Giter Site home page Giter Site logo

awslabs / flexible-snapshot-proxy Goto Github PK

View Code? Open in Web Editor NEW
28.0 9.0 2.0 162 KB

High-performance open-source orchestration utility that utilizes EBS Direct APIs to efficiently clone, copy and migrate EBS snapshots to and from arbitrary File, Block or Object destinations.

License: Apache License 2.0

Python 100.00%
ebs snapshots

flexible-snapshot-proxy's Introduction

Flexible Snapshot Proxy

High-performance open-source orchestration utility that utilizes EBS Direct APIs to efficiently clone, copy and migrate EBS snapshots to and from arbitrary File, Block or Object destinations.

Basic Usage

Help is available by running src/main.py -h.

Some usage examples are available as full-stack canaries in test_functional.py.

Example scenarios we have tested are in Scenarios

IAM permissions required for reading and writing snapshots are documented here.

Below is an example IAM template:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeSnapshots",
                "ec2:DescribeRegions",
                "ebs:StartSnapshot",
                "ebs:ListSnapshotBlocks",
                "ebs:ListChangedBlocks",
                "ebs:GetSnapshotBlock",
                "s3:GetBucketAcl",
                "s3:ListBucket",
                "s3:PutObject",
                "s3:GetObject",
            ],
            "Resource": "*"
        }
    ]
}

The below one-liner will generate a list of all commands for which test cases exist, and show their syntax.

% cat test/test_functional.py|grep python3 | cut -d "/" -f 2-3 | awk -F"[{']" '{print $1 $3 " " $6 " " $9}'

src/main.py list ORG_SNAP  
src/main.py download snapshotId PATH_TO_RAW_DEVICE 
src/main.py deltadownload snapshot1 snapshot2 PATH_TO_RAW_DEVICE 
src/main.py upload UPLOAD_BLOCKS  
src/main.py copy snapshotId  
src/main.py diff snapshotId_1 snapshotId_2 
src/main.py sync snapshotId_1 snapshotId_2 snapshotId_parent 
src/main.py multiclone snapshotId PATH_TO_TEMP_DIRECTORY 
src/main.py fanout UPLOAD_BLOCKS PATH_TO_PROJECT_DIRECTORY 
src/main.py movetos3 snapshotId DEST_S3_BUCKET 
src/main.py getfroms3 snapshotId DEST_S3_BUCKET 

Installation

Currently, the utility will enumerate all Python package dependencies on runtime, and install necessary packages via pip3 if they are not already installed on the system. It will show no indication of progress, and will not ask the user for permission to install additional packages.

TODO: Ask user for permission, print a list of packages it is going to install. TODO: Get the package into PyPI so it could be installed via pip3.

Configuration

Configuration of the transfer is performed at runtime with the following CLI arguments:

Optional arguments:
  -h, --help            show this help message and exit
  -o ORIGIN_REGION, --origin_region ORIGIN_REGION
                        AWS Origin Region. Source of Snapshot copies. (default: .aws/config then us-east-1)
  -d, --dry_run         Perform a dry run of FSP operation to check valid AWS permissions. (default: false)
  -q, --quiet           quiet output
  -v, --verbosity       output verbosity. (Pass/Fail blocks per region)
  -vv                   increased output verbosity. (Pass/Fail for individual blocks)
  -vvv                  Maximum output verbosity. (All individual block retries will be recorded)
  --nodeps              Do not verify/install dependencies.
  --suppress_writes     Intended for underpowered devices. Will not write log files or check dependencies

Additional advanced tuneables are currently in the source itself.

NUM_JOBS # controls the parallelism
FULL_COPY # enables transfer of sparse chunks, which are skipped by default
offset # in chunk_and_align() controls the maximum size of S3 Objects generated. 64 chunks = 32 MB.

TODO: implement a setup.py script for CLI configuration/installation.

Features

Flexible Snapshot Proxy supports the following commands:

list                Returns accurate size of an EBS Snapshot by enumerating
                    actual consumed/allocated space. 

diff                Returns accurate size of an EBS Snapshot Delta by
                    enumerating the incremental block-level difference 
                    between 2 Snapshots with a common parent.

download            Transfers an EBS Snapshot to an arbitrary file or
                    block device.

deltadownload       Downloads the delta between any two snapshots with a
                    common parent on top of an arbitrary file or block device.
                    Can be used for synchronizing existing volumes created from
                    the parent.

upload              Transfers an arbitrary file or block device to a new
                    EBS Snapshot.

copy                Transfers an EBS Snapshot to another EBS Direct API
                    Endpoint. Intended use case: copy EBS Snapshots across
                    accounts and/or regions)

sync                Synchronizes the incremental difference between 2
                    EBS Snapshots, delta(A,B) to Snapshot C (clone of A),
                    resulting in Snapshot D (clone of B). Intended use case:
                    `copy` the parent snapshot, then use `sync` to synchronize
                    changes.

movetos3            Transfers an EBS Snapshot or an arbitrary image file / block 
(TODO: verify       device to a customer-owned S3 Bucket (any S3 Storage Class, or 
block->S3 path)     Snow Family), with zstandard compression, tuneable object 
					          size and an independent segment checksum.

getfroms3           Transfers a Snapshot stored in a customer-owned S3
                    Bucket to a new block volume or file.

multiclone          Same functionality as `download`, but writing to
                    multiple destinations in parallel. Intended use case: clone a
                    snapshot to multiple volumes.

fanout              Upload from arbitrary file or block device to 
					          multiple EBS Snapshot(s) in parallel, provided a list 
					          of regions. 

Design Overview

todo: Outline the design choices of high parallelization and sharing memory completing the same job in different regions (e.g. multiclone and fanout)

System requirements

Memory

CPU

Network

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

flexible-snapshot-proxy's People

Contributors

amazon-auto avatar kdavyd avatar robbieowens15 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

kdavyd kenhui

flexible-snapshot-proxy's Issues

deltadownload results in OSError: [Errno 9] Bad file descriptor

I have 2 snapshots from same parent, an initial and a diff

Original
Snapshot ID
[snap-01e5f2fee19e08fae]
Size
500 GiB
Progress
Available (100%)
Snapshot status
Completed
Owner
387198252553
Volume ID
[vol-0b4c2a2591f78c536]
Started
Wed Aug 31 2022 11:42:19 GMT-0500 (Central Daylight Time)

2nd snapshot
Snapshot ID
[snap-01b3747942520453a]
Size
500 GiB
Progress
Available (100%)
Snapshot status
Completed
Owner
387198252553
Volume ID
[vol-0b4c2a2591f78c536]

When I do a diff I get
C:\flexible-snapshot-proxy\src/main.py diff snap-01e5f2fee19e08fae snap-0095db1e0578136bc
Changes between snap-01e5f2fee19e08fae and snap-0095db1e0578136bc contain 8 chunks and 4194304 bytes, took 0.15 seconds.

however DeltaDownload throws an error:

PS C:\flexible-snapshot-proxy\src/main.py deltadownload snap-01e5f2fee19e08fae snap-01b3747942520453a \.\PhysicalDrive2
Changes between snap-01e5f2fee19e08fae and snap-01b3747942520453a contain 8 chunks and 4194304 bytes, took 0.16 seconds.
['\\.\PhysicalDrive2']
C:\flexible-snapshot-proxy\src/main.py : Traceback (most recent call last):
At line:1 char:1

  • C:\flexible-snapshot-proxy\src/main.py deltadownload snap-01e5f2fee19 ...
  •   + CategoryInfo          : NotSpecified: (Traceback (most recent call last)::String) [], RemoteException
      + FullyQualifiedErrorId : NativeCommandError
    
    File "C:\flexible-snapshot-proxy\src\main.py", line 484, in <module>
      deltadownload(snapshot_id_one=args.snapshot_one, snapshot_id_two=args.snapshot_two, file_path=args.file_path)
    File "C:\flexible-snapshot-proxy\src\fsp.py", line 613, in deltadownload
      parallel(
    File "C:\Python310\lib\site-packages\joblib\parallel.py", line 1056, in __call__
      self.retrieve()
    File "C:\Python310\lib\site-packages\joblib\parallel.py", line 935, in retrieve
      self._output.extend(job.get(timeout=self.timeout))
    File "C:\Python310\lib\multiprocessing\pool.py", line 774, in get
      raise self._value
    File "C:\Python310\lib\multiprocessing\pool.py", line 125, in worker
      result = (True, func(*args, **kwds))
    File "C:\Python310\lib\site-packages\joblib\_parallel_backends.py", line 595, in __call__
      return self.func(*args, **kwargs)
    File "C:\Python310\lib\site-packages\joblib\parallel.py", line 262, in __call__
      return [func(*args, **kwargs)
    File "C:\Python310\lib\site-packages\joblib\parallel.py", line 262, in <listcomp>
      return [func(*args, **kwargs)
    File "C:\flexible-snapshot-proxy\src\fsp.py", line 388, in get_changed_blocks
      parallel2(
    File "C:\Python310\lib\site-packages\joblib\parallel.py", line 1056, in __call__
      self.retrieve()
    File "C:\Python310\lib\site-packages\joblib\parallel.py", line 935, in retrieve
      self._output.extend(job.get(timeout=self.timeout))
    File "C:\Python310\lib\multiprocessing\pool.py", line 774, in get
      raise self._value
    File "C:\Python310\lib\multiprocessing\pool.py", line 125, in worker
      result = (True, func(*args, **kwds))
    File "C:\Python310\lib\site-packages\joblib\_parallel_backends.py", line 595, in __call__
      return self.func(*args, **kwargs)
    File "C:\Python310\lib\site-packages\joblib\parallel.py", line 262, in __call__
      return [func(*args, **kwargs)
    File "C:\Python310\lib\site-packages\joblib\parallel.py", line 262, in <listcomp>
      return [func(*args, **kwargs)
    File "C:\flexible-snapshot-proxy\src\fsp.py", line 226, in get_changed_block
      write_block_to_file(file, block, data)
    File "C:\flexible-snapshot-proxy\src\fsp.py", line 177, in write_block_to_file
      f.write(data)
    

OSError: [Errno 9] Bad file descriptor

Ability to use IAM Role instead of user

.\main.py : Traceback (most recent call last):
At line:1 char:1

  • .\main.py download snap-\.\PhysicalDrive1
  •   + CategoryInfo          : NotSpecified: (Traceback (most recent call last)::String) [], RemoteException
      + FullyQualifiedErrorId : NativeCommandError
    
    File "C:\flexible-snapshot-proxy\src\main.py", line 270, in setup_singleton
      user_account = sts.get_caller_identity().get("Account")
    File "C:\Python310\lib\site-packages\botocore\client.py", line 508, in _api_call
      return self._make_api_call(operation_name, kwargs)
    File "C:\Python310\lib\site-packages\botocore\client.py", line 898, in _make_api_call
      http, parsed_response = self._make_request(
    File "C:\Python310\lib\site-packages\botocore\client.py", line 921, in _make_request
      return self._endpoint.make_request(operation_model, request_dict)
    File "C:\Python310\lib\site-packages\botocore\endpoint.py", line 119, in make_request
      return self._send_request(request_dict, operation_model)
    File "C:\Python310\lib\site-packages\botocore\endpoint.py", line 198, in _send_request
      request = self.create_request(request_dict, operation_model)
    File "C:\Python310\lib\site-packages\botocore\endpoint.py", line 134, in create_request
      self._event_emitter.emit(
    File "C:\Python310\lib\site-packages\botocore\hooks.py", line 412, in emit
      return self._emitter.emit(aliased_event_name, **kwargs)
    File "C:\Python310\lib\site-packages\botocore\hooks.py", line 256, in emit
      return self._emit(event_name, kwargs)
    File "C:\Python310\lib\site-packages\botocore\hooks.py", line 239, in _emit
      response = handler(**kwargs)
    File "C:\Python310\lib\site-packages\botocore\signers.py", line 103, in handler
      return self.sign(operation_name, request)
    File "C:\Python310\lib\site-packages\botocore\signers.py", line 187, in sign
      auth.add_auth(request)
    File "C:\Python310\lib\site-packages\botocore\auth.py", line 407, in add_auth
      raise NoCredentialsError()
    

botocore.exceptions.NoCredentialsError: Unable to locate credentials
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\flexible-snapshot-proxy\src\main.py", line 457, in
setup_singleton(args)
File "C:\flexible-snapshot-proxy\src\main.py", line 272, in setup_singleton
except sts.exceptions as e:
TypeError: catching classes that do not inherit from BaseException is not allowed

Add Windows support

Is your idea related to a problem? Please describe.
Currently this code uses a lot of Linux and macOS specific API calls for reading devices and files, as well as calculating total size of files and filesystems.

Describe the solution you'd like
For example, in fsp.py there are references to O_RDONLY, O_NONBLOCK, and os.SEEK_END, these functions only work in LInux and macOS.

I tested some examples replacing O_NONBLOCK with O_BINARY which works with Windows and was able to successfully upload a VSS volume snapshot.

After digging in the code for a few days, significant portions of fsp.py would need to be rewritten to detect conditions for Linux / macOS and separate functions need to be called for each OS.

For example, here's some psudocode of a new function block that can be used to split off file and filesystem calculations:

def calculateSize(file_path):
    currentPlatform = platform.system()
    match currentPlatform:
        case 'Linux' | 'Darwin':
            files = []
            files.append(file_path)
            validate_file_paths_read(files)
            with os.fdopen(os.open(file_path, os.O_RDONLY | os.O_BINARY), "rb+") as fileHandle: 
            fileHandle.seek(0, os.SEEK_END)
            size = fileHandle.tell()
            os.close(fileHandle)
            return size
        case 'Windows':
            <code for calculating Windows file and filesystem size>
        case _:
            print("Unsupported Platform!")
            return None

Additionally, some extra steps need to be taken on a Windows filesystem to copy the filesystem, such as taking a VSS snapshot of the filesystem and then performing the copy and upload against the VSS snapshot instead of the live filesystem.

I'm going to try to work on it some more but I'm definitely not a coder so this will take a lot of time for me. Some leads that I was pursuing to calculate filesystem size for Windows involve using WMI calls and will need to import the WMI library.

Define list of IAM Permissions

I am trying to narrow down the IAM permissions required. So far I have these:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeSnapshots",
"ec2:DescribeRegions",
"ebs:StartSnapshot",
"s3:GetBucketAcl",
"s3:ListBucket",
"s3:PutObject",
"s3:GetObject",
],
"Resource": "*"
}
]
}
But I am getting botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the ListBuckets operation:
Access Denied

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.