Giter Site home page Giter Site logo

aws-s3-multipart-presigned-upload's Introduction

Multipart + Presigned URL upload to AWS S3/Minio via the browser

Motivation

I created this demo repo because documentation for multipart uploading of large files using presigned URLs was very scant.

I wanted to create a solution to allow users to upload files directly from the browser to AWS S3 (or any S3-compliant storage server). This worked great when I used AWS SDK's getSignedUrl API to generate a temporary URL that the browser could upload the file to.

However, I hit a snag when dealing with files > 5GB because the pre-signed URL only allows for a maximum file size of 5GB to be uploaded at one go. As such, this repo demonstrates the use of multipart + presigned URLs to upload large files to an AWS S3-compliant storage service.

Components used in this demo

  • Frontend Server: React (Next.js)
  • Backend Server: Node.js (Express), using the AWS JS SDK
  • Storage Server: Minio (but this can easily be switched out to AWS S3)

How to run

  • Clone the repo and change directory into the repo
  • Open three different terminal windows.

Storage Server

In window 1, run:

# Set up the Minio server (ignore this if you are using AWS S3)
# Minio docs: https://docs.minio.io/docs/minio-quickstart-guide
minio server /data

Note: Set the S3-compliant bucket policy as appropriate to allow the right access

Backend Server

Replace the following code in backend/server.js with your AWS S3 or S3-compliant storage server config.

const s3  = new AWS.S3({
  accessKeyId: '<ACCESS_KEY_ID>' , // Replace with your access key id
  secretAccessKey: '<SECRET_ACCESS_KEY>' , // Replace with your secret access key
  endpoint: 'http://127.0.0.1:9000' ,
  s3ForcePathStyle: true, // needed with minio?
  signatureVersion: 'v4'
});

Note: If you are using AWS S3, follow the docs on the AWS website to instantiate a new AWS S3 client.

In window 2, run:

cd backend
npm install
node server.js

Frontend Server

In window 3, run:

cd frontend
npm install
npm run dev

Upload File

Go to http://localhost:3000 in your browser window and upload a file.

aws-s3-multipart-presigned-upload's People

Contributors

prestonlimlianjie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

aws-s3-multipart-presigned-upload's Issues

Improvement: Update to retrieve all presigned urls in one network call, instead of having to make a call for each and every uploaded part.

Thank you for this example repo for frontend multipart presigned url uploads.

I noticed a change that will help improve performance, and will help anyone taking this approach! This issue has been made to get this improvement implemented.

Context

The current frontend code does something like this

for (let index = 1; index < NUM_CHUNKS + 1; index++) {
    start = (index - 1)*FILE_CHUNK_SIZE
    end = (index)*FILE_CHUNK_SIZE
    blob = (index < NUM_CHUNKS) ? this.state.selectedFile.slice(start, end) : this.state.selectedFile.slice(start)

    // (1) Generate presigned URL for each part
    let getUploadUrlResp = await axios.get(`${this.state.backendUrl}/get-upload-url`, {
        params: {
            fileName: this.state.fileName,
            partNumber: index,
            uploadId: this.state.uploadId
        }
     })
    ...
}
...

This previous code creates a growing network cost. Larger files will have more NUM_CHUNKS (parts) then smaller files, so this call will be made more times for larger files (compared to smaller files).

However, even though larger files will normally take longer to upload anyway, this cost in particular can be avoided! An example of this improvement is shown below:

let getUploadUrlResp = await axios.get(`${this.state.backendUrl}/get-upload-urls`, {
    params: {
        fileName: this.state.fileName,
        numberOfParts: NUM_CHUNKS
        uploadId: this.state.uploadId
    }
});

for (let index = 1; index < NUM_CHUNKS + 1; index++) {
    ...
}
...

This change converts /get-upload-url being called NUM_CHUNKS times, to /get-upload-urls being called a single time (no matter the file size!).

As part of fixing this issue, the following should be done:

  • Update frontend code to use single network call
  • Convert and update logic /get-upload-url to /get-upload-urls
  • Bonus: Comment on the following GitHub issue showing off new changes for visibility and recognition! aws/aws-sdk-js#1603

Complete multipart upload gets call before uploading particular part

I have used your code as a reference for multi part upload and the problem I am facing is my complete multipart upload gets executed before my any one part of the file starts upload.
I have created a for loop inside which i am initializing partnumber, partSizeToupload, dataToUpload, and position. Then API call to my backend server in Django to get presigned URL and when the API successfully gets called PUT call to upload that object.
My complete multipart upload API call is outside my for loop and on debugging I get to know it gets executed my for loop is executing.
Below is my code to do this task:
const partSize = 1024 * 1024 * 5;
const wavSize = wav.byteLength;
const partCount = parseInt(Math.ceil(wavSize / parseFloat(partSize, 10)), 10);
// let fileData = {};
let partNum = 0;
let position = 0;
let iterator = 0;
const uploadPartsArray = [];
let promisesArray = [];
for ( iterator = 0; iterator < partCount; iterator++) {
partNum = iterator + 1;
const partSizeToUpload = Math.min(partSize, (wavSize - position));
const dataToUpload = wav.slice(position, partSizeToUpload);
position = position + partSizeToUpload;

            apiService.uploadFilePart(this.props.meetingData.meeting.uuid, { upload_id: responseUrl.data.upload_id, key: responseUrl.data.file_key, file_part_number: partNum})
              .then((responseUploadPart) => {
                console.log('I am response from uploadFIlePart: ', responseUploadPart);
                const uploadRespPromise = apiService.uploadAudioRecording(responseUploadPart.data.url, dataToUpload, { headers: {'Access-Control-Expose-Headers': '*'} })
                  .then((responseUpload) => {
                    console.log('Upload AUdio iam response headers: ', responseUpload.headers.etag);
                    uploadPartsArray.push({
                      ETag: responseUpload.headers.etag,
                      PartNumber: partNum
                    });
                    console.log('Upload parts array: ', uploadPartsArray);
                  }).catch(error => { console.log('Error in uploading to presigned URL..', error); });

                promisesArray.push(uploadRespPromise);
              }).catch(error => { console.log('Error in API call of uploadFIlepart..', error);});
          }
          // Complete Multi part upload
          const resolvedArray = await Promise.all(promisesArray);
          console.log(resolvedArray, ' resolvedArray');
          console.log('Final Upload parts array: ', uploadPartsArray);
          apiService.completeMultipartUpload(this.props.meetingData.meeting.uuid, { upload_id: responseUrl.data.upload_id, key: responseUrl.data.file_key, file_parts: uploadPartsArray })
            .then((responseCompleteMltipart) => {
              console.log('Multi part upload response: ', responseCompleteMltipart);
            })
            .catch((errCMU) => { console.log('Error in completing multipart upload..,', errCMU); } );

Error when running the script "MalformedXML"

Hello, thank you very much for this script, this is exactly what I'm looking for. I'm trying to use it to upload a small object on S3. I'm running into the following issue:

Do you have the same issue on your end or is it my configuration?

Thanks for your help


> [email protected] start aws-s3-multipart-presigned-upload/backend
> node server.js

Example app listening on port 4000!
{
  Bucket: 'mybucket',
  Key: 'myobject',
  PartNumber: '1',
  UploadId: 'KAafi66SQfghLBIYD19RUZjgwe.47_BODU3uXFs10X1BKQ5VIez_HktLe2m3dTYC0i9hDwzOoSJ4ft5wKaZjdy3n10Brpxk2lq.qkJaW0uoqsUDbTHeoz3F7bMnGfD2i'
}
{
  params: {
    fileName: 'myobject',
    parts: [ [Object] ],
    uploadId: 'KAafi66SQfghLBIYD19RUZjgwe.47_BODU3uXFs10X1BKQ5VIez_HktLe2m3dTYC0i9hDwzOoSJ4ft5wKaZjdy3n10Brpxk2lq.qkJaW0uoqsUDbTHeoz3F7bMnGfD2i'
  }
} : body
{
  Bucket: 'mybucket',
  Key: 'myobject',
  MultipartUpload: { Parts: [ [Object] ] },
  UploadId: 'KAafi66SQfghLBIYD19RUZjgwe.47_BODU3uXFs10X1BKQ5VIez_HktLe2m3dTYC0i9hDwzOoSJ4ft5wKaZjdy3n10Brpxk2lq.qkJaW0uoqsUDbTHeoz3F7bMnGfD2i'
}
MalformedXML: The XML you provided was not well-formed or did not validate against our published schema
    at Request.extractError [...]
 {
    message: 'The XML you provided was not well-formed or did not validate against our published schema',
    code: 'MalformedXML',
    region: null,
    time: 2019-10-16T20:42:21.454Z,
    requestId: '6F80BAEE1032B0F8',
    extendedRequestId: 'TUz6H3z/lq3RPHj3QioVw4R+g7lSjwvvqeC8XH6p2Fcz1ptXYk2pir9l8pGmjhabvKUnq0CQhYU=',
    cfId: undefined,
    statusCode: 400,
    retryable: false,
    retryDelay: 61.98304100596486
  },
  isOperational: true,
  code: 'MalformedXML',
  region: null,
  time: 2019-10-16T20:42:21.454Z,
  requestId: '6F80BAEE1032B0F8',
  extendedRequestId: 'TUz6H3z/lq3RPHj3QioVw4R+g7lSjwvvqeC8XH6p2Fcz1ptXYk2pir9l8pGmjhabvKUnq0CQhYU=',
  cfId: undefined,
  statusCode: 400,
  retryable: false,
  retryDelay: 61.98304100596486
}```

NUM_CHUNKS miscalculation

I noticed an issue with calculating the number of chunks that could cause a problem

const NUM_CHUNKS = Math.round(fileSize / FILE_CHUNK_SIZE) + 1

If the fileSize/FILE_CHUNK_SIZE remainder is >= 5 then Math.round will round up and the number of chunks will be 1 more than needed. Math.floor would solve this issue.

This is a really great example library. Thanks for posting this!

How to use this in the browser?

Hi,
I need the frontend to run in the browser. How would i do that using your multipart presigned upload? Can you give an example?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.