Giter Site home page Giter Site logo

s3-streams's Introduction

s3-streams

Support for streaming reads and writes from and to S3 using Amazon's native API.

build status coverage license version downloads

Amazon makes it a giant pain to do anything stream-like when it comes to S3 (given the general restriction that every request needs a Content-Length header). We provide native stream classes (both Readable and Writable) that wrap aws-sdk S3 requests and responses to make your life easier.

IMPORTANT: This library uses the streams3 API. In order to provide compatibility with older versions of node we make use of readable-stream. This is unlikely to have any effect on your code but has not yet been well tested.

If you are using node 0.8 you must ensure your version of npm is at least 1.4.6.

Features:

  • Native read streams,
  • Native write streams,
  • Smart piping.

Usage

npm install s3-streams

Write Streams

Create streams for uploading to S3:

var S3 = require('aws-sdk').S3,
	S3S = require('s3-streams');

var upload = S3S.WriteStream(new S3(), {
	Bucket: 'my-bucket',
	Key: 'my-key',
	// Any other AWS SDK options
	// ContentType: 'application/json'
	// Expires: new Date('2099-01-01')
	// ...
});

Read Streams

Create streams for downloading from S3:

var S3 = require('aws-sdk').S3,
	S3S = require('s3-streams');

var download = S3S.ReadStream(new S3(), {
	Bucket: 'my-bucket',
	Key: 'my-key',
	// Any other AWS SDK options
});

Smart Piping

Smart pipe files over HTTP:

var http = require('http'),
    S3 = require('aws-sdk').S3,
	S3S = require('s3-streams');

http.createServer(function(req, res) {
    var src = S3S.ReadStream(...);
    // Automatically sets the correct HTTP headers
    src.pipe(res);
})

Smart pipe files on S3:

var S3 = require('aws-sdk').S3,
	S3S = require('s3-streams');

var src = S3S.ReadStream(...),
	dst = S3S.WriteStream(...);

// No data ever gets downloaded locally.
src.pipe(dst);

Extras

You can create streams with different settings by creating a partial for the specific S3 instance you have:

var instance = new S3(), s3 = {
	createReadStream: _.partial(S3ReadStream, instance),
	createWriteStream: _.partial(S3WriteStream, instance)
}

var stream = s3.createReadStream({ Bucket: 'my-bucket', Key: 'my-key' });

Existing frameworks:

  • knox (doesn't use native AWS SDK, no true streaming support)
  • s3-upload-stream (doesn't use node streams API, no support for streaming downloads)
  • s3-download-stream (only does downloads, downloads are streamed by S3 part, not by individual buffer chunks)
  • streaming-s3 (overall terrible API; no actual streams)
  • create-s3-object-write-stream (probably one of the better ones)

s3-streams's People

Contributors

dci-aloughran avatar izaakschroeder avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

s3-streams's Issues

Error: stream.push() after EOF

Sometimes an error occurs. 3/10

// hapijs handler
const handler = function(request, reply) {
  const imageStream = s3.createReadStream({ 
        Bucket: 'bucket', 
        Key: 'filename'
  });
  reply(null, imageStream);
};
Error: stream.push() after EOF
at readableAddChunk (/.../node_modules/readable-stream/lib/_stream_readable.js:264:30)
at S3ReadStream.Readable.push (/.../node_modules/readable-stream/lib/_stream_readable.js:238:10)
at PassThrough.readable (/.../node_modules/s3-streams/lib/read.js:85:9)
at emitNone (events.js:105:13)
at PassThrough.emit (events.js:207:7)
at emitReadable_ (_stream_readable.js:516:10)
at emitReadable (_stream_readable.js:510:7)
at onEofChunk (_stream_readable.js:495:3)
at readableAddChunk (_stream_readable.js:223:5)
at PassThrough.Readable.push (_stream_readable.js:211:10)
at PassThrough.Transform.push (_stream_transform.js:147:32)
at done (_stream_transform.js:218:17)
at PassThrough.prefinish (_stream_transform.js:141:5)
at emitNone (events.js:105:13)
at PassThrough.emit (events.js:207:7)
at prefinish (_stream_writable.js:592:14)
at finishMaybe (_stream_writable.js:600:5)
at endWritable (_stream_writable.js:611:3)
at PassThrough.Writable.end (_stream_writable.js:562:5)
at PassThrough.checkContentLengthAndEmit (/.../node_modules/aws-sdk/lib/request.js:617:20)
at emitNone (events.js:110:20)
at PassThrough.emit (events.js:207:7)

Create output directory if it doesn't exist

Hi,

I am getting the following error if the directory to which I would like to write the file doesn't exist:

Unable to download file: { Error: ENOENT: no such file or directory, open '/myfilepath/'
errno: -2,
code: 'ENOENT',
syscall: 'open',
path: '/myfilepath/' }

I can fix this by checking if the directory exists and creating it before calling the download.

(!fs.existsSync(localDir)) { try { fs.mkdirSync(localDir); } catch (err) { console.log('Error creating folder:', err); } }

But that doesn't seem very efficient, cause it would have to check for the existence of the directory for every file download.

It would be better if the creating of the directory was handled by the download as a try/catch, and if it encounters "ENOENT no such file or directory", then create the directory and proceed.

Is that possible for you to change, or can you suggest a better way to handle this? Like somehow catch it with your .on('error') function, create the directory inside of it and re-trigger the 1st step somehow?

Node process crashes

Trying to upload the following file [1] cloud.tar.gz (autodeletes in 48 hrs)

For some reason the entire node process crashes... the file isn't particularly large, and I can't for the life of me figure out what the issue is.

Worst case scenario will take a different path.

import fs from 'mz/fs';
import aws from 'aws-sdk';
import s3s from 's3-streams';

const s3 = new aws.S3();

fs
    .createReadStream('path/to/file')
    .pipe(s3s.WriteStream(s3, { Bucket: 'S3_BUCKET', Key: 'path/to/file' }));

[1] https://expirebox.com/download/bd4b7be5a2501c1639919e24e0620eaa.html

running from lambda function nothing is created in s3

const archiver = require('archiver');
const aws = require('aws-sdk');
const S3 = require('aws-sdk').S3;
const S3S = require('s3-streams');
const fs = require('fs');
const bucket = 'testbucket-download';

exports.handler = (event, context, callback) => {
s3 = new S3();
var upload = S3S.WriteStream(new S3(), {
Bucket: bucket,
Key: 'download.zip',

});

let rs = fs.createReadStream('./3.png');
rs.pipe(upload);
output.on('finish',function(err){
callback(null,"yolosssssss");
})

Large files not uploaded (size > highWatermark)

I can't get large file uploads to work other than if I raise highWatermark to above the file size. I thought highWatermark was the limit for when a new chunk should be sent. Have I misunderstood something?

const AWS = require('aws-sdk');
AWS.config.update({
  accessKeyId: accessKey, 
  secretAccessKey: secretKey
});

const S3 = AWS.S3;
const S3S = require('s3-streams');
const read = fs.createReadStream('./big.pdf') //18MB file

const s3Upload = S3S.WriteStream(new S3(), {Bucket: 'myBucket', Key: 'test'});

read.pipe(s3Upload)
.on('finish', function() {
  console.log(`finished piping to S3.. `);
})

Example for saving to a local file

Can you please add an example of how to take the download stream and actually save it to a local file?

Is there more to it than doing
download.pipe(fs.createWriteStream("/tmp/" + file);

Crash when file does not exist

It's not a real issue for this package, but I'm writing this down here so it can help others having the same problem.

When the requested file does not exist, I got an error.

It's easy to solved by using the following code:

    s3Client.headObject(getObjectOptions, (err, metadata) => {
            if (err && err.code === 'NotFound') {
                console.log('file not found!');
            } else {
                let src = new S3S.ReadStream(s3Client, getObjectOptions);
                //...
            }
        });

Extra options break multipart upload logic

I'm trying to add ServerSideEncryption option:
var ws = S3S.WriteStream(core.S3Client, { ServerSideEncryption: 'AES256', Bucket: bucketName, Key: fileName });

However that results in this error at the end of upload:
Unhandled rejection UnexpectedParameter: Unexpected key 'ServerSideEncryption' found in params.MultipartUpload.Parts[0]

That's because of this code in MultipartUpload.prototype.uploadPart:
return (err) ? reject(err) : resolve(_.assign({ }, result, { PartNumber: partNumber }));

The problem here is that the extra option is returned as part of result and then you pass that option to Parts array. Parts array can only have ETag and PartNumber resulting in this error.

Unexpected key 'ETag' found in params

The following code

var S3 = require('aws-sdk').S3
var S3S = require('s3-streams')

var zlib = require('zlib')

var compress = zlib.createGzip()

var upload = S3S.WriteStream(new S3(), {
  Bucket: 'my-bucket',
  Key: 'my-key'
})

// readStream is a multipart file upload
readStream.pipe(compress).pipe(upload)

throws the error:

"err": {
    "message": "Unexpected key 'ETag' found in params",
    "name": "UnexpectedParameter",
    "stack": "UnexpectedParameter: Unexpected key 'ETag' found in params
                at fail (my-application/node_modules/aws-sdk/lib/param_validator.js:115:37)
                at validateStructure (my-application/node_modules/aws-sdk/lib/param_validator.js:50:14)
                at validateMember (my-application/node_modules/aws-sdk/lib/param_validator.js:61:21)
                at validate (my-application/node_modules/aws-sdk/lib/param_validator.js:9:10)
                at Request.VALIDATE_PARAMETERS (my-application/node_modules/aws-sdk/lib/event_listeners.js:88:32)
                at Request.callListeners (my-application/node_modules/aws-sdk/lib/sequential_executor.js:100:18)
                at callNextListener (my-application/node_modules/aws-sdk/lib/sequential_executor.js:90:14)
                at my-application/node_modules/aws-sdk/lib/event_listeners.js:75:9
                at finish (my-application/node_modules/aws-sdk/lib/config.js:228:7)
                at my-application/node_modules/aws-sdk/lib/config.js:244:9
                at SharedIniFileCredentials.get (my-application/node_modules/aws-sdk/lib/credentials.js:126:7)
                at getAsyncCredentials (my-application/node_modules/aws-sdk/lib/config.js:238:24)
                at Config.getCredentials (my-application/node_modules/aws-sdk/lib/config.js:258:9)
                at Request.VALIDATE_CREDENTIALS (my-application/node_modules/aws-sdk/lib/event_listeners.js:70:26)
                at Request.callListeners (my-application/node_modules/aws-sdk/lib/sequential_executor.js:97:18)
                at Request.emit (my-application/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
                at Request.emit (my-application/node_modules/aws-sdk/lib/request.js:604:14)
                at Request.transition (my-application/node_modules/aws-sdk/lib/request.js:21:12)
                at AcceptorStateMachine.runTo (my-application/node_modules/aws-sdk/lib/state_machine.js:14:12)
                at Request.runTo (my-application/node_modules/aws-sdk/lib/request.js:369:15)
                at Request.send (my-application/node_modules/aws-sdk/lib/request.js:353:10)
                at makeRequest (my-application/node_modules/aws-sdk/lib/service.js:169:27)
                at svc.(anonymous function) [as completeMultipartUpload] (my-application/node_modules/aws-sdk/lib/service.js:402:21)
                at multipartPromise (my-application/node_modules/s3-streams/lib/multipart.js:149:13)
                at tryCatcher (my-application/node_modules/s3-streams/node_modules/bluebird/js/main/util.js:24:31)
                at Promise._resolveFromResolver (my-application/node_modules/s3-streams/node_modules/bluebird/js/main/promise.js:427:31)
                at new Promise (my-application/node_modules/s3-streams/node_modules/bluebird/js/main/promise.js:53:37)
                at afterParts (my-application/node_modules/s3-streams/lib/multipart.js:148:11)
                at tryCatcher (my-application/node_modules/s3-streams/node_modules/bluebird/js/main/util.js:24:31)
                at Promise._settlePromiseFromHandler (my-application/node_modules/s3-streams/node_modules/bluebird/js/main/promise.js:454:31)
                at Promise._settlePromiseAt (my-application/node_modules/s3-streams/node_modules/bluebird/js/main/promise.js:530:18)
                at Promise._settlePromiseAtPostResolution (my-application/node_modules/s3-streams/node_modules/bluebird/js/main/promise.js:224:10)
                at Async._drainQueue (my-application/node_modules/s3-streams/node_modules/bluebird/js/main/async.js:182:12)
                at Async._drainQueues (my-application/node_modules/s3-streams/node_modules/bluebird/js/main/async.js:187:10)
                at Async.drainQueues (my-application/node_modules/s3-streams/node_modules/bluebird/js/main/async.js:15:14)
                at process._tickDomainCallback (node.js:492:13)",
    "code": "UnexpectedParameter"
  }

aws-sdk is at

"aws-sdk": "^2.1.29"

Incompatiable with Node v10

When installing this module, the following error is returned.

error [email protected]: The engine "node" is incompatible with this module. Expected version "^1.2.0".

If installed with the --ignore-engines flag, it works without issue. Please relax your node engines restrictions in the package.json file.

Headers from S3 object?

I'm trying to stream the S3 object to the browser as a response, with correct Content-Type header.

I've got an S3 type with the Content-Type header, and I do:

  const stream = S3S.ReadStream(new S3(), {
    Bucket: config.aws_upload_bucket,
    Key: key
  })
  stream.pipe(res)

There doesn't seem to be any Content-Type, and I couldn't even find any headers, on the stream object.

Race condition reading stream larger then highWaterMark

Hi,

Found a race condition when getting fast (i.e. cached) s3 responses so that all readable event fires before _read is call.

Used following code (with some features removed, calling consume at request() and 'readable' ):

const _ = require('lodash');
const Stream = require('readable-stream');
const util = require('util');

/**
 * @constructor
 * @param {AWS.S3} client S3 client.
 * @param {Object} options AWS options.
 */
function S3ReadStream(client, options) {
  if (this instanceof S3ReadStream === false) {
    return new S3ReadStream(client, options);
  }
  if (!client || !_.isFunction(client.getObject)) {
    throw new TypeError();
  }
  if (!_.has(options, 'Bucket')) {
    throw new TypeError();
  }
  if (!_.has(options, 'Key')) {
    throw new TypeError();
  }
  Stream.Readable.call(this, _.assign({ highWaterMark: 4194304 }));
  this._more = 0;
  this.options = options;
  this.client = client;
}
util.inherits(S3ReadStream, Stream.Readable);

S3ReadStream.prototype.request = function request() {
  if (this.req) {
    return this.consume();
  }
  this.req = this.client.getObject(_.assign({}, this.options));
  this.stream = this.req
    .on('httpHeaders', statusCode => {
      if (statusCode >= 300) {
        this.emit('error', { statusCode: statusCode });
        return;
      }
    })
    .createReadStream()
    .on('end', () => {
      this.push(null);
    })
    .on('error', err => {
      this.emit('error', err);
    })
    .on('readable', () => {
      this.consume();
    });
  return this.stream;
};

S3ReadStream.prototype.consume = function consume() {
  let chunk;
  while (null !== (chunk = this.stream.read(this._more)) && this._more) {
    this._more -= chunk.length;
    this.push(chunk);
  }
};

S3ReadStream.prototype.pipe = function pipe() {
  return Stream.Readable.prototype.pipe.apply(this, arguments);
};

/**
 * @param {Number} size Amount of data to read.
 * @returns {Undefined} Nothing.
 * @see Stream.Readable._read
 */
S3ReadStream.prototype._read = function _read(size) {
  this._more += size;
  this.request();
};

module.exports = S3ReadStream;

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.