Giter Site home page Giter Site logo

node-tar's Introduction

node-tar

Fast and full-featured Tar for Node.js

The API is designed to mimic the behavior of tar(1) on unix systems. If you are familiar with how tar works, most of this will hopefully be straightforward for you. If not, then hopefully this module can teach you useful unix skills that may come in handy someday :)

Background

A "tar file" or "tarball" is an archive of file system entries (directories, files, links, etc.) The name comes from "tape archive". If you run man tar on almost any Unix command line, you'll learn quite a bit about what it can do, and its history.

Tar has 5 main top-level commands:

  • c Create an archive
  • r Replace entries within an archive
  • u Update entries within an archive (ie, replace if they're newer)
  • t List out the contents of an archive
  • x Extract an archive to disk

The other flags and options modify how this top level function works.

High-Level API

These 5 functions are the high-level API. All of them have a single-character name (for unix nerds familiar with tar(1)) as well as a long name (for everyone else).

All the high-level functions take the following arguments, all three of which are optional and may be omitted.

  1. options - An optional object specifying various options
  2. paths - An array of paths to add or extract
  3. callback - Called when the command is completed, if async. (If sync or no file specified, providing a callback throws a TypeError.)

If the command is sync (ie, if options.sync=true), then the callback is not allowed, since the action will be completed immediately.

If a file argument is specified, and the command is async, then a Promise is returned. In this case, if async, a callback may be provided which is called when the command is completed.

If a file option is not specified, then a stream is returned. For create, this is a readable stream of the generated archive. For list and extract this is a writable stream that an archive should be written into. If a file is not specified, then a callback is not allowed, because you're already getting a stream to work with.

replace and update only work on existing archives, and so require a file argument.

Sync commands without a file argument return a stream that acts on its input immediately in the same tick. For readable streams, this means that all of the data is immediately available by calling stream.read(). For writable streams, it will be acted upon as soon as it is provided, but this can be at any time.

Warnings and Errors

Tar emits warnings and errors for recoverable and unrecoverable situations, respectively. In many cases, a warning only affects a single entry in an archive, or is simply informing you that it's modifying an entry to comply with the settings provided.

Unrecoverable warnings will always raise an error (ie, emit 'error' on streaming actions, throw for non-streaming sync actions, reject the returned Promise for non-streaming async operations, or call a provided callback with an Error as the first argument). Recoverable errors will raise an error only if strict: true is set in the options.

Respond to (recoverable) warnings by listening to the warn event. Handlers receive 3 arguments:

  • code String. One of the error codes below. This may not match data.code, which preserves the original error code from fs and zlib.
  • message String. More details about the error.
  • data Metadata about the error. An Error object for errors raised by fs and zlib. All fields are attached to errors raisd by tar. Typically contains the following fields, as relevant:
    • tarCode The tar error code.
    • code Either the tar error code, or the error code set by the underlying system.
    • file The archive file being read or written.
    • cwd Working directory for creation and extraction operations.
    • entry The entry object (if it could be created) for TAR_ENTRY_INFO, TAR_ENTRY_INVALID, and TAR_ENTRY_ERROR warnings.
    • header The header object (if it could be created, and the entry could not be created) for TAR_ENTRY_INFO and TAR_ENTRY_INVALID warnings.
    • recoverable Boolean. If false, then the warning will emit an error, even in non-strict mode.

Error Codes

  • TAR_ENTRY_INFO An informative error indicating that an entry is being modified, but otherwise processed normally. For example, removing / or C:\ from absolute paths if preservePaths is not set.

  • TAR_ENTRY_INVALID An indication that a given entry is not a valid tar archive entry, and will be skipped. This occurs when:

    • a checksum fails,
    • a linkpath is missing for a link type, or
    • a linkpath is provided for a non-link type.

    If every entry in a parsed archive raises an TAR_ENTRY_INVALID error, then the archive is presumed to be unrecoverably broken, and TAR_BAD_ARCHIVE will be raised.

  • TAR_ENTRY_ERROR The entry appears to be a valid tar archive entry, but encountered an error which prevented it from being unpacked. This occurs when:

    • an unrecoverable fs error happens during unpacking,
    • an entry is trying to extract into an excessively deep location (by default, limited to 1024 subfolders),
    • an entry has .. in the path and preservePaths is not set, or
    • an entry is extracting through a symbolic link, when preservePaths is not set.
  • TAR_ENTRY_UNSUPPORTED An indication that a given entry is a valid archive entry, but of a type that is unsupported, and so will be skipped in archive creation or extracting.

  • TAR_ABORT When parsing gzipped-encoded archives, the parser will abort the parse process raise a warning for any zlib errors encountered. Aborts are considered unrecoverable for both parsing and unpacking.

  • TAR_BAD_ARCHIVE The archive file is totally hosed. This can happen for a number of reasons, and always occurs at the end of a parse or extract:

    • An entry body was truncated before seeing the full number of bytes.
    • The archive contained only invalid entries, indicating that it is likely not an archive, or at least, not an archive this library can parse.

    TAR_BAD_ARCHIVE is considered informative for parse operations, but unrecoverable for extraction. Note that, if encountered at the end of an extraction, tar WILL still have extracted as much it could from the archive, so there may be some garbage files to clean up.

Errors that occur deeper in the system (ie, either the filesystem or zlib) will have their error codes left intact, and a tarCode matching one of the above will be added to the warning metadata or the raised error object.

Errors generated by tar will have one of the above codes set as the error.code field as well, but since errors originating in zlib or fs will have their original codes, it's better to read error.tarCode if you wish to see how tar is handling the issue.

Examples

The API mimics the tar(1) command line functionality, with aliases for more human-readable option and function names. The goal is that if you know how to use tar(1) in Unix, then you know how to use import('tar') in JavaScript.

To replicate tar czf my-tarball.tgz files and folders, you'd do:

import { create } from 'tar'
create(
  {
    gzip: <true|gzip options>,
    file: 'my-tarball.tgz'
  },
  ['some', 'files', 'and', 'folders']
).then(_ => { .. tarball has been created .. })

To replicate tar cz files and folders > my-tarball.tgz, you'd do:

// if you're familiar with the tar(1) cli flags, this can be nice
import * as tar from 'tar'
tar.c(
  {
    // 'z' is alias for 'gzip' option
    z: <true|gzip options>
  },
  ['some', 'files', 'and', 'folders']
).pipe(fs.createWriteStream('my-tarball.tgz'))

To replicate tar xf my-tarball.tgz you'd do:

tar.x( // or `tar.extract`
  {
    // or `file:`
    f: 'my-tarball.tgz'
  }
).then(_=> { .. tarball has been dumped in cwd .. })

To replicate cat my-tarball.tgz | tar x -C some-dir --strip=1:

fs.createReadStream('my-tarball.tgz').pipe(
  tar.x({
    strip: 1,
    C: 'some-dir', // alias for cwd:'some-dir', also ok
  }),
)

To replicate tar tf my-tarball.tgz, do this:

tar.t({
  file: 'my-tarball.tgz',
  onentry: entry => { .. do whatever with it .. }
})

For example, to just get the list of filenames from an archive:

const getEntryFilenames = async tarballFilename => {
  const filenames = []
  await tar.t({
    file: tarballFilename,
    onentry: entry => filenames.push(entry.path),
  })
  return filenames
}

To replicate cat my-tarball.tgz | tar t do:

fs.createReadStream('my-tarball.tgz')
  .pipe(tar.t())
  .on('entry', entry => { .. do whatever with it .. })

To do anything synchronous, add sync: true to the options. Note that sync functions don't take a callback and don't return a promise. When the function returns, it's already done. Sync methods without a file argument return a sync stream, which flushes immediately. But, of course, it still won't be done until you .end() it.

const getEntryFilenamesSync = tarballFilename => {
  const filenames = []
  tar.t({
    file: tarballFilename,
    onentry: entry => filenames.push(entry.path),
    sync: true,
  })
  return filenames
}

To filter entries, add filter: <function> to the options. Tar-creating methods call the filter with filter(path, stat). Tar-reading methods (including extraction) call the filter with filter(path, entry). The filter is called in the this-context of the Pack or Unpack stream object.

The arguments list to tar t and tar x specify a list of filenames to extract or list, so they're equivalent to a filter that tests if the file is in the list.

For those who aren't fans of tar's single-character command names:

tar.c === tar.create
tar.r === tar.replace (appends to archive, file is required)
tar.u === tar.update (appends if newer, file is required)
tar.x === tar.extract
tar.t === tar.list

Keep reading for all the command descriptions and options, as well as the low-level API that they are built on.

tar.c(options, fileList, callback) [alias: tar.create]

Create a tarball archive.

The fileList is an array of paths to add to the tarball. Adding a directory also adds its children recursively.

An entry in fileList that starts with an @ symbol is a tar archive whose entries will be added. To add a file that starts with @, prepend it with ./.

The following options are supported:

  • file Write the tarball archive to the specified filename. If this is specified, then the callback will be fired when the file has been written, and a promise will be returned that resolves when the file is written. If a filename is not specified, then a Readable Stream will be returned which will emit the file data. [Alias: f]
  • sync Act synchronously. If this is set, then any provided file will be fully written after the call to tar.c. If this is set, and a file is not provided, then the resulting stream will already have the data ready to read or emit('data') as soon as you request it.
  • onwarn A function that will get called with (code, message, data) for any warnings encountered. (See "Warnings and Errors")
  • strict Treat warnings as crash-worthy errors. Default false.
  • cwd The current working directory for creating the archive. Defaults to process.cwd(). [Alias: C]
  • prefix A path portion to prefix onto the entries in the archive.
  • gzip Set to any truthy value to create a gzipped archive, or an object with settings for zlib.Gzip() [Alias: z]
  • filter A function that gets called with (path, stat) for each entry being added. Return true to add the entry to the archive, or false to omit it.
  • portable Omit metadata that is system-specific: ctime, atime, uid, gid, uname, gname, dev, ino, and nlink. Note that mtime is still included, because this is necessary for other time-based operations. Additionally, mode is set to a "reasonable default" for most unix systems, based on a umask value of 0o22.
  • preservePaths Allow absolute paths. By default, / is stripped from absolute paths. [Alias: P]
  • mode The mode to set on the created file archive
  • noDirRecurse Do not recursively archive the contents of directories. [Alias: n]
  • follow Set to true to pack the targets of symbolic links. Without this option, symbolic links are archived as such. [Alias: L, h]
  • noPax Suppress pax extended headers. Note that this means that long paths and linkpaths will be truncated, and large or negative numeric values may be interpreted incorrectly.
  • noMtime Set to true to omit writing mtime values for entries. Note that this prevents using other mtime-based features like tar.update or the keepNewer option with the resulting tar archive. [Alias: m, no-mtime]
  • mtime Set to a Date object to force a specific mtime for everything added to the archive. Overridden by noMtime.

The following options are mostly internal, but can be modified in some advanced use cases, such as re-using caches between runs.

  • linkCache A Map object containing the device and inode value for any file whose nlink is > 1, to identify hard links.
  • statCache A Map object that caches calls lstat.
  • readdirCache A Map object that caches calls to readdir.
  • jobs A number specifying how many concurrent jobs to run. Defaults to 4.
  • maxReadSize The maximum buffer size for fs.read() operations. Defaults to 16 MB.

tar.x(options, fileList, callback) [alias: tar.extract]

Extract a tarball archive.

The fileList is an array of paths to extract from the tarball. If no paths are provided, then all the entries are extracted.

If the archive is gzipped, then tar will detect this and unzip it.

Note that all directories that are created will be forced to be writable, readable, and listable by their owner, to avoid cases where a directory prevents extraction of child entries by virtue of its mode.

Most extraction errors will cause a warn event to be emitted. If the cwd is missing, or not a directory, then the extraction will fail completely.

The following options are supported:

  • cwd Extract files relative to the specified directory. Defaults to process.cwd(). If provided, this must exist and must be a directory. [Alias: C]
  • file The archive file to extract. If not specified, then a Writable stream is returned where the archive data should be written. [Alias: f]
  • sync Create files and directories synchronously.
  • strict Treat warnings as crash-worthy errors. Default false.
  • filter A function that gets called with (path, entry) for each entry being unpacked. Return true to unpack the entry from the archive, or false to skip it.
  • newer Set to true to keep the existing file on disk if it's newer than the file in the archive. [Alias: keep-newer, keep-newer-files]
  • keep Do not overwrite existing files. In particular, if a file appears more than once in an archive, later copies will not overwrite earlier copies. [Alias: k, keep-existing]
  • preservePaths Allow absolute paths, paths containing .., and extracting through symbolic links. By default, / is stripped from absolute paths, .. paths are not extracted, and any file whose location would be modified by a symbolic link is not extracted. [Alias: P]
  • unlink Unlink files before creating them. Without this option, tar overwrites existing files, which preserves existing hardlinks. With this option, existing hardlinks will be broken, as will any symlink that would affect the location of an extracted file. [Alias: U]
  • strip Remove the specified number of leading path elements. Pathnames with fewer elements will be silently skipped. Note that the pathname is edited after applying the filter, but before security checks. [Alias: strip-components, stripComponents]
  • onwarn A function that will get called with (code, message, data) for any warnings encountered. (See "Warnings and Errors")
  • preserveOwner If true, tar will set the uid and gid of extracted entries to the uid and gid fields in the archive. This defaults to true when run as root, and false otherwise. If false, then files and directories will be set with the owner and group of the user running the process. This is similar to -p in tar(1), but ACLs and other system-specific data is never unpacked in this implementation, and modes are set by default already. [Alias: p]
  • uid Set to a number to force ownership of all extracted files and folders, and all implicitly created directories, to be owned by the specified user id, regardless of the uid field in the archive. Cannot be used along with preserveOwner. Requires also setting a gid option.
  • gid Set to a number to force ownership of all extracted files and folders, and all implicitly created directories, to be owned by the specified group id, regardless of the gid field in the archive. Cannot be used along with preserveOwner. Requires also setting a uid option.
  • noMtime Set to true to omit writing mtime value for extracted entries. [Alias: m, no-mtime]
  • transform Provide a function that takes an entry object, and returns a stream, or any falsey value. If a stream is provided, then that stream's data will be written instead of the contents of the archive entry. If a falsey value is provided, then the entry is written to disk as normal. (To exclude items from extraction, use the filter option described above.)
  • onentry A function that gets called with (entry) for each entry that passes the filter.
  • onwarn A function that will get called with (code, message, data) for any warnings encountered. (See "Warnings and Errors")
  • chmod Set to true to call fs.chmod() to ensure that the extracted file matches the entry mode. This may necessitate a call to the deprecated and thread-unsafe process.umask() method to determine the default umask value, unless a processUmask options is also provided. Otherwise tar will extract with whatever mode is provided, and let the process umask apply normally.
  • processUmask Set to an explicit numeric value to avoid calling process.umask() when chmod: true is set.
  • maxDepth The maximum depth of subfolders to extract into. This defaults to 1024. Anything deeper than the limit will raise a warning and skip the entry. Set to Infinity to remove the limitation.

The following options are mostly internal, but can be modified in some advanced use cases, such as re-using caches between runs.

  • maxReadSize The maximum buffer size for fs.read() operations. Defaults to 16 MB.
  • umask Filter the modes of entries like process.umask().
  • dmode Default mode for directories
  • fmode Default mode for files
  • dirCache A Map object of which directories exist.
  • maxMetaEntrySize The maximum size of meta entries that is supported. Defaults to 1 MB.

Note that using an asynchronous stream type with the transform option will cause undefined behavior in sync extractions. MiniPass-based streams are designed for this use case.

tar.t(options, fileList, callback) [alias: tar.list]

List the contents of a tarball archive.

The fileList is an array of paths to list from the tarball. If no paths are provided, then all the entries are listed.

If the archive is gzipped, then tar will detect this and unzip it.

If the file option is not provided, then returns an event emitter that emits entry events with tar.ReadEntry objects. However, they don't emit 'data' or 'end' events. (If you want to get actual readable entries, use the tar.Parse class instead.)

If a file option is provided, then the return value will be a promise that resolves when the file has been fully traversed in async mode, or undefined if sync: true is set. Thus, you must specify an onentry method in order to do anything useful with the data it parses.

The following options are supported:

  • file The archive file to list. If not specified, then a Writable stream is returned where the archive data should be written. [Alias: f]
  • sync Read the specified file synchronously. (This has no effect when a file option isn't specified, because entries are emitted as fast as they are parsed from the stream anyway.)
  • strict Treat warnings as crash-worthy errors. Default false.
  • filter A function that gets called with (path, entry) for each entry being listed. Return true to emit the entry from the archive, or false to skip it.
  • onentry A function that gets called with (entry) for each entry that passes the filter. This is important for when file is set, because there is no other way to do anything useful with this method.
  • maxReadSize The maximum buffer size for fs.read() operations. Defaults to 16 MB.
  • noResume By default, entry streams are resumed immediately after the call to onentry. Set noResume: true to suppress this behavior. Note that by opting into this, the stream will never complete until the entry data is consumed.
  • onwarn A function that will get called with (code, message, data) for any warnings encountered. (See "Warnings and Errors")

tar.u(options, fileList, callback) [alias: tar.update]

Add files to an archive if they are newer than the entry already in the tarball archive.

The fileList is an array of paths to add to the tarball. Adding a directory also adds its children recursively.

An entry in fileList that starts with an @ symbol is a tar archive whose entries will be added. To add a file that starts with @, prepend it with ./.

The following options are supported:

  • file Required. Write the tarball archive to the specified filename. [Alias: f]
  • sync Act synchronously. If this is set, then any provided file will be fully written after the call to tar.c.
  • onwarn A function that will get called with (code, message, data) for any warnings encountered. (See "Warnings and Errors")
  • strict Treat warnings as crash-worthy errors. Default false.
  • cwd The current working directory for adding entries to the archive. Defaults to process.cwd(). [Alias: C]
  • prefix A path portion to prefix onto the entries in the archive.
  • gzip Set to any truthy value to create a gzipped archive, or an object with settings for zlib.Gzip() [Alias: z]
  • filter A function that gets called with (path, stat) for each entry being added. Return true to add the entry to the archive, or false to omit it.
  • portable Omit metadata that is system-specific: ctime, atime, uid, gid, uname, gname, dev, ino, and nlink. Note that mtime is still included, because this is necessary for other time-based operations. Additionally, mode is set to a "reasonable default" for most unix systems, based on a umask value of 0o22.
  • preservePaths Allow absolute paths. By default, / is stripped from absolute paths. [Alias: P]
  • maxReadSize The maximum buffer size for fs.read() operations. Defaults to 16 MB.
  • noDirRecurse Do not recursively archive the contents of directories. [Alias: n]
  • follow Set to true to pack the targets of symbolic links. Without this option, symbolic links are archived as such. [Alias: L, h]
  • noPax Suppress pax extended headers. Note that this means that long paths and linkpaths will be truncated, and large or negative numeric values may be interpreted incorrectly.
  • noMtime Set to true to omit writing mtime values for entries. Note that this prevents using other mtime-based features like tar.update or the keepNewer option with the resulting tar archive. [Alias: m, no-mtime]
  • mtime Set to a Date object to force a specific mtime for everything added to the archive. Overridden by noMtime.

tar.r(options, fileList, callback) [alias: tar.replace]

Add files to an existing archive. Because later entries override earlier entries, this effectively replaces any existing entries.

The fileList is an array of paths to add to the tarball. Adding a directory also adds its children recursively.

An entry in fileList that starts with an @ symbol is a tar archive whose entries will be added. To add a file that starts with @, prepend it with ./.

The following options are supported:

  • file Required. Write the tarball archive to the specified filename. [Alias: f]
  • sync Act synchronously. If this is set, then any provided file will be fully written after the call to tar.c.
  • onwarn A function that will get called with (code, message, data) for any warnings encountered. (See "Warnings and Errors")
  • strict Treat warnings as crash-worthy errors. Default false.
  • cwd The current working directory for adding entries to the archive. Defaults to process.cwd(). [Alias: C]
  • prefix A path portion to prefix onto the entries in the archive.
  • gzip Set to any truthy value to create a gzipped archive, or an object with settings for zlib.Gzip() [Alias: z]
  • filter A function that gets called with (path, stat) for each entry being added. Return true to add the entry to the archive, or false to omit it.
  • portable Omit metadata that is system-specific: ctime, atime, uid, gid, uname, gname, dev, ino, and nlink. Note that mtime is still included, because this is necessary for other time-based operations. Additionally, mode is set to a "reasonable default" for most unix systems, based on a umask value of 0o22.
  • preservePaths Allow absolute paths. By default, / is stripped from absolute paths. [Alias: P]
  • maxReadSize The maximum buffer size for fs.read() operations. Defaults to 16 MB.
  • noDirRecurse Do not recursively archive the contents of directories. [Alias: n]
  • follow Set to true to pack the targets of symbolic links. Without this option, symbolic links are archived as such. [Alias: L, h]
  • noPax Suppress pax extended headers. Note that this means that long paths and linkpaths will be truncated, and large or negative numeric values may be interpreted incorrectly.
  • noMtime Set to true to omit writing mtime values for entries. Note that this prevents using other mtime-based features like tar.update or the keepNewer option with the resulting tar archive. [Alias: m, no-mtime]
  • mtime Set to a Date object to force a specific mtime for everything added to the archive. Overridden by noMtime.

Low-Level API

class tar.Pack

A readable tar stream.

Has all the standard readable stream interface stuff. 'data' and 'end' events, read() method, pause() and resume(), etc.

constructor(options)

The following options are supported:

  • onwarn A function that will get called with (code, message, data) for any warnings encountered. (See "Warnings and Errors")

  • strict Treat warnings as crash-worthy errors. Default false.

  • cwd The current working directory for creating the archive. Defaults to process.cwd().

  • prefix A path portion to prefix onto the entries in the archive.

  • gzip Set to any truthy value to create a gzipped archive, or an object with settings for zlib.Gzip()

  • filter A function that gets called with (path, stat) for each entry being added. Return true to add the entry to the archive, or false to omit it.

  • portable Omit metadata that is system-specific: ctime, atime, uid, gid, uname, gname, dev, ino, and nlink. Note that mtime is still included, because this is necessary for other time-based operations. Additionally, mode is set to a "reasonable default" for most unix systems, based on a umask value of 0o22.

  • preservePaths Allow absolute paths. By default, / is stripped from absolute paths.

  • linkCache A Map object containing the device and inode value for any file whose nlink is > 1, to identify hard links.

  • statCache A Map object that caches calls lstat.

  • readdirCache A Map object that caches calls to readdir.

  • jobs A number specifying how many concurrent jobs to run. Defaults to 4.

  • maxReadSize The maximum buffer size for fs.read() operations. Defaults to 16 MB.

  • noDirRecurse Do not recursively archive the contents of directories.

  • follow Set to true to pack the targets of symbolic links. Without this option, symbolic links are archived as such.

  • noPax Suppress pax extended headers. Note that this means that long paths and linkpaths will be truncated, and large or negative numeric values may be interpreted incorrectly.

  • noMtime Set to true to omit writing mtime values for entries. Note that this prevents using other mtime-based features like tar.update or the keepNewer option with the resulting tar archive.

  • mtime Set to a Date object to force a specific mtime for everything added to the archive. Overridden by noMtime.

add(path)

Adds an entry to the archive. Returns the Pack stream.

write(path)

Adds an entry to the archive. Returns true if flushed.

end()

Finishes the archive.

class tar.Pack.Sync

Synchronous version of tar.Pack.

class tar.Unpack

A writable stream that unpacks a tar archive onto the file system.

All the normal writable stream stuff is supported. write() and end() methods, 'drain' events, etc.

Note that all directories that are created will be forced to be writable, readable, and listable by their owner, to avoid cases where a directory prevents extraction of child entries by virtue of its mode.

'close' is emitted when it's done writing stuff to the file system.

Most unpack errors will cause a warn event to be emitted. If the cwd is missing, or not a directory, then an error will be emitted.

constructor(options)

  • cwd Extract files relative to the specified directory. Defaults to process.cwd(). If provided, this must exist and must be a directory.
  • filter A function that gets called with (path, entry) for each entry being unpacked. Return true to unpack the entry from the archive, or false to skip it.
  • newer Set to true to keep the existing file on disk if it's newer than the file in the archive.
  • keep Do not overwrite existing files. In particular, if a file appears more than once in an archive, later copies will not overwrite earlier copies.
  • preservePaths Allow absolute paths, paths containing .., and extracting through symbolic links. By default, / is stripped from absolute paths, .. paths are not extracted, and any file whose location would be modified by a symbolic link is not extracted.
  • unlink Unlink files before creating them. Without this option, tar overwrites existing files, which preserves existing hardlinks. With this option, existing hardlinks will be broken, as will any symlink that would affect the location of an extracted file.
  • strip Remove the specified number of leading path elements. Pathnames with fewer elements will be silently skipped. Note that the pathname is edited after applying the filter, but before security checks.
  • onwarn A function that will get called with (code, message, data) for any warnings encountered. (See "Warnings and Errors")
  • umask Filter the modes of entries like process.umask().
  • dmode Default mode for directories
  • fmode Default mode for files
  • dirCache A Map object of which directories exist.
  • maxMetaEntrySize The maximum size of meta entries that is supported. Defaults to 1 MB.
  • preserveOwner If true, tar will set the uid and gid of extracted entries to the uid and gid fields in the archive. This defaults to true when run as root, and false otherwise. If false, then files and directories will be set with the owner and group of the user running the process. This is similar to -p in tar(1), but ACLs and other system-specific data is never unpacked in this implementation, and modes are set by default already.
  • win32 True if on a windows platform. Causes behavior where filenames containing <|>? chars are converted to windows-compatible values while being unpacked.
  • uid Set to a number to force ownership of all extracted files and folders, and all implicitly created directories, to be owned by the specified user id, regardless of the uid field in the archive. Cannot be used along with preserveOwner. Requires also setting a gid option.
  • gid Set to a number to force ownership of all extracted files and folders, and all implicitly created directories, to be owned by the specified group id, regardless of the gid field in the archive. Cannot be used along with preserveOwner. Requires also setting a uid option.
  • noMtime Set to true to omit writing mtime value for extracted entries.
  • transform Provide a function that takes an entry object, and returns a stream, or any falsey value. If a stream is provided, then that stream's data will be written instead of the contents of the archive entry. If a falsey value is provided, then the entry is written to disk as normal. (To exclude items from extraction, use the filter option described above.)
  • strict Treat warnings as crash-worthy errors. Default false.
  • onentry A function that gets called with (entry) for each entry that passes the filter.
  • onwarn A function that will get called with (code, message, data) for any warnings encountered. (See "Warnings and Errors")
  • chmod Set to true to call fs.chmod() to ensure that the extracted file matches the entry mode. This may necessitate a call to the deprecated and thread-unsafe process.umask() method to determine the default umask value, unless a processUmask options is also provided. Otherwise tar will extract with whatever mode is provided, and let the process umask apply normally.
  • processUmask Set to an explicit numeric value to avoid calling process.umask() when chmod: true is set.
  • maxDepth The maximum depth of subfolders to extract into. This defaults to 1024. Anything deeper than the limit will raise a warning and skip the entry. Set to Infinity to remove the limitation.

class tar.Unpack.Sync

Synchronous version of tar.Unpack.

Note that using an asynchronous stream type with the transform option will cause undefined behavior in sync unpack streams. MiniPass-based streams are designed for this use case.

class tar.Parse

A writable stream that parses a tar archive stream. All the standard writable stream stuff is supported.

If the archive is gzipped, then tar will detect this and unzip it.

Emits 'entry' events with tar.ReadEntry objects, which are themselves readable streams that you can pipe wherever.

Each entry will not emit until the one before it is flushed through, so make sure to either consume the data (with on('data', ...) or .pipe(...)) or throw it away with .resume() to keep the stream flowing.

constructor(options)

Returns an event emitter that emits entry events with tar.ReadEntry objects.

The following options are supported:

  • strict Treat warnings as crash-worthy errors. Default false.
  • filter A function that gets called with (path, entry) for each entry being listed. Return true to emit the entry from the archive, or false to skip it.
  • onentry A function that gets called with (entry) for each entry that passes the filter.
  • onwarn A function that will get called with (code, message, data) for any warnings encountered. (See "Warnings and Errors")

abort(error)

Stop all parsing activities. This is called when there are zlib errors. It also emits an unrecoverable warning with the error provided.

class tar.ReadEntry extends MiniPass

A representation of an entry that is being read out of a tar archive.

It has the following fields:

  • extended The extended metadata object provided to the constructor.
  • globalExtended The global extended metadata object provided to the constructor.
  • remain The number of bytes remaining to be written into the stream.
  • blockRemain The number of 512-byte blocks remaining to be written into the stream.
  • ignore Whether this entry should be ignored.
  • meta True if this represents metadata about the next entry, false if it represents a filesystem object.
  • All the fields from the header, extended header, and global extended header are added to the ReadEntry object. So it has path, type, size, mode, and so on.

constructor(header, extended, globalExtended)

Create a new ReadEntry object with the specified header, extended header, and global extended header values.

class tar.WriteEntry extends MiniPass

A representation of an entry that is being written from the file system into a tar archive.

Emits data for the Header, and for the Pax Extended Header if one is required, as well as any body data.

Creating a WriteEntry for a directory does not also create WriteEntry objects for all of the directory contents.

It has the following fields:

  • path The path field that will be written to the archive. By default, this is also the path from the cwd to the file system object.
  • portable Omit metadata that is system-specific: ctime, atime, uid, gid, uname, gname, dev, ino, and nlink. Note that mtime is still included, because this is necessary for other time-based operations. Additionally, mode is set to a "reasonable default" for most unix systems, based on a umask value of 0o22.
  • myuid If supported, the uid of the user running the current process.
  • myuser The env.USER string if set, or ''. Set as the entry uname field if the file's uid matches this.myuid.
  • maxReadSize The maximum buffer size for fs.read() operations. Defaults to 1 MB.
  • linkCache A Map object containing the device and inode value for any file whose nlink is > 1, to identify hard links.
  • statCache A Map object that caches calls lstat.
  • preservePaths Allow absolute paths. By default, / is stripped from absolute paths.
  • cwd The current working directory for creating the archive. Defaults to process.cwd().
  • absolute The absolute path to the entry on the filesystem. By default, this is path.resolve(this.cwd, this.path), but it can be overridden explicitly.
  • strict Treat warnings as crash-worthy errors. Default false.
  • win32 True if on a windows platform. Causes behavior where paths replace \ with / and filenames containing the windows-compatible forms of <|>?: characters are converted to actual <|>?: characters in the archive.
  • noPax Suppress pax extended headers. Note that this means that long paths and linkpaths will be truncated, and large or negative numeric values may be interpreted incorrectly.
  • noMtime Set to true to omit writing mtime values for entries. Note that this prevents using other mtime-based features like tar.update or the keepNewer option with the resulting tar archive.

constructor(path, options)

path is the path of the entry as it is written in the archive.

The following options are supported:

  • portable Omit metadata that is system-specific: ctime, atime, uid, gid, uname, gname, dev, ino, and nlink. Note that mtime is still included, because this is necessary for other time-based operations. Additionally, mode is set to a "reasonable default" for most unix systems, based on a umask value of 0o22.
  • maxReadSize The maximum buffer size for fs.read() operations. Defaults to 1 MB.
  • linkCache A Map object containing the device and inode value for any file whose nlink is > 1, to identify hard links.
  • statCache A Map object that caches calls lstat.
  • preservePaths Allow absolute paths. By default, / is stripped from absolute paths.
  • cwd The current working directory for creating the archive. Defaults to process.cwd().
  • absolute The absolute path to the entry on the filesystem. By default, this is path.resolve(this.cwd, this.path), but it can be overridden explicitly.
  • strict Treat warnings as crash-worthy errors. Default false.
  • win32 True if on a windows platform. Causes behavior where paths replace \ with /.
  • onwarn A function that will get called with (code, message, data) for any warnings encountered. (See "Warnings and Errors")
  • noMtime Set to true to omit writing mtime values for entries. Note that this prevents using other mtime-based features like tar.update or the keepNewer option with the resulting tar archive.
  • umask Set to restrict the modes on the entries in the archive, somewhat like how umask works on file creation. Defaults to process.umask() on unix systems, or 0o22 on Windows.

warn(message, data)

If strict, emit an error with the provided message.

Othewise, emit a 'warn' event with the provided message and data.

class tar.WriteEntry.Sync

Synchronous version of tar.WriteEntry

class tar.WriteEntry.Tar

A version of tar.WriteEntry that gets its data from a tar.ReadEntry instead of from the filesystem.

constructor(readEntry, options)

readEntry is the entry being read out of another archive.

The following options are supported:

  • portable Omit metadata that is system-specific: ctime, atime, uid, gid, uname, gname, dev, ino, and nlink. Note that mtime is still included, because this is necessary for other time-based operations. Additionally, mode is set to a "reasonable default" for most unix systems, based on a umask value of 0o22.
  • preservePaths Allow absolute paths. By default, / is stripped from absolute paths.
  • strict Treat warnings as crash-worthy errors. Default false.
  • onwarn A function that will get called with (code, message, data) for any warnings encountered. (See "Warnings and Errors")
  • noMtime Set to true to omit writing mtime values for entries. Note that this prevents using other mtime-based features like tar.update or the keepNewer option with the resulting tar archive.

class tar.Header

A class for reading and writing header blocks.

It has the following fields:

  • nullBlock True if decoding a block which is entirely composed of 0x00 null bytes. (Useful because tar files are terminated by at least 2 null blocks.)
  • cksumValid True if the checksum in the header is valid, false otherwise.
  • needPax True if the values, as encoded, will require a Pax extended header.
  • path The path of the entry.
  • mode The 4 lowest-order octal digits of the file mode. That is, read/write/execute permissions for world, group, and owner, and the setuid, setgid, and sticky bits.
  • uid Numeric user id of the file owner
  • gid Numeric group id of the file owner
  • size Size of the file in bytes
  • mtime Modified time of the file
  • cksum The checksum of the header. This is generated by adding all the bytes of the header block, treating the checksum field itself as all ascii space characters (that is, 0x20).
  • type The human-readable name of the type of entry this represents, or the alphanumeric key if unknown.
  • typeKey The alphanumeric key for the type of entry this header represents.
  • linkpath The target of Link and SymbolicLink entries.
  • uname Human-readable user name of the file owner
  • gname Human-readable group name of the file owner
  • devmaj The major portion of the device number. Always 0 for files, directories, and links.
  • devmin The minor portion of the device number. Always 0 for files, directories, and links.
  • atime File access time.
  • ctime File change time.

constructor(data, [offset=0])

data is optional. It is either a Buffer that should be interpreted as a tar Header starting at the specified offset and continuing for 512 bytes, or a data object of keys and values to set on the header object, and eventually encode as a tar Header.

decode(block, offset)

Decode the provided buffer starting at the specified offset.

Buffer length must be greater than 512 bytes.

set(data)

Set the fields in the data object.

encode(buffer, offset)

Encode the header fields into the buffer at the specified offset.

Returns this.needPax to indicate whether a Pax Extended Header is required to properly encode the specified data.

class tar.Pax

An object representing a set of key-value pairs in an Pax extended header entry.

It has the following fields. Where the same name is used, they have the same semantics as the tar.Header field of the same name.

  • global True if this represents a global extended header, or false if it is for a single entry.
  • atime
  • charset
  • comment
  • ctime
  • gid
  • gname
  • linkpath
  • mtime
  • path
  • size
  • uid
  • uname
  • dev
  • ino
  • nlink

constructor(object, global)

Set the fields set in the object. global is a boolean that defaults to false.

encode()

Return a Buffer containing the header and body for the Pax extended header entry, or null if there is nothing to encode.

encodeBody()

Return a string representing the body of the pax extended header entry.

encodeField(fieldName)

Return a string representing the key/value encoding for the specified fieldName, or '' if the field is unset.

tar.Pax.parse(string, extended, global)

Return a new Pax object created by parsing the contents of the string provided.

If the extended object is set, then also add the fields from that object. (This is necessary because multiple metadata entries can occur in sequence.)

tar.types

A translation table for the type field in tar headers.

tar.types.name.get(code)

Get the human-readable name for a given alphanumeric code.

tar.types.code.get(name)

Get the alphanumeric code for a given human-readable name.

node-tar's People

Contributors

afischer-sfly avatar ashleygwilliams avatar bmeck avatar bruce-one avatar commanderroot avatar dependabot[bot] avatar github-actions[bot] avatar gyllstromk avatar iarna avatar isaacs avatar isker avatar itsachen avatar jamiemagee avatar joaocgreis avatar justfalter avatar lukekarrys avatar mastercactapus avatar max-mapper avatar othiym23 avatar petros avatar razzeee avatar rmg avatar runk avatar rvagg avatar soldair avatar timbertson avatar vhain avatar webark avatar wraithgar avatar zkat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

node-tar's Issues

八进制问题

我看见tar的部分服务使用到了八进制处理. 但是严格模式下八进制是限制使用. 我想知道这块可能成为将来的一个风险点么?

transform the path when packing

Is it possible to transform the path of each file as it goes into the tarball.
In other words provide a mapping function that changes the path name that is attached to each file as it is added to the archive.

This can be done in the command line tar by using the --transform=expression option (see http://www.gnu.org/software/tar/manual/html_section/transform.html)

I have a need to prefix each file with an extra folder name.

I can't see an obvious way of achieving this without hacking the node-tar library.

tar.Extract hangs on large files

When using the extracter.js example as-is with tar.Extract on a large (~100mb) tar file (with plenty subfolders, small and large files) the extraction sometimes stops and the application hangs waiting for events that never come.

When I check the same tar with tar.Parse and log everything I see all the content entries and then the pipes finish as expected.

If I add some log handlers to the tar.Extract pipes I see it usually hangs on the larger files. With a bit of googling and browsing implementers I see mentions that if the file writer pauses the pipe node-tar also stops and won't emit new events.

I looked into some of the modules that use node-tar and they either use way more complicated code (like in npm) or reimplement the extraction based on tar.Parse and just buffer everything in memory.

An example tar that does this consistently on my workstation is the one in this .tar.bz2 archive (both the .bz2 as .tar work fine with plain cli tar + bunzip2). It usually hangs after reaching that 80mb file.

pack loses data when using pause() / resume()

It's always difficult to tell if I'm doing something wrong when I use pause/resume, but I think this is correct. If so, tar is buggy because it's emitting data events while paused. If I'm doing something wrong, please let me know. Here's my simplest repro code:

var tar = require('tar');
var fs = require('fs');
var fstream = require('fstream');
var child_process = require('child_process');

var reader = fstream.Reader('./somedir/');
var pack = tar.Pack();
var _pack = pack;
var extractProc = child_process['spawn']('tar',['xv', '-C','/tmp'], {'stdio':['pipe',1,2]});

reader.pipe(pack);

// monkey-patch emit with logging
pack.emit = (function(orig) {
    return function(evt) {
        console.log("pack.emit("+evt+") -> " + pack.listeners(evt).length + " listeners");
        return orig.apply(this, arguments);
    }
})(pack.emit);

var readChunk = function() {
    var onData = function(data) {
        console.log("got " + data.length + "b");
        console.log("pause()");
        pack.pause();
        pack.removeListener('data', onData);

        var cont = function() {
            // emulate a slow reader
            setTimeout(readChunk, 100);
        };

        var written = function() {
            extractProc.stdin.removeListener('drain', written);
            cont();
        }
        var writtenImmediately = extractProc.stdin.write(data);
        console.log("writtenImmediately:", writtenImmediately);
        if(writtenImmediately) {
            cont();
        } else {
            console.log("(await drain ...)");
            extractProc.stdin.on('drain', written);
        }
    };
    pack.addListener('data', onData);
    console.log("resume()");
    pack.resume();
};
pack.addListener('end', function() {
    console.log('pack ended');
    //setTimeout(function() {
        console.log('calling end()');
        extractProc.stdin.end();
    //}, 1000);
});
readChunk();

extractProc.on('exit', function(rv) { console.log('done', rv); });

Without the setTimeout() it seems to work, but presumably only by chance, because the extract process happens to be keeping up with the reader. When there's some delay between calls to pause()/resume() you eventually end up with this sort of thing in the console output:

# (truncated ...)
pack.emit(data) -> 1 listeners
got 512b
pause()
pack.emit(pause) -> 0 listeners
writtenImmediately: true
resume()
pack.emit(data) -> 1 listeners
got 512b
pause()
pack.emit(pause) -> 0 listeners
pack.emit(data) -> 0 listeners
pack.emit(data) -> 0 listeners
pack.emit(data) -> 0 listeners
pack.emit(data) -> 0 listeners
pack.emit(data) -> 0 listeners
pack.emit(data) -> 0 listeners
# (badness and a corrupt tar file ensues)

As far as I understand it, in node 0.10+ a stream's pause method is not advisory. Wrapping the pack stream with stream.Readable.wrap does seem to work in the above code.

doesn't handle unknown types of entries. Should handle as normal files?

Hi

I try to extract several files. But one of this doesn't extract the files. The following sample demonstrate the bug. The unzipping works as expected. The tar is correct, but the extraction doesn't work right:

var zlib = require('zlib')
  , tar = require('tar')
  , request = require('request');

request.get("http://curl.haxx.se/download/curl-7.28.1.tar.gz")
.pipe(zlib.createGunzip())
.pipe(tar.Extract({path: "./test"}))
.on("error", function(err){
  console.log("Error on extract", err);
})
.on("end", function(){
  console.log("done.");
});

When i parse then i see such entries:

ignoredEntry?!? { path: 'curl-7.28.1/buildconf',
  mode: 493,
  uid: 1000,
  gid: 1000,
  size: 14743,
  mtime: Wed Apr 25 2012 17:29:20 GMT+0200 (CEST),
  cksum: 4866,
  type: '',
  linkpath: '',
  ustar: false }

the following code should be more flexible (with option?):

Parse.prototype._startEntry 
...
   default:
      EntryType = Entry
      ev = "entry"
      break

i use [email protected] on MacOSX 10.8 and [email protected].

Cheers

More documentation

There is not even nearly enough documentation for this.

  • How are Pack's options used? At least provide a link to something that makes sense about "Global Extended Headers"
  • How would I pack multiple files together?
  • How does Pack actually work? Do you have to use fstream?
  • Why pass in a Writer to Pack? I would have expected the usage to be input.pipe(tar).pipe(output), no?

Some examples would help too.

tar.Extract class should return an instance of fstream

For example:

Module tar.gz uses such construction to pipe all logic:

return fstream.Reader({[options]}).pipe(zlib.createGunzip())).pipe(tar.Extract().on("close", function);

So the point is that ending event .on('close', function) doesn't get fired, because there is no return in Extract class, there should be a return this._fst in the end of the function.

Extract does not fire error on corrupted tar

http://cl.ly/3j3v1o161M0m

The tar file above is a tar.gz file that failed while downloading. If passed into Extract() there's no end or error event being fired. It simply silently fails. I tracked down all the end events in streams and it gets fired correctly for every stream (gunzip, block stream, etc) except for _fst (filestream). Can this be related to the padding applied to block stream?

Sample code:

var fs = require('fs');
var zlib = require('zlib');
var tar = require('tar');

console.log('start');

fs.createReadStream('foo.tar.gz')
.on('end', console.log.bind(console, 'fs read end'))
.on('error', console.log.bind(console, 'error'))
.pipe(zlib.createGunzip())
.on('end', console.log.bind(console, 'gunzip end'))
.on('error', console.log.bind(console, 'error'))
.pipe(tar.Extract({
    path: 'foo'
}))
.on('error', console.log.bind(console, 'error'))
.on('end', console.log.bind(console, 'done'));

.Pack should have a way to use global space vs the directory prefix

Code such as the following:

fstream.Reader({ path: path.join(__dirname, './xyz'), type: "Directory" })
.pipe(tar.Pack({ noProprietary: true }))
.pipe(destination)

Will always prefix the entries of a tar with xyz/... , there should be a sane way to prevent this and set a default if you want to add a custom prefix

Silently fails to extract

I'm using the module npm-cache and for some reason it won't extract the modules.
It however works for the bower modules.
This is the code used for both type of modules:

  var extractor = tar.Extract({path: targetPath})
                     .on('error', onError)
                     .on('end', onEnd);

  fs.createReadStream(cachePath)
    .on('error', onError)
    .pipe(extractor);

Cache path: C:\Users\***\.package_cache\npm\3.6.0\6cd7ceb02dc80477b9c0084e50507a72.tar.gz
Size: 115MB
Target path: C:\TeamCity\buildAgent\work\6e8ae23de4bcc9d9\***SE_src\***.SKT\***.SKT.Api\***.SKT.Api\Application

I'm have double checked the paths. They are both correct.
I'm using Windows 2012 Server.

I don't get any error message. Is the error handling applied correctly?
If so, how come I don't get any error?

I have tried to extract the file manually. It works and the structure is fine.

tar.Pack breaks node JSON.stringify for buffers! REALLY ANNOYING!

tar.Pack method alters how JSON.stringify works with buffers. In the example below I stringify a buffer before tar.Pack and I get a result like this:

{"type":"Buffer","data":[212,44,136,22,122,74,202,144,84,253,59,173,15,210,27,89]}

after tar.Pack for the same buffer I get:

"�,�\u0016zJʐT�;�\u000f�\u001bY"

from now on JSON.stringify always works in the wrong way. You can try the code below.

var tar = require("tar");
var fs = require("fs");

var pathFile = "/Users/edoardo/porcaccio.json";
var pathTar = "/Users/edoardo/porcaccio.tar";

var array = [];
for (var i = 0; i < 16; i++) {
  array[i] = Math.floor((Math.random() * 256) + 1);
}
//
// create a new buffer from an array
var buffer = new Buffer(array);
console.log(JSON.stringify(buffer));
//
// create read and write strams
var rs = fs.createReadStream(pathFile);
var ws = fs.createWriteStream(pathTar);
//
// tar the file
rs.pipe(tar.Pack()).pipe(ws).on("finish",function () {
  console.log(JSON.stringify(buffer));
});

Can't extract tar archives with non-empty directories lacking +w permission

(Similar to #7.)

Reproduction:

$ git clone https://github.com/glasser/tar-unwritable-dir
$ cd tar-unwritable-dir
$ npm install
$ node tar-unwritable-dir.js 

events.js:72
        throw er; // Unhandled 'error' event
              ^
Error: EACCES, open '/var/folders/2k/tmccc7sj7pg2c0qt6646cz8r0000gn/T/tmp-659231kcne1s/foo/bar'

It's because node-tar/fstream chmods the directory before writing its children, I guess.

The 'end' event does not work!

Hi, I'm trying to unpack a tar file downloaded with the aws s3 sdk... all seems to be fine, except for the 'end' event that doesn't works at all...

Here is my code:
`
self.main.getS3Client().getObject({Bucket: self.main.s3config.bucket, Key: key}, function(err, data) {

if (err) {return callback(err);}

    if (err) {return callback(err);}

    console.log("WORK");

    var stream = new Stream({chunkSize: data.ContentLength});
    var extractor = tar.Extract({path: self.data.cwd});

    extractor.on('error', function(e) {
        return callback(err);
    });

    extractor.on('end', function() {
        console.log("END"); // It doesn't work!
    });

    stream.put(data.Body);
    stream.pipe(extractor);

});

});
`

It works but the end event doesn't get triggered

symbolic link gets corrupted

Example:

var request = require('request');
var tar = require('tar');
var zlib = require('zlib');

request('http://nodejs.org/dist/v0.12.1/node-v0.12.1-darwin-x64.tar.gz')
  .pipe(zlib.Gunzip())
  .pipe(tar.Extract({path: __dirname}));

Result:

» ls -lh node-v0.12.1-darwin-x64/bin
total 36328
-rwxr-xr-x  1 jose  staff    18M Mar 23 23:02 node
lrwxr-xr-x  1 jose  staff    73B Mar 23 23:02 npm -> /tmp/test-tar/lib/node_modules/npm/bin/npm-cli.js

It seems node-tar is resolving the sym link incorrectly and is missing some directories.

If I rather use tar(1):

» wget http://nodejs.org/dist/v0.12.1/node-v0.12.1-darwin-x64.tar.gz
» tar -xzf node-v0.12.1-darwin-x64.tar.gz
» ls -lh node-v0.12.1-darwin-x64/bin
total 36328
-rwxr-xr-x  1 jose  wheel    18M Mar 23 23:02 node
lrwxr-xr-x  1 jose  wheel    38B Mar 23 23:02 npm -> ../lib/node_modules/npm/bin/npm-cli.js

Questions:

  1. Am I doing something wrong?
  2. Is there a way to prevent node-tar changing relative paths to absolute?

Extract throws unhandled error if you remove error handler after the first error

So this is a little weird. Basically, I'm attaching an error handler to a tar Extract stream. If I attach it and leave it forever, it does what I'd expect:

var tar = require('tar');
var fs = require('fs');

var extract = tar.Extract({path: '/tmp'});
var errorHandler = function(e) {
    console.warn("Caught error from extract: " + e);
    setTimeout(function() { console.log("all quiet");}, 1000);
}
extract.on('error', errorHandler);

extract.write('not likely to be a tar');
extract.end();

$ node tar-test.js 
Caught error from extract: Error: invalid tar file
all quiet

However, if I am a good citizen and remove my error handler when it's no longer needed (I'm not just being pedantic, this is in library code which listens for various events and always makes sure to clean up after itself):

var tar = require('tar');
var fs = require('fs');

var extract = tar.Extract({path: '/tmp'});
var errorHandler = function(e) {
    console.warn("Caught error from extract: " + e);
    extract.removeListener('error', errorHandler);
}
extract.on('error', errorHandler);

extract.write('not likely to be a tar');
extract.end();


$ node tar-test.js 
Caught error from extract: Error: invalid tar file

stream.js:94
      throw er; // Unhandled stream error in pipe.
            ^
Error: invalid tar file
    at Extract.Parse._startEntry (/home/tim/dev/oni/stratifiedjs/node_modules/tar/lib/parse.js:145:13)
    at Extract.Parse._process (/home/tim/dev/oni/stratifiedjs/node_modules/tar/lib/parse.js:127:12)
    at BlockStream.<anonymous> (/home/tim/dev/oni/stratifiedjs/node_modules/tar/lib/parse.js:47:8)
    at BlockStream.emit (events.js:95:17)
    at BlockStream._emitChunk (/home/tim/dev/oni/stratifiedjs/node_modules/tar/node_modules/block-stream/block-stream.js:145:10)
    at BlockStream.resume (/home/tim/dev/oni/stratifiedjs/node_modules/tar/node_modules/block-stream/block-stream.js:58:15)
    at Extract.Reader.resume (/home/tim/dev/oni/stratifiedjs/node_modules/fstream/lib/reader.js:255:34)
    at DirWriter.<anonymous> (/home/tim/dev/oni/stratifiedjs/node_modules/tar/lib/extract.js:57:8)
    at DirWriter.emit (events.js:92:17)
    at /home/tim/dev/oni/stratifiedjs/node_modules/fstream/lib/dir-writer.js:39:8

Notice that it handles the error, then removes the error handler, then another error is thrown (which doesn't get emitted if I simply leave the error handler attached).

So I dunno, it definitely seems like a bug. Is there something I should be doing when I receive the first error to tell the stream "hope is lost, just shut down and stop erroring"? Either way, it seems pretty odd that the second error doesn't get sent to my handler, but gets raised when it's absent.

Falls over extracting .bin folder

First - thanks for the awesome library, works wonderfully.

Just hit an issue that I managed to work around through trial and error, which seems to occur on RedHat Linux (we are using 5.7). The issue, however, does not occur on MacOS.

If a tar contains a .bin folder then the following exception is raised, when attempting to extract the library.

fs.js:223
  binding.open(path, stringToFlags(flags), mode, callback);
          ^
TypeError: Bad argument
    at Object._open (fs.js:223:11)
    at open (/usr/local/steelmesh/node_modules/steelmesh-dash/node_modules/tar/node_modules/fstream/node_modules/graceful-fs/graceful-fs.js:73:6)
    at Object.open (/usr/local/steelmesh/node_modules/steelmesh-dash/node_modules/tar/node_modules/fstream/node_modules/graceful-fs/graceful-fs.js:67:3)
    at Object.lutimes (/usr/local/steelmesh/node_modules/steelmesh-dash/node_modules/tar/node_modules/fstream/node_modules/graceful-fs/graceful-fs.js:28:6)
    at setProps (/usr/local/steelmesh/node_modules/steelmesh-dash/node_modules/tar/node_modules/fstream/lib/writer.js:267:18)
    at Object.oncomplete (/usr/local/steelmesh/node_modules/steelmesh-dash/node_modules/tar/node_modules/fstream/lib/writer.js:205:7)

Feel free to flick me an email (damon.oehlman -at- sidelab.com) and I'll be able to send you an archive that replicates the problem. Example code of how I'm running the extract process is below (I'm using the write interface rather than piping streams):

fs.readFile(packageFile, function(err, input) {
    if (err) {
        callback({ errors: ['Could not read package' ]});
        return;
    }

    // unzip the file
    zlib.unzip(input, function(err, buffer) {
        if (err) {
            callback(_.extend(data, { errors: ['Could not unzip package '] }));
        }
        else {
            // we have an in memory tar... time to process it
            var extractor = new Extract({ path: _getPackageFolder(packageFile) });

            extractor.on('entry', function() {
                clearTimeout(callbackTimer);
                callbackTimer = setTimeout(callback, 100);
            });

            // END handling doesn't seem to be working...
            extractor.on('end', function() {
                callback();
            });

            extractor.write(buffer);
        }
    });
});

Thanks again for taking the time to write a pure JS tar library :)

Cheers,
Damon.

Error: Adding a cache directory to the cache will make to world implode

After running into this issue trying to install package that were using tgz packages as dependencies, I tried to find where it could come from. I'm not sure I'm in the right place though, but here is what I get, in an empty folder :

$ npm install tar
npm http GET https://registry.npmjs.org/tar
npm http 304 https://registry.npmjs.org/tar
npm http GET https://registry.npmjs.org/tar/-/tar-0.1.17.tgz
npm http 200 https://registry.npmjs.org/tar/-/tar-0.1.17.tgz
npm ERR! Error: Adding a cache directory to the cache will make the world implode.
npm ERR!     at addLocalDirectory (/home/jeremie/lib/node_modules/npm/lib/cache.js:1060:45)
npm ERR!     at /home/jeremie/lib/node_modules/npm/lib/cache.js:1112:7
npm ERR!     at cb (/home/jeremie/lib/node_modules/npm/lib/utils/tar.js:144:7)
npm ERR!     at /home/jeremie/lib/node_modules/npm/lib/utils/tar.js:141:9
npm ERR!     at exports.unlock (/home/jeremie/lib/node_modules/npm/node_modules/lockfile/lockfile.js:43:43)
npm ERR!     at Object.oncomplete (fs.js:297:15)
npm ERR! If you need help, you may report this log at:
npm ERR!     <http://github.com/isaacs/npm/issues>
npm ERR! or email it to:
npm ERR!     <[email protected]>

npm ERR! System Linux 3.5.0-31-generic
npm ERR! command "/home/jeremie/bin/node" "/home/jeremie/bin/npm" "install" "tar"
npm ERR! cwd /home/jeremie/workspace/test
npm ERR! node -v v0.8.23
npm ERR! npm -v 1.2.18
npm ERR! 
npm ERR! Additional logging details can be found in:
npm ERR!     /home/jeremie/workspace/test/npm-debug.log
npm ERR! not ok code 0

Is it a problem on my end or is it really broken ?

github-generated tar files

I have trouble using this tar with a github-generated tar file. Steps:

mkdir node_modules
npm install tar
curl https://nodeload.github.com/documentcloud/underscore/tarball/1.3.1 > u.tar.gz
gunzip u.tar.gz
node tartest.js u.tar

where tartest.js looks like:

var fs = require('fs'),
    extract = require('tar').Extract,
    path = require('path'),
    fullpath = path.resolve(process.argv[1]);

console.log('untarring: ' + fullpath);

fs.createReadStream(fullpath)
    .pipe(extract({ path: 'ust' }))
    .on("error", function (err) { console.log('error: ' + err); })
    .on("end", function (msg) { console.log('DONE: ' + msg); });

I get the following output:

error: Error: invalid tar file
error: Error: unexpected eof
DONE: undefined

However, doing a tar -xvf u.tar works, using OS X.

I am still new to the code and tar, so it could very well be just user error.

tar.Extract is very slow

I'm trying to use node-tar to extract a tar file. The code is at:

https://github.com/raymondfeng/node-tar-perf/blob/master/untar.js

The performance is really bad comparing to tar command. For a 152MB tar with some big files, it took more than 1 min.

After debugging, I found out the tar entries are written out in chunks of 512 bytes. That is probably due to the tar format.

I did an experiment to add a buffered stream before sending to fs. The new code is:

https://github.com/raymondfeng/node-tar-perf/blob/master/untar-with-buffer.js

Now it only took around 1 second to extract the tar.

I initially tried to add the buffered-stream with node-tar code but I had trouble getting it working. Maybe fstream is a special.

I also wonder if Node stream/fs apis should have options to support buffering.

File name too long issue

I found that if filename is too long, it may results in filename been cut into specific length and lost file extension. I've tried cut filename so that it's length <= 100 (Standard of GNU tar), but problem still remains.

Thank you.

Can't install on eCryptfs

This is an odd one, I'll grant that; but you can't install node-tar or even clone this repository on an eCryptfs subdirectory, which is used on Ubuntu (et. al.) for encrypted home directories.

$ git clone https://github.com/isaacs/node-tar.git
Cloning into node-tar...
remote: Counting objects: 649, done.
remote: Compressing objects: 100% (231/231), done.
remote: Total 649 (delta 370), reused 647 (delta 368)
Receiving objects: 100% (649/649), 153.89 KiB | 68 KiB/s, done.
Resolving deltas: 100% (370/370), done.
error: unable to create file test/fixtures/200ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc (File name too long)
$ npm install tar
npm http GET https://registry.npmjs.org/tar
npm http 304 https://registry.npmjs.org/tar
npm http GET https://registry.npmjs.org/tar/-/tar-0.1.11.tgz
npm http 200 https://registry.npmjs.org/tar/-/tar-0.1.11.tgz

npm ERR! Error: UNKNOWN, unknown error '/home/rvagg/.npm/tar/0.1.11/___package.npm/package/test/fixtures/200ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc'
npm ERR! You may report this log at:
npm ERR!     <http://github.com/isaacs/npm/issues>
npm ERR! or email it to:
npm ERR!     <[email protected]>
npm ERR! 
npm ERR! System Linux 3.0.0-14-generic
npm ERR! command "node" "/usr/local/bin/npm" "install" "tar"
npm ERR! cwd /home/rvagg/git
npm ERR! node -v v0.6.6
npm ERR! npm -v 1.1.0
npm ERR! path /home/rvagg/.npm/tar/0.1.11/___package.npm/package/test/fixtures/200ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
npm ERR! code UNKNOWN
npm ERR! message UNKNOWN, unknown error '/home/rvagg/.npm/tar/0.1.11/___package.npm/package/test/fixtures/200ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc'
npm ERR! errno {}
npm ERR! 
npm ERR! Additional logging details can be found in:
npm ERR!     /home/rvagg/git/npm-debug.log
npm not ok

I only discovered this because of a failing npm update npm -g, the "UNKNOWN, unknown" isn't very helpful and took a bit of digging to figure out.

The issue is that eCryptfs can't handle filenames over 144 characters (at least ontop of ext4 which has a limit of 255. I think it may be different if the underlying fs has a different filename length limit). The workaround for me was to install as root who's home directory isn't set up as eCryptfs on Ubuntu (even using sudo your HOME is preserved so it still uses ~/.npm).

Does this particular filename need to be that long? GitHub even has random problems with the file when navigating to it: https://github.com/isaacs/node-tar/tree/master/test/fixtures

Fails with ENOENT: no such file or directory, open when filename contains "<"

This module fails to extract files which contain special characters such as < however when I use the tar command in cygwin these characters are automatically replaced.

The exact error i get is:

{ Error: ENOENT: no such file or directory, open 'C:\path\to\file\containing_<.json'
    at Error (native)
  errno: -4058,
  code: 'ENOENT',
  syscall: 'open',
  path: 'C:\\path\\to\\file\\containing_<.json' }

node-tar pack should allow the uid+gid to be overwritten

A common use case when creating a tar is to override the source filesystem uid+gid with 0:0 when creating the tar entries, so that when the tar is extracted by a root user it will correctly be written with root:root as the owner/group, rather than inheriting the uid+gid stored with the .tgz

e.g.,

$ tar -zcv --owner=root --group=root --numeric-owner -f /tmp/filename.tgz package/
$ cd /tmp && sudo tar -zxf /tmp/filename.tgz
$ ls -l package/*
-rw-r--r-- 1 root root package/package.json
etc.

Currently if I create a tar using node-tar and distribute it to someone who untars it as root, the files will be written with whatever uid/gid my user happens to have on the machine on which I created the tar

TypeError: me.error is not a function

I need help debugging a case where we're calling error() incorrectly in extended_header.js occasionally.

/Users/wizard/src/t2-compiler/node_modules/tar/lib/extended-header.js:138
  me.error(msg)
     ^

TypeError: me.error is not a function
    at error (/Users/wizard/src/t2-compiler/node_modules/tar/lib/extended-header.js:138:6)
    at ExtendedHeader.parse [as _parse] (/Users/wizard/src/t2-compiler/node_modules/tar/lib/extended-header.js:89:11)
    at emitOne (events.js:77:13)
    at ExtendedHeader.emit (events.js:169:7)
    at ExtendedHeader.Entry._read (/Users/wizard/src/t2-compiler/node_modules/tar/lib/entry.js:112:10)
    at ExtendedHeader.Entry.write (/Users/wizard/src/t2-compiler/node_modules/tar/lib/entry.js:69:8)
    at Parse._process (/Users/wizard/src/t2-compiler/node_modules/tar/lib/parse.js:105:29)
    at BlockStream.<anonymous> (/Users/wizard/src/t2-compiler/node_modules/tar/lib/parse.js:47:8)
    at emitOne (events.js:77:13)
    at BlockStream.emit (events.js:169:7)
    at BlockStream._emitChunk (/Users/wizard/src/t2-compiler/node_modules/block-stream/block-stream.js:145:10)
    at BlockStream.flush (/Users/wizard/src/t2-compiler/node_modules/block-stream/block-stream.js:70:8)
    at BlockStream.end (/Users/wizard/src/t2-compiler/node_modules/block-stream/block-stream.js:66:8)
    at Parse.end (/Users/wizard/src/t2-compiler/node_modules/tar/lib/parse.js:86:23)
    at onend (/Users/wizard/src/t2-compiler/node_modules/readable-stream/lib/_stream_readable.js:495:10)
    at g (events.js:260:16)

I'm working on a project where I need to parse every tar in npm. Sometimes I think I'm getting a bad tar file and I end up with this crash. Unfortunately there isn't a way to catch this error and since at one any time I may be streaming up to 20 tar files I'm at a loss to figuring out what causes it and doing any kind of logging or writing to disk to help determine the issue. When I retry I'm able to parse the tar without issue. I also am having trouble reproducing it with only 1 or 2 operations in flight at a time.

The function where I use tar is as follows.

const debug = require('debug')('t2:registry');
const gunzip = require('gunzip-maybe');
const tar = require('tar');
const got = require('got');


function checkForGypFile(url) {
  debug(`Scanning ${url} for binding.gyp`);
  const response = got.stream(url);
  const unzip = gunzip();
  const untar = tar.Parse();

  let clientResponse;
  response.on('response', res => (clientResponse = res));

  return new Promise((accept, reject) => {
    let detected = false;
    let finished = false;
    function finish() {
      if (finished) { return; }
      finished = true;
      response.end();

      // this seriously doesn't always exist
      if (clientResponse && clientResponse.destroy) {
        clientResponse.destroy();
      }

      if (!detected) {
        debug(`No binding.gyp found in ${url}`);
      } else {
        debug(`Found binding.gyp in ${url}`);
      }
      accept(detected);
    }

    function error(e) {
      if (finished) { return; }
      finished = true;
      reject(e);
    }

    untar.on('entry', (entry) => {
      entry.abort();
      const file = entry.props.path;
      if (file.match(/binding\.gyp$/)) {
        detected = true;
        finish();
      }
    });

    untar.on('end', finish);
    response.on('error', error);
    unzip.on('error', error);
    untar.on('error', error);

    response.pipe(unzip).pipe(untar);
  });
}

Hope anyone has some ideas or suggestions. Thanks!

No longer installs on Node 0.8

I imagine the response here will just be "upgrade," but I thought I'd open an issue anyway. Because node-tar has a ~ dependency on fstream and fstream was just released with a ^ dependency on graceful-fs, node-tar no longer installs on Node 0.8.

npm ERR! Error: No compatible version found: graceful-fs@'^3.0.2'
npm ERR! Valid install targets:
npm ERR! ["1.0.0","1.0.1","1.0.2","1.1.0","1.1.1","1.1.2","1.1.3","1.1.4","1.1.5","1.1.6","1.1.7","1.1.8","1.1.9","1.1.10","1.1.11","1.1.12","1.1.13","1.1.14","1.2.0","1.2.1","1.2.2","1.2.3","2.0.0","2.0.1","2.0.2","2.0.3","3.0.0","3.0.1","3.0.2"]
npm ERR!     at installTargetsError (/Users/tschaub/.nvm/v0.8.26/lib/node_modules/npm/lib/cache.js:719:10)
npm ERR!     at /Users/tschaub/.nvm/v0.8.26/lib/node_modules/npm/lib/cache.js:641:10
npm ERR!     at saved (/Users/tschaub/.nvm/v0.8.26/lib/node_modules/npm/node_modules/npm-registry-client/lib/get.js:138:7)
npm ERR!     at Object.oncomplete (fs.js:297:15)
npm ERR! If you need help, you may report this log at:
npm ERR!     <http://github.com/isaacs/npm/issues>
npm ERR! or email it to:
npm ERR!     <[email protected]>

Assuming people want to test on Node 0.8 (e.g. on Travis), I'm interested to hear suggestions for modules that depend on node-tar.

I think these are the alternatives:

  • shrinkwrap
  • wait for a release of 0.8 that includes a version of npm that uses a version of semver that can parse ^.

Needs example for tar.Pack. Does the packer even work?

I tried looking at what you've got in test/pack.js, but the test doesn't pass and I couldn't figure out how to create a simple example using tar.Pack.

It seems that it immediately pauses the fstream.Reader and then quits. Or at least I get events when I test fstream.Reader#on('entry', fn) by itself, but nothing once I pipe it to pack.

Reliance on local system when packing?

Curious if and how the local OS affects the use of this module. Does node-tar essentially mock the system tar lib or does it utilize the system's local tar lib somehow?

The reason I'm asking is because I'm using this module for Packing tars, then later validating the file by inspecting for the ustar magic number. This check fails if a tar contains an extended tar header and has PaxHeader in the filepath, and I'm not sure how that header is being created. The only thing I can think of is that the local system is causing this. I noticed you test for it here.

Any input is much appreciated, thank you!

Can't extract tar components into directory that doesn't have +x permission.

If a tarball contains a directory that isn't set to be searchable (+x permission flag), then the "tar" module will fail to extract its contents.

This is admittedly an odd corner case, but it came up in practice (see Hogan issue #52). And, FWIW, the traditional "tar" clients seem to handle this case, even though the resulting output isn't particularly useful.

Here's a quick script that demos the problem.

#!/bin/bash

# Set up the example tarfile.
curl -k -o hogan.tar.gz https://registry.npmjs.org/hogan/-/hogan-1.0.5-dev.tgz
rm -f hogan.tar
gunzip hogan.tar.gz

# Clean up from last run.
chmod -R +x out; rm -rf out 
rm -rf node_modules

# Problem!
npm install 'tar@~0.1.12'
node -e '
    var fs = require("fs");
    var tar = require("tar");
    fs.createReadStream("hogan.tar")
        .pipe(tar.Extract({ type: "Directory", path: "out" }));'

When I run this, I consistently get an error along the lines of:

Error: EACCES, permission denied '[...]/out/package/bin/hulk'

which is explained by the fact that the bin directory in the tarball isn't set to be searchable. I'd guess that the traditional tar utilities always create searchable directories and then only change permissions after extraction.

I hope this helps!

TarHeader.encode: writeNumeric incorrectly adds space to numeric fields

According to your comment in writeNumeric:

// god, tar is so annoying
// if the string is small enough, you should put a space
// between the octal string and the \0, but if it doesn't
// fit, then don't.

But according to the GNU Tar spec (emphasis mine):

The name, linkname, magic, uname, and gname are null-terminated character strings. All other fields are zero-filled octal numbers in ASCII. Each numeric field of width w contains w minus 1 digits, and a null.

It's possible that you are mimicking a tar variant that does not behave in this manner, in which case this is not a bug. But I was unable to find such a variant online.

Older variants use spaces instead of zeroes for padding; from Wikipedia:

Additionally, versions of tar from before the first POSIX standard from 1988 pad the values with spaces instead of zeroes.

In addition, Wikipedia suggests that some formats use an ending space instead of a NUL, but does not insinuate that a format ever used both.

a5337a6 breaks tests

Test run for 5da515d passes but a5337a6 does not (nor does v2.0.0 for that matter)

tero@fasaani:~/src/node-tar$ git checkout 5da515d
Note: checking out '5da515d'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at 5da515d... Emit finish from Extract when finished.
tero@fasaani:~/src/node-tar$ npm run test

> [email protected] test /home/tero/src/node-tar
> tap test/*.js

fixtures/
fixtures/200.tar
fixtures/200ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
fixtures/200L.hex
fixtures/200longlink.tar
fixtures/200longname.tar
fixtures/a.hex
fixtures/a.tar
fixtures/a.txt
fixtures/b.hex
fixtures/b.tar
fixtures/b.txt
fixtures/c.hex
fixtures/c.tar
fixtures/c.txt
fixtures/cc.txt
fixtures/dir/
fixtures/dir.tar
fixtures/foo.hex
fixtures/foo.js
fixtures/foo.tar
fixtures/hardlink-1
fixtures/hardlink-2
fixtures/omega.hex
fixtures/omega.tar
fixtures/omega.txt
fixtures/omegapax.tar
fixtures/packtest/
fixtures/r/
fixtures/Ω.txt
fixtures/r/e/
fixtures/r/e/a/
fixtures/r/e/a/l/
fixtures/r/e/a/l/l/
fixtures/r/e/a/l/l/y/
fixtures/r/e/a/l/l/y/-/
fixtures/r/e/a/l/l/y/-/d/
fixtures/r/e/a/l/l/y/-/d/e/
fixtures/r/e/a/l/l/y/-/d/e/e/
fixtures/r/e/a/l/l/y/-/d/e/e/p/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/r/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/r/-/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/r/-/p/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/r/-/p/a/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/r/-/p/a/t/
tar: Ignoring unknown extended header keyword 'SCHILY.dev'
tar: Ignoring unknown extended header keyword 'SCHILY.ino'
tar: Ignoring unknown extended header keyword 'SCHILY.nlink'
tar: Ignoring unknown extended header keyword 'SCHILY.dev'
tar: Ignoring unknown extended header keyword 'SCHILY.ino'
tar: Ignoring unknown extended header keyword 'SCHILY.nlink'
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/r/-/p/a/t/h/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/r/-/p/a/t/h/cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
fixtures/packtest/omega.txt
fixtures/packtest/star.4.html
tar: Ignoring unknown extended header keyword 'SCHILY.dev'
tar: Ignoring unknown extended header keyword 'SCHILY.ino'
tar: Ignoring unknown extended header keyword 'SCHILY.nlink'
fixtures/packtest/Ω.txt
fixtures/dir/sub/
ok test/00-setup-fixtures.js ............................ 4/4
slow chmod /home/tero/src/node-tar/test/tmp/extract-test/dir
slow utimes /home/tero/src/node-tar/test/tmp/extract-test/dir
slow chmod /home/tero/src/node-tar/test/tmp/extract-test/dir/sub
slow utimes /home/tero/src/node-tar/test/tmp/extract-test/dir/sub
ok test/extract-move.js ................................. 5/5
ok test/extract.js .................................... 45/45
ok test/header.js ..................................... 22/22
ok test/pack-no-proprietary.js ...................... 201/201
ok test/pack.js ..................................... 204/204
ok test/parse.js ...................................... 43/43
ok test/zz-cleanup.js ................................... 3/3
total ............................................... 527/527

ok
tero@fasaani:~/src/node-tar$ git checkout master
Previous HEAD position was 5da515d... Emit finish from Extract when finished.
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
tero@fasaani:~/src/node-tar$ npm run test

> [email protected] test /home/tero/src/node-tar
> tap test/*.js

fixtures/
fixtures/200.tar
fixtures/200ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
fixtures/200L.hex
fixtures/200longlink.tar
fixtures/200longname.tar
fixtures/a.hex
fixtures/a.tar
fixtures/a.txt
fixtures/b.hex
fixtures/b.tar
fixtures/b.txt
fixtures/c.hex
fixtures/c.tar
fixtures/c.txt
fixtures/cc.txt
fixtures/dir/
fixtures/dir.tar
fixtures/foo.hex
fixtures/foo.js
fixtures/foo.tar
fixtures/hardlink-1
fixtures/hardlink-2
fixtures/omega.hex
fixtures/omega.tar
fixtures/omega.txt
fixtures/omegapax.tar
fixtures/packtest/
fixtures/r/
fixtures/Ω.txt
fixtures/r/e/
fixtures/r/e/a/
fixtures/r/e/a/l/
fixtures/r/e/a/l/l/
fixtures/r/e/a/l/l/y/
fixtures/r/e/a/l/l/y/-/
fixtures/r/e/a/l/l/y/-/d/
fixtures/r/e/a/l/l/y/-/d/e/
fixtures/r/e/a/l/l/y/-/d/e/e/
fixtures/r/e/a/l/l/y/-/d/e/e/p/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/
tar: Ignoring unknown extended header keyword 'SCHILY.dev'
tar: Ignoring unknown extended header keyword 'SCHILY.ino'
tar: Ignoring unknown extended header keyword 'SCHILY.nlink'
tar: Ignoring unknown extended header keyword 'SCHILY.dev'
tar: Ignoring unknown extended header keyword 'SCHILY.ino'
tar: Ignoring unknown extended header keyword 'SCHILY.nlink'
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/r/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/r/-/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/r/-/p/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/r/-/p/a/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/r/-/p/a/t/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/r/-/p/a/t/h/
fixtures/r/e/a/l/l/y/-/d/e/e/p/-/f/o/l/d/e/r/-/p/a/t/h/cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
fixtures/packtest/omega.txt
fixtures/packtest/star.4.html
tar: Ignoring unknown extended header keyword 'SCHILY.dev'
tar: Ignoring unknown extended header keyword 'SCHILY.ino'
tar: Ignoring unknown extended header keyword 'SCHILY.nlink'
fixtures/packtest/Ω.txt
fixtures/dir/sub/
ok test/00-setup-fixtures.js ............................ 4/4
not ok test/dir-normalization.js ........................ 7/9
    Command: "/usr/bin/nodejs dir-normalization.js"
    TAP version 13
    ok 1 cleaned!
    ok 2 tar entry 1 fixtures/
    ok 3 tar entry 2 fixtures/the-chumbler
    ok 4 should see 2 entries
    ok 5 unpacked file 1
    ok 6 unpacked file 2 /fixtures
    not ok 7 unpacked file 3 /fixtures/the-chumbler
      ---
        file:   /home/tero/src/node-tar/test/dir-normalization.js
        line:   131
        column: 9
        stack:  
          - |
            getCaller (/home/tero/src/node-tar/node_modules/tap/lib/tap-assert.js:439:17)
          - |
            assert (/home/tero/src/node-tar/node_modules/tap/lib/tap-assert.js:21:16)
          - |
            equivalent (/home/tero/src/node-tar/node_modules/tap/lib/tap-assert.js:183:12)
          - |
            Function.similar (/home/tero/src/node-tar/node_modules/tap/lib/tap-assert.js:284:10)
          - |
            Test._testAssert (/home/tero/src/node-tar/node_modules/tap/lib/tap-test.js:87:16)
          - |
            DirReader.foundEntry (/home/tero/src/node-tar/test/dir-normalization.js:131:9)
          - |
            DirReader.emit (events.js:95:17)
          - |
            DirReader.emitEntry (/home/tero/src/node-tar/node_modules/fstream/lib/dir-reader.js:249:8)
          - |
            LinkReader.EMITCHILD (/home/tero/src/node-tar/node_modules/fstream/lib/dir-reader.js:158:12)
          - |
            LinkReader.emit (events.js:95:17)
        found:  
          path:     /fixtures/the-chumbler
          mode:     120777
          type:     SymbolicLink
          depth:    2
          size:     76
          linkpath: /home/tero/src/node-tar/test/tmp/dir-normalization-test/a/b/c/d/the-chumbler
          nlink:    1
        wanted: 
          path:     /fixtures/the-chumbler
          mode:     120755
          type:     SymbolicLink
          depth:    2
          size:     95
          linkpath: /home/tero/src/node-tar/test/tmp/dir-normalization-test/a/b/c/d/the-chumbler
          nlink:    1
        diff:   |
          {
            "path" : "/fixtures/the-chumbler",
            "mode" : "120777", // != "120755"
            "type" : "SymbolicLink",
            "depth" : 2,
            "size" : 76, // != 95
            "linkpath" : "/home/tero/src/node-tar/test/tmp/dir-normalization-test/a/b/c/d/the-chumbler",
            "nlink" : 1
          }
      ...
    ok 8 should have 3 items
    not ok 9 test/dir-normalization.js
      ---
        exit:    1
        command: "/usr/bin/nodejs dir-normalization.js"
      ...

    1..9
    # tests 9
    # pass  7
    # fail  2

slow chmod /home/tero/src/node-tar/test/tmp/extract-test/dir
slow utimes /home/tero/src/node-tar/test/tmp/extract-test/dir
slow chmod /home/tero/src/node-tar/test/tmp/extract-test/dir/sub
slow utimes /home/tero/src/node-tar/test/tmp/extract-test/dir/sub
ok test/extract-move.js ................................. 5/5
ok test/extract.js .................................... 45/45
ok test/header.js ..................................... 22/22
ok test/pack-no-proprietary.js ...................... 201/201
ok test/pack.js ..................................... 204/204
ok test/parse.js ...................................... 43/43
ok test/zz-cleanup.js ................................... 3/3
total ............................................... 534/536

not ok

Superceded by tar-stream?

It looks to me like tar-stream has more functionality, generatlity, and a better API. Should work be done on node-tar when another package covers all that ground better than this one does?

[Q] The way not to use posix headers ?

Hi @isaacs
Is there any way to make tar to force not to use posix headers

I encountered an issue that some certain tar extractor module does not support posix headers properly..
So, I wonder if there is any way not to use posix headers in using this tar node module like this.
https://dev.openwrt.org/browser/trunk/tools/ipkg-utils/patches/200-force_gnu_format.patch?rev=34261

not compatible with [email protected]

when I try to parse and repack a tarball, the result looks like a tar file,
but is not compatible with gnutar.

//extract.js
var tar = require('./')
var zlib = require('zlib')

process.stdin
  .pipe(zlib.createGunzip())
  .pipe(tar.Parse())
  .pipe(tar.Pack())
  .pipe(zlib.createGzip())
  .pipe(process.stdout)

then I run this command:

node extract.js < ~/.npm/tar/0.1.19/package.tgz | tar -tz

which should transform a tarball into a node-tar tarball,
and then list it's entries with the tar command.
output:

package.json
.npmignore
tar: Skipping to next header
README.md
tar: Skipping to next header
LICENCE
tar: Skipping to next header
tar.js
tar: Skipping to next header
.travis.yml
extracter.js
tar: Skipping to next header
reader.js
tar: Skipping to next header
buffer-entry.js
tar: Skipping to next header
entry-writer.js
tar: Skipping to next header
entry.js
extended-header-writer.js
extended-header.js
extract.js
global-header-writer.js
header.js
pack.js
parse.js
00-setup-fixtures.js
extract.js
header.js
pack-no-proprietary.js
pack.js
parse.js
zz-cleanup.js
fixtures.tgz
tar: Exiting with failure status due to previous errors

It looks like there is something messed up with the paths?
when I do: |gunzip | less I see that the directories have been chopped off.

package.json^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@000644 ^@001750 ^@000012 ^@0000000744 ^@12173645457^@011435 ^@0^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ustar^@00^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@000000 ^@000000 ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@{
  "author": "Isaac Z. Schlueter <[email protected]> (http://blog.izs.me/)",
  "name": "tar",
  "description": "tar for node",
  "version": "0.1.18",
  "repository": {
    "type": "git",
    "url": "git://github.com/isaacs/node-tar.git"
  },
  "main": "tar.js",
  "scripts": {
...

there also seems to be spaces at the end of the numbers, but there isn't spaces at the end of numbers in the tarball which does work.
Is there an option that makes this work by default? I've been looking through the npm code to see how it packs/unpacks stuff, but I can't find anything unusual.

Can't transform packages in memory using node-tar

Just for reference, as talked about in irc.

Getting a package from the interwebz and piping it to a parser, changing its directory structure and then sending it back to the interwebz should be possible:

var request = require("request");
var tar     = require("tar");
var zlib    = require("zlib");

request
  .get("https://nodeload.github.com/dscape-testing/hello-world-flatiron-api/"+
       "tarball/master")
  .pipe(zlib.Gunzip())
  .pipe(tar.Parse())
  .on("entry", function (entry) {
    var path = entry.path && entry.path.split("/");
    if(Array.isArray(path)) {
      path[0] = "package"
      entry.path = path.join("/");
      console.log(entry.path);
    }
    return entry;
  })
  .pipe(tar.Pack({ noProprietary: true }))
  .pipe(zlib.createGzip())
  .pipe(fstream.Writer("output.tar.gz"))

Right now this errors out:

/Users/dscape/Desktop/dev/testing-tar/node_modules/tar/lib/pack.js:137
  Object.keys(entry.props).forEach(function (k) {
         ^
TypeError: Object.keys called on non-object
    at Function.keys (native)
    at Pack._process (/Users/dscape/Desktop/dev/testing-tar/node_modules/tar/lib/pack.js:137:10)
    at Pack.add (/Users/dscape/Desktop/dev/testing-tar/node_modules/tar/lib/pack.js:66:8)
    at Pack.<anonymous> (/Users/dscape/Desktop/dev/testing-tar/node_modules/tar/lib/pack.js:42:8)
    at Pack.EventEmitter.emit (events.js:93:17)
    at Parse.Stream.pipe (stream.js:112:8)
    at Parse.Reader.pipe (/Users/dscape/Desktop/dev/testing-tar/node_modules/tar/node_modules/fstream/lib/reader.js:238:32)
    at Object.<anonymous> (/Users/dscape/Desktop/dev/testing-tar/index.js:19:4)
    at Module._compile (module.js:449:26)
    at Object.Module._extensions..js (module.js:467:10)

Bummer 🈹

Spam in tar package.json from npmjs.org

I'm not 100% sure the problem was specific to this module, but I got this error while trying to install tar with npm. Obviously, someone injected something bad into a response from npmjs.org.

npm http 200 https://registry.npmjs.org/tar
npm ERR! registry error parsing json
npm ERR! SyntaxError: Unexpected token y
npm ERR! ny-leone-porn-sex-scene/Sunny-Leones-deleted-sex-scene-from-Ragini-MMS-2/photostory/33010796.cms"})^@9"},"��������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������npmVersion":"0.2.7-2","_nodeVersion":"v0.3.1-pre","dist":{"tarball":"http://registry.npmjs.org/tar/-/tar-0.0.1
.tgz","shasum":"2f94ccd48020df33ade32241ccb21ea109b11f56"}},"0.1.0":{"author":{"name":"Isaac Z. Schlueter","email":"[email protected]","url":"http://blog.izs.me/"},"name":"tar","description":"tar for node","version":"0.1.0","repository":{"type":"git","url":"git://github.com/isaacs/node-tar.git"},"main":"tar.js","scripts":{"test":"rm -rf test/tmp; tap test/*.js"},"engines":{"node":"~0.5.9 || 0.6 || 0.7 || 0.8"},"dependencies":{"inh
erits":"1.x","block-stream":"*","fstream":"~0.1"},"devDependencies":{"tap":"0.x","rimraf":"1.x"},"_npmUser":{"name":"isaacs","email":"[email protected]"},"_id":"[email protected]","_engineSupported":true,"_npmV

The response continues on with normal JSON, but it's rather spooky to get something like this from the npmjs.org registry.

Fixtures are too long...

Any workaround in for Ubuntu 12.04 with ext4 ?

⇾ npm install tar
npm WARN package.json [email protected] No README.md file found!
npm http GET https://registry.npmjs.org/tar
npm http 200 https://registry.npmjs.org/tar
npm http GET https://registry.npmjs.org/tar/-/tar-0.1.15.tgz
npm http 200 https://registry.npmjs.org/tar/-/tar-0.1.15.tgz
npm ERR! Error: ENAMETOOLONG, open '/home/charles/tmp/npm-25493/1359832037556-0.9535427638329566/package/test/fixtures/200ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc'
npm ERR! If you need help, you may report this log at:
npm ERR!     <http://github.com/isaacs/npm/issues>
npm ERR! or email it to:
npm ERR!     <[email protected]>

npm ERR! System Linux 3.2.0-37-generic-pae
npm ERR! command "/home/charles/.nvm/v0.8.18/bin/node" "/home/charles/.nvm/v0.8.18/bin/npm" "install" "tar"
npm ERR! cwd /home/charles/Repositories/advertiser
npm ERR! node -v v0.8.18
npm ERR! npm -v 1.2.2
npm ERR! path /home/charles/tmp/npm-25493/1359832037556-0.9535427638329566/package/test/fixtures/200ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
npm ERR! code ENAMETOOLONG
npm ERR! errno 49
npm ERR! 
npm ERR! Additional logging details can be found in:
npm ERR!     /home/charles/Repositories/advertiser/npm-debug.log
npm ERR! not ok code 0

tar.Extract class should accept a 'filter' function

I would expect this to work:

var tar = require("../tar.js")
  , fs = require("fs")

fs.createReadStream(__dirname + "/../test/fixtures/c.tar")
  .pipe(tar.Extract({ path: __dirname + "/extract", filter: filter }))

function filter (entry) {
  if (entry.props.path == 'something') {
    return true
  }
  return false
}

But it doesn't; the filter callback is never invoked and all the tar files are extracted regardless.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.