Giter Site home page Giter Site logo

fileclerk's Introduction

ಠ‿ಠ fileclerk

A node.js library for file and folder collation. Sort. Separate. Organize.

npm version Build Status codecov Known Vulnerabilities

Overview

Fileclerk arose from a personal need to organize media files from disparate sources into a common location while maintaining a consistent folder structure.

More specifically, my family has media being synchronized, backed-up, copied from many different devices to wildly differing folder structures on a centralized NAS device. I wanted to get those sorted by file-type and by date.

I originally began writing it as a simple script that would run from a Docker container, but quickly realized the value of abstracting it out into a more general library. And here we are...

Requirements

Currently only works for >= Node.js v8

Install

npm install fileclerk

Usage

const FileClerk = require('fileclerk');

API

organize (sourcePath, targetPath [, options])

Returns a promise that resolves with a list of files that were moved / copied during the process.

Use Case

Pull files out of a recursive directory structure - starting at the sourcePath - and move or copy them to a different directory structure starting at the targetPath.

Example
  1. Files of many different types in an arbitrary-depth directory structure under a /INCOMING folder.
  2. Want to pull out all of the image files, move them to a new directory, and organize them by year, month, and full date (based on created date of source file).
  3. If any of the source directories below the sourcePath are empty after the operation, go ahead and delete those.
const options = {
  recursive: true,
  cleanDirs: true,
  extensions: ['jpg', 'png', 'tiff', 'gif'],
  collateFn: (file) => {
    const date = new Date(file.ctime);
    const year = date.getFullYear();
    const month = date.getMonth();
    const day = date.getDate();

    // Return the path relative to targetPath
    return `${year}/${month}/${year}-${month}-${day}/${file.filename}`;
  }
}

const results = await FileClerk.organize(
  '/INCOMING',
  '/Pictures',
  options,
);

organizeByDate (sourcePath, targetPath[, options])

Helper for organizing easily by date TODO: Document

organizeByExtension (sourcePath, targetPath[, options])

Helper for organizing easily by file extension

TODO: Document

organizeByAlphabetical (sourcePath, targetPath[, options])

Helper for organizing easily by alphabetical order TODO: Document


TODO

  • Documentation.
  • Transpile to support earlier Node.js versions
  • Handle paths in a cross-platform way. Not sure how this library behaves on Windows devices, for instance.
  • General resiliency improvements for edge-cases and non-happy paths

Contributing

  1. Fork repo
  2. Add / modify tests
  3. Add / modify implementation
  4. Open PR

License

MIT License

Copyright (c) 2018 Jonathan Griggs

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

fileclerk's People

Contributors

boatmeme avatar

Watchers

 avatar James Cloos avatar

fileclerk's Issues

Should check for existence of targetPath during dryRun

In CollationService.collate, we should check for the existence of the targetPath before even attempting to write to it.

This way, we can accurately report on the probable outcome of the collate operation during a dryRun.

Ability to specify list of Regex Expressions for includes / excludes

Currently, there are two ways to filter which files are selected to be organized:

  • Implement your own sourceFilter function and pass that in on the options object,
  • Pass in a string or array of strings in the extensions property on the options object.

Would like to add two new options parameters, includes and excludes.

These should be either String, RegExp, or an Array of Strings or RegExps. The logic should be implemented as follows:

  • If extensions not empty, the file extension must match ANY of the provided extensions
  • If includes not empty, the file path (relative to the sourcePath) must match ANY the provided regex strings or RegExps.
  • If excludes not empty, the file path (relative to the sourcePath) must NOT match ANY of the provided regex strings or RegExps.

Need to reconsider the default dateProperty to use when organizing by date.

Currently, the default File.stats property we are using is ctime, which I mistakenly interpreted as 'creation time'.

Sometimes ctime can be synonymous with the creation date, but not always.

Would want to consider an approach where we either default it to mtime, birthtime, or the earliest date amongst atime, mtime, birthtime.

Client should be able to specify the timezone used for organizeByDate collation

When organizing files into directories based on the various unix file timestamps, it would be helpful to specify the timezone that should be used when creating the directory names.

This is particularly useful in situations where the the fileclerk client may be running in an environment with a timezone that is different than the the one used to write the files in the source directories (Network shares, Docker, etc).

Exception on FileService.stat if file does not exist

FileService.stat throws an uncaught exception if the path does not exist.

This seemingly should never happen as we call this function immediately after having just obtained a list of file system objects that exist in the parent path.

In practice, there are situations where another process may be monitoring the source directory in real-time (thumbnail generator) and something like this occurs:

  1. Fileclerk moves some files successfully from source -> target directory
  2. Fileclerk checks for more files, gets a list back (including files in a temp /.@thumbnails directory
  3. Other process deletes files in /.@thumbnails directory
  4. Fileclerk tries to do an lstat on file in /.@thumbnails directory, using list generated by step 2.
  5. Exception thrown

What we should probably do here is just swallow the exception, and return null. This function is primarily used in the context of building a list of directories and files, so we should just compact that list to remove any nulls in it and safely assume any missing files were ones that were deleted in the narrow window between the ls and the lstat.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.