Giter Site home page Giter Site logo

csv-normalizer's Introduction

csv-normalizer

How to Build and Run

  • Clone the Repo
git clone https://github.com/hydeenoble/csv-normalizer.git
  • Install dependencies
npm install
  • Run application
node app.js < /path/to/csv/file.csv

Assumptions on the Input file

  • Input CSV is UTF-8
  • First row contains column headers
  • Column names are: Timestamp, Address, ZIP, FullName, FooDuration, BarDuration, TotalDuration, Notes

Problem Statement

Please write a tool that reads a CSV formatted file on stdin and emits a normalized CSV formatted file on stdout. Normalized, in this case, means:

  • The entire CSV is in the UTF-8 character set.
  • The Timestamp column should be formatted in ISO-8601 format.
  • The Timestamp column should be assumed to be in US/Pacific time; please convert it to US/Eastern.
  • All ZIP codes should be formatted as 5 digits. If there are less than 5 digits, assume 0 as the prefix.
  • All name columns should be converted to uppercase. There will be non-English names.
  • The Address column should be passed through as is, except for Unicode validation. Please note there are commas in the Address field; your CSV parsing will need to take that into account. Commas will only be present inside a quoted string.
  • The columns FooDuration and BarDuration are in HH:MM:SS.MS format (where MS is milliseconds); please convert them to a floating point seconds format.
  • The column "TotalDuration" is filled with garbage data. For each row, please replace the value of TotalDuration with the sum of FooDuration and BarDuration.
  • The column "Notes" is free form text input by end-users; please do not perform any transformations on this column. If there are invalid UTF-8 characters, please replace them with the Unicode Replacement Character.

You can assume that the input document is in UTF-8 and that any times that are missing timezone information are in US/Pacific. If a character is invalid, please replace it with the Unicode Replacement Character. If that replacement makes data invalid (for example, because it turns a date field into something unparseable), print a warning to stderr and drop the row from your output.

You can assume that the sample data we provide will contain all date and time format variants you will need to handle.

csv-normalizer's People

Contributors

hydeenoble avatar dependabot[bot] avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.