The goal of this assignment is to implement two MapReduce programs in Java (using Apache Hadoop). Specifically, my MapReduce jobs will analyzing a data set consisting of New York City Taxi trip reports in the Year 2013.
This is about 20 GB of data in all. Using Amazon EMR, I use those files as an input to a Hadoop MapReduce job.