imuqtadir / nasdaq-stocksvolatility-mapreduce Goto Github PK

The project is aimed to find top ten most volatile stocks and top ten least volatile stocks using HDFS. HDFS is designed to store very large datasets reliably. Given are three datasets called large, medium and small. These datasets contain many files; each having stocks pricing data with respect to the company. We aim to use MapReduce in order to speed-up the process of reading files and processing them in parallel. The input files are read by multiple mappers and partitioned into smaller parts for their processing. The mapper then outputs <key, value> pair which serves as an input to the reducer. Reducer then takes the <key, List<value>> pair as it’s input and processes and outputs the results to the user.

Shell 18.54% Java 81.46%

Recommend Projects

imuqtadir / nasdaq-stocksvolatility-mapreduce Goto Github PK

nasdaq-stocksvolatility-mapreduce's People

Contributors

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent