dataplayr / dse230_data_analysis_using_hadoop_and_spark_ucsd Goto Github PK
View Code? Open in Web Editor NEWThis project forked from mgalarnyk/dse230_data_analysis_using_hadoop_and_spark_ucsd
Map-reduce, streaming analysis, and external memory algorithms and their implementation using the Hadoop and its eco-system: HBase, Hive, Pig and Spark. The class will include assignment of analyzing large existing databases.