💗관심분야
- Hadoop, Spark, Kafka, Docker ...
- 대용량 데이터 처리, 분산 시스템, 데이터 분석
🌺블로그 데이터 엔지니어가 되기위해 공부하고 있는 모든 것!!
🎀프로젝트 웹 개발/데이터 분석/추천시스템/딥러닝 등
🌸CS 자료구조/알고리즘/컴퓨터구조/운영체제/네트워크/데이터베이스 등
🧁논문리뷰
- Piranha : Optimizing Short Jobs in Hadoop, Elmeleegy K
- Robert H Bonczek, Clyde W Holsapple, and Andrew B Whinston. Foundations of decision support systems. Academic Press, 2014.
- Yingyi Bu, Bill Howe, Magdalena Balazinska, and Michael D Ernst. Haloop: efficient iterative data processing on large clusters. Proceedings of the VLDB Endowment,
- An Experimental Comparison of Pregel-like, Systems G Han M Daudjee K Ammar KOzsu M Wang X Jin T
- Twister : A Runtime for Iterative MapReduce, Ekanayake J Li H Zhang B Gunarathne TBae S Qiu J Fox G
- The Hadoop Distributed File System, Shvachko K Kuang H Radia S Chansler
- MapReduce : Simplified Data Processing on Large Clusters, Dean J Ghemawat S
- Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large clusters. Communications of the aCM, 51(1):107–113, 2008.
- Hive: a warehousing solution over a map-reduce framework. Proceedings of the VLDB
- MapReduce Online, Condie T Conway N Alvaro P Hellerstein JElmeleegy K Sears R
- PACMan: Coordinated memory caching for parallel jobs, Ananthanarayanan G Ghodsi A Wang A
- Hive: a warehousing solution over a map-reduce framework
- Resilient Distributed Datasets : A Fault-Tolerant Abstraction for In-Memory Cluster Computing, Zaharia M Chowdhury M Das T Dave A Ma JMccauley M
- Flink Forward conference in Berlin. Flink vs spark slideshare. http://www.slideshare.net/sbaltagi/flink-vs-spark? related=2.
- Resilient Distributed Datasets : A Fault-Tolerant Abstraction for In-Memory Cluster Computing, Zaharia M Chowdhury M Das T Dave A Ma JMccauley M Franklin M
- Streaming Data Analysis using Apache Cassandra and Zeppelin
- Analysis of Hadoop performance and unstructured data using Zeppelin
- Haloop efficient iterative data processing on large clusters
- iMapReduce: A Distributed Computing Framework for Iterative Computation
- Improving MapReduce Performance in Heterogeneous Environments