Pramit Mitra's Projects
The exercises/samples repo for bit.ly/9stepsawesome presentation
Containerization of Database
2 month data structures and algorithmic scripting challenge starting from 20th December 2018 - Coding is Fun! šÆšÆ Do it everyday!! Also, Do give us a ā if you liked the repository
Machine Learning and Data Analysis Case Studies using Spark.
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
List of Data Science Cheatsheets to rule the world
Monospaced font with programming ligatures
Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
a list of machine learning books, covering ML, RL, NLP
Repository for the code used in my Medium articles
Machine learning cheatsheet
PipelineAI: Real-Time Enterprise AI Platform
Config files for my GitHub profile.
Example project implementing best practices for PySpark ETL jobs and applications.
Python3Code
Apache Spark 3 - Structured Streaming Course Material
Pyspark Standalone codebase
This repo is to capture Spark-Sql function and use cases
Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs and reports.