This repository contains a PySpark data analysis projects focused on exploring and analyzing various datasets using PySpark's DataFrame API. The project demonstrates the use of PySpark for big data processing, data exploration, transformation, and aggregation tasks. It includes real-world datasets and Jupyter notebooks showcasing the analysis and insights derived from the data.
asvivs / pyspark-data-analysis-project Goto Github PK
View Code? Open in Web Editor NEWThis repository contains a PySpark data analysis projects focused on exploring and analyzing various datasets using PySpark's DataFrame API.