This Repo will be updated through my learning path at WorldQuant University.
First Project 1-Housing in Mexico
For this project the dataset is with 21,000 properties for sale in Mexico through the real estate website Properati.com. Your goal is to determine whether sale prices are influenced more by property size or location.
Some of the things you'll learn in this project are:
How to organize information using basic Python data structures.
How to import data from CSV files and clean it using the pandas library.
How to create data visualizations like scatter and box plots.
How to examine the relationship between two variables using correlation.
Second Project 2-Housing in Buenos Aires
For this project, you'll build on those skills and move from descriptive to predictive data science. Your focus is still real estate, but now you need to create a machine learning model that predicts apartment prices in Buenos Aires, Argentina.
Some of the things you'll learn in this project are:
How to create a linear regression model using the scikit-learn library.
How to build a data pipeline for imputing missing values and encoding categorical features.
How to improve model performance by reducing overfitting.
How to create a dynamic dashboard for interacting with your completed model.
Third project 3-Air Quality in Nairobi
For this project, you'll work with data from one of Africa's largest open data platforms openAfrica. You'll look at air quality data from Nairobi, Lagos, and Dar es Salaam; and build a time seriesmodel to predict PM 2.5 readings throughout the day.
Some of the things you'll learn in this project are:
How to get data by querying a MongoDB database.
How to prepare time series data for analysis.
How to build an autoregression model.
How to improve a model by tuning its hyperparameters.
Time series models are not only important in public health; they're a key part of Financial Engineering. Plus, concepts you learn in this project will be helpful in Natural Language Processing.