Giter Site home page Giter Site logo

flsmedia / predicting_real_estate_prices_using_scikit-learn Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mbkraus/predicting_real_estate_prices_using_scikit-learn

0.0 0.0 0.0 891 KB

Predicting Amsterdam house / real estate prices using Ordinary Least Squares-, XGBoost-, KNN-, Lasso-, Ridge-, Polynomial-, Random Forest-, and Neural Network MLP Regression (via scikit-learn)

Python 100.00%

predicting_real_estate_prices_using_scikit-learn's Introduction

Predicting Amsterdam house / real estate prices using Ordinary Least Squares-, XGBoost-, KNN-, Lasso-, Ridge-, Polynomial-, Random Forest-, and Neural Network MLP Regression (via scikit-learn)

Approach:

  • load Pandas DataFrame containing (Dec-17) housing data retrieved by means of the following scraper, supplemented with longitude and latitude coordinates mapped to zip code (via GeoPy
  • do some simple data exploration / visualisation
  • remove non-numeric data, NaNs, and outliers (everything above 3 x standard dev of y)
  • define explanatory variables (surface,latitude,and longitude) and independent variable (price EUR)
  • split the data in train and test sets (+ normalise independent variables where required)
  • find the optimal model parameters using scikit-learn's GridSearchCV
  • fit the model using GridSearchCV's optimal parameters
  • evaluate estimator performance by means of 5 fold 'shuffled' nested cross-validation
  • predict cross validated estimates of y for each data point and plot on scatter diagram vs true y

Packages required

Scores (5 fold nested 'shuffled'cross-validation - Rsquared)

1. XGBoost Regression

  • Parameters: max_depth: 5, min_child_weight: 6, gamma: 0.01, colsample_bytree: 1, subsample: 0.7
  • Score: 0.887

2. Random Forest Regression

  • Parameters: max_depth: 6, max_feat: None, n_estimators: 10
  • Score: 0.839

3. Polynomial Regression

  • Parameters: degrees: 2
  • Score: 0.731

4. Neural Network MLP Regression

  • Parameters: act: relu, alpha: 0.01, hidden_layer_size: (10,10), learning_rate: invscal
  • Score: 0.715

5. KNN Regression

  • Parameters: n_neighbours: 10
  • Score: 0.711

6. Ordinary Least-Squares Regression

  • Parameters: None
  • Score: 0.694

7. Ridge Regression

  • Parameters: alpha: 0.01
  • Score: 0.694

8. Lasso Regression

  • Parameters: alpha 0.01
  • Score: 0.693

Sample data input (Pandas DataFrame)

   surface  rooms_new  zipcode_new  price_new   latitude  longitude
0    138.0        4.0         1060     420000  40.804672 -73.963420
1    130.0        5.0         1087     550000  52.355590   5.000561
2    116.0        5.0         1061     425000  52.373044   4.837568
3     92.0        5.0         1035     349511  52.416895   4.906767
4    127.0        4.0         1013    1050000  52.396789   4.876607

Scatter plot - Surface vs. Asking Price (EUR)

alt text

XGBoost - Predicted prices vs. True price (EUR)

alt text

Random Forest - Predicted prices vs. True price (EUR)

alt text

Polynomial - Predicted prices vs. True price (EUR)

alt text

Neural Network MLP - Predicted prices vs. True price (EUR)

alt text

KNN - Predicted prices vs. True price (EUR)

alt text

OLS - Predicted prices vs. True price (EUR)

alt text

Lasso - Predicted prices vs. True price (EUR)

alt text

Ridge - Predicted prices vs. True price (EUR)

alt text

predicting_real_estate_prices_using_scikit-learn's People

Contributors

mbkraus avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.