Giter Site home page Giter Site logo

insight's Introduction

Rent or Buy

The goal of my Insight project was to create a web app http://randomnumbergator.com/input that helps users navigate the NYC residential real estate market based on permits information. The permits data is available from https://data.cityofnewyork.us/Housing-Development/DOB-Permit-Issuance/ipu4-2q9a and the home values and rents data is available from Zillow https://www.zillow.com/research/data/. The project can be divided into three parts. The first part of the project was to understand the causal impacts of permits on rents or home values, or vice versa. The second was to incorporate permits information into a predictive model. Finally, we tried to understand, which permits, if any, affect home values and rents.

Rents vs Permits Prices vs Permits

As we can see from the plots above, the data is non-stationary. There is a clear trend in the data. We take first differences to try and deal with this, which seems to take care of the issue. In addition, to determine the number of autoregressive terms we do choose the one that minimizes the log likelihood. In this case, it turns out to be 2.

For the first part, we used the concept of Granger causality https://en.wikipedia.org/wiki/Granger_causality for time series. This is a statistical test for determining whether one time series is helpful in forecasting another. However, this is a method for univaritate time series, so I used a Singular Value Decomposition for dimensionality reduction and ran the Granger test on the principal eigenvectors.

Predictor Response F
Permits Rents 1.43
Permits Prices 0.86
Prices Permits 0.38
Rents Permits 0.15

The table clearly indicates that the probability that Permits have a causal effect on Home values and Rents is much higher than the other way round.

For the predictive modeling we are interested in leveraging both home values and rent information as well as permits data from all neighborhoods. Thus we use an L1 regularized Vector Autoregressive model (VAR), a Bayesian Autoregressive model (BAR) with a horseshoe prior http://proceedings.mlr.press/v5/carvalho09a/carvalho09a.pdf, a Recurrent Neural Network (RNN) with an LSTM layer and compared them Facebook's Prophet https://facebookincubator.github.io/prophet/. For the VAR, RNN and BAR models we used 2 autoregressive terms.

In this case, it turns out that the simplest model - the โ„“1-VAR has the best performance in terms of predictive ability. However, for model selection, the Bayesian paradigm made more sense.

The pupose of model selection is to understand which permits have the biggest impacts on rents and home values. Any non-zero coefficient from the BAR model indicates an effect on home values and rents. To figure out what permits have the largest effects we look at the largest coeffcients divided by the standard deviations. The green dot shows the area we are interested in. The blue dots show the largest positive impacts, while the red dots show the largest negative impacts with larger circles indicating a larger effect.

Bayesian Variable Selection

insight's People

Contributors

shr264 avatar

Stargazers

Mike Moran avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.