Giter Site home page Giter Site logo

tutorial-predicting_nba_games's Introduction

Predicting NBA Games using SQL Server R Services

In this repo I included the R code, T-SQL code, SQL Server database, and Power BI desktop file used in my presentation on SQL Server R Services. The steps that I used in the presentation are as follows:

  1. I developed the models to predict the final quarter of the 2008 - 2009 NBA season using data from the first 3/4 of the season. The model was developed in RTVS. Listed are the R scripts used to develop the model broken out into logical units of work:
- STEP_1.R:   Sets the working directory

- STEP_2.R:  The checkpoint library is used to make sure that the most recent version of the packages are used as of the date that was passed to the "checkpoint" function. It scans all of the scripts in the working directory to perform the test and if the test fails "checkpoint" will download the package that meets that criteria from MRAN.

- STEP_3.R:  Retrieves data from SQL Server  

- STEP_4.R:  Partitions the data into a data set that will be used to create the training data set and a data set that will be used to create the testing data set

- STEP_5.R:  This is the feature engineering step. The following features are created in this step:  day_type, home.team.record.level, away.team.record.level, home.team.overall.record.level, and away.team.overall.record.level.

- STEP_6.R:  In this step a XDF file is created to store the "model.train.data" data frame to disk. The XDF file was used by the rxGLM function to build 8 different logistic regression models. The models were ranked by using the AIC statistics.

- STEP_7.R:  A ROC curve is created in this step and the AUC statistics is also calculated. This information is used to test the worthiness of the model that had the highest AIC score.
  1. I created the NBAModels table and the AddModel stored procedure in the NBAPredictions database. The NBAModels table warehouses the actual model as well as additional columns for the model attributes. The AddModel stored procedure is used to get the serialized version of the model from R, convert the model to a binary format, then insert the model along with some of its attributes into the NBAModels table.

  2. I executed the STEP_8.R script to add the 8 models and some of its attributes to the NBAModels table by calling the AddModel stored procedure that was created in step 2.

  3. I created the AwayRecords.sql, HomeRecords.sql, OverallRecords.sql, and ScoreData.sql views. These views were used to create the features for the games in the last quarter of the season.

  4. I executed the PredictGameBatchMode stored procedure to predicted the last quarter of the season.

  5. I visualized the results in Power BI via the NBA_Report.pbix file.

The raw data was obtained from the www.basketballgeek.com/data/ website. I used the CombineGamePlayByPlayData.R script to combine all of the 1,176 individual game data files into one data frame. The resulting data frame was saved to a csv file and used as the source data for the PlayLevelData table data in the NBAPredictions database.

tutorial-predicting_nba_games's People

Contributors

dieselanalytics avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.