Giter Site home page Giter Site logo

nba-prediction-algorithms's Introduction

NBA-Prediction-Algorithms

This is an unofficial Python-based parser of www.basketball-reference.com allowing NBA and statistics enthusiasts to observe and analyze NBA data.

Basketball Reference Logo

Goals

The goal of this repository is to provide easy-to-execute methods to access NBA player and team data.

Using this repo, our hope is that everyday fans like us can perform statistical analyses to potentially build models that can:

  1. Regress team wins by player biodata (i.e. win percentage vs. height/age/etc)
  2. Generate comprehensive statistics to determine the GOAT (greatest of all time) and GOED (greatest of each decade)
  3. Assess anomalies in betting lines

and much more.

Getting Started

The two easiest ways to utilize this code is either by using it in the Google Colab environment, or setting up your own virtualenv.

Google Colab

The following notebooks have been provided if you prefer to use Google Colab:

  • single_player_search.ipynb: scrape biodata and basic (per game, total) and advanced (per-minute, per-possession, per-play, shooting, salary) season-wide statistics for single players, as well as gamelogs for that player
  • basketball-reference-scraper.ipynb: scrape the same data as single_player_search.ipynb but for all players in the entire database
    • Note: given the tens of thousands of html requests, this takes many hours to complete, but the data can be saved so that you only have to run this once

Virtualenv

Follow the Hitchhiker's guide to python to set up a virtual environment. Most of the packages used come native to python3, but you may need to install these using the following commands

BeautifulSoup: pip3 install bs4

tqdm: pip3 install tqdm

Scraping

Single Player Function Calls

kb_meta, kb_data, kb_gamelogs = single_player_scraper('Kobe Bryant')

Pandas DataFrames

kb_meta

Biodata for Kobe Bryant

kb_data

Season data for Kobe Bryant

kb_gamelogs

Gamelogs for Kobe Bryant

Entire Database Scraping Calls

Only needs to be completed once.

ROOT = <set/path/to/repo>
df_players_meta, df_players_data, df_players_gamelogs = main(ROOT)

After scraping, you can access the data by unpickling the DataFrames

players_df_meta = pickle_load(DATA_PATH+'players_df_meta.pkl')

players_df_data = pickle_load(DATA_PATH+'players_df_data.pkl')

players_df_gamelogs = pickle_load(DATA_PATH+'players_df_gamelogs.pkl')

Example Analyses

Biodata — Filter for Largest Recorded Players

# Player Meta Query
df_large = df_players_meta.loc[(df_players_meta['height']>80) & 
                   (df_players_meta['weight']>30)]

df_large = df_large.dropna(how='all', axis='columns') # drops all columns that are empty
display(df_large[df_large['weight'] == df_large['weight'].max()])

Season Data — Find Specific Game Stat Maxes

Select any of the following data types for different table fields:

  • Per Game (per_game)
  • Totals (totals)
  • Advanced
  • Per Minute
  • Per Possession
  • Adjusted Shooting
  • Play-By-Play
  • Shooting
  • All-Star
  • Salaries

The following is a an example to filter for Per Game statistics:

table_selected = players_df_data[players_df_data['data_type'] == 'per_game']

table_selected[table_selected['pts_per_g'] == np.nanmax(table_selected['pts_per_g'])]

Gamelogs — Find Specific Game Stat Maxes

players_df_gamelogs[players_df_gamelogs['pts']==np.nanmax(players_df_gamelogs['pts'])]

Acknowledgements

This project was built entirely by Rahim Hashim ([email protected]) and Aunoy Poddar ([email protected]). None of this could have been done without the tireless and comprehensive effort of those who work at Basketball Reference providing an open-source, API-friendly database containing millions of datapoints from which the entirety of this codebase is built.

nba-prediction-algorithms's People

Contributors

rahim-hashim avatar aunoyp avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.