Giter Site home page Giter Site logo

amy-panda / nba_career_prediction Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 69.77 MB

Predicting if a NBA rookie player will last at least 5 years in the league

Python 0.06% Jupyter Notebook 9.96% Dockerfile 0.01% HTML 89.98%
crisp-dm imbalanced-data imputation-methods feature-engineering classification-algorithims hyperparameters-tuning

nba_career_prediction's Introduction

NBA Career Prediction

Following the Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology, this project undertook data processing and developed multiple classification models to forecast whether a rookie player would continue playing in the NBA league for at least five years. These models encompassed Logistic Regression, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Random Forest, AdaBoost, and XGBoost. The top-performing classifiers, Logistic Regression and XGBoost, were identified based on key performance metrics, including ROC-AUC scores and Confusion matrix.

๐Ÿค Contributors

  • Amy Yang
  • Chanthru Vimalasri
  • Yatindra Vegunta

๐Ÿ—ผ Project Organization

โ”œโ”€โ”€ README.md          <- README file with project details.
|
โ”œโ”€โ”€ data
โ”‚ย ย  โ”œโ”€โ”€ external       <- Data from third party sources.
โ”‚ย ย  โ”œโ”€โ”€ interim        <- Intermediate data that has been transformed.
โ”‚ย ย  โ”œโ”€โ”€ processed      <- Including training and validation sets.
โ”‚ย ย  โ””โ”€โ”€ raw            <- Including 2022_train.csv and 2022_test.csv files.
โ”‚
โ”œโ”€โ”€ models             <- Trained and serialized models, model predictions, or model summaries
โ”‚
โ”œโ”€โ”€ notebooks          <- Jupyter notebooks. Including the data preprocessing and two best models. 
โ”‚
โ”œโ”€โ”€ reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
โ”‚ย ย  โ””โ”€โ”€ figures        <- Generated graphics and figures to be used in reporting
โ”‚
โ”œโ”€โ”€ requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
โ”‚                         generated with `pip freeze > requirements.txt`
โ”‚
โ”œโ”€โ”€ setup.py           <- makes project pip installable (pip install -e .) so src can be imported
|
|
โ””โ”€โ”€ src                <- Source code for use in this project.
ย ย   โ”œโ”€โ”€ __init__.py    <- Makes src a Python module
    โ”‚
ย ย   โ”œโ”€โ”€ data           <- Scripts to download or generate data
ย ย   โ”‚ย ย  โ””โ”€โ”€ sets.py  
    โ”‚
ย ย   โ”œโ”€โ”€ features       <- Scripts to turn raw data into features for modeling
ย ย   โ”‚ย ย  โ””โ”€โ”€ build_features.py
    โ”‚
    |
    โ””โ”€โ”€ models         <- Scripts to train models and then use trained models to make
        โ”‚                 predictions
ย ย   ย ย   โ”œโ”€โ”€ null.py
ย ย  ย  ย   โ””โ”€โ”€ performance.py

Note: The project organisation above is adapted with the cookiecutter data science project template.

๐Ÿ›  Tools and Techniques

  • Feature engineering
  • Imputation methods such as single imputation by using mean/median, multiple imputation and Nearest neighbour imputation
  • Imbalance data treatment including oversampling, undersampling, STOME and hyperparameter setting
  • Model training with the packages including lazypredict and scikit-learn
  • Hyperparameter tuning with random search, grid search and automatic search using the Hyperopt package
  • Model evaluation with ROC-AUC score and Confusion Matrix plot

โ„น๏ธ Data Source

Kaggle Competition [UTS AdvDSI 2022-11] NBA Career Prediction

nba_career_prediction's People

Contributors

amy-panda avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.