NBA is one of the four majors sports competitions in America. It is very competitive, and all teams want to attract the best players. Imagine that your team is trying to hire a new player; how can we compare him with other players, and how much should you offer to get the best deal?
That project aims to help you answer these questions. More precisely, we make an in-depth statistical analysis of NBA players. Additionally, we will build machine learning models to predict a player's salary. As expected, a player's salary depends on many factors such as his positions, his performance, the home team, and many more.
We will use a public dataset to train and test our models. It can be downloaded from this website
This file contains an intensive list of variables. Here are some that we find interesting (please see the file a complete list of all variables. Alternatively, we can see them on Wikipedia as well). We list here some of these variables.
3P: Average number of shots made from beyond the 3-point line.
FT: Average number of Free Throws made per game.
AST: Average number of Assists per game.
PF Average number of Personal Fouls per game.
We discovered several interesting facts about NBA statistics. For example, even though there are more than 40 types of statistics, our data analysis shows that the following stats strongly influences a player's salary: PER (player efficiency rating) FG (Field Goal shots per game) 2P (average two points made per game) FT (Free throws) PTS (points scored per game) WS (Win Shares)
Our best models are Gradient Boosting Model and the Random Forest model.
We refer readers to the Jupyter Notebook for more detailed conclusions.
I am thankful to Briana Konnick for several helpful and constructive feedback on this project.
[1] Sebastian Raschka, Vahid Mirjalili, Python Machine Learning - Second Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2nd Edition.
[2] Aurelien Geron, Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems.