HDSC Winter ’22 Premiere Project Presentation: Machine Learning and Professional Basketball- A look into the NBA playoffs tournament
A project by Team Adaboost
Basketball is a famous sport played between teams of players with a major aim of tossing the ball through the opponent’s goal (an elevated net called “basket”), so that the team with the highest goal by the end of the last quarter of the game wins the games.
The American National Basketball Association (NBA) is the most popular professional basketball body in the world in terms of popularity, wages, talent, and level of competition.
The NBA playoffs is a best-of-seven elimination competition between sixteen teams from the Eastern and Western Conferences, with the final four teams advancing to the conference finals. Following the conference finals, the winners of the West and East conferences compete for the NBA championship.
Objective
This project aims to investigate the most essential regular season metrics that contribute to a team’s chances of making the NBA playoffs finals.
Data Description
The dataset was sourced from Kaggle
There are 1536 rows and 60 columns in the dataset
Most of the columns are numeric
There are not too many missing values.
The playoffs have almost 50 percent missing values. This is reasonable because only 16 teams participate in the playoffs.
Definition of Terms
Field Goal — any successful shot other than a free throw. It is worth two or three points depending on the distance
Free Throw — an unhindered shot awarded because of a foul by an opponent. It is worth one point
Trey — a field goal attempted from outside the three-point line. It is worth three points
Data Exploration
The dataset contains data from 6 different leagues with NBA data being the most recorded.
Boston Celtics and Los Angeles Lakers have won the NBA title 17 and 16 times respectively
Up until 2011, The Western conference has produced the most NBA champions (21).
While the attempts at goal reduced, the shot accuracy have generally improved over time.
Free throws were awarded between 1950 and 1970. This has significantly reduced in recent years while the conversion rate has slightly increased
There has been a marked improvement in the number of trey attempts per game over the years while the conversion rate has improved. However, There seems to be no trey attempts before 1980. This could be an entry error.
Personal Fouls have reduced over time. This explains why the number of free throws have reduced over the years.
So far, we can observe that the play style in the NBA has evolved over time. Teams have been making more field goals, committing fewer fouls, and improving their free throw conversion rate.
Model Building
After sub-setting the NBA playoffs data. There was imbalance in the target classes with only about 20 percent positive instances. Training on this data gave poor performance on the positive classes. This is expected as the models would learn heavily on the majority class
The majority instances were undersampled (a quarter) and the models were retrained. This resulted in a marked improvement over the positive class but trades off some performance on the negative class.
New features were also engineered but this didn’t improve the performance of the models greatly
The feature importance plot of the best performing model (XGBoost) shows that win% is a very important feature. We can also see that a handful of the engineered features are informative.
Link to repo: https://github.com/Abdulbaasit95/AdaBoost/tree/main/model_building
Conclusion
While there may be other indicators that represent a team’s form, the regular season victories, losses, and opponents’ performance are useful in projecting how far a basketball team can go in the NBA playoffs.
Recommendation
Bet simulators and bookies can benefit from the accuracy, speed and efficiency of this kind of model. This can also be useful to NBA teams in preparing for the playoffs tournament.
Reference
Wikipedia, 2022, _Basketball_, viewed 17 February 2022,
<https://en.wikipedia.org/wiki/Basketball>
NBA, _NBA Advanced Stats_, viewed 18 February 2022,
<https://www.nba.com/stats/help/glossary/>
MasterClass, 2021, _How Long Are Basketball Games? NBA, WNBA, and NCAA Rules_, viewed 21 February 2021,
<https://www.masterclass.com/articles/how-long-are-basketball-games#how-long-are-basketball-games>