HDSC Winter ’22 Premiere Project Presentation: Machine Learning and Professional Basketball- A look into the NBA playoffs tournament

HamoyeHQ
Hamoye Blog
Published in
4 min readMar 10, 2022

A project by Team Adaboost

Basketball game opponents trying to keep the game in their respective team’s favour

Basketball is a famous sport played between teams of players with a major aim of tossing the ball through the opponent’s goal (an elevated net called “basket”), so that the team with the highest goal by the end of the last quarter of the game wins the games.

The American National Basketball Association (NBA) is the most popular professional basketball body in the world in terms of popularity, wages, talent, and level of competition.

The NBA playoffs is a best-of-seven elimination competition between sixteen teams from the Eastern and Western Conferences, with the final four teams advancing to the conference finals. Following the conference finals, the winners of the West and East conferences compete for the NBA championship.

Objective

This project aims to investigate the most essential regular season metrics that contribute to a team’s chances of making the NBA playoffs finals.

Data Description

The dataset was sourced from Kaggle

There are 1536 rows and 60 columns in the dataset

Most of the columns are numeric

There are not too many missing values.

The playoffs have almost 50 percent missing values. This is reasonable because only 16 teams participate in the playoffs.

Definition of Terms

Field Goal — any successful shot other than a free throw. It is worth two or three points depending on the distance

Free Throw — an unhindered shot awarded because of a foul by an opponent. It is worth one point

Trey — a field goal attempted from outside the three-point line. It is worth three points

A pictorial description of the flow of work from start to finish

Data Exploration

The dataset contains data from 6 different leagues with NBA data being the most recorded.

Boston Celtics and Los Angeles Lakers have won the NBA title 17 and 16 times respectively

Up until 2011, The Western conference has produced the most NBA champions (21).

While the attempts at goal reduced, the shot accuracy have generally improved over time.

Free throws were awarded between 1950 and 1970. This has significantly reduced in recent years while the conversion rate has slightly increased

There has been a marked improvement in the number of trey attempts per game over the years while the conversion rate has improved. However, There seems to be no trey attempts before 1980. This could be an entry error.

Personal Fouls have reduced over time. This explains why the number of free throws have reduced over the years.

So far, we can observe that the play style in the NBA has evolved over time. Teams have been making more field goals, committing fewer fouls, and improving their free throw conversion rate.

Model Building

After sub-setting the NBA playoffs data. There was imbalance in the target classes with only about 20 percent positive instances. Training on this data gave poor performance on the positive classes. This is expected as the models would learn heavily on the majority class

The majority instances were undersampled (a quarter) and the models were retrained. This resulted in a marked improvement over the positive class but trades off some performance on the negative class.

New features were also engineered but this didn’t improve the performance of the models greatly

The feature importance plot of the best performing model (XGBoost) shows that win% is a very important feature. We can also see that a handful of the engineered features are informative.

Link to repo: https://github.com/Abdulbaasit95/AdaBoost/tree/main/model_building

Conclusion

While there may be other indicators that represent a team’s form, the regular season victories, losses, and opponents’ performance are useful in projecting how far a basketball team can go in the NBA playoffs.

Recommendation

Bet simulators and bookies can benefit from the accuracy, speed and efficiency of this kind of model. This can also be useful to NBA teams in preparing for the playoffs tournament.

Reference

Wikipedia, 2022, _Basketball_, viewed 17 February 2022,

<https://en.wikipedia.org/wiki/Basketball>

NBA, _NBA Advanced Stats_, viewed 18 February 2022,

<https://www.nba.com/stats/help/glossary/>

MasterClass, 2021, _How Long Are Basketball Games? NBA, WNBA, and NCAA Rules_, viewed 21 February 2021,

<https://www.masterclass.com/articles/how-long-are-basketball-games#how-long-are-basketball-games>

--

--

HamoyeHQ
Hamoye Blog

Our mission is to develop an army of creative problem solvers using an innovative approach to internships.