Welcome to the day before the start of the men’s NCAA March Madness Sweet 16 games, also known as the last work day of this week for basketball junkies — the end of halftime, if you will.
Among the many astounding moments of last weekend, we were inspired by Nevada’s incredible comeback to win over Cincinnati, after trailing by 22 points with 11 minutes to go. Not to mention Nevada trailing Texas by 14 before rallying to win their first round game two days earlier. So halftime deficit reduction seems particularly apropos as we dive in to the next round of the tournament.
Halftime is significant in college basketball for a variety of reasons. It allows student athletes to get coached up. It enables a coaching staff to review and adjust in-game strategies in response to the first 20 minutes of the game. Halftime also marks when teams start to manage fouls differently, such as when a player sits early in the game with 2 first half personal fouls but then plays more in the second half because their coach saved the “extra” foul for the final 20 minutes of the game (the fifth foul leads to disqualification).
Halftime is a natural break for coaches, players, and teams to evaluate where they are during the course of a 40 minute game, and what steps are necessary either to maintain or alter their overall performance in order to win. It gives coaches space for a good old fashioned talk to the team, often leaving fans wondering what inspiring (or incendiary) words were exchanged during the break, igniting one team to ultimate victory.
Getting a better sense of how teams deal with leads and deficits coming out of the half fits into our broader close and late prediction modeling, too. Today’s post will explore questions like, what was the biggest losing margin at the half (and did it close or widen)? What was the biggest comeback to win from halftime deficit? And what team is best historically at protecting halftime leads?
Let’s dig in.
Setting Up the Analysis
Our first step in analysis was getting data on team scores and stats by the half. Fortunately for us, the BigQuery public dataset has data at the game, period, and play-by-play level. For this exercise we chose to use the `bigquery-public-data.ncaa_basketball.mbb_pbp_ncaa` table, which contains every play from every game since 2009. That’s approximately 24M rows.
You might be thinking, why waste your time looking at play-by-play data when you already have team scores by the half located in bigquery-public-data:ncaa_basketball.mbb_teams_periods_ncaa? While it’s true we have stats by half in that “periods” table, we wanted to be able to ask questions about scores and stats in tighter windows (e.g. by 5 minutes within each half). BigQuery enables you to nest views, making aggregations like 5 minutes into halves into full game stats very easy to do.
Note: the term period refers to the ordinal sequence of the game where period 1 is 1st half, period 2 is 2nd half and then periods 3+ are overtime.
Using the play-by-play table we built, well, a really big query. 278 lines to be exact. The screenshot below is just a snippet.
Biggest Halftime Comeback (Since 2009)
Next, we took the initial query and turned into a view and built a top-level query to look at halftime deficits and final outcomes of each game.
For our first query, we wanted to see games for losing teams at the half and where the end of the second half score delta resulted in a win. And the winner is… Drexel coming back to beat Delaware just this past February! They came back from being down by 27 to win by 2.
Tweaking this query, we looked at the biggest halftime deficit/lead. UNC-Asheville had a bit of a rough go back in November 2009 vs. Tennessee: in that game, they were down by 52 at the half and lost by 75.
Sweet 16 — Wins When Losing At the Half
So much for the outlier games. Now let’s look for team trends for the schools in the Sweet 16. The table below is the number of games each team won when they were losing at the half. You can see that Kentucky has the highest halftime deficit average and still managed to win 5 games.
Sweet 16 — Wins When Leading at the Half
This is as you’d expect. These teams are good. Gonzaga tops the list with the most wins, while Duke holds the largest margins on average: winning by 15 at the half and 23 at the end. Keep in mind though, these are full season stats, so there is a great deal of early season padding here.
Sweet 16 — Losses When Leading At the Half
This list is the most sobering. WVU and Florida State have 10 losses combined when leading at the half by an average of 7 (at least they’re not playing each other, for the moment!).
Note: Can you pick out the missing teams on this list? Three of the teams above did not lose a game when they were winning at the half this season. :-)
Armed with these basic views one could build an entropy score (weighting each team’s opponents) as a means to quantify how good or bad each team is at lead or deficit management, but that is for another post. For now, go forth and write some queries and protect those halftime leads!!!