How the Final Score Can Lie: Rating “Score Control”

Published in

Analyzing NCAA Basketball with GCP

8 min readMar 21, 2019

As a college basketball fan, if you miss watching a particular game involving your favorite team, the main thing you’ll look for afterward is the final score — who won and by how much? But if you actually watch enough games from start to finish, you know the final score does not always tell the whole story.

Let’s take a couple Villanova games from this season as examples:

January 5 win at Providence, 65–59
February 17 loss at St. John’s, 71–65

A 6-point win where they played well enough to win and a 6-point loss where they must’ve played worse, right? That’s what you might conclude if you only saw the final score of each. But if you followed the path of both games, you’d understand how so very differently the two games arrived at those final 6-point margins.

Let’s look at a plot of the score throughout the game (time on x-axis, each team’s points on y-axis) for each game, starting with the Providence game.

Despite the relatively close final margin, Villanova actually controlled this game from start to finish, leading for 99% of the game, by double digits for nearly 80% of the game, and by as much as 21. Yes, the Friars did make it close at the end, but they never really threatened to win.

Now, let’s look at the same plot for the St. John’s game:

St. John’s won, but this was clearly a come-from-behind win (or in Villanova’s case, a fall-from-ahead loss). From the Wildcats’ perspective, this is actually more similar to the Providence game than the final score shows — they led for more than 80% of the game, had a lead as big as 19 — until the end, of course.

The key takeaway here is that while the final score ultimately matters more from a results standpoint, looking at the score evolution throughout a game can give us more information about a team’s performance. Following this premise, we wanted to build a team metric that measured how well a team controlled the score of its games over the course of the season: what we’ve aptly named “Score Control.”

In the course of building this metric, we used:

our NCAA play-by-play data in BigQuery, building off the infrastructure described here
Colab for more statistical analysis and interactive visualization
Data Studio to help build some interactive reports

Each of these helped generate many unique insights along the way.

We start with our play-by-play data, detailing the key events on every play of every D-1 game this season. From there, we do a few things:

Isolate only scoring plays
Get some pre- and post-play running score fields — sometimes we want the score from before the play, other times after
Get the time between each scoring play, using navigational functions LAG and LEAD over the time-ordered scoring plays within each game

A large chunk of the query we use for this purpose is shown below, for reference.

Part of Code to Generate Play-by-Play Scoring Log in BigQuery

You’ll notice we don’t do any aggregation here. That’s intentional, as this is actually a preliminary step toward our goal, and it also helps with other “score over time”-type analysis (e.g. it also tags each tie and lead change within a game). We store the results from this query in a view, a virtual table defined by a SQL query, that we call “pbp_scoring_log.” This log of ordered scoring plays ends up being pretty useful for a couple different calculations.

When we actually run the query over our millions of rows of play-by-play data, doing some intricate sequencing across all scoring plays, it finishes in under 30 seconds — that’s the power of BigQuery!

This result still has more than two million rows (across five seasons of play-by-play data), so we know we’ll want to do more aggregation in BigQuery before we take results into Colab or elsewhere for further processing. Fortunately, we can build off this view to get one row per team — the above result only has rows for the scoring team, but having one row for each team at each scoring play helps with aggregating up to the team level later on. Once we get one row per team, we can build another view on top of that to aggregate further to the team game level.

We’ll skip over some of the details here, but basically we create a view on a view on a view (we really like views!), ending up with game-level results that look something like this:

Team Game-Level Play-by-Play Aggregation Results

For each team in each game, we have a variety of information about how the score evolved from its perspective: biggest lead, highest deficit, average point differential throughout the game, and percent time spent leading and trailing. Once again, BigQuery processed 1 GB and got results back to us in less than 15 seconds.

Since those stats are useful in and of themselves, we’ve shared them for all 2018–19 games on this page of our public-facing March Madness dashboard. You can sort, filter, and play around to find out:

Who faced the largest deficit to come back and win this season? (hint: Duke’s stunning 23-point comeback at Louisville was among the biggest deficits overcome, but not #1!)
How much time did Wofford spend leading in each of its games?
In how many games did Gonzaga trail for more than 50% of game time?

But back to our Score Control metric. We are interested in the average in-game point difference (“AvgPtDiff” field), which gives us a summary of the team’s scoring margin throughout the game. To study this more and build the team season-level metric, we read the team game-level data into Colab. We generated the following plot to see if this average in-game point margin tells us anything different than the final point margin on a game level, using plotly in Colab.

There’s a strong correlation between final margin and in-game margin (as we’d expect), but still a good deal of variation across games with the same final margin. Let’s zoom in further on the plot above to look only at games where a team won by 1–9 points.

For these final score margins, we can see average in-game margins from as low as -10 to more than +15. As we saw with the Villanova example earlier, all 6-point wins are not created equally!

From this game-level metric, we can aggregate to get the average in-game point differential for a team across all its games in the season. You can see that on the season-level in-game score metrics page of our dashboard, with a screenshot of the top 10 teams below (numbers through Tuesday’s games).

Top 10 Teams in Average In-Game Point Differential

Gonzaga is dominant, nearly four points ahead of every other team! The Bulldogs have spent more than half their season leading by double digits, and have had a lead of 20+ points at some point in two-thirds of their games. The teams ranked behind them are interesting: a mix of other top NCAA Tournament seeds (like Duke and Virginia), as well as some teams who dominated their lower-level conferences (including Lipscomb, who didn’t even make the tournament).

If you followed our work from last week, the next step based on this initial list might be predictable: schedule adjustment. While Lipscomb may have a higher average in-game point differential than Duke, it’s clear statistically — and from a basketball standpoint — that the Blue Devils played a much tougher schedule than the Bisons. Comparing their raw numbers can be a bit misleading if used for rating the teams from an overall quality standpoint.

So to get to our final Score Control metric, we use the game-level average in-game point differential data, and adjust for opponent and site using ridge regression (see the post linked above for much more detail). We’ve shared these ratings and rankings on the BracketIQ Metrics page of our public-facing dashboard (along with our other BracketIQ metrics, some of which we’ll explain in more depth during the tournament). Here’s the top 10 (through Tuesday):

Gonzaga is still #1 in the adjusted version of our metric, but the distance to other teams is much smaller. We see Duke, Virginia, and Michigan all moving up from their spots on the previous list, and they’re now joined by some of the more successful programs (and top NCAA Tournament seeds) rather than mid-majors who dominated smaller conferences. In fact, the top seven teams in Score Control are all 1- or 2-seeds. While not a be-all and end-all, this certainly makes us feel better about the metric’s ability to measure overall team quality.

But still, can we learn something from Score Control that we can’t otherwise learn from just looking at a team’s average final point margin? To study this, we plotted those two quantities for the 68 teams that qualified for the 2019 NCAA Tournament in the following interactive plot. The diagonal line is the “expected” Score Control given a team’s average final point margin — basically, where every team would lie if the two metrics rated them nearly identically.

We can see some teams pretty far off the line, which suggests that looking at in-game point margin and adjusting for schedule can make a big difference. As an example, VCU and UCF — two teams that face off in the Round of 64 on Friday — see their rankings improve more than 10 spots when going from average point margin to Score Control. On the other hand, both New Mexico State and Liberty rank in the top 15 when only looking at final scores, but are dozens of spots lower when looking at the evolution of their scores within games and accounting for schedule.

Now that we’ve made the case for the importance of following the score throughout games in rating team quality, hopefully it’ll help frame the many, many hours of college basketball you’ll watch in the next few days. Come back to this dashboard throughout the tournament to see Score Control and other in-game scoring metrics. Enjoy the Madness!

How the Final Score Can Lie: Rating “Score Control”

Written by Alok Pattani