A game illustration using a dot plot. Each circle is a play; darker ones are more successful plays.

What does a football game look like?

Using data visualization to quickly describe a sporting event.

So, you missed the game. Something came up, and you’d been planning this outing with this friend for a long time, and… anyway, you missed it. You’ll need to find out how the game went afterwards. You could…

  • Find a recording/replay of the game and watch it end to end
  • Ask your friends how the game went
  • Ask your best friend, the Internet, how the game went

Most of this discovery will uncover qualitative impressions: this is usually informative enough, but the worst of it can be conveyed in sentiments like “that [winning] team just wanted it more.” Hmm. While qualitative analysis is valuable (e.g., sports journalism), using data can help avoid biased perspectives and hone in on performance. If we’re checking to see how the game went, here’s what that data usually looks like:

The box score from the game we’ll be looking at.

It’s a box score. Get it… it’s like a box. With scores in it. It’s often accompanied by other baseline metrics that give some view of what happened during a game: total yards, first downs, turnovers… the basics. There are outlets for the more statistically-inclined, too. There are some smart and passionate sports stats communities online: for football, check out Football Study Hall and Football Outsiders. There have never been more (and more interesting) ways to talk about performance in sports.

But — basic and advanced alike — these stats usually aggregated, often at the whole-game level (but sometimes by quarters or halves), so the flow of the game is a bit of a mystery. E.g., in that box score, what was happening during that 0–0 first quarter? And was that 3rd quarter as “dominant” as it looks according to the scores? Here are some of the things we may be looking for when we ask the question, “how did the game go?”:

  • Score (duh). Which team won? By how much? In which quarter(s)?
  • Pacing, momentum. How did the teams accrue their points, yardage, etc.? Big plays (even lucky plays), or sustained success? Were there big swings in the game, or a constant back and forth?
  • Methods. What types of plays did the team(s) run? Run or pass? Which formations? Which were successful?
  • Tone, execution, luck. Did the winning team dominate, or were they barely holding on to their lead? Which team made the most of their opportunities? Which team created the most opportunities?
  • Individual performance. Which players shone? Which didn’t?
  • Expectations. Was it a “normal-looking game” for these teams? Did it follow existing trends? Any surprises?
  • (there are surely more. Let’s pretend that I wrote several more bullets and that they all sounded really smart)

The point is, there are many ways of thinking about how a football game went. Different mediums and methods for doing so will favor certain types of answers… e.g., score and individual performance are commonly emphasized by mass sports media, whereas your friend might give a better sense of the feel of the game: “Well, the score ended up pretty lopsided, but the game was neck in neck for most of the first half. They just couldn’t seem to finish drives.”

The box score is handy for a quick score overview, but I’d like to explore other facets; using data visualizations, we should be able to convey some additional layers without asking too much time or effort from the end user.

For these charts (and the above examples) I chose to use the Alabama vs Michigan State 2015 Cotton Bowl. This was due to a few reasons: one, because the extreme differences in performance should make chart results pop off the page; and, two, because I’m an Alabama fan and this game was a hilarious rout. The data I use is scraped from RollTide.com, then parsed and digested in Google Sheets.

In exploring ways to visualize a game, there are a few things we can do to nuance the data:

Identify more levels of “success”

One extreme, in discussing “success,” is simply identifying and valuing the win (W-L records are rife in mainstream analysis). The other extreme would be something like raw yardage per play. To get to a middle ground, I’ll look to one of Football Outsiders’ efficiency metrics, Success Rate, to talk about which plays were “successful” versus not. Here’s how Success Rate (SR) works for plays on offense:

  • 1st down: a successful play gains 50% of needed yardage
  • 2nd down: a successful play gains 70% of needed yardage
  • 3rd/4th down: a successful play gains 100% of needed yardage

Obviously, plays that don’t gain yardage, turnovers, etc., are negative plays. Here’s an example of Success Rate in action: the aggregated offensive success rates for both teams, broken down by quarter.

UPDATE: the original version of this post had punts included in success rate calculations. I’ve changed that to be more in line with other stats folks, so the SRs have gone up a bit and many of these charts have been updated.

A bar chart showing team Success Rates per quarter for this game.

Looks like one team had the clear advantage for most of the game (Roll Tide, folks), but interestingly the other team was much closer late in the game: too bad that the 4th quarter ended up being entirely “garbage time,” which is normally removed for success metrics at the season level.

Another success designation I’ll use is Explosiveness. This represents how “big” the plays are that a team successfully completes on offense. E.g., a team that throws long passes may have a lower success rate than a run-first team, but could have a higher explosiveness due to the successful plays accruing a lot of yards per play. There are a few ways to talk about explosiveness: one is iPPP, a metric that talks about average explosiveness (expressed as an index/aggregation over multiple plays, like an average iPPP is 1.25). Another is to simply mark plays that “are explosive.” The definitions for an explosive play vary across different teams and analysts, but the one I’ll use goes like this: a run of 12+ yards, or a pass of 16+ yards, is considered “explosive.”

Parse timing into smaller chunks

Again, one extreme of the time dimension is “the whole game,” and the other is something like a play-by-play description of everything that happened (ugh). I’ve tried to strike a balance using quarter quintiles: divide each quarter into 5 chunks (for the game clock, 15:00–12:01, 12:00–9:01, 9:00–6:01, and so on). Five is a fairly arbitrary figure, but for these visualizations it gave a good balance of width/height. By having more granular time increments, we should be able to spot more detail, team momentum, interesting events, etc.

Break down the performers

Each football team has 11 players on the field at a given time, so there’s plenty of detail to be had. You can break down stats by unit (offense/defense), positional grouping (e.g., receivers), and to individual players. Individual performance is something that mainstream sports analysis gravitates towards, as personalities are easy to tell stories around. I didn’t go into this, but may explore it at a later time.

Position the game and metrics within context

Some of this is dependent on the factors above; e.g., how well a team did one quarter is inevitably compared to performance in another quarter. Zooming out, you can look at team (or player) performance versus other games, or control for opponent strength (Bill Connelly and his cadre at SBNation do a great job with both of these). This is something I’ll get into at some point (it’s pretty awesome), but not in this post.

By pulling from some of the tactics above, we can provide a data illustration of a football game that’s both information rich and quickly digestible.

Semi-aggregated views

I talked about aggregated metrics views above, e.g., displaying how many total first downs each team got during the game. To extract something more interesting, I broke things down into “semi-aggregated views.” The plays and stats still roll up, e.g., into quarter quintiles, but it’s broken down into smaller chunks to tell a richer story.

A more detailed version of the dot plot title illustration. Each circle is a play; the darker circles are more successful, with the darkest being “explosive plays.” Scores (e.g., 7), Penalties (*), and change of possession plays (e.g., P) are called out.

The first thing I put together is a horizontal dot plot showing unsuccessful, unsuccessful, explosive, and neutral plays. The above image is a more detailed version of the one in the title of this post: I’m calling out scoring plays, penalties (*), and change of possession plays, and there are cumulative stats (score and success rate) updated at the end of each quarter.

There’s a lot going on here, but I think it strikes a nice balance of detail and theme. The general directionality (one team pulling up, another down) and color scales suggest larger themes to the game, while colors, numbers, and symbols help identify big moments in the game. The breakdown by quarter evokes the familiar “box score” structure.

But the visualization does get a little involved, and there is a learning curve. And, while the circles are intended to break that visual suggestion, the vertical bars are a bit arbitrary: e.g., if you see a very tall stack of circles by a very short one, it’s not necessarily meaningful… it’s just that plays happened to line up within a specific time range in a quarter, e.g., 6:00–3:01, then perhaps they punted the ball away.

I tried a different version in an attempt to clean up some of the visual noise: in a way, this view is “even more aggregated,” as plays have been grouped into success type (successful vs. not successful in separate charts).

A similar dot plot, separated by team and play success time. It’s big!

Visually, this structure cleans up nicely: more white space and less close-proximity color contrast helps. The big themes from the game (e.g., one team pretty consistently kicked ass) are perceivable at a glance: the big “shapes” of the plots tell a story.

Unfortunately, the quarter-quintile aggregation on a dot plot still requires some learning. E.g., two dark circles and two light circles stacked up does not imply the order of the plays… just that they happened in the same quarter quintile. Also, this arrangement in particular takes up a ton of space! The plays from one team are lined up with concurrent plays from the other team, but the dots are so far apart that you’d need to read the axis labels to match them up by quarter quintile. A grid could help assist in this, but it’d still require precise visual scanning. Perhaps if the whole thing were zoomed out (e.g., get rid of scores and neutral play types), it could be salvaged, but as it is, it’s pretty ungainly.

Taking a similar semi-aggregated approach, we can try simplify things using different shapes: stacked area charts. And by rolling the metrics up to be cumulative over the course of the game, we get a broader view.

Stacked line/area charts clean up the big stories nicely, but you lose some detail (like, uh, the score).

It’s nice: simple and soft, in comparison to the dotted views. As long as there’s an understanding of what “success” and “explosiveness” are, the charts quickly convey a sense of the game. Not only was one team scoring all of the points here… they had sustained success (though, notably, the less efficient and successful team did have more explosive plays, and they started happening earlier in the game).

However, with the cumulative rollup we’ve lost some detail: trends and directionality are apparent, but by the 4th quarter we’re looking at data that’s mostly already been defined in the prior 3 quarters. The viewer’s ability to identify truly incremental activity over the course of the game is limited. Also, separating the teams into two charts makes time-oriented comparison difficult; in a closer game, it’d be hard to tell who was performing better during a given moment in the 2nd or 3rd quarter. We could overlay these charts and delineate with additional visualization (patterned lines, for example), but we’d lose some visual appeal.

Chronological views

The aggregated views are pretty good at showing game trends, but they have caveats around user learning. So instead of “semi-aggregating” the plays of the game (e.g., rolling up plays by quarter quintile and ordering them by success designation), let’s just lay the plays out chronologically and see if the game makes more sense that way. The color coding is the same as the prior charts, with darker colors corresponding to successful and/or explosive plays.

A chronological “block” view of the plays in a game. Red is ALA and Green is MSU. It’s pretty concise, but hard to follow.

Bleh. This was disappointing: relying on simple chronology was a tempting and elegant-sounding solution, but without laying it all across one axis — which would be space-inefficient — it doesn’t communicate well. Like a football game itself, there is a lot of noise over the regular course of a game, and plenty of back-and-forth. In watching a game, the human brain can interpret the important stuff (and generally enjoy the ride, results and team loyalty depending), but this analogous visualization doesn’t give us any assistance in doing so.

Perhaps breaking the chronology back down a bit would help: here’s a similar visualization built vertically and broken down by the drive (there’s a line break with each change in possession). This is also a narrower visualization, so I’ve mocked it up in a restrictive small-ish phone:

Another chronological block view, this time broken down by drives and laid out in a small phone format.

Well, it’s better, not that the bar was high. And it lays out well vertically (something that many of these visualizations can’t claim). With drives broken out, there is some semblance of overall trend here… it’s slightly easier to see comparisons like the length of drives (which usually correlates with efficiency metrics like success rate). But the back and forth nature of this visualization is jarring… trended information is difficult to grok when the context switches so rapidly.

Another huge disadvantage in these chronological charts so far is the reliance on color to distinguish between teams… obviously a hugely important distinction. This red and green coloration (which corresponds to teams) is especially egregious, as around 8% of men of northern European descent (and 0.5% of women) are red-green colorblind. That’s a lot of people that couldn’t interpret this particular visualization at all! (Sorry to any red/green colorblind folks who are reading this. I assure you, these chronological “block” visualizations aren’t worth your time anyway).

So, in an effort to save the chronological view, let’s reintroduce space as a dimension, this time for teams (similar to the dot plots above) and for success. The scoring system is crude, but I’ve mapped all of the plays on a “success scale: e.g., a scoring play is on the highest point of the scale, +3. A defensive score (like an interception returned for a touchdown) is +3 for the defense, aka, -3 for the offense. It makes more sense in a visualization: in this case, a line/area chart with the +/- success on the vertical axis and play number (just all the plays in order) on the horizontal access. Take a look.

All of the plays of the game, charted on a “success scale,” with scores being the most extreme points on the scale.

This is… kinda interesting. It reminds me vaguely of the in-game win probability charts you see on some sports stats sites. This line/area chart is better than the earlier “block” chronological visualizations, in that we have a clearer idea of who’s having success and when. The spatial mashup of “team” and “success,” similar to the first semi-aggregated dot plot, is a handy way to tell the story. With the correct labeling, we’ve solved the most critical of the colorblind interpretation issues, too.

But it’s still pretty opaque: this rapid back and forth, the up-and-down nature of the changes in the game, makes for a messy visualization. It’s difficult enough for people to make 2D spatial comparisons between objects (see Stephen Few’s denouncements of pie charts to the same effect), and here we’ve even broken them up into more of small areas. It’s a mess. This could perhaps be addressed by using a scatterplot instead of lines, but then we’re back to dot plots again, pretty close to where we started… but with a somewhat-arbitrary success scale attached.

Conclusion (for now)

The charts explored here each have pros and cons, but I’m leaning towards the semi-aggregated variety as the better way to convey details and trends about a football game. The first horizontal dot plot is a fun one to develop: manipulating the different levels of detail (e.g., quarter terciles instead of quintiles, or suppressing play type details) could help improve its digestibility. It’s also versionable… we could stack multiple charts to compare metrics, games, etc. for comparison purposes. Also, it could be worth exploring a vertical option to better accommodate mobile viewing.

Further explorations

This obviously isn’t an exhaustive effort; there are potentially infinite ways to display this data! And per my earlier lists, there are entire information categories I haven’t explored yet. Here are some natural extensions:

  • Performance vs. season averages. This is rich context for game stats, and helps put games in context of the season (or even era). Saxon at RollBamaRoll.com has done an excellent job with this.
  • Run vs. pass. This is something the Alabama fan base is especially sensitive to (“Roll Tide! Run the damn ball!”). I avoided this layer for now, but it could be pretty easily incorporated.
  • Player and position stats. Though, the per-game data for individuals often suffers from small sample sizes. Even season stats aren’t very rich until well into the year.
  • Key plays (offense or defense). I’d love to go beyond the “explosive offensive play” designation and figure out how to isolate the most important or impactful plays on the game result.
  • Other sports. e.g., shooting rates over time in a basketball game. Though, in basketball the scoring is so frequent that it might not tell us anything new. Surely there are some other metrics that could be isolated to show “how the game went.”

Experimenting in D3

What would be really cool is to run visualization experiments with this data using a data visualization library like D3.js. I’m still in the process of learning the language — I just had to get these ideas out in the meantime, so they’re manually assembled — but I’m excited to tweak dimensions and filters on the fly and to discover visualizations during execution (rather than with necessary pre-planning).

This is all forthcoming, and football season is looming, so look for follow-up sometime soon; now that I have the data extraction in place, things could get fun fast. Thanks for reading.