Best Football Analytics Pieces

Sam Gregory
Jun 21 · 5 min read

Since Tom Worville wrote this piece back highlighting the best of football analytics in 2016 there have been a lot of influential pieces and I figured enough time has passed to publish a follow up to that piece. I got a lot of good suggestions and tried to select a variety of pieces but keep it small enough it doesn’t feel like an impossibly long course syllabus for anyone trying to catch up.

Expected Threat — Karun Singh

This piece employs a markov approach to valuing specific actions, something that has been employed to varying success in the past (I’d be remiss not to mention Sarah Rudd’s excellent presentation from NESSIS in 2011 here). What I think makes Karun’s approach novel is the limited scope — a goal in the next 5 events — which avoids some of the strange non-linearities which exist when you go back further in a possession and the excellent interactive visualizations used to illustrate the approach.

Game of Throw-Ins — Eliot McKinley

American Soccer Analysis has been a treasure trove for public analytics work over the past few years and I could have picked any number of their pieces to include here but I really liked this one because of the subject: throw-ins. In the aftermath of Liverpool employing a throw-in coach, throw-ins became an in-vogue topic in the football world, but surprisingly up until this point had not really been touched — at least in public — by the data community. Elliot’s work here is an excellent jumping off point for analyzing throw-ins.

Messi Walks Better Than Most Players Run — Bobby Gardiner

This article on 538 is essentially a summary of a 2018 Sloan Paper by Javier Fernandez and Luke Bornn, but I think Bobby here picks up on what I found the most fascinating insight of the paper — how efficient and effective Messi is at making space while walking — and made it accessible to a broader audience by eloquently highlighting why the work is important and putting the narrative in context.

Decomposing the Immeasurable Sport — Javier Fernandez, Luke Bornn and Dan Cervone

Sticking with them theme of Javier’s work I had to include his paper from this year at Sloan — Decomposing the Immeasurable Sport. Sit down with anyone involved in the football analytics world and ask them what they think the future of the discipline will look like and I guarantee their response will reference this paper in one way or another. This paper assigns what is commonly called a EPV (expected possession value) to every situation on a football pitch. There is lots of debate in the analytics community about how close we are to seeing these kind of approaches applied at scale or how their insights can be translated to actually affecting performance but without a doubt one of the most influential things to come out of field in the last few years.

Pass Footedness in the Premier League — James Yorke

Since Statsbomb transitioned from a consultancy to a data company one of the key additional data points they highlighted was the foot each pass was made with. These new data points allowed James to write this piece which is the first comprehensive look at how footedness affects passing ability and decision making uncovering a few interesting and unexpected insights.

Breaking Down Set Pieces: Picks, Packs, Stacks and More — Euan Dewar

I don’t even know if this counts as an “analytics” piece in the classic use of the term to describe a data-driven approach, but I had to include it because it’s one of my favourite football pieces I’ve read full stop. Euan’s insights into different set piece techniques has genuinely affected how I watch the game which is about as high a praise I can give.

Rethinking Shots — Marek Kwiatkowski

A classic Marek piece modelling shot generation yields some interesting results that goes beyond typical xG-shot-type analysis and gets at the underlying processes defining shot generation.

Phases of Play: An Introduction — Tom Worville

One of the biggest gaps between how coaches/performances analysts talk about the game and the work generated by the public analytics community has been around phases of play, how teams play in transition, build-up and attacking phases. Tom here introduces a data-driven framework for breaking up possessions or sequences into their smaller sub-components.

Full disclosure this is a project I worked on quite a bit during my time at Opta but really happy with how Tom has expanded upon it and improved it here.

Passing out at the Back — Will Gürpinar-Morgan

Will’s OptaPro debut was unsurprisingly excellent — an analysis of defender passing tendencies in the Premier League, comparing both how league tendencies have changed over time and how that differs between the Top 6 and rest of the league.

This is obviously an incomplete list and probably includes a bit of recency bias in terms of the pieces that have had the biggest influence on me over the past few months, but I hope provides a good summary of the best recent work in the field.

Sam Gregory

Written by

⚽️📊📉📈 | Data Analyst @SPORTLOGiQ | Previously @OptaPro