Predicting Football Match Outcomes with Python & the Understat Package

James Harris
The Football Hub
Published in
8 min readNov 23, 2021

--

Predict outcomes and scorelines across Europe’s top leagues.

Photo by Bence Balla-Schottner on Unsplash

This article does come with one blatant caveat — football is inherently random. Results surprise us each week, making it incredibly difficult to predict outcomes with a high level of certainty. As a result, bookies profits pile up as punters fail to outsmart them with predictions conceived with biases, hunches, and last week’s performances in mind. This is especially true when looking at teams we don’t necessarily watch week in, week out.

It is, of course, impossible to correctly predict match results each week. However, we can use football data to make our predictions more objective and understand better the teams we pay less attention to. In this article, we will create a simple model and apply it to the top leagues in Europe.

First, we will use the Understat package to retrieve the data we need to get started. Then we will perform some actions on the data to formulate the expected goals for each fixture before we find the probabilities for the results with a Poisson distribution.

Understat is a great package for accessing basic football data in Python.

--

--