Predicting the Scottish Premiership
Last minute winners, the triumph of underdogs and sudden change in fortune are the essential ingredients of football.
For me, as an Englishman who grew up in Scotland, the Euro ’96 match between my two nations provides a perfect example. I’m not going to say which country I was supporting that day, but what was palpable was the drama. With characters like Paul ‘Gazza’ Gascoigne, Ally McCoist and Colin Hendry on the pitch, almost anything could happen. And it did. After England took the lead, Scotland were awarded a penalty with 15 minutes to go. But Gary McAllister’s spot kick was saved by David Seaman, and seconds later Gazza finished Scotland off with a fantastic left-footed flick over Hendry’s head and a volley in to the Scotland goal.
What we learn from sudden changes in fortune, after we have recovered from the disappointment or got over our elation, is that football is unpredictable. Goals can come at any time from either side and it is almost impossible to predict when one will be scored.
It isn’t just single matches that are difficult to predict. In the race for the Scottish Premiership title this season Aberdeen are chasing Celtic, with only one point separating the two teams. The middle of the table is extremely close, and the big question in the coming weeks is which teams will make the top-six split? With three matches to go, seven teams (Ross County, St Johnstone, Motherwell, Dundee, Partick Thistle, Inverness CT & Hamilton) have between 33 and 39 points. How likely is it that each of these teams will be in the top half of the league on the 9th of April?
To work out probabilities we need to look at the statistics and think mathematically. Let’s look at the upcoming encounter between Kilmarnock and Celtic on Saturday (19 March). The Hoops have scored 31 goals away in 14 matches. This means that their average goals per away match is 31/14 =2.12. Kilmarnock have conceded 30 during their 15 games at home, giving an average number of goals conceded of 30/15=2. A reasonable estimate of the number of goals that Celtic will score when they visit Rugby Park next week is half way between these two numbers, and we can reasonably say that we expect Celtic to score 2.06 goals this Saturday.
A number like 2.06 might make sense in a maths lesson, but it makes little sense on a football pitch. Even a mathematician like me understands that the score line on Saturday will not be:
Kilmarnock 1.00–2.06 Celtic
It is here that the unpredictability of football comes in. It turns out that the randomness that makes games so exciting to watch is also very helpful in making predictions. The fact that goals are equally likely to go in at any point in the match, allows us to calculate the probability of different score-lines. Using a piece of mathematics called the Poisson distribution we can calculate the probability of different score lines for Kilmarnock vs. Celtic. These are shown below.
An away win of 1–2 or 0–2 for Celtic are the most likely outcomes, both at near 10%. But there are other possibilities. There is a 9.4% chance that Killie hold on for a 1–1 draw. And in the other direction, Celtic have around a 1 in 12 chance of winning by four goals or more. There is a great deal of uncertainty in a football match, even one between league leaders and the team that is second from bottom.
Prediction doesn’t stop with a single match. By calculating the probability of the outcomes of all upcoming games we can get glimpse of the future. This is exactly what I have done for the Scottish Premiership up to the point of the split. But instead of calculating one possible future, I have run 10,000 league simulations. It takes just 30 seconds to do this on my laptop and it turns out that, despite only being separated by a point, Celtic’s game in hand and their easier run-in means that Aberdeen have only a 10.4% chance of leading the league at the split. Celtic are league leaders in 89.6% of simulations and Hearts in only 0.01%. Good luck to the Dons, but Celtic are still strong favourites.
In the middle of the league table things are a lot less certain and my model can provide numbers that measure this uncertainty. Below is the probability that different teams will make the split.
It will be an exciting run in. Ross County and St Johnstone are nearly there, but Motherwell and Dundee’s certainty of making the top six is similar to the outcome of a coin toss. After their away win in Inverness, even Hamilton Accies are in with a small chance.
The application of maths to football goes well beyond predicting the league. I have spent the last year looking at all aspects of football in terms of mathematics. How does the Barcelona midfield create a network of passes? How can managers use game theory to outwit the opposition? How do Bayern’s defenders narrow down space? And how can probability theory make you money at the bookies?
My book Soccermatics, which comes out in May looks at these and many other aspects of our favourite game.
On Friday 8 April, I will be talking to Pat Nevin at the Edinburgh International Science Festival about why maths and football are “More Than a Game”. Before that, I will be updating my predictions on Twitter at @Soccermatics. Come along and listen. By then we’ll nearly know who made the split and whether or not I got it right. With all the randomness at play in football, I might well have some more explaining to do.
David Sumpter is Professor of Applied Mathematics at Uppsala University in Sweden.
Originally published at blogs.scotland.gov.uk on March 16, 2016.