Understanding Expected Goals — xG

Ishdeep Chadha
5 min readDec 23, 2023

--

Since past 6 years after almost every football game, there is always one social media page which shows the summary of the match either with statistics or visuals. The most common stat you see almost everytime is something called xG. Let’s try to understand what this stat actually is and how do players or coaches comprehend from it.

xG is a statistical metric used in football (soccer) to quantify the quality of goal-scoring opportunities. It assigns a numerical value to each chance based on various factors such as the distance from goal, angle of the shot, type of play leading to the chance, and historical data on similar situations.

When a shot has an xG value of 0.2 (or 20%), it means that, on average, shots with similar characteristics would result in a goal 20% of the time. This metric provides a more nuanced evaluation of a team’s performance beyond just looking at the number of shots taken. It helps analysts and coaches understand the likelihood of a goal being scored from a particular chance, allowing for a more in-depth analysis of a team’s attacking efficiency and defensive resilience.

The most common context in which xG is discussed in the media is in summarizing a single match. The above figure displays the shot map for both Manchester city and Crystal Palace in their recent encounter on 16th of December 2023.

Manchester City vs Crystal Palace 2023/23 PL

Now for someone who hasn’t seen the match and on just seeing the above stat, one could conclude that it was a pretty “even” match according to the result and even the xG as both teams overperformed their xG. This is where is I feel this stat is misleading, people read this stat as to estimate how many goals the team should have scored. However, this does not realistically show the whole picture. While xG is a valuable tool, like all statistics or performance indicators, we have to think about what they are really telling us. It isn’t simply the case that the team with the highest xG was the worthy winner of the match.

Now that we have a clear understanding of what this metric really tells, let’s try to see some of the factors influencing how xG is calculated —

Distance and Angle from the goal -

This one is the most obvious factor/feature as one can say that it is much easier to score from near a goal than further out. But from where exactly are players instructed to shoot from the most?

This is where the angle comes in , the first thing to consider when it comes to evaluating a shot is the view the player has of the goal: the more he or she can see, the better your chance of scoring. Players learn this early. They notice that if they overrun the ball in the box, they end up hitting the side netting. It also underlies the most basic advice for defenders: showing the attacking player the way out to goal line, to narrow down his angle.Note that moving out to the side is equivalent to moving further away from the goal. The same principle applies: the wider the angle between the goal posts the better the chance of scoring.

Probability of scoring a goal from different areas

Let’s try to back this with a real life dataset using the help of Python. I have the 2017/18 Premier League dataset from Wyscout from which we will try to plot a heatmap which will display the xG from shots taken over the season.

To understand this, first let’s plot the shots and goals-

From the above heatmaps we can conclude that most of the shots taken and goals scored are in and around the penalty area , that explains why a penalty has the xG of 0.8.

Goalkeeper and Defenders -

Up to now, we have ignored a very important factor which determines if a shot is a goal or not: the defending players and the goalkeeper! We can account for these in expected goals model, using tracking data.
We can also, using a machine learning model, measure how these factors combine with distance and goal angle in determining the probability of a shot’s success.

Twelve data scientist, Jernej Flisar has developed a method based on, what is known as Shapley values, to calculate how much different factors contribute to the quality of a chance. In this example, distance and angle (in red) are positive contributions. The number of defenders between the Mane and the goal (in blue) are negative contributions. Without the opponents in the way, this would have been a 0.25xG chance, but with the opponents it drops to 0.18xG.

There are several other factors which I feel influences the xG as in reaction time of the goalkeeper , how often has a player or player of similar quality scored from a particular position, why some players are encouraged to shoot from distance and some are not. But this is a good starting point to begin and learn more about this. One thing that I would learn to dig deep as how players see this metric, because as a player myself I know that the art of shooting is very instinctive. Ofcouse we cannot take out the human part from the sport but how these minute details are effecting the decision making on every level is really something that is motivating me to dig deep.

To get the code for the case study shown above you can go to : https://github.com/ishdeep-10/Expected-Goals-Model/tree/main

References:

  1. https://soccermatics.readthedocs.io/en/latest/lesson2/GeometryOfShooting.html
  2. https://soccermatics.readthedocs.io/en/latest/lesson2/MoreonxG.html
  3. https://statsbomb.com/soccer-metrics/expected-goals-xg-explained/
  4. https://statsbomb.com/articles/soccer/explaining-and-training-shot-quality/

--

--