What is xG?

Wes Swager
4 min readSep 3, 2021

--

Definition

‘The most difficult thing in [soccer] is to score a goal’

-Pep Guardiola

xG measures the quality of a shot.

xG calculates the likelihood that a shot will be scored based on the characteristics of the shot and the play preceding the shot.

xG is measured on a scale between zero and one, with one representing a goal. For example, a shot with a 0.5 xG is calculated to have a 50% chance of scoring.

xG History

‘Quality without results is pointless, results without quality is boring’

-Johan Cruyff

In 2002, ‘moneyball’ transformed baseball, when the Oakland Athletics utilized ‘sabermetrics’ models to overperform what their budget suggested they were capable of through the use of data.

In much the same way, data science has begun to penetrate the highest levels of soccer, largely exemplified by the popularization of the xG metric.

In 2004, Jake Ensum, Richard Pollard, and Samuel Taylor conducted a study of matches from the 2002 World Cup ‘to investigate and quantify factors that might affect the success of a shot,’ and, using logistic regression, identified five factors which had a significant effect:

  • Distance from Goal
  • Angle from Goal
  • Distance from Nearest the Defender
  • If the Assist was a Cross
  • Number of Players Between the Shot and Goal

In 2012, Sam Green, of Opta, a British sports analytics company, introduced the term ‘expected goals’ (xG) in regards to soccer. He described the metric calculated by Opta as a means to:

  • ‘Accurately and effectively increase…chances of scoring and therefore winning matches’
  • ‘Use data from a defensive perspective to limit the better chances by defending key areas of the pitch’
  • ‘Look at each player’s shots and tally up the probability of each of them being a goal to give a [cumulative] expected goal (xG) value’

Early adoption by betting markets has since helped xG become a regular feature of mainstream soccer broadcasts and media analysis.

xG Calculation?

You study and think and discuss, and you come up with a model of playing.

Carlo Ancelloti

xG is not a standardized metric, and, therefore, many different proprietary variants exist amongst various sports analytics organizations, betting companies, soccer clubs, etc.

Generally, xG is built using a classifier model with input from a large historical dataset of shot event data.

Some of the most commonly included features are:

  • Distance to the Goal
  • Angle to the Goal​
  • Opponent Players Between the Shot and Goal
  • Total Players Between the Shot and Goal
  • Bodypart Used to Shoot
  • Type of Assist
  • Pattern of Play

A shot’s xG is measured by comparing the shot to data from previous shots with similar features and calculating the quantity scored.

xG does not consider the quality the players or teams.

xG Application?

‘I tell the players that the bus is moving. This club has to progress. And the bus wouldn’t wait for them. I tell them to get on board.’

Sir Alex Ferguson

xG can add context beyond traditional statistics through assigning a measurement to the quality of shots.

The most direct application of xG is in the assessment of an individual shot, if a miss was a high quality opportunity or how difficult a goal was to score.

Assessment of individual shots can help in providing recommendations using using features significant to xG. For example, a training staff can use an example of a shot taken 20-yards from goal, and show using video replay that there was an opportunity to take an additional dribble, then provide a comparison between the xG of the actual shot and the xG of a shot with the same features but 15-yards from goal in order to help quantify the value of the difference.

The more common application of xG is in measuring xG cumulatively (for example, the xG for a team’s shots across the length of a match would indicate how many goals they were likely to score given the characteristics of those shots).

The cumulative xG of a player across the course of a game or set of games can be compared with actual goals to assess over or underperformance. For example, if a player had a cumulative xG of 3.5 across five games, but only scored 1 goal, they may be assessed as underperforming, and training staff may then assess the player’s shots across that period come to a conclusion that the player requires additional technical training to help improve their finishing.

Cumulative xG for players can also be used as a metric for comparison. For example, if a scouting staff are assessing two players and the players scored a similar number of goals from a similar number of shots, they then can use xG for a deeper comparison, such as, concluding the player with more actual goals v xG is the superior finisher or concluding the player with smaller difference of goals v xG is more likely to be consistent and maintain their current rate of scoring (v regressing to their xG mean).

Cumulative xG can also be measured for a team. For example, for a sports betting company, in assessing a team which has won nine of their previous ten matches, the assumption may be that they are likely to win their next match. However, if that team has also significantly overperformed their xG while their opponents have significantly underperformed their xG, then the expectation may change as a regression toward their xG means is likely.

Cumulative xG can also be measured across the course of a match using timestamps of shots. For example, a media company may create a visual of a match indicating each team’s cumulative xG across the length of the match as a means of demonstrating which team was more threatening toward scoring at various points.

Because soccer is a relatively low scoring sport, the ability to measure the likelihood of a goal being scored can add context toward quantifying the flow and story.

--

--