xGSR: A Second-Order xG Metric

17 min readAug 15, 2023

1. Introduction

Football analytics has grown enormously, with expected goals (xG) emerging as a key metric for evaluating chance quality. As a reminder, xG assigns a value between 0 and 1 to shots based on factors like distance and angle. This enables deeper analysis of match events beyond just goals scored.

However, xG has limitations in capturing the full narrative. Consider two matches where a team has 1 total xG. In one game, they create many low xG chances (0.05 each) and one 0.5 xG chance. In another game, they have fewer but higher quality chances (0.2 xG each). The total xG is identical but the story differs — one reflects erratic chance creation versus another with more consistency.

This raises a key question — while xG quantifies quality, how can we analyze the variance and consistency in a team’s attacking performance? This article introduces Expected Goals Sharpe Ratio (xGSR), a metric aiming to provide that deeper perspective into the sustainability and predictability of chance creation.

2. Introducing xGSR: Evaluating Chance Creation Consistency

What is xGSR?

Formula: For S = [S1, …, SN] being a set of N xG values for N shots, xGSR_S = mean(S) / stdev(S)
In words: Say you want to calculate xGSR over a set of N shots (this might be all of the shots in a game, all of the shots in a season or just the last 100 shots for a given player/team), then the xGSR value is obtained by dividing the mean of those shots’ xG values by the standard deviation of those shots’ xG values.

Why the name “xGSR”?

The term “xGSR” draws inspiration from the Sharpe Ratio in finance. Just as the Sharpe Ratio evaluates investment return vs risk, xGSR assesses a team’s ability to create quality chances while accounting for variability in those chances. High xGSR indicates steady creation of good opportunities, like a reliable stock. Low xGSR suggests erratic chance creation, like a high-risk cryptocurrency.

The temptation to call it the “Expected Goals Saleem Ratio” was hard to resist, especially if the metric takes off in the analytics world. If Sky are reading this, it’s at least an official alias for the metric!

Why does xGSR matter?

Let’s break it down:

Mean of xG: Represents the average quality of chances in a match. A higher mean indicates better average shot quality.
Standard Deviation of xG: Captures the variability of shot quality. A low standard deviation means a team consistently produces shots of similar quality, while a high standard deviation indicates variability in shot quality.

Combining the two gives us xGSR, which assesses not just the quality but the consistency of attacking play. A higher xGSR suggests a team creates consistent, good-quality chances, while a low xGSR could indicate a more hit-or-miss attacking approach.

xGSR can be increased in two ways:

Increasing the quality of chances created.
Reducing the variance in the quality of chances created.

A good way to showcase the utility of this metric is to simulate some games under different conditions and see how xGSR influences win %.

Analysis of xGSR’s Correlation to Match Outcomes

Methodology & Simulations

Team Formulation: We set out to create 5,000 distinct performances for a virtual teams. The process for each team performance was as follows:

Randomly determine the number of shots the team took in a match, with this number ranging from 3 to 20.
Generate individual xG values for each shot such that the total xG for that team equalled 2. This involved distributing the xG value across the randomly determined number of shots in a way that their sum amounted to 2.

Match Setup: Each of these 5,000 teams was pitted against Team X, a consistent opponent that always scored 2 goals, in a series of 10,000 simulated matches. This provided us with a robust dataset of 50 million match outcomes to analyze.

Outcome Metrics: The key metrics captured from these simulations were the Win, Draw, and Loss percentages for each of the virtual teams when playing against Team X.

Example for Clarity:

Let’s walk through a sample simulation:

For one of our 5,000 virtual teams, we randomly determine it took 5 shots in a match.
We then randomly generate xG values for these shots. An example distribution might be: 0.6, 0.5, 0.4, 0.3, and 0.2. The sum of these xG values is 2.
In the simulated match against Team X, this virtual team would score a number of goals based on these xG values. For instance, in each of the 10,000 simulations, the 0.6 xG shot has a 60% chance of being converted into a goal and so on for all of the shots. In this particular instance, the maximum number of goals is 5 (as there are 5 shots in total).
The outcome of this simulated match is then determined by comparing the number of goals the virtual team scores against the fixed 2 goals of Team C.
This process is repeated 10,000 times for this team, and similarly for all other virtual teams, to compute the Win, Draw, and Loss percentages.

With this framework, we delved into understanding the relationship between xGSR and the potential match outcomes.

Correlation Coefficients and Statistical Significance

When we dove deep into the data, some fascinating correlations emerged between xGSR and various match outcomes:

Win Percentage: A Pearson Correlation Coefficient of 0.21 indicates a weak positive correlation between xGSR and winning. The p-value is 0.00, pointing to this correlation being statistically significant.
Draw Percentage: There’s a weak negative correlation of -0.12 between xGSR and draws. Once again, with a p-value of 0.00, this relationship is statistically significant.
Loss Percentage: The coefficient of -0.01 suggests a negligible relationship between xGSR and losses. This is further reinforced by a p-value of 0.51, suggesting that the correlation is not statistically significant.

Clearly, not all xG is created equally — each point on these scatter plots is a simulated game where a team generates 2 xG in total yet their win probability can vary greatly. The main point to takeaway in this particular simulation setup is that increasing your xGSR raises the floor of your Win% and reduces the ceiling of your Draw%. This isn’t a cherry-picked example — indeed, it was the first I simulated but feel free to simulate other scenarios and share your results!

Using Predictive Models to Validate the Importance of Standard Deviation

In the world of data, we often lean on predictive models not just for forecasting, but also for gaining insights about the underlying data and the importance of various features. In our context, we want to understand the impact of xG_mean and xG_std on our outcome - the win percentage. To do this, we adopt a sequential modelling strategy.

Sequential Modelling Approach:

Random Noise Model (RMSE: 4.43): The first model we consider is our baseline, incorporating only a random noise feature. This model helps set a benchmark for the least informed predictions.
xG_mean + Random Noise Model (RMSE: 2.87): Here, we introduce the xG_mean feature alongside our random noise. The idea is to see how much improvement in prediction accuracy we achieve by simply considering the mean xG.
xG_mean + xG_std + Random Noise Model (RMSE: 0.67): Lastly, we augment the model with the xG_std feature. By doing so, we aim to gauge the added predictive power of considering shot variability.

The RMSE, or Root Mean Squared Error, provides an indication of the accuracy of our predictions. In essence, it measures the difference between the predicted and actual values. A smaller RMSE denotes a more precise model, suggesting that our features (xG_mean, xG_std, and random_noise) are effective in predicting the Win%.

Why is this method useful?

This sequential approach paints a clear narrative. The stark drop in RMSE from the random noise model to the one incorporating xG_mean and xG_std underscores the significance of these features. The fact that the RMSE drops drastically from 2.87 to 0.67 when xG_std is introduced, despite the presence of xG_mean, solidifies the importance of considering shot variability or standard deviation.

This methodology validates the intuition behind xGSR. The RMSE values demonstrate that considering both the mean and standard deviation of xG values (i.e., the essence of xGSR) makes our predictions far more accurate than considering either in isolation or not at all.

In summary, xGSR encapsulates both central tendency (mean) and dispersion (standard deviation), and our modelling results underscore the importance of considering both dimensions when assessing a team’s performance.

Now that we know these values are important, let’s look at how they affect our model using SHAP values. A SHAP summary plot:

The above plot shows that lower xG_std values bolster our Win% predictions, reinforcing the intuitive belief that consistency in shots is favorable. This consistency, characterized by a lower standard deviation, seems to play a pivotal role in swaying outcomes. Our model astutely recognizes that the random noise, a control feature, should remain neutral.

The analysis around xG_mean is more intricate. Given that we set a fixed total xG of 2 for each simulated game, drawing definitive insights from this feature becomes challenging. However, in a broader perspective, a higher xG_mean typically indicates a higher cumulative xG, which would intuitively correlate with a higher chance of winning.

Bridging the Simulation-Reality Gap

The simulations demonstrate important relationships between xGSR and match outcomes through a robust dataset. But how do these simulated correlations translate to practical insights?

A positive correlation between xGSR and win percentage indicates teams that consistently create good chances improve their odds of victory. This suggests managers should focus not just on chance quality, but steady creation.

The negative xGSR-draw correlation implies inconsistent chance creation makes draws more likely. Teams may play to stylistic strengths, but at the cost of unpredictability.

While a simulated environment simplifies reality, it validates xGSR’s potential value. The metrics warrant further real-world analysis through case studies on specific teams, players and managers. Questions to explore:

How does xGSR relate to different playing styles and tactics?
Which players consistently supply high xGSR chances? How can teams maximize their involvement?
Do certain managers consistently achieve high or low xGSR across teams? What factors drive this?

While prudent interpretation is always required, these initial results suggest xGSR could offer tangible insights into improving performance, tactics and evaluation. The next step is targeted real-world examination.

3. Real-World Applications of xGSR

Whenever I come across an article unveiling a new metric, my instinct is to grasp its core concept and then quickly delve into how various players or teams measure up using this metric. If you’re on the same page and swiftly scrolling through, PAUSE RIGHT HERE!

The Data Source:

The data is scraped from Understat.

The Standouts and the Inconsistents:

Diving into last season’s data unveils the top 15 performers, filtered for those with a minimum of 50 shots taken. Immediately, the metric’s limitations surface. Consider players like Carles Pérez and Lazar Samardzic. Their track record shows a consistent tendency for low-quality shots.

For instance, examining Pérez’s shot map uncovers a pattern of long shots taken just outside the right side of the box, each with an average xG of about 0.05 — not a single one resulting in a goal. Only three of his attempts culminated in goals, all from more central zones. The shots from the right half-space are where talents like Trent or KDB often deliver precise crosses for their teammates to, more often than not, get a shot on target. Perhaps Carles could draw some inspiration from their playbook!

Using xG-received as the lens to evaluate players might not be the best approach. More often than not, the xG a player receives is a testament to the team’s chance-creation prowess, rather than an individual’s capabilities. While one might argue that a player’s off-the-ball movement affects these stats, it’s a less influential factor.

On the contrary, evaluating xG generated by a player is quite insightful — we do this by aggregating over the assister of the shot as opposed to the shooter of the shot. This leads us to the xGSR from a generation standpoint. Among the top 15 in this aspect, familiar names emerge. Yet, Alexis Sánchez, standing tall at the top, underscores a crucial caveat: always consider xGSR in tandem with total xG created. Although xGSR doesn’t factor in xG totals, their combined insights offer a richer understanding of both individual and team potential. Sanchez didn’t create scoring opportunities often, as indicated by his low total xG. However, when he did, they were consistently of high quality, suggesting that while he might not always be involved, his contributions tend to be valuable and potent.

Drawing inspiration from its financial counterpart, the Sharpe Ratio, investors primarily focus on a high SR for their investments. They understand that strategies with a high SR can be leveraged to achieve any desired ROI. However, in the context of players, the analogy isn’t directly translatable. While one might think of increasing a player’s involvement as a parallel to leveraging, there are inherent limits to how often a player can be at the centre of play — think of it as a “liquidity” constraint in the game.

Switching our gaze to the 15 players at the other end of the spectrum, it’s crucial to note that this list is filtered to include only those who created at least 50 shots last season. So, these are still the worst of a relatively good bunch. Players like Szoboszlai and Maddison, both players who have had recent high-profile transfers, don’t shine in this metric. Their roles last year, which often saw them in varying pitch positions (in something of a free-role) aiming to create chances, naturally led to diverse xG outputs. The inconsistency in their xG generation might stem from their inconsistency in field position. Clearly, interpreting this metric for players demands a nuanced approach.

Let’s take a look at the team-level instead.

Teams like PSG and Bayern maintain a stronghold on their respective leagues, which is getting increasingly boring to all bar perhaps the most ardent supporters. Given their consistent dominance domestically, it’s no shock to find them among the top 10 for the season’s xGSR — their ability to craft superior shot opportunities against relatively weaker adversaries is well-documented. However, the spotlight shines brightly on Tottenham, a top 3 entrant. Despite their total xG tally being eclipsed by Man City’s staggering 70.95, they almost mirror City in xGSR. This brings about a fascinating comparison when looking at the worst teams:

Take Everton for instance, positioned at the other end of the leaderboard. Everton’s shots, on average, carry a higher likelihood of finding the back of the net compared to Spurs; their xG_mean of 0.11 outshines Tottenham’s 0.10. Yet, when it comes to xGSR, Spurs, with their 0.86, are clearly a class apart from Everton’s 0.67.

To visualize this difference, let’s delve into a density plot showcasing the xG values of shots from both teams. As one might predict from the statistics, Tottenham’s density is more centralized. Everton, in contrast, has a significant proportion of their shots at the extreme ends. You might question the advantage of having fewer high xG value shots but it’s worth noting that a team consistently aiming for chances with >0.8 xG might inadvertently pass up several decent xG opportunities, thereby diminishing their overall xG sum. It’s reminiscent of the age-old adage from commentators: “Arsenal’s dilemma? They always seem to want to walk the ball into the net.”

Reframing Defence: A Different Angle

Analyzing xGSR isn’t limited to offensive prowess; it offers crucial insights into defensive fortitude as well. The defensive goal, as elucidated by xGSR, is twofold: firstly, strive to minimize the average xG conceded. Then, for a set average xG conceded, aim to ensure that opponents’ shots fluctuate in quality. The ideal defence makes it challenging for attackers to consistently produce high-quality shots.

Diving deeper into the data, Italian clubs notably emerge as stalwarts. This pattern could indicate a tactical inclination in Serie A towards frustrating opponents and mitigating their shooting consistency. Among these defensive powerhouses, Jose Mourinho’s AS Roma shines brightly, surpassed only by Zenit. Another commendable mention is Brentford from the Premier League — their aggressive pressing style has historically made it challenging for opponents to stabilize their shot quality.

However, there’s always another perspective to consider. Examining the defences that faltered, we find a blend of the predictable and the astonishing. Relegated teams like Southampton and Leeds understandably appear, given their tumultuous seasons. But the shocker? Liverpool, a team with a storied defensive legacy, languishes at the bottom. Their struggles last season were so pronounced that they ranked as the most vulnerable defence across all leagues covered by Understat. We’ll dive deeper into this a little later.

xGSRD: Tipping the Scales

Introducing xGSRD — a metric that offers an alternative way to measure the performance difference between a team’s defence and attack. While on the surface it may seem like a straightforward subtraction, in reality, it delves deeper, encapsulating the dynamic interplay between offensive prowess and defensive resilience. This metric shines a light on the equipoise teams strike in their game plan, as it measures how adept they are at crafting high-quality chances, juxtaposed against their effectiveness in curbing opponents from doing the same.

Our leaderboard echoes familiar names, with PSG reigning supreme. However, with the departure of stalwarts Messi and Neymar, both renowned for their impeccable knack for creating high-caliber chances, the future landscape might be less predictable for PSG. Will there be a notable shift in their chance creation next season? Time will tell.

On the other end of the spectrum, familiar narratives persist. Leeds find themselves at an unfavourable end, a telling reflection of their season’s challenges. Liverpool, too, makes an appearance, signalling the pronounced issues they’ve had in their defensive configuration.

We can take a further look into some of these interesting cases over time.

Both Merseyside teams have witnessed fluctuations in their xGSRD metrics. However, what might raise eyebrows is that last season, Liverpool unexpectedly lagged behind Everton in this regard. But, context is key: while xGSR provides insight into shot quality, it doesn’t account for the sheer volume of shots. With Liverpool boasting a shot_count_diff of 79, compared to Everton’s surprising -294, they clearly had the upper hand in terms of attack frequency and total xG (shot_count_diff is the delta between shots_taken and shots_faced).

Concerns for Liverpool’s defence have only really grown louder in the last season. Their aggressive gegenpressing has left space for transitions as their initial pressures have failed more regularly. The data, however, suggests this wasn’t just a one-season hiccup; the decline started earlier — they’ve been getting worse every year since the 18/19 season! And, judging by their recent performance against Chelsea, their rebuild may not be complete just yet.

Spurs under Conte presented an intriguing narrative. Despite some claiming his tenure missed the mark, xGSD data paints a promising picture. Tottenham not only carved out quality chances but also effectively thwarted their rivals. With Ange now in charge, there’s a palpable sense of anticipation about the club’s tactical shift. Ange’s approach appears to starkly contrast the Mourinho/Conte era.

Speaking of Mourinho, over in Italy, AS Roma is turning heads. Under the guidance of José, they’ve mastered the art of keeping opponents confined to inconsistent and low-quality shots. Mourinho has been known to tailor his match-specific training to nullify the strengths of his opposition — this metric let’s us see that he’s amongst the best in the business at doing so! It’s interesting to note that coaches in that mould (Mourinho/Conte) seem to be faring well in this metric.

4. Balancing Quality with Quantity: The Final Verdict on xGSR Metrics

The introduction of xGSR, xGSRc and xGSRD as new metrics offers a fresh perspective on evaluating both individual player and team performances. Instead of merely focusing on the quantity of chances created or conceded, these metrics shed light on the consistency and quality of those opportunities.

For players, xGSR allows for a deeper dive into the nature of chances they’re involved in. It’s not just about how many shots or chances a player creates, but the regularity and quality of those opportunities. As illustrated, certain players may not frequently create chances, but when they do, they tend to be of higher quality — can we use this metric to increase play through certain players?

From a team perspective, xGSRD gives an insight into the equilibrium between a team’s offensive potency and defensive solidity. Teams with higher xGSRD values not only generate better chances consistently but also thwart their opponents from doing the same. The metric, however, doesn’t function in isolation. For a comprehensive assessment, it must be juxtaposed with traditional xG and shot count data. Generally, this metric can be used to understand the progress a club is making under a new coach or new style of play.

What I was most pleased about, and not just because I’m a United fan, is seeing Liverpool sit right at the very bottom of the xGSRc leaderboard. A lot of people have commented on how their engine room last year let so many transitional plays through far too easily. These transitional opportunities often yield high xG values as you’ve caught your opponent out of their rest defence. We now have a metric which tells us they were the very worst team in the top leagues at letting through these kinds of chances!

For fans and analysts, these metrics introduce another layer to the rich tapestry of football analytics. As with all statistics, caution is advised. No single number can capture the entirety of a game as fluid and multifaceted as football. However, by integrating these new perspectives, we can gain a richer, more nuanced understanding of the sport we love.

AOB

This article ventures beyond the surface of xG by delving into a “second-order” perspective, which essentially means we’re eyeing the standard deviation, often dubbed the second moment. But one might wonder, why halt at the second moment? Why not venture into Skew (the third moment) or even Kurtosis (the fourth)?

The rationale is pretty straightforward. As you climb up the ladder of moments, the demand for data swells. To truly grasp these moments, we’d have to rewind further, incorporating more shots and games into our analysis. Yet, there’s a snag. The deeper we dive into the past, the less pertinent those shots or games become to today’s team dynamics.

Still, it’s worth noting that xG isn’t the sole metric ripe for this expanded perspective. Consider metrics like xT and VAEP, which zoom into the minutiae of the game. Given their detailed nature, they amass data at a brisker pace and in these domains, exploring the third or fourth moments may well shine a revelatory light.

I haven’t written an article of this kind for quite a few years and quite enjoyed it — would welcome any comments or critiques! Thanks to my co-writer ChatGPT and to T. Barrie for giving it a brief edit.