Attributing xG Allowed to individual defenders

Ricardo Tavares
Football Crunching
Published in
5 min readMar 14, 2021

A defender is a player “whose primary roles are to stop attacks during the game and prevent the opposing team from scoring goals”, according to Wikipedia. There are a lot of tools to do that: some are active (e.g. tackles, pressures, aerial duels), some are passive (positioning).

Active defending can be measured by events, but passive defending can’t. The effect has to be measured indirectly — if your positioning is good, the opponents will have less opportunities to score. xG Allowed (or Against), in fact, is widely used as a measure for a team’s defensive solidity.

From FBref.com

How do we go from there to an individual measure? Short answer: mix it with marking, and attribute the xG to the players marking each shooter.

For a perfect analysis we would need to use tracking data in order to attribute each shot to a specific defender, but there isn’t a big enough public dataset to try this idea with: even for experimentation we need at least multiple matches with players consistently identified.

For this post, I’m trying a simpler approach, using I’m using Statsbomb’s free WC2018 data. Statsbomb’s data includes, for each shot, the location of every visible defender and attacker. Here’s a example, from Spain vs. Portugal:

We consider the optimal marking position to be between the shooter and the goal, and any player within a certain distance of that position is considered a marker. If multiple markers exist, we calculate a weight for each marker based on how close each player is to the optimal marking position (a higher weight given to the player closest to it). Each shot’s weights must total 100%.

Finally, we add each shot’s xG to its markers (multiplied by the weight), and get the xG Allowed by player. Unmarked shots’ xG are added up separately.

This is the outcome for Portugal’s 4 matches at the World Cup:

As expected, most players featured on the list are defenders. Surprisingly though, João Moutinho, a midfielder, has the most xG allowed — due to what seems to be a particular bad match against Spain.

Now lets add shots and goals:

Some interesting points that warrant further analysis:

  • Unmarked shots overperformed the xG. Can this be a valuable new feature to the xG model?
  • Although Pepe had high xG and Shots allowed numbers, he didn’t allow any goal. Can good defenders overperform the xG Allowed (e.g. by blocking shots better)?

While the numbers seems interesting, the results don’t seem to highlight better defenders: Pepe had a great tournament, but his values similar to José Fonte; Cedric is a better defender than Raphael Guerreiro, but he had much worse numbers (0.18 xG Allowed per 90 vs 0.03).

Can we make this metric better?

Going deeper

Lets look at the above example again:

Pepe (#3) seems to be the main marker for Diego Costa, with José Fonte (#6) close by.

Now lets look at the two key moments before the shot:

The moment of the two passes before the shot

We can now see that it was in fact José Fonte marking Diego Costa.

When the shot is made, Pepe is actively decreasing the chance of scoring by being between the shooter and the goal. Saying he “allowed” the xG to happen is not really true. It happened mostly on Fontes’ watch.

Lets look at another example: Cavani’s goal against Portugal.

With this information, we can assume either José Fonte (#6) or Raphael Guerreiro (#5) allowed Cavani to shoot. The model gives José Fonte a weight of 57%.

Now lets look at the freeze frame for the assist. This was collected by myself, and translated to Statsbomb’s coordinate system.

A much clearer picture appears: Cavani was being marked by Raphael Guerreiro, who let Cavani appear uncontested in front of the goal. While Fonte tried to get there, he never had much chance of success.

Adding the assist to the marking calculation would, assuming each moment is worth 50%, increase Raphael Guerreiro’s weight from 43% to 72%, a much better representation of what happened.

Watching the full play will reinforce that idea:

Conclusion

This is mostly an experiment, not a proper study, but hopefully I’ve convinced you that:

  1. It is valuable to try to attribute xG Allowed to specific defenders in order to measure their performance
  2. We need more than the moment of the shot to make a proper attribution

Statsbomb is showing, in the Statsbomb Evolve conference on March 17th, their new 360 product, which promises to include freeze frame information for every event collected.

I believe that should be quite valuable to perfect a measure like this one. Exciting times are ahead. Stay tuned!

Final notes:

If you like this post, you may be interested in my previous post about Marking, focused on analyzing the relationships between attackers and market during a whole match.

This post wouldn’t be possible without Statsbomb’s free data. Thanks for making it available to everybody!

Also thanks to Andrew Rowlinson, who created mplsoccer, the Python library used to make the graphics in this post.

--

--