Top 5 leagues styles of play using data
(And some interactive dashboards to explore them.)
Nowadays football/soccer analytics is becoming more and more popular even with the mainstream audience. Data is used very efficiently to assess the performances of teams and players, and the classic stats (goals, assist, tackles, …) are now flanked or replaced by new advanced metrics (expected goals/assist/threat, field tilt, …) that enable an in-depth comprehension of the game that was absolutely impossible a few years ago.
The evaluation of teams’ levels of performance is a core aspect of football analytics, but also the identification of tactical patterns and common playstyles are tasks that can be supported by data. The boundary between evaluating performance and playing style with data is thin: for example, two teams can produce a very different amount of non-penalty expected goals relying on similar principles and ideas in terms of buildup and threat creation.
In this story you’ll read how I tried to profile the 98 teams of the top 5 European leagues in terms of style of play, focusing on stats that in my opinion highlight this aspect rather than the effectiveness of the performance. Long story short, I didn’t use any stats strongly related to results and performance like goal difference, xG, or field tilt.
This analysis was inspired by the brilliant work of Matteo Pilotto. At the end of the article, you’ll also find two interactive dashboards to freely explore the data that I will try to keep updated until the end of the season.
How I built this
The first step was collecting and preparing the data. FBref is a great free source of information: to build the dataset I just downloaded all data of teams of the top 5 leagues for this season. A metric that I like to use in my analysis that is not available on FBref is the PPDA. These data are published by Unserstat, so I took the PPDA from the website and joined it with each team in the FBref dataset.
Once the initial dataset was ready I started to dig into the data to explore and identify the stats that could be useful to highlight and define aspects of the style of play. Lastly, I ranked all 98 teams according to the indicators I had chosen. Again, the focus in the features/metrics selection was on highlighting the nuances of the style of play of each team, so in this analysis a low rank doesn’t necessarily mean “bad” and a higher one “good”.
The metrics
Here’s a brief description of the 10 aspects of the style of play I chose to include, divided into 3 macro-areas, and the stats and metrics I used to evaluate and rank them.
Possession metrics
- Ball possession: after considering different stats I chose to simply use the average ball possession share. I’m not completely satisfied with this choice because obviously the amount of time a team keeps the ball is not only the result of a tactical choice but I think it’s fair enough.
- Passing tempo: here I wanted to quantify the speed of the possession phase. I used the ratio between the attempted passes and the touches made by the team. A high value of passing tempo indicates a few touches between each pass, a low value means a lot of touches between one pass and another.
- Directness: the idea was to evaluate how vertical and direct the approach of the team is. This rank is based on the share of progressive passes and runs over the total amount of passes and runs made by the team.
Offensive and threat metrics
- Crossing: to estimate how much a team relies on crossing game to create threats, I considered the share of penalty area entries via cross.
- Dribbling: that’s simply the number of attempted dribbling, as a measure of the reliance of a team on the 1vs1 situations.
- Air duels: this rank is just the sum of air duels played by the team, both won and lost. Using FBref’s data it’s not possible to distinguish between defensive and offensive air duels and the choice to consider them as an “offensive and threat” trait of the playstyle is absolutely arguable. What I wanted to highlight is the tendency of each team to lift the ball, and even if the number of the played air duels also depends on what the opponents do I thought that in the long run, the teams that rely massively on this solution will emerge.
- Set pieces: FBref publishes the data on SCA (shot-creating actions), and I chose to use the share of SCA created via dead-ball passes to estimate how much a team relies on set pieces to generate threats.
Defensive and pressing metrics
- Defense height: when it comes to playstyle in the defensive phase it’s important to understand if a team prefers to stay back on the pitch and defend in a positional way or tries to press and regain the ball in a higher position. Finding a way to model this nuance of the style of play using stats it’s not an easy task, and probably it would need more data than those available on FBref like the average heights of the team and defense line. After digging a bit into the data I chose to use the share of total tackles attempted in the defensive third. This is not a perfect criterion but it’s based on the assumption that a team that tends to defend in a positional and cautious way will attempt a larger share of his tackles in the defensive third.
- Pressing intensity: for this ranking, I just used the PPDA.
- High pressing: in a similar way to the “defense height”, I used the share of pressures applied in the attacking third.
Exploring the data
After ranking all the teams according to the defined playstyle metrics I put on a simple visualization using coxcomb/pizza charts.
I found myself spending hours digging and exploring the data and I wrote down some examples of interesting stuff that can be discovered. I encourage you to do the same using the interactive dashboards linked at the end of the page.
Extreme styles of play produce extreme chart shapes
After generating the charts I started to navigate through the leagues and the teams for a first-look evaluation of the result and the shape of the playstyle chart of Manchester City caught my attention. I was quite satisfied to see a so sharp and extreme shape for a team with clear and well-implemented principles.
I think that the style of the intricated and complex possession phase of the Sky Blues is accurately represented with extremely high possession share and passing tempo and minimum directness. Top ranks for high pressing and height of tackles seem pretty reasonable too.
Why we should watch more Ligue 1 matches
For some years now the slogan of the Ligue 1 has been “the league of talents”. For sure the French championship is the league of dribbling. The teams from Ligue 1 have the highest average rank in the dribbling metric, and that’s not only due to the presence of Neymar and Mbappé.
Even teams in the lowest positions in the standings like Angers, Clermont Foot, and Saint-Étienne rely heavily on dribbling and 1vs1 situations to create scoring chances or overcome the opponents’ pressure. Besides the PSG’s stars, there are other exciting dribblers like Boufal, Faivre, Doku, Aouar, Sulemana, and many others. Ligue 1 is often considered the Cinderella of the classic top 5 leagues, but it’s probably one of the funniest to watch.
When it comes to being direct, Bundesliga teams don’t mess around
The Bundesliga is by far the league with the highest average rank for directness with 12 of the 18 teams having a rank over 80. Mainz 05, Greauther Fuerth, Union Berlin, Wolfsburg, Hoffenheim, FC Cologne, RB Leipzig, and Freiburg even have an over 90 rank.
The emphasis placed in the German league on aspects such as physicality, aerial game, and verticality reflects in the average ranks for air duels and share of shot-creating actions from set pieces too. Even Bayern Munich despite low values for set pieces reliance, air duels attempted, and crossing game follows the league trend with a directness rank of 85, the highest among the other league-leading teams.
Fluidity over duels?
The Italian Serie A is the league where on average fewer dribbles are attempted, and even in this context, Inter is the second to last for completed and attempted dribbling. No other high-ranking team relies so little on this kind of situation. The team may not include extremely skilled dribblers, but that’s obviously not a matter of the quality of individual players.
Simone Inzaghi decided to focus on rotations and positional fluidity, instead of individual duels and dribbling, to create space and progress the ball. So far this approach has not limited the team’s attacking and threat-creation potential, as proved by xG and other attacking stats.
Sarri’s legacy is one that lasts
Maurizio Sarri left Napoli in 2018 and since then three very different coaches sat on that bench: Ancelotti, Gattuso, and Spalletti. Even if only a few players of that Napoli are still involved consistently (Koulibaly, Insigne, and less Mertens) Sarri’s legacy seems to be still tied deep in the DNA of the club. The elaborated possession phase, the intense and high pressing, the very low reliance on crossing game and air duels, are principles that have been in place for many years now. Maurizio Sarri is now facing several difficulties on Lazio’s bench, but it’s nice to see how similar the shape of his current team and that of Napoli are. Even if the effectiveness and performance levels of the two teams are very different, the charts show the similarities in their styles of play.
Conclusions & interactive dashboards
And now here’s the reward for reading all my considerations of doubtful interest (or just skipping to the end of the article). On my Tableau Public profile, you’ll find a dashboard to explore the data of the top 5 leagues and another one for single team analysis. You can play around with data and find your own insights. There’s a lot of data and stuff here, so I didn't want to just fill a page with a ton of charts but focus on the process that led to the realization of these charts and then make the tools available. Again, this is not an exact science and the metrics are not perfect, but I think that they could be an adequate approximation.
Click here for the leagues dashboard
Click here for the single team dashboard
Ping me on Twitter for every feedback and above all for constructive criticism!
In the next article, I’ll try to identify the most similar teams in terms of style of play applying some clustering algorithms to my ranked dataset.