# Football Insights from FIFA Data: What Comes and Goes with Age

## Essential football lessons anyone can learn using data from the popular Electronic Arts game series. Part #1: What comes and goes with age.

14 min readNov 15, 2022

--

# Introduction

I, like many of you, have played the EA FIFA game series for more than two decades. So when I looked for data for a new project and saw the open FIFA dataset, I was immediately intrigued. It has full player ratings spanning almost a decade, granting a great opportunity to learn about the development of football players over time. With market value and additional metadata on top of players’ attributes, this dataset may produce fascinating insights into players’ development, transfers, and price dynamics.

This mini-series will focus on what football insights we can squeeze out of the EA FIFA dataset alone. We will start from the basics, exercising some descriptive statistics to get a grasp on the domain. Gradually, we will tackle more complex (and useful) tasks that require heavier mathematical tools.

# Data

The data for this project is essentially a mesh of three different datasets I found on Kaggle: the FIFA 21 Dataset link (stats from FIFA 15–21), the FIFA 22 dataset, and the FIFA 23 dataset. Combined, these datasets cover 45,630 players (by ID) and 1,017 clubs and teams over the years 2015–2023.

We will use the following information: (1) Player metadata: age, position, etc; (2) Player attributes: overall and potential ratings; six main attributes (pace, shooting, passing, dribbling, defending, physic); and 40 skills ratings (crossing, jumping, free-kick accuracy, and more); (3) Player value: price, wage, and contract end-date.

Addressing major limitations and biases: Firstly, the selection of leagues distorts the true worldwide population of players. Secondly, players’ ratings and values may be subject to being over- or underestimated due to human evaluator bias, overall hype, sponsored entities, etc. To mitigate these understandable concerns, we won’t address the absolute values of the attributes, but rather self-ratios and their trends over time. Additional details regarding bias will be elaborated on later.

# What comes and goes with age

Naturally, all people, including athletes, experience degradation in some physical aspects as they age, such as pace, acceleration, or stamina. However, footballers like Karim Benzema and Virgil van Dijk prove that years of experience and seasons of practice can benefit players, producing late bloomers. What is it that players gain with time? What might be lost or damaged? And how are these opposing forces aggregated into their overall performance and value? With the FIFA dataset in hand, let’s find out.

This first chapter will cover three main topics: the evolution of on- and off-ball skills, careers development and the first step in understanding players’ potential.

## Warm up: Global trends

To get some basic notion of the data, we will start with some description of ‘global’ trends. Specifically, we’ll inspect a few basic characteristics over age for all players without any differentiation, just to get some sense of the data.

No need to act surprised. The opening figure, Fig 2., acts as a sanity check for our data collection and processing. As expected, pace decays as we get older, peaking at 22–25 and starting its steep decline after the age of 30. Interestingly, this trend is not aligned with the behavior of the overall rating curve. The latter tops at the age of 33, flattens, and then starts its steep descent only after the age of 35. This may be a first hint to suggest that players can gain more than they lose with time. Another, rather esoteric conclusion of Fig 2. is that weight, on average, consistently grows with age. It is reasonable that at younger periods this growth is more related to muscles than fat, rather than at older ages.

Warmup time is over. It’s time for the kickoff ⚽️️

## Skill evolution with age

To better understand what drives performance with age, we will take three different views on the elements composing the overall rating: (1) main attributes, (2) physical abilities, and (3) mentality & skills.

Observing Fig. 3, the first thing to stand out is the variance in the attribute peak timings (marked by 🔝 on the same age-x vertical line). With just a brief look, it is clear that some attributes peak early, around age 27, while others are like wine — getting better with age. Focusing on the bottom left subfigure, some of the skills maxed early (24–25 for pace and dribble), with others over 31, up to 34–35 for shooting, composure, and passing. An interesting lesson from the right subfigure is that despite movement-related abilities being negatively affected before 25, strength remains high.

So far, results suggest that on-ball skills can compensate for the decline in physical shape, keeping overall performance up. However, in reality, we know that tactics objectives vary across positions, hence the set of skills they require is distinct as well. For example, losing your speed as a striker can be more impactful than as a goalkeeper or a central defensive midfielder.

Fig. 4 shows how attackers and their related skills are affected by age (right figure) as well as defenders and their corresponding abilities (left figure). It appears that although offensive attributes are in a steady state at age 30, their mix is different. While on-ball skills seem to consistently improve (e.g., free-kicks, passing, and heading), off-ball skills show an opposing trendmentality gets higher, and physical attributes mostly get lower (Fig 3.).

Defenders (Fig 4., right) especially seem to be late bloomers. Their rating, on average, steadily grows until the age of 30, exhibiting further increase at ages 33–35. Then, almost miraculously, all attributes peak together. *Note: sample size > 150 for all figures’ values, except for age 36 for attackers (82).

Overall, attackers and defenders reach their top performance at different ages: 33 for attackers (like the general population, Fig. 2), and 35 for defenders. Both are way over the age of 30, not so long ago considered the beginning of the end for players’ ability to perform at the highest levels. But, one has to wonder (or doubt) — does the overall rating really represent the quality of the player?

Even the common amateur FIFA player probably knows that the overall rating alone is not enough. A high-rated but slow defense may be high risk against fast-as-light attackers who can exploit that through balls. Just imagine facing Mbappe and Haaland, with (the current) Gerard Pique and Harry Mcguire as your duo at the back. Horrific indeed.

And yet, these trends do align with basic knowledge we have of the game, providing a sort of validation to the analysis. In addition, one can focus on specific attributes of interest, rather than the overall rating. Besides, FIFA is yet to be a real football match, where a collection of individuals play as one, thus more capable and guided by tactics, to compensate for their weaknesses and to empower their strengths.

So far we’ve analyzed aggregated trends. Now, let’s bring it down to earth with some real individual-level examples. Fig. 5 provides an interactive view of several well-known players over time (left) as well as the main skills evolution of the best player the world’s ever seen — Leo Messi (right subfigure).

Interestingly, despite Messi’s overall rating being steady during this period, his physical attributes decreased after the age of 27–28, and his playmaking skills (passing, vision, etc) actually improved. For example, his long passing (about 75 at the age of 27) outrated his acceleration (96 at the age of 27) ability before the age of 32, as Messi shifted from a false 9/right winger to more of a CAM position. This can be also shown with heatmaps.

The left subfigure provides some evidence of the variety of career patterns players may have. Among them, we can witness stars on the rise like João Cancelo or N’golo Kante; players whose performance declines with age like Falcao; and players who were able (so far, in terms of overall rating) to maintain their level, such as Luka Modric, Messi, Lewandowski, and Jordi Alba. Moreover, some players, like Coutinho, exhibit a full career parabolic shape, squeezed into about almost half of the duration of a full career. For those like the latter, their golden age is probably over.

Acknowledging that each player is unique in his development (Fig. 5), it is time to pull more suitable statistics tools, inspecting how these attributes are distributed over age and players.

## Year over Year (YoY) analysis

Here, we take the first step in understanding overall performance distribution by age to by studying the overall rating yearly change for each player.

Inspecting the left subfigure of Fig. 6, the majority of players grow (positions above the red line) until the ages of 23–24, where there is a split in behavior: either decay or further growth. Subsequently, ages 29–33 seem to mirror the age 23–24 crossroads. Later, it seems like the sun goes down for essentially everyone. As mentioned before, the development process can differ across positions, as demonstrated in the right sub-chart of Fig. 6.

## Segmenting career paths of football players

In this section, we will extend the previous YoY analysis by addressing the multi-year curves of players’ ratings. With these sequences in hand, we will segment them into clusters. Such groupings can help us find useful patterns, and can even be used to forecast future development.

For this task, we’ve collected all players’ overall ratings from the ages of 21–27: a six-year span around the age of 24, a spot we found interesting. A sequence is considered valid only if it has no missing values, meaning players that are younger (today) than 24 are excluded, as well as players that were not part of the FIFA series at least once during this period. Overall, 357 players matched these filters.

Cluster description:

`Cluster ix |   0   |   1   |   2   |   3   |   4   |   5   |   6   |#Players       37      67      34      26      69      73      51%Players    (10.3%)  (18.8%) (9.5%)  (7.3%) (19.3%) (20.4%)  (14.3%)`

Interpretation of selected clusters

• Cluster 0 (10.3% of players) represents players who reached a high overall rating but lost some of their charm after the age of 24. Some members of this cluster are Eric Bailly, André Gomes, F. Bernardeschi, Gerrard Deulofeu, A. Oxlade-Chamberlain and Franco Cervi.
• Cluster 1 (18.8% of players) has the median starting point: 65 rating points. It is assembled of players who peaked early, at the age of 23, and were not able to recover. Honestly, I am familiar with only very few of them, such as Jeison Murillo and Erik Godoy.
• Cluster 2 (18.8% of players) represents the individuals that made persistent progress. Players in this cluster started 3rd from the bottom at the age of 21 and consistently improved until the age of 26, to such an extent that they closed a > 10 overall rating gap with cluster 0 at this point. Among players in this group, we can find Jordan Pickford, Matteo Politano, Ante Rebić, and Odysseas Vlachodimos.
• Cluster 3 (7.3% of players) represents the big stars — it contains players such as Harry Kane, Paulo Dybala, Joao Cancelo, and Memphis Depay. This cluster does not present any sign of degradation in this period.
• Cluster 6 (14.3% of players) stands for a collection of players that peaked exactly at the age of 24, similar to cluster 0, but with an average overall rating of 10 points lower. Interestingly, members of this cluster outgrew those of cluster 1, despite having an inferior starting point.

## From age to career phases

Analyzing the curves over time allowed us more long-period observations of players’ careers. However, we, the fans, often feel more comfortable using ordinal phases to describe player states, such as growth, decline, or peak performance. For our needs, we will define five distinct career phases:

1. Accelerated growth: the time the player gains most of its progress in a short period of time. Defined as an increase ≥ 5 of overall rating YoY.
2. Mild growth: player performance still improving but at a subtler pace. Usually, this is a transition state, often before reaching the peak. Defined as an increase of 2 or 3 of overall rating YoY.
3. Stagnation: reaching a (or the) peak. Tough it also goes the other way around, when reaching rock bottom. Defined as a growth of either 0 or 1 of overall rating YoY.
4. Mild decay: the same as mild growth, in the opposite direction and for opposing reasons. Defined as a decrease between 1–3 of overall rating YoY.
5. Accelerated decay: hopefully occurs nearly at the end of a player’s career. Defined as a decrease ≥ 4 of overall rating YoY.

Each player can visit any of these phases repeatedly, or remain static within a single one. For simplicity, the boundaries of these phases are determined by the YoY overall rating difference as mentioned above. Values themselves are a product of an additional analysis (Jupyter notebook coming soon).

What can we learn about each phase in the context of growth, potential fulfillment and even value? Let’s start with validating that they behave as we expect:

Naturally, decay is associated with older ages. Surprisingly however, Fig. 8 (left) shows a substantial amount of younger players across all categories. The means of potential ratio and overall rating (right subfigure) are aligned with the clusters interpretations made above.

## Graph-based analysis of career phases

Career phases open the door for a new form of analysis — a graph-based one. The nodes of this graph can be the career phases, where links are the transitions of the players as observed in the dataset.

We can enrich our node set by defining nodes as permutations of age and phases. For example, here is the career phases graph for attackers aged 18–21, after adding the age dimensions:

We will use the above representations multiple times across this series. In this section, we will explore how to leverage this representation into insights. I started by digging deep, manually inspecting different career paths, diving in for some real examples, and only after that trying to generalize. One of those examples is the fascinating still-ongoing journey of Dele Alli and Ollie Watkins.

## The Tale of D. Alli and O. Watkins

On their 20th birthdays, Alli and Watkins were about 20 rating points apart, in favor of Alli, one of the most promising English football players that year. This gap remained high as Alli continue to shine in Tottenham Hotspur. Nevertheless, by the age of 22, Alli reached stagnation, i.e., the peak. Over time, Watkins’ consistent improvement was catching him up, while Alli struggled to meet the challenge of his own high bar, resulting in dropping potential. Today, Alli still leads overall. However, judging on potential, Watkins is now rated higher than Alli for the first time ever. Any gambles on next year’s ratings?

Judging by transfermarkt.com, this trend already reversed in June 8 2021, when Alli’s price completed a jaw-dropping 70% decrease in 3.5 years (from €100m to €30m), at the ages 22–25. It’s the comeback of the notorious crossroads. It was also one year after Watkins put his new Vila uniforms, raising his value from €12m to €30m in a single year.

What we can learn from this very specific example, beyond that not all careers behave the same, is that these patterns of behavior can be crucial. If we can understand them, maybe we will be able to predict them as well.

🏁 3 minutes of extra time⏳

## The first step in understanding players’ potential

In the last part of this post, we will deal with players’ potential. For our needs, potential is defined as the best score a player would ever achieve during his career, that is, his max score. Conveniently, the FIFA game provides exactly such an attribute. Alternatively, one can normalize player rating by the max overall rating over his career.

Understanding the level of player potential is a hard task even for a jury of world-class experts. Thus, it is reasonable to believe these potential scores are neither accurate nor unbiased. Yet, using patterns and additional factors, we may be able to target players before they explode (both in rating and price), or go the other way around, looking for cheap after-peak players hoping they will keep in shape.

From the left subfigure, we can see that the potential ratio presents roughly the same behavior across positions: gradual consistent growth until the player’s early thirties. Nevertheless, some distinctions exist, such as the peak of the average value of CMs, which happens at the latest at the age of 37 (despite the age 33’s average being very close). Considering this resemblance, the center figure describes the distribution of the overall-potential ratio of each age, for all players combined.

The right subfigure adds another dimension to the analysis, allowing us to get a first impression of players’ value. It demonstrates the relations of the overall rating, the overall-potential ratio, and normalized player value, indicating how much we pay (on average) per a single overall rating point. As shown in the figure, market value moves faster than rating, but it is the rating that sets the direction.

That was a long ride, so we will have to stop here. In the next chapter, we will focus on players’ prices and transfers, so potential will be back soon. We’ll also have an entire post on predicting potential further down the road.

# Summary

In the first post of this EA FIFA mini-series, we took a bird’s eye view of the development of football players as they aged. Merely scratching the surface, we visualized how attributes like speed and stamina wear out with time, while also observing the growth of a compensating set of on-the-ball skills as players continue to master their sport over the years, as well as improved tactical and mental abilities.

In terms of career development, we observed that ages 23–24 may be a crucial turning point for many players, such as Dele Alli and Ollie Watkins. However, we also found that these dynamics change across different positional roles.

Setting the foundations for the following chapters, we explored how graphs can benefit us in studying footballers’ careers, defined and (briefly) examined potential, and segmented players’ career curves. A Jupyter notebook of code will be available in the future, including many additional figures that didn’t make it through editing. Stay tuned!

See you in chapter two :)

Final whistle ⌛️