Demystifying the FIFA Algorithm

18 min readDec 15, 2021

By Vishaal Kumar, Vikram Bala, and Jeffrey Xiao

Football is a religion. There is a reason it is the most popular sport in the world. But due to its large fanbase, there is a lot of debate amongst members of the football community regarding various aspects of the game — Which domestic league is the best? Is the wage balance between players in different domestic leagues fair? Does the reputation of a team boost the value of a player more than his/her skills and performances?

These are all interesting questions and are often brought up when fans talk about the sport. Apart from the physical game itself, the infamous video game series, FIFA, has grown to be a very influential entity amongst members of the football community.

FIFA is titled as the best-selling sports video game franchise in the world. Football and players are obsessed with the game. Every year, fans are eager to know what the ratings of their favorite players are for the next iteration of FIFA. Oftentimes Electronic Arts (EA), the creators of FIFA, are questioned for their rating decisions and there is a lot of debate in the football community about the validity of these player ratings.

We thought it would be interesting to analyze the player ratings, and team statistics in order to see some trends in the way EA rates and values its football players. We wanted to answer some of football’s most debated questions:

Does your transfer value depend on the league you play in?
Which domestic league is the most competitive (player rating wise)?
What aspects of the game do EA value the most?

Apart from exploring the interesting question of EA’s FIFA rating/value process, we also thought it would be interesting to explore how well FIFA expresses the norms and beliefs of the football community. Does FIFA conform to the most popular expectations or is it fair when evaluating players in the game? We attempt to answer these questions by using various Machine Learning techniques and we want to take you along our journey of demystifying the FIFA algorithm.

Step 1 — Find a Data Source

We will use a FIFA player dataset which contains statistics regarding football players from FIFA 17 to FIFA 22. We chose this dataset because it has a wide variety of features, including player attributes as well as market value and release clauses. This dataset contains data that was scraped from Sofifa (a database containing up-to-date information about player statistics in most iterations of FIFA).

We also wanted to see the relationships between football leagues (especially Europe’s Big Five — yes we realize that the topic of Europe’s Big Five is another topic of debate but we do not really touch on that topic) on other features. Unfortunately, the FIFA player dataset above does not map football clubs to football leagues directly. As a result, we scraped the mappings from football clubs to football leagues from their respective Wikipedia pages.

Step 2 — Data Parsing

We chose Wikipedia as a source for several reasons. Firstly, Wikipedia is the largest encyclopedia in the world — if there’s any place that contains the information we’re looking for, it’ll be here. Secondly, Wikipedia pages are very structured, as they are divided into numerous sections, headers, and tables. Two similar pages (eg. two football club pages) will have similar layouts, making it easier to programmatically parse and scrape information.

The basic algorithm is as follows:

- Go through all players in every FIFA dataset
- For each player, find the Wikipedia page for their football club
- Use XPath to scrape the “League” term from the information box

Our first attempt at scraping football leagues was to use a single-threaded approach. This was the simplest and shortest approach, but also the most unsuccessful one. For one thing, it was incredibly slow. With about 17000 players for each of the 6 datasets, this approach took over 10 minutes for what is logically a simple task. Given the nature of this project (and Wikipedia scraping policy), we began optimizing our algorithm further.

The second attempt was a significant improvement in terms of speed. The bottleneck of the prior approach was that we were only performing one search at a time. Using an open-source data manipulation library called Pandas, we were able to optimize our approach and run multiple searches at once. This approach cut down the practical running time down to about 1 minute. However, there’s always a tradeoff between speed and accuracy. Although this approach was quick, we were left with over 200 teams that did not have a corresponding league. The reason: Wikipedia’s disambiguation pages. Terms such as “Everton” can refer to things other than the football club. Since none of us wanted to manually find the league for 200 or so teams, we needed to find a different approach. Back to the drawing board we went.

Let’s take a step back and revisit the prior approaches. The primary issue with the first approach was speed — to solve that, we took advantage of Pandas’ built-in optimizations. The issue with the second approach was accuracy — we needed a way to account for disambiguation pages. Luckily, using Wikipedia’s python module, we can fetch the most relevant pages given a search query. So, say we get the top 5 most relevant pages for a team. Assuming that these teams are relatively popular, if we were to search through all 5 of these pages (or until we find a match), this should account for the missing clubs above — and it did just that! This updated approach left us with only 36 no matches, whilst running for approximately 2–3 minutes.

- Go through all players in every FIFA dataset
- For each player, find the top 5 most relevant pages by their team name
- If any of the top pages matches onto the actual club page, use XPath to scrape the “League” term from the information box

Step 3 — Cleaning, Structuring, and Integration

In order to run a ML algorithm on our data set we need to clean and structure our data. To do this we performed the following:

Removing HTML tags in certain fields
Removing currency values from fields such as wages and value
Converting strings to doubles (eg:- 1K to 1000 and 1M to 1000000)
Converting units to conform to a standard unit of measurement: Height is in centimeters and Weight is in kilograms)
Dropping Null Values
Dropping Columns that were deemed unnecessary for our analysis (eg: Flag, Joined, Loaned From, etc.)

Step 4 — Exploratory Data Analysis (EDA)

Now let us explore our dataset!

FIFA is a complicated game, with billions of dollars poured into it every year simply to create and tweak a variety of player attributes. In our EDA, we chose a small subset of player attributes — the ones that we found most relevant and interesting.

attributes = [‘Crossing’,’Finishing’,’Dribbling’,’BallControl’,’Acceleration’,
’SprintSpeed’,’Agility’,’Balance’,’ShotPower’,’Stamina’,’Strength’,
’Vision’]

After selecting the attributes we wanted to focus on, we started to create some plots to see trends in player data. We only selected a few plots to display in this article. Our full EDA can be found here.

1. Age vs Overall

The first interesting plot we created was Age vs Overall. In the past decade, football has seen a large influx of young talent and we thought it would be interesting to see if FIFA values experience when setting an overall rating for a player.

The heatmaps above tell us that younger players between the ages of 20 and 25 generally have higher ratings, while the opposite seems to happen for players above the age of 30, confirming influx of young talent.

Additionally, it’s interesting to see where most players fall into. In the results below, we see that the most frequent age is between 22–26.

Year 2017, Age = 26, Overall = 55
Year 2018, Age = 24, Overall = 52 
Year 2019, Age = 22, Overall = 52 
Year 2020, Age = 26, Overall = 49 
Year 2021, Age = 26, Overall = 49 
Year 2022, Age = 22, Overall = 46

2. Overall Rating vs League

Bar chart of Average Overall Rating vs. League by League

Bar chart of Average Overall Rating vs. League by Year

For years, pundits have argued as to which league has the highest quality of players. Some argue that the Premier League is the most competitive league in the world, but it is safe to say that the Spanish, Germans, Italians, and the French would disagree. In order to debunk this myth, we plotted overall rating vs league. Interestingly, on average for the last 5 iterations of FIFA, the Premier League is on the lower end of player averages in comparison to the other Big 5 leagues. However, focusing on this year’s statistics, the league with the highest average overall rating is La Liga followed by Premier League, Serie A, Bundesliga, and Ligue 1.

3. Height/Weight vs Player Attributes

Zlatan Ibrahimovic, Lionel Messi, Cristiano Ronaldo, Adama Traore are all popular names amongst fans and players but what differentiates them is not only their skill but their heights and weights. We thought it would be interesting to see how a player’s height and weight relate to certain attributes. After plotting a large variety of plots comparing a player’s height and weight with several attributes we found the following trends:

Height and Strength are positively correlated
Height and Agility are negatively correlated
Height and Dribbling are negatively correlated — Does a lower center of mass allow you to move with the ball easier? This is very interesting!
Weight and Strength are positively correlated — This makes sense since the higher weight could correspond with a higher muscle mass or body weight allowing the players to be stronger.
Weight and Agility are negatively correlated.

4. Wage vs League

Interestingly even though when we compared the average player overalls in the Big 5, the averages were very close and only differed by a few decimals. However, there is a large difference between the average player wages in the Premier League and in other domestic leagues. This alludes to the seemingly high reputation the Premier League has and the fact that the Premier League is known to have much richer clubs in comparison to other top-flight domestic leagues. Does FIFA harshly rate Premier League players or are Premier League players overpaid? This is again an interesting question that a lot of football pundits debate on.

5. Wage vs Position

This one is very interesting and confirms one of the most controversial topics in football — Are strikers the most valuable members of a team? Are goals more important than tackles, assists, and clean sheets? Clearly, from the graphs above we can see that CFs (Centre-Forwards) are paid the most by a large margin. This clearly shows that players who provide teams goals are considered much more valuable and are thus paid more!

6. Overall vs Value

Scatterplot of Overall Rating vs Transfer Value in FIFA17

Scatterplot of Overall Rating vs Transfer Value in FIFA22

Now onto a less hotly-debated topic: overall rating vs value. It’s expected that the better or higher rated a player is, they are more likely to be valued higher. The graphs above for FIFA17 and FIFA22 confirm this, but also show that there is an exponential relationship between overall and value. Put simply, the difference between higher ratings (i.e. 80–90) causes more variance in value than the difference between lower ratings (i.e. 60–70). Perhaps this is due to the fact that it’s rarer to find “better” players than mediocre ones, causing their value to skyrocket.

7. Wage vs Value

Scatterplot of Wage vs Transfer Value in FIFA17

Scatterplot of Wage vs Transfer Value in FIFA22

As we suspected, players with a higher salary/wage typically also have a higher transfer value. FIFA22 seems to have more of a scatter in terms of wage and value, but this is to be expected due to the football world’s big names. Not much of a surprise here!

8. Transfer Value vs League

Bar graphs of League vs Transfer Value by League

Bar graphs of League vs Transfer Value by Year

To finish off our journey visualizing transfer value, we looked into the relationship between leagues and transfer value. After normalizing the data, we see that among the big 5, Ligue 1 is consistently the lowest, whereas the Premier League sits comfortably at the top. Additionally, we’re noticing a decline in normalized transfer values in recent years among La Liga and Serie A. This can correspond to a large influx of talented players to the Premier League as it grows in popularity. For instance, the transfers of Cristiano Ronaldo and Raphael Varane from Juventus and Real Madrid (respectively) to Manchester United.

Modeling: Identifying Fraud in International Football Trades?

As we began our EDA, we stumbled upon a NY Times article in the corner of the sports section. Its title: What’s a Soccer Player Worth? Italy’s Regulators Are Asking. The article explained the accounting maneuvers that some of the biggest football teams, like FC Barcelona and Juventus, had been making in order to register profits on their balance sheets. Highly overvalued players were being used to help teams fix their balance sheet, and regulators were starting to take notice. The value of an athlete, even in the NFL, NHL, and NBA has always been contested and controversial. An aging veteran might get a large contract just because of their reputation, while fans might react in outrage when their favorite player gets a highly undervalued contract. However, surely there must be some objective measurement to the range of a player’s value. This is what we set out to test, to see if there was actually a means to predict a player’s transfer value based on their different skills and other attributes.

Thus, after making graphs for seemingly every possible combination of features, as well as analyzing various trends, we settled on predicting transfer value as being our goal for the modeling portion of our analysis. In this section, we will discuss data preparation for modeling, as well as 4 different models we made in attempting to predict a player’s transfer value.

*Note: One thing to note is that a player’s transfer value is different from the “wage” feature seen in our data. Wage is the weekly salary of a player, whereas transfer value is how much money that player can be traded for.*

How did EDA Inform our Modeling

For the amount of EDA we performed, surely there must be some takeaways that could help us in our modeling. Our EDA first and foremost demonstrated that there is indeed a correlation between many skill attributes and a player’s overall rating. Additionally, since we showed that there is a correlation between overall rating and transfer value, this tells us that there is some correlation between skill attributes and transfer value. Thus, we are not modeling on random data, but data that visually appears to have some relationship to the problem we are trying to solve. Let’s get onto preparing out data to solve that problem now!

Data Preparation

To prepare our data for modeling we had to first make sure we had the features that we wanted. We prepared a list of columns to drop that weren’t really correlated with player transfer value. These included columns like “Jersey Number,” “Name,” “Contract Valid Until.” Additionally, since International Reputation is very subjective to which International team you play for, we decided to drop that column as well.

After dropping these columns, we were tasked with converting some categorical features into one hot vector in our table. Some of these features included “Body Type,” “Club,”and “Position.” In addition we also performed processing to put our transfer value label into 10 buckets, such that we could train on not only continuous prediction models like linear regression, but classification based ones as well — like the Random Forest classifier, and a Logistic Regression classifier. We lastly split our data into a 20% test size, and a train size of 80% (there was a total of 97133 rows in our data before splitting).

drop_columns = ['ID', 'Name', 'Preferred Foot', 'Jersey Number', 'Contract Valid Until', 'Work Rate', 'Best Position','International Reputation']
cat_columns = ['Nationality', 'League', 'Position', 'Body Type', 'Club']

drop_columns were dropped, while cat_columns were converted to hot vectors using a LabelEncoder() from sklearn.

PCA

Before we jump into the actual models themselves, it’s important to note our attention to multicollinearity. We did perform PCA on our 42 features, and found that the explained variance was at least 99% at 33 components (note that at a 99.5% threshold there was barely any dimension reduction, so we lowered our standard to 99%). We performed PCA on our data after scaling and normalizing the data to have a z score of 1. While we trained the Linear Regression model on both the PCA data and the non PCA dataset, and also used PCA data for the Logistic Regression model, we did not use it for the Random Forest. This is because PCA ended up not improving the performance of the Random Forest nor Linear Regression.

Here’s how we did PCA and scaling, with the help of some tools from sklearn. We’ve attached an image of our explained variance graph, where the x-axis is the number of features, and the y axis the explained variance.

Naive Bayes

We first ran a Naive Bayes classifier on our data, specifically to see how well it could predict what range (among 10 possible equal range buckets) a player’s transfer value could be in. It had an accuracy of 46.35%, which is about 4.5 times better than randomly guessing. Thus, there’s clearly some work the model has done, but it’s not great. No worries though, we have many more tunable and exciting models to work with!

Linear Regression

Let’s get into our best and first model, the Linear Regression model! This model uses gradient descent to make a linear model that gives us continuous output values, rather than classifications. Thus, we did not use the value buckets created as our labels, but rather used the transfer value originally from the data set as our labels here. We ran our model both with PCA scaled data, and without PCA scaled data, to see which would yield better results. Below is the code for how we did so, as well as our results.

Note that for each test set, we chronologically built up the training sets used. Thus, FIFA19 for example, was only predicted using data from previous years. This allowed us to see if the FIFA algorithm that generates player stats and other metrics changed every year, possibly making it harder for us to create accurate predictions.

Results for non PCA data (See above for what train and test sets where used):

Test Data Accuracy without PCA for FIFA 18: 0.9065569580480287
Test Data Accuracy without PCA for FIFA 19: 0.9118843390386203
Test Data Accuracy without PCA for FIFA 20: 0.923835636293558
Test Data Accuracy without PCA for FIFA 21: 0.9082690376474717
Test Data Accuracy without PCA for FIFA 22: 0.8439685316620665

Results for PCA data (This is the result after training and testing with all years 2017–2022 of data):

Train Data Accuracy On ALL Years with PCA: 0.8766950420015381
Test Data Accuracy On ALL Years with PCA: 0.7951320486481293

We observed foremost that our data using PCA was considerably worse than without PCA. In addition, FIFA22 had a much worse predictive accuracy than any of our other years, a trend we noticed in many of our models that will be explored below as well. We next wanted to observe which features were most influential in our model. For each year we plotted the top 10 most important factors in the model, based on their coefficient in the linear regression. They were all generally the same, so here’s one from FIFA 2021.

Random Forest

Now, we’ll dive right into the famous Random Forest method! Often considered as one of the more thorough classifiers, random forests work by building multiple smaller estimators (decision trees/stumps), training them up to a certain depth, and then using the majority vote of all the stumps to create one final classifier (for more information, look into ensemble methods). Typically, this model produces more accurate results (although remembering our speed vs. accuracy tradeoff from earlier, this means that random forests are more computationally expensive).

To get the best prediction, we want to use the best possible hyperparameters for our random forest model. Accomplishing this in Python is quite easy with the help of sklearn’s GridSearchCV class (ignoring the grueling hour long wait for the grid search to finish):

The rest was relatively straightforward as well — with these best parameters, we passed them into the RandomForestClassifier, fit the training data, and then scored the test data. Ultimately, our random forest model performed quite well, scoring an accuracy of 75% on the above hyperparameters.

Accuracy for Random Forest Model: 0.7576053945539712

Logistic Regression

Onto the final model! Here, we chose to run several logistic regressions to again model and predict the relationship between player attributes and transfer value. Logistic regressions are typically used for categorical data when we want to assign an entry to a specific bucket. Although transfer value is a continuous variable, we can approximate a continuous variable by creating multiple buckets — computing the predicted transfer value and then assigning it to its corresponding bucket range.

For the purposes of this model, we chose a bucket size of 10. Similar to the linear regression models, we first wanted to treat each FIFA dataset as a test set, using all prior years for training.

Before running the logistic regression models, since many attributes are correlated with each other, we needed a way to ensure that multicollinearity would not negatively affect the predictions. As a result, we built a sequential pipeline consisting of a StandardScalar to normalize/scale the data, PCA to remove correlated features, and apply an L2 ridge regularization penalty to the logistic regression.

pipe = Pipeline(steps=[('Scale',StandardScaler()),('PCA',PCA()),('LogReg',LogisticRegression(max_iter=100, penalty='l2'))])

In the end, the predictions hovered around an accuracy of 65%, with the exception of FIFA22, which had an accuracy of about 50%. Now, this is actually quite good! Remember that we’re not working with a binary classifier — instead, we have 10 buckets. This means that if we were to randomly guess a label, we would expect an accuracy of about 10%, much lower than our model.

Accuracy for FIFA 18 (based on previous years):  0.6290490513651087
Accuracy for FIFA 19 (based on previous years):  0.6357539315448658
Accuracy for FIFA 20 (based on previous years):  0.6818243426402819
Accuracy for FIFA 21 (based on previous years):  0.6825070639609555
Accuracy for FIFA 22 (based on previous years):  0.49523062851169475

Lastly, we aggregated the data from all FIFA years into one large dataset and ran a logistic regression over it. Instead of breaking it down by year, we wanted to see simply whether we can accurately predict transfer value, ignoring which year our data came from. Having the same pipeline set up, the regression model had an accuracy of about 66% — quite similar to most of the prior accuracies. In fact, it seems as if FIFA 22’s lower accuracy didn’t have as large of an effect as imagined.

Accuracy for Logistic Regression (based on all years): 0.6616564575075925

All Models

Accuracies of all performed models

Let’s take a step back and see what we have here. One thing is that in both instances where we used prior FIFA data to predict the next year, FIFA22 suffered the lowest predictive accuracy. We suspect that this may be due to the several complaints EA receives regarding the lack of balance in gameplay leading to a very different rating structure (maybe EA does listen).

Takeaways

And there we have it! We set out to answer a couple of questions, and after both visualizing the data and modeling it, we’ve discovered a couple of fascinating things. For example, transfer value is correlated with the skill level of the league. Given a random player in the Premier League vs a random player in Ligue 1, the transfer value of the Premier League player will typically be higher than that of the League 1 player.

Overall, we’ve found some interesting information regarding how EA approaches the entire rating problem for its games. Have we completely exposed how EA assigns ratings to players? Not yet — but we’re at a starting point. EA clearly dumps millions of dollars into their rating algorithms, so we’ve barely scratched the surface of what they do. Still, it’s cool to see that there are some factors that weigh more into the algorithm than others. Until next time!

Link to full EDA is here.