Moneyball — How Grassroots Football Clubs Compete with Europe’s Elite

Daniel Adams
INST414: Data Science Techniques
6 min readApr 29, 2024

By Daniel Adams

Intro

Football, or soccer as it is called in America, is one of the largest sports in the world. In Europe, the football season spans all the way from early August to May. Additionally, football is a key part of European culture, similar to the way American Football pulls in more viewers and revenue compared to other domestic leagues like the NBA and NHL. That being said, football in Europe generates swaths of revenue both domestically and globally. While the sport is extremely popular and lucrative, smaller clubs have become distanced financially from globally renowned clubs. With that in mind, these smaller clubs need to find innovative solutions in order to keep up with the elite European football clubs.

Question

A key factor of footballing success is the players that make up the team. In football, players are acquired via the club’s academy systems or the transfer market. The global football transfer market is vastly different from the trading and drafting systems that are used in the NFL, NBA, or NHL. Instead of trading players for draft picks or other players, the global football transfer market uses transfer fees. These fees are calculated by a handful of different factors, including the player’s age, ability, potential, and current contract length. The elite European football clubs have no problem signing players they want from the transfer market, as they have hundreds of millions to spend at their disposal. For example, an elite football club Real Madrid signed Jude Bellingham from the transfer market in the Summer of 2023 for $110,000,000. Bellingham possesses the key attributes to garner a transfer fee of this magnitude, as he was signed as a 19 year old with three years of professional experience at some of the highest levels in Europe. While elite clubs can spend hundreds of millions of dollars on players, the majority of clubs can not afford to do this. Therefore, this reality begs the question, how do less wealthy clubs keep up with these elite teams?

Stakeholders and Approach

As mentioned, the majority of clubs in Europe’s first divisions do not possess the money to spend so much on players as the few elite clubs do. That being said, the stakeholders for this problem would be managers of clubs with stricter financial constraints while still having high ambitions to do well in their domestic league for qualification to continental tournaments. This data analysis aims to find players of higher quality that do not garner such a high transfer fee. By doing this, teams with smaller transfer budgets can minimize their market spending while maximizing player quality.

Data

The data collected for this analysis was sourced from kaggle.com. This data set, titled Football Player Database Top 5 Leagues, contains key data points of 2613 players from the English Premier League (EPL), Spanish La Liga, German Bundesliga, Italian Serie A, and French Ligue 1 during the 2022/2023 season. While the data set included player statistics such as height, place of birth, and other information, my analysis used these key data points listed below:

  • Name: To identify the player being analyzed
  • Age: To identify the impact of the player’s age on their transfer fee
  • Current transfer fee (In Euros): To analyze the price the player currently sells for
  • Peak transfer fee (In Euros): To analyze a relationship between the player’s peak fee to their current fee.

Data Cleaning

I began the data cleaning process by removing extraneous information from the csv file. Removed columns included, full name, place of birth, height, player agent, brand sponsor, jersey sponsor, and the date of joining the club. Now that the dataframe was limited to only containing the information I wanted, I dropped any players with null values. Once this removal process was finished, I was left with a csv file containing the pertinent information listed above with usable values. As mentioned, the data set contains 2,613 different players, therefore, many of these players will either be too low of quality for the stakeholder, or too expensive for the stakeholder. In order to accommodate this, I ordered the dataframe in descending order based on the current transfer fee. Now that the data was in descending order of player price, I used an English club named Aston Villa as a base for my filtering process. This club being used as a data benchmark was used since it is a good example of the stakeholder I intend to help. This club has tight financial restraints, but possesses high ambition. Currently, Aston Villa is placed fourth in the English Premier League and Semi Finalist of the Europa Conference League, which is a qualification based tournament for some of Europe’s best clubs. The average transfer fee that Aston Villa paid from 2022 to 2024 is 53.35 million Euros. Therefore, the upper bound of players kept will have a transfer fee of 55 million Euros.

Data Analysis

Now that the data is cleaned it contains players that are of interest to the stakeholder. While the price will be right for the stakeholder, there are other factors that make transfer signings good beyond just the transfer value. A key factor in a smaller club’s transfer decision is if the transfer will be lucrative for the club in the long run. This means will the club be able to sell the player for a higher value than when the player was bought. Therefore, players before their prime will be analyzed by bounding the data to an age below 25 years old. With the transfer fee limits and age limits, we are left with 31 valuable transfer targets for the stakeholder, which are listed below.

A reasonable hypothesis of the remaining players would be their current price equaling the max price. This is due to the fact that young players of this quality will have their transfer value increased as they prove their capabilities with more division one experience. That being said, clubs who have these prospects tend to ask for more money than the market values the player, as the club believes that the player has more transfer potential. Therefore, I filtered these 31 players to leave players whose current value is less than their peak value, as shown below.

This boils down 2,613 players to only 13 key players that the stakeholder can analyze. A player having a current transfer fee less than the peak transfer fee can attribute a decline in their abilities. Though this can be true, a significant factor in this occurring is a young talent given too high of a transfer fee before he proved his abilities. That being said, if the stakeholder were to buy any of these 13 players at the relatively low price they currently are given, the stakeholder has the opportunity to acquire a high quality player with potential to increase their transfer value.

Limitations

A key limitation of this analysis is that the data used can not predict any future change in a player’s transfer value. That being said, the analysis gives a good general foundation of players for the stakeholder to keep note of. Since the analysis yields such a finite group of players, the stakeholder will be able to keep up with the value of these potential transfer targets. Another limitation is knowing the financial restraints of the stakeholder. While Aston Villa was used as an example, the amount of money a team can allocate to spend is determined on club performance and player sales, which can significantly vary from year to year.

GitHub Code:

https://github.com/dadams16/INST414AdamsModules/tree/main/module1

--

--