Daniel Adams
INST414: Data Science Techniques
12 min readMay 16, 2024

--

Closing the Gap — How Free to Play Users of EA FC24 can succeed on a budget (Extended)

By Daniel Adams

Introduction

EA FC24, formerly known as the FIFA series, is a soccer video game played around the world. A key game mode in FC24 is Ultimate Team. In this mode, users can compete online and build their “ultimate” squad with players from leagues around the world. While some users focus on signing players that they like or support, the majority of users seek to sign the best players so they can dominate their opponents. In order to sign players, users need to utilize the in-game transfer market, which contains cards of real life players that can be bought using the in-game currency. Player cards are broken down into three rarities, bronze, silver, and gold. The gold cards are the best of these three types, however, overall ratings of gold cards range from 75 overall to 91 overall. With that in mind, cards with higher overalls require more in-game currencies for their cost. In addition to the transfer market described above, users can claim free packs from unlocking rewards. Additionally, players can use real money to buy store packs. However, free packs and store packs do not yield the quality of players.

Question

A critical grievance many users hold against Electronic Arts (EA), the creators of FC24, is that the game has become pay-to-win over the years. The difference in return of free packs versus store packs has caused users to claim this since getting high overall players with free packs has become exceedingly rare. This has resulted in those who do not spend real money on packs to fall behind in player quality, and has brought about great frustration to the free-to-play users in the FC24 user base. Regardless, many free-to-play players still play the game hoping to be lucky enough to get the best players from free card packs. That being said, these free-to-play users still need to field a team competitive enough to win and therefore unlock the free reward packs. This creates a cycle of these users unable to unlock the best cards since their team can not propel them to finish the requirements for free packs. Therefore, how are these users able to field a good enough team to break this cycle?

Stakeholders and Approach

The stakeholders for this analysis are the free-to-play users of FC24. This analysis seeks to answer the question of acquiring good players that do not require significant amounts of in-game currency. In order to solve the stakeholder’s problem, the analysis utilizes KMedoids clustering to find comparable players to the game’s best and most expensive players. These replacement players will allow free-to-play users to have a team consisting of less expensive alternatives that still maintain competitive levels of quality. Therefore, this analysis will recommend specific players to the stakeholder than can efficiently replicate the quality of the game’s best at a significantly cheaper cost.

Data Cleaning

Data used for this analysis was gathered from Kaggle.com. The dataset consisted of all players in EA FC24, with a player representing one row. In the row relevant identifying data was included, such as the player name, nation, club, position, age, and overall rating. Additionally, the in-game statistics of each player were included, which were the only data points required for this analysis. For the scope of this analysis, the nation, club, age, and other non statistical features were removed from the table. Other than this, the data set was ready for further analysis.

Data

Once the data set only contained the player name and all in-game stats, the dataset was ready to be analyzed. These features are listed below. Note: all features besides name and position are bound between 1–99.

Identifying features:

  • Name: The player’s name
  • Position: The player’s position
  • Overall: The overall rating of the player

Physical statistics:

  • Acceleration: The rate at which a player can obtain top speed
  • Sprint: The maximum speed a player can reach
  • Jumping: The height a player can jump
  • Stamina: The amount of stamina a player has throughout the match
  • Strength: Player strength when tackling or protecting the ball
  • Composure: Ability to withstand pressure from the defense when shooting and passing
  • Reactions: Time to execute a user given command

Attacking statistics:

  • Positioning: The AI controlled off-ball position of a player
  • Finishing: The inside the box shooting capability
  • Shot: The shot power of the player
  • Long: The outside the box shooting capability
  • Volleys: The first touch shooting capability (shooting upon receiving a pass)
  • Penalties: The capability of scoring from the penalty spot

Passing statistics:

  • Passing: Short passing
  • Vision: Ability to pass through open gaps between defenders
  • Crossing: Passing from the outside into the opponent’s box
  • Free: Free kick accuracy
  • Curve: The amount of bend applied on crosses, shots, and aerial passes

Dribbling statistics:

  • Agility: The player’s elusiveness on and off the ball
  • Balance: The player’s ability to stay on their feet when facing pressure
  • Ball: Ball control when receiving a pass and dribbling
  • Dribbling: A player’s elusiveness with the ball at feet

Defending statistics

  • Interceptions: Capability of reaching an opponent’s pass
  • Heading: Heading accuracy when passing and shooting
  • Def: The AI controlled off-ball defensive position of a player
  • Standing: Stand tackling
  • Sliding: Slide tackling
  • Aggression: The aggression when fighting for a loose ball or aerial pass

Data Analysis

The analysis conducted sought to find cheaper alternatives to some of the game’s best players. To execute this, the cleaned dataset was broken into three separate tables, one for forwards, midfielders, and defenders. Each of these three subset tables would then have their own unique instance of KMedoids in order to represent different play styles of each area on the field.

Data Analysis — Tools

The previous analysis utilized cosine similarity from Module 3 for various positions on the field, however, a key limitation of that analysis is that it did not include different play style of forwards, midfielders, and defenders. Therefore, this data analysis will utilize KMedoids clustering to identify similar, cheaper alternatives to the game’s most expensive and best players based on specific play styles. KMedoids is similar to KMeans from Module 4, however, KMedoids allows the user to initialize the cluster algorithm with a set of initial cluster centroids. That being said, KMedoids allowed this analysis to cluster players based on player’s of my choice. This difference was important for the analysis as specific clusters intended to represent specific play styles of various positions on the field. Once the centroids had been initialized, KMedoids allows the user to input the similarity metric of their choice for the algorithm to base similarity on. For the sake of this analysis, cosine similarity was used to cluster the players.

Data Analysis — Forwards

Now that the data set was separated into three subsets, KMedoids could be run on the forwards table. This table includes positions that are considered attackers, such as Strikers and Wingers, denoted in the table as ST, CF, LW, LM, RW, RM.

Forwards — Goal Scoring Winger

To initialize the KMedoids centroid for goal scoring wingers, Kylian Mbappé was used to represent wingers (LW, LM, RW, RM) that are scoring focused. Mbappé was used to initialize this centroid as he possesses high speed, agility, dribbling, and finishing, which are the key statistics for this style of the winger position.

As shown, this cluster focused on those exact statistics; speed, agility, dribbling, and finishing, which are the main focus of the cluster compared to statistics such as passing. Timo Werner, Karim Adeyemi, and Ferran Torres are shown to be great and cheap alternatives to Mbappé for a very low price. That being said, the stakeholder would be recommended to purchase one of these players to mimic Mbappés ability on the wing.

Forwards — Playmaker Winger

Another key play style of a winger is being a playmaker rather than scoring orientated. This means this style of player does not focus as much on speed and finishing, rather dribbling, passing, and crossing. For this cluster, Ángel Di María was used as he has some of the best play making stats for a winger in the game.

This cluster reflects a playmaker style of winger as dribbling attributes are higher than speed attributes. Similarly, passing takes priority over finishing statistics. Therefore, the stakeholder would be advised to purchase cheap alternatives to Di María’s playmaking ability, such as Samuel Chukwueze, Marco Asensio, and Rodrygo.

Forwards — Strikers

The last type of forward analyzed were the strikers (ST, CF), which focus on physicality and finishing. For this cluster, Erling Haaland was used as he is the highest rated striker in the game.

As shown, this cluster focuses on physicality attributes such as jumping, strength, and heading. Additionally, finishing, penalties, and long shots are key stats that define this cluster. In regards to the stakeholder, players such as Youssef En-Nesyri, Sébastien Haller, and Tammy Abraham are capable of replicating Haaland’s striker abilities at a much lower cost.

Data Analysis — Midfielders

For the midfielder subset of data, the KMedoids clustering method was used to identify three specific playstyle of midfielders. These were central attacking midfielders (CAMs), central defensive midfielders (CDMs), and center midfielders (CMs).

Midfielders — Central Attacking Midfielders (CAMs)

The CAM style of midfielder revolves around the player’s ability to create scoring opportunities for the team. That being said, these players have very high vision and passing statistics. Additionally, this style of midfielder is expected to be able to finish chances of their own, so they have relatively high shooting statistics as well. In order to initialize this cluster, Kevin De Bruyne was used as he is the game’s highest rated CAM.

The statistics that are paramount for this cluster is ball control, shot power, vision, and dribbling. As mentioned, these statistics are critical for a CAM’s ability to create chances for themselves and the team. That being said, the stakeholder would be advised to replicate De Bruyne’s playmaking ability with cheaper alternatives such as Giacomo Bonaventura, Florian Wirtz, and Nabil Fekir.

Midfielders — Central Defensive Midfielders (CDMs)

This style of midfielder revolves around regaining possession of the ball while still maintaining midfielder qualities. Therefore, defensive, physical, and passing attributes are critical for representing the quality of a CDM. For this cluster’s centroid, the highest rated CDM in the game, Rodri, was chosen.

This cluster highlights the previously mentioned qualities; aggression, standing tackle, defensive awareness (Def), and interceptions are some of the best statistics of this cluster. With that in mind, some cheap alternatives to Rodri’s defensive style of play can be emulated well by Granit Xhaka, Danilo Pereira, and Ali Al Musrati at a much cheaper cost.

Midfielders — Center Midfielders (CMs)

Center Midfielders are a hybrid of CAMs and CDMs. These midfielders are commonly known as “box-to-box” midfielders as they are expected to be able to defend their own box and attack the opposition’s box. Therefore, key features of a CM are passing, dribbling, defending, and decent shooting. In order to create this cluster, Jude Bellingham was chosen as the centroid as he has some of the most well rounded CM attributes in the game.

This cluster consists of players who possess the aforementioned qualities of a center midfielder, specifically statistics related to passing, dribling, and defending. The stakeholder would be recommended to replace Bellingham’s box-to-box qualities with cheap alternatives such as Mikel Merino or Youri Tielemans.

Data Analysis — Defenders

In soccer, most teams’ defenses consist of four players, a left back (LB), right back (RB), and two center backs (CBs). LBs and RBs are considered “fullbacks” and this style of defender focuses on speed and agility in order to keep up with the wingers of the other team. On the other hand, CBs are typically slow yet much more powerful in order to be able to guard the strong strikers of the opposition.

Defenders — Fullbacks (LB & RB)

As mentioned, fullbacks must have high speed and agility in order to not be beaten by the opposition’s wingers. Additionally these players must possess strong defensive attributes as defending is one of this position’s main priorities. The centroid used for this cluster was Theo Hernandéz as he is one of the highest rated fullbacks in the game

This cluster is defined by top statistics in terms of speed and defensive statistics, which are the most important for fullbacks. Therefore, the stakeholder would be advised to sign players such as Alphonso Davies or Robin Gosens in replacement of Theo Hernandéz

Defenders- Center Backs (CBs)

The most important statistics for center backs are physical attributes and defensive attributes. That being said, Virgil van Dijk was used as this cluster’s centroid since he possesses high statistics in these categories.

As shown, this cluster contains player’s with high strength, jumping, and aggression, which are physical attributes. Additionally, defensive attributes such as standing tackle, defensive awareness, and interceptions are among this cluster’s top statistics. The stakeholder could look towards signing players such as Sven Botman, Raphaël Varane, or Gianluca Mancini in replacement of Virgil van Dijk.

Final Recommendations

Overall a plethora of players were recommended to the stakeholder to choose. Using my domain expertise, I would directly recommend the set of players listed below.

  • Left Wing: A left winger is commonly a goal scoring winger since the player can cut in on his dominant right foot. Therefore, I would recommend Timo Werner.
  • Striker: I recommend Sébastien Haller, since he is one of the game’s best aerial threats at a super low cost.
  • Right Wing: Right wingers are commonly play makers as they can go to the outside and pass in to the box using their dominant right foot. For a cheap but quality player, I recommend Marco Asensio.
  • Center Attacking Midfielder: A great cheap alternative CAM would be Nabil Fekir. He possess all the necessary passing statistics but also boasts very good shooting, so I highly recommend this player.
  • Center Defensive Midfielder: For a CDM, I highly recommend Granit Xhaka as he possess great defensive qualities while also having a decent shooting threat from outside the box.
  • Center Midfielder: Mikel Merino is one of my favorite CMs in the game as he has great defending ability and also playmaking ability. He lacks elusive agility and dribbling, but his physicality makes up for it.
  • Left Back: Alphonso Davies has extremely high sprint speed and can be outrun by very few attackers in the game. Therefore, Davies is a must in regards to cheap, quality players.
  • Center Back: A great cheap alternative CB is Raphaël Varane as he is relatively quick in addition to great strength and defensive statistics.
  • Center Back: Another great CB would be Sven Botman. He is not as quick as Varane, however, his physicality and defensive ability makes up for this deficiency.
  • Right Back: Robin Gosens is a very good cheap full back in the game. He is quick, strong, and has good defensive abilities. Additionally, he is quite unique in regards to him possessing relatively high shooting and passing quality for a defender.

Limitations

A critical limitation of this analysis was the dataset’s lack of player price. This limited the analysis from showing the alternative players’ price tags, which would significantly help the stakeholder. That being said, many websites exist that can show the current market price of any player, such as Futbin.com. Another limitation was the lack of goal keepers being included in the analysis. This is due to the fact that goalies only posses six unique statistics, therefore, clustering this position would not be very applicable. Additionally, most goalies, including some of the highest rated, are very moderately priced in the game, so their price point deters any need to be analyzed in order to find cheap alternatives.

GitHub Code:

https://github.com/dadams16/INST414AdamsModules/tree/main/ModuleExtension

--

--