I would like to mention how I prepared and analyzed the Fifa 21 dataset in this week’s post. This post will be more practical than informative. Of course I will share some information, but I want this article to encourage more research. Because I think there is more to learn while doing research.
After downloading dataset in your computer, you could create a new file on your desktop. Then you move the dataset into this new file.
I created a file called “Medium Series” and then moved dataset to this file as below. Then I clicked “New” button and by clicking “Python 3” I created a new notebook called “Fifa_2021”.
After the installation, I opened “Fifa_2021.ipynb” notebook and then imported the frameworks that I would like to use.
Then I need to read file to create a data frame by using “read_csv()” method from pandas framework. I generally prefer copying the main data file to protect it against any deterioration by using “copy()” method. My data frame’s name is “df” now. I observed first 5 rows by using “head()” method. Thus I aimed to get some information about dataset.
By using “tail()” method I could also observe last 5 rows from dataset.
I accessed the number of columns (features) and rows (observations) of dataset by using “shape”. Then I checked if there is any missing value by using “.isnull().sum()” methods. I saw that there is no missing value.
All players must have unique “player_id”. So using “drop_duplicates()” method, I deleted duplicate members id to be sure if there is duplicate.
To get some information about datatypes of columns, I used “info()” method. I could also observe that there are 17981 rows and 9 columns in dataset by using “info()” method. And you could see that dataset has index numbers from 0 to 17980.
To trim spaces at the beginning and the end of the words in “team” column, I used “str.strip()” method. If you want to get more information about string methods, you could take a look at my “Most Common String Methods in Python” post.
Number Of Players Per Team
I would like to see number of players of each team. By using “groupby()”, “count()” and “sort_values()” methods I grouped the teams by number of players as you see below. And I assigned it as “df_sum_of_plyrs”.
I would like to focus on teams with more than 12 players so that they don’t affect the averages too much. Therefore I reorganized “df_sum_of_plyrs” as below.
As you could see, 211 players are in free agents status. Therefore I removed them from dataset and I assigned as “free_agents”. I also assigned players that already have a team as “df_team_player”.
By using “mean()” method, I calculated the number of players as “24,95…” per team. You could also see the first five teams that have most players, by using “sort_values(ascending=False).head()” as below.
Overall Average Per Team
Let’s take a look to overall average of teams. I accessed to overall average, as you see below, by using “mean()” method. And then, I sorted them from highest to lowest.
Let’s take a look at 10 teams that have highest and lowest overall average.
Extracting Base Positions of Players from Positions
I used “str.strip()” method again to trim spaces at the beginning and the end of the words in “position” column. Then I defined a new column called “base_position” and assigned base positions to new column by using “str.extract()” method.
I got number of players according to their base positions in whole dataset by using “value_counts()” method.
Overall Average According to Players’ Age Average and Base Positions
I selected columns that I would like to use from “df” and created as a new “df_pos” data frame.
Then I grouped overall average and age average by base position and sorted the values of overall average from highest to lowest.
I aimed to emphasize the useful methods while analyzing the data in this post. I hope this would be helpful for you.
See you next post:)
FIFA 2021 Complete Player Dataset
This Data set contains data of the players in FIFA-2021
The Jupyter Notebook is a web-based interactive computing platform. The notebook combines live code, equations…
Installation - pandas 1.1.4 documentation
The easiest way to install pandas is to install it as part of the Anaconda distribution, a cross platform distribution…
The only prerequisite for NumPy is Python itself. If you don't have Python yet and want the simplest way to get…
Installing and getting started - seaborn 0.11.0 documentation
Official releases of seaborn can be installed from PyPI: The library is also included as part of the Anaconda…