How I Created an End-To-End Personal Data Project (Fantasy Premier League)

Andrej Trajkovski
5 min readNov 1, 2022

--

Building a data portfolio is the first step in showcasing your technical skills in addition to your resume. Choosing the right data set to analyze can be a challenging task due to the number of available resources. On the one hand, the most popular public data sets are great for getting started and comparing your work to others. Projects involving these data sets, on the other hand, can be found in the portfolios of the vast majority of data analysts. The Titanic, for example, is the most famous dataset used for projects and has been downloaded over 100,000 times on Kaggle. With that being said, how much more value can you add to this overused dataset? Most importantly, is using a publicly available dataset the most effective way to land your dream data job?

Therefore, having a personal project is crucial in setting yourself apart in the highly condensed data world. In this blog, I will talk about how I combined my passion for Fantasy Premier League (FPL) with my data analytics skills to create an end-to-end data project using a data set that is unique to me.

What is an end-to-end data project?

An end-to-end data project allows you to demonstrate a wide range of skills, from data engineering to advanced visualisation, all in one place. A project of this nature can be divided into four components:

  • Data extraction
  • Data cleaning and transformation
  • Analysis
  • Reporting (visuals or a dashboard)

As I mentioned before, this project’s scope is Fantasy Premier League — more specifically, my FPL team.

I fell in love with FPL several years ago, but it wasn’t until recently that I had the idea to make a tool to help me better view my team’s performance.

Follow along to see how I completed this project and what the final output looks like!

Data Extraction

To extract the data from the FPL website, I needed to connect to their official API. As there was a lack of documentation, I looked at various Reddit threads and articles to find the most up-to-date method of connecting to the FPL API. A more detailed approach to accessing the FPL data can be found here.

In short, the FPL API provides many endpoints that hold various data relevant to the game (fixtures, player stats, manager info, etc.). For this project I have used the following endpoints:

*{TID} = your team ID

Establishing the connection to the FPL API

With these three data sources, I was able to extract the data I needed to make my personalized FPL dashboard.

Data Cleaning & Transformations

The next step in this process was to clean the data and make it usable for the end user (in this case, myself).

First, after loading all the data, it was necessary to add the proper dimensions (mappings) to the raw data. Each player was assigned a unique ID by FPL, along with his position (element type) and team. Therefore, I used the ‘map’ function in python to add the appropriate dimensions to each player. Afterwards, I joined the all-players data with my team in order to have more information on my current squad which would help me to have more complete information on my team.

Next, I noticed that the player prices were multiplied by 10 when compared to the official FPL app. For example, Salah’s price was displayed as 127 instead of 12.7. Hence, I transformed all the players’ prices into the appropriate values.

Data transformation & manipulation

Finally, I exported the cleaned data into three different tables:

  • Gameweeks
  • My team history
  • All Players

Analysis & Reporting

The final piece of this end-to-end project is to report the findings in a useful manner by using a BI tool. In this case, I opted for Tableau. More specifically, I wanted to design a dashboard that will highlight the performance of my team in the current week in an easy-to-digest manner.

In a business setting, the contents of a dashboard would have to be agreed upon with its main stakeholders depending on their KPIs and the planned use case. However, this dashboard will only be used by myself. Therefore, I had the freedom to choose the metrics that will help me assess my performance in the game and potentially help me make decisions in the future.

I decided to break down the dashboard into three sections:

The Key Performance Indicators (KPIs)

KPI Bans

I wanted this section to quickly highlight the following:

  • Current overall rank and the percent change from the previous rank update
  • Latest gameweek (GW) points & the number of points I have left on my bench
  • The current team value with a change indicator in case of any price rises/falls

Rank Trend & Top Performers

Rank Line Chart & Latest GW Top Performers

This section of the dashboard takes a deeper dive into my ranking and the top performers from the latest gameweek.

I would like to point out two things on the line chart:

  • Reversed overall rank y-axis — we all want to climb *up* the rankings and not down!
  • Minimum and Maximum data markers — a quick way to show the best and worst GWs of the current season in terms of the overall ranking

Next, the horizontal bar graph displays the top performers sorted by the total amount of points in the latest gameweek. The highlighted player in this view is my team’s captain whose points will be doubled for the week (already included in this view). Naturally, I would want the highlighted player to be near the top of this graph as blanks are not encouraged!

Current Squad Table

Current squad with selected metrics

Finally, the last section of this dashboard focuses on all members of my squad with several metrics that are relevant to me in terms of deciding the starting XI & transfers.

Final Dashboard

Full Dashboard View

This is it! The dashboard above provides a quick overview with relevant metrics to help me with decision-making and performance analysis.

I am planning to expand this dashboard with more tabs displaying various player in-game performance metrics (xG, SoT, xA) to make a comprehensive one-stop tool for conquering FPL!

The full Jupyter notebook is available on GitHub and the dashboard on Tableau Public.

If you liked this article or have any questions please connect with me on LinkedIn or give me a follow on Twitter!

--

--

Andrej Trajkovski

Data Analyst | Passionate about sharing knowledge and learning new concepts