How to win at Fantasy Premier League using data — Part 2 — Picking your team
EDIT: Updated player forecasts, team optimisation tool and team of the week can now be found each gameweek at https://twitter.com/solpaul7
In Part 1 I went through building a model to project player scores for the upcoming season. In this part I discuss how to use these projections in practice. I share:
- My general philosophy of decision making to maximise expectation
- A tool to select optimal teams and interrogate player projections
- (For those who are interested) a step-by-step guide to team optimisation in Microsoft Excel using the Solver Add-in
It’s all about maximising expectation
FPL is all about points — I want to pick the combination of players that delivers the most expected points.
Let’s be clear about what expected points means.
A trick I often use is to imagine thousands of simulations of the season, each with slightly different outcomes. I want to pick the team that scores highest on average across all those seasons.
Now, I could be unlucky and start the season badly. If that happens, so be it. My actual points ended up on the low end of the scale in those thousands of seasons, but so long as I was right about my expected points then I’m happy. The FPL season is very long, so if I consistently make decisions that maximise expectation then the luck is likely to even itself out.
Selecting an opening team using this principle is straightforward, assuming that we think our forecast model is good. All we need to do is pick the team with the highest predicted score over, say, the first 10 weeks.
How do I figure out this optimal team?
I can use trial and error, looking at the predicted total scores for different combinations of players, but unless I’ve got a few million years to spare then I’m not going to find the highest scoring team. There is a much better way: optimisation. We can use an algorithm to find the top scoring combination of players almost instantly.
I’ve built a tool in Excel to do exactly this. I’ve also added functionality to allow you to make manual adjustments. My principle is to use data to augment, not replace, human expertise. There will be information that the model cannot capture, or clear biases in the data — I always want to create tools that allow me to recognise and correct these issues.
Download it here and have a play. I’ll be using it over the next two days to get my team ready for the season opener this Friday. I also plan to update the predictions and share the tool each week. It’s accuracy will improve significantly as the season progresses and it consumes more information.
If you’re interested in how I actually performed the optimisation in Excel, read on. Otherwise, good luck for gameweek 1!
Team optimisation in Microsoft Excel, step by step
Let’s start by opening the raw forecasts in Excel. We have one row for each player and week, so first we need to create another sheet with just one row per player. We’ll also turn this data into named tables (side note, if you’re not using named ranges, tables and pivot tables in Excel, then you should be).
We want to look at predictions for gameweeks 1 to 10, but let’s make it adjustable by creating named cells for the starting and ending gameweek, and calculating total predicted score for each player using the SUMIFS function.
One more bit of data cleaning, we need to create columns for each position (1 if it’s the players position, 0 if not) and a final selection column.
Solver will select players by putting a 1 in the ‘Selected’ column. This means that we can use the SUMPRODUCT function to calculate total points, total price and number of players in each position (I set a few players to 1 to make sure it’s working).
Ok, we’re ready to use Solver to find our optimal team. Here’s what we need to tell it:
- Which cell to maximise: Total Points
- Which cells to change: top 200 in the Selection column (200 is the maximum amount of cells Solver can handle, so we order the players by predicted score, and take the top 200)
- Constraints: let’s pick eleven players in a 4–4–2 with a £78m budget, we also need to tell it to make our selection cells binary (1 or 0)
- Solving method: Simplex LP
Click solve and voila! We have the selections that give the maximum points based on these predictions and criteria (you may need to enable Solver first).
Note: The final team shown in the GIF highlights an issue I mentioned in Part 1. Players are selected who we do not expect to start this weekend. I’ve got around this in the final tool shared above by setting a zero point prediction for players who are not expected to start. Over the next few weeks I plan to replace this with an approach that takes into account injury status and recent minutes. Watch this space.