The Best F1 Fantasy 2020 Line up

Keiren Mullage
F1FantasyTracker
Published in
4 min readMar 12, 2020

Produced by an artificial neural network and machine learning.

First of all we want to say thank you to everyone getting in touch with us and providing amazing feedback from the last blog post as well as everyone on the discord channels engaging in great chats.

Some of you have been asking for previous year raw data, so we decided to upload last years website, you can find this under “Legacy Site” on the top left.

One user, Kumagi, wanted to produce a neural network that produced the best lineups based on last years performances, various statistics and on impressions from driver and car strength in testing. What makes this possible is that the rule set from last year is very similar to this year.

The Statistically Strongest Line up

MERECEDES, HAMILTON, STROLL, GASLY, RUSSELL, PEREZ

The Statistically Strongest Line up (Minus Hamilton and Mercedes)

RED BULL, VERSTAPPEN, ALBON, STROLL, GASLY, PEREZ

What is really exciting from the first image, is that the strongest line up, through artificial intelligence and neural network modelling, is exactly the line up we were recommending in the “F1 Fantasy 2020 Guide” blog post!

Again, like last time, these estimations and models are based on data we have from last year and impressions from testing. Come race day these biases could change dramatically. As of what we know, this is the best.

What is a Neural Network?

By Kumagi

Neural networks are models of machine learning that simulate biological networks of neurons found in the human brain with the purpose of learning the association between ‘features’ (inputs) of data. Although complex in their mathematical basis, they work on the principle of connecting layers of ‘nodes’, data at various stages of processing, with layers of ‘weights’, scalar values held in matrices. A network can be ‘trained’ such that the data passed through the network produces values at the output layer that align with circumstances achieved by the input data in the real world. Neural networks are effective as they identify patterns between the features of data that often a human cannot. It is this nature of this form of machine learning that allows it to be an effective and modestly accurate method of event classification and prediction.

The network architecture used for this project was a multi-class classification neural network consisting of 4 layers with 8 input nodes (8 features of data), 10 and 20 nodes in the hidden layers respectively, and 20 output nodes (each representing the probability for Fantasy points within an allotted range being achieved by a driver in a race).

By providing the statistics of each driver for each race in the 2019 season, the network learned to relate the features of individual drivers, their qualifying position, positions gained over the race and track type along with 4 other variables (sourced from F1 Fantasy Tracker and various other sites) to the total Fantasy points gained in the race*. Eventually, the network had adjusted its weight matrices such that network was fairly accurate (~88%) in predicting the Fantasy points bracket outcome of the next race for each driver based on the trends of the races before. It is based on these relations and simple calculations with the fluctuating cost of the driver at every race last season that allows the network to learn to predict the points likely to be accrued by each driver in each race to come.

To calculate the optimal team for the start of the 2020 season, the network needed many of these data features to be imputed — predicted by a separate regression neural network to account for the variability of individual races and the inherent luck and conditions factors that plague the consistency of race outcomes. However, based on the training of the network on the relations between the key variables of the 2019 season, the imputed values for 2020 as well as an additional bias placed on drivers to account for great strides in the cars of some teams over the off season (particularly racing point), the network could calculate the optimal teams for the start 2020 within the $100 million budget.

*Neural networks typically require a far bigger corpus of data to run and hence this network needed to be have a high learning rate and make use of data modelling libraries available in Keras and Tensorflow to make programming the network easier and predictions more accurate.

--

--