How I Analyze Formula 1 Data With Python: 2021 Dutch GP Hamilton vs. Bottas

Jasper
Towards Formula 1 Analysis
5 min readSep 24, 2021

As a data-fanatic and a Formula 1-fan, the amount of data coming from Formula 1 weekends is simply amazing to play around with. How cool is to create insights that you even haven’t seen on TV during the weekend?!

However, even though an astonishing amount of data is being generated each weekend, this data can be overwhelming and complex. This tutorial will therefore show you an example of how you can use this data to gain insights.

This tutorial assumes basic knowledge of Python, including packages like Pandas and Mathplotlib. However, beginners should be able to follow along!

The 2021 Dutch Grand Prix

What. A. Weekend. The first Dutch Grand Prix since 1985. Even though this race did not have too many overtakes, there still was a lot going on (and the atmosphere was amazing, I was lucky enough to be there).

A very memorable thing from this weekend was Bottas setting the fastest lap towards the end of the race, while explicitly being told by his team not to go for the fastest lap (according to Bottas, he was “just playing around”).

So, why not dive into the telemetry of both Hamilton and Bottas’ fastest laps, to see what exactly the differences were?

Comparing Hamilton and Bottas’ fastest laps

In order to do so, we will use the Fastf1 Python package. All information on how to install and get started can be found in their documentation.

Step 1: Set up the basics

So, we create a Jupyter notebook (explanation) and start by loading all required packages. These consist of the fastf1 package for the data, and matplotlib for the visualization.

import fastf1 as ff1
from fastf1 import plotting
from matplotlib import pyplot as plt
from matplotlib.pyplot import figure

Moving on from importing packages, we need to take care of two small things. First, setup the built-in plotting functionality provided by fastf1, and second, enable caching since large amounts of data can have long loading times.

# Setup plotting
plotting.setup_mpl()
# Enable the cache
ff1.Cache.enable_cache('cache')

Step 2: Collect the data

Now everything has been set up, we want to gather all the data.

The first step to take is specifying the session we’re interested in. We specify the year, the race (could be either the circuit or the race name), and the session (could range from “FP1” to “Q” and “R”).

# Load the session data
race = ff1.get_session(2021, 'Zandvoort', 'R')

After that, we can collect the race laps. Since we’re comparing telemetry, we obviously also want to load the telemetry. This can take a few seconds, which is why we enabled caching.

# Collect all race laps
laps = race.load_laps(with_telemetry=True)

It is recommended to explore the dataset for a bit to get a feeling of what is going on, this way you will get a feeling of what’s going on within the data.

Now that we have an entire dataset of all laps that were completed during the race, we can zoom in a little. Since we’re only interested the fastest laps from Hamilton and Bottas, we do the following:

# Get laps of the drivers (BOT and HAM)
laps_bot = laps.pick_driver('BOT')
laps_ham = laps.pick_driver('HAM')
# Extract the fastest laps
fastest_bot = laps_bot.pick_fastest()
fastest_ham = laps_ham.pick_fastest()

These functionalities are conveniently built-in by fastf1, but basically they just perform some selections within the DataFrame.

If you have a look at the variables fastest_bot and fastest_ham , you’ll notice that there is no telemetry data yet. To include this, we will have to take one more step:

# Get telemetry from fastest laps
telemetry_bot = fastest_bot.get_car_data().add_distance()
telemetry_ham = fastest_ham.get_car_data().add_distance()

From the fastest laps, we load the car data which consists of many telemetry variables like Speed , Throttle and Brakes . Another variable we want to have is the Distance , since we can use this as the variable on the X-axis.

Step 3: Plot the data

Now that we’ve got everything in place, it is time to finally visualize the data! We use the subplots functionality from Matplotlib, which allows us to have multiple plots in a single figure. This is convenient for comparing multiple variables, like speed, throttle and brake.

So, first of all, we define the subplots and set the title. Since we’re making three plots, we tell matplotlib that we will be having 3 subplots.

fig, ax = plt.subplots(3)
fig.suptitle("Fastest Race Lap Telemetry Comparison")

After that, we can create the three subplots. One for speed, one for throttle and one for brake. All subplots are stored in the ax variable. For each subplot, we create two lines: one for Bottas and one for Hamilton.

ax[0].plot(telemetry_bot['Distance'], telemetry_bot['Speed'], label='BOT')
ax[0].plot(telemetry_ham['Distance'], telemetry_ham['Speed'], label='HAM')
ax[0].set(ylabel='Speed')
ax[0].legend(loc="lower right")
ax[1].plot(telemetry_bot['Distance'], telemetry_bot['Throttle'], label='BOT')
ax[1].plot(telemetry_ham['Distance'], telemetry_ham['Throttle'], label='HAM')
ax[1].set(ylabel='Throttle')
ax[2].plot(telemetry_bot['Distance'], telemetry_bot['Brake'], label='BOT')
ax[2].plot(telemetry_ham['Distance'], telemetry_ham['Brake'], label='HAM')
ax[2].set(ylabel='Brakes')
# Hide x labels and tick labels for top plots and y ticks for right plots.
for a in ax.flat:
a.label_outer()

plt.show()

Step 4: Analyse the data

And finally, we have a plot displaying the telemetry data of both drivers during their fastest race lap. Of course, you can play around with the dimensions of the final figure for a bit, but for now this is fine.

“Valtteri, it’s James. Please abort the fastest lap attempt before the end of the lap”

Bottas was on his way to set the fastest lap, despite being told that they were not going for the fastest lap. During his attempt, Bottas was asked by James Vowles to abort his lap to make sure that Hamilton would get the extra point.

As visible in the telemetry data, Bottas only slightly backed of at the end of the lap. This resulted in him setting the fastest lap (and raising some eyebrows here and there), while not making it too difficult for Hamilton to set the fastest lap in the end.

Interesting insight if you ask me!

Thanks for paying attention to this tutorial. Feel free to ask any questions you have and to leave feedback if you have any!

--

--

Jasper
Towards Formula 1 Analysis

Writing tutorials about Data Analysis & Visualization through Formula 1 Examples