FORMULA 1 STORIES : SAUDI ARABIA GP #2

Rajaah Dhananjey
4 min readApr 9, 2022

--

Qualifying stories : Part 2 (Powered by FastF1 for python)

Part 0 : A shout out!

I was planning on making a nice little tutorial for analyzing telemetry data, but when I scanned through the articles , I realized there is already a very recent, fully up to date article here.

This article by Jasper is honestly better than what I was planning to put out — so now I had to change up the content. Highly recommend checking it out!

That being said, please stick around for the rest of the content!

Part 1 : The time delta problem

For this, let’s have a look at the comparison of Perez’s and Leclerc’s qualifying laps. We start with the same process of loading the race laps from the quali session as we saw previously.

A key nuance to note is that the lap data does not omit the laps deleted due to ‘track limits’ — so we always need to verify the lap times with the official results.

666 data points for a single lap — is it really enough though?

Now let’s say we want to use this data to calculate the time delta throughout the lap , there is a nice little function available in Fastf1.utils. So let’s quickly try it out.

Now the dotted line of time delta is just not making sense when compared with the speed delta — except till the 1000m mark. From 1000 to 2000 meters , Leclerc is faster through the turns but Perez is still showing the consistent advantage of top speed.

In the middle portion of the lap (from 2500 mark to 4500 mark) , it is weird to see Perez losing 0.2 seconds despite the speed advantage.

Let’s compare the time delta here with the sector times to see if we can get a validation.

Total diff vs sector diff = -0.025 vs 0.165 ; -0.024 ; -0.166

Now these numbers are more in line with what we see in the speed traces — Leclerc enjoying an advantage in the twisty turns of the latter portion of sector 1 , followed by the top speed of Perez giving him the advantage in sector 3.

Part 2 : So what is the problem?

The FastF1 documentation calls it out quite clearly :

We can understand it better if we understand the way the data points are captured. Let’s try to plot the data points captured

This is how formula 1 shows the circuit, so we will replicate the same using this code.

Notice how the individual “points” are different for Sergio Perez vs Charles Leclerc

As you can see from the plot, it becomes more clear why calculating the time delta throughout the lap is so hard — the telemetry data is captured at different regions for the different laps and so, it is very tricky to “impute” the missing values for a like-to-like comparison.

The other thing is that the distance measures are also computed distances — so there is always a margin of error. When calculating delta time across the distance covered, these errors add up quite significantly.

However this data is still valuable when we try to compare speeds throughout the map — this is because speed is fairly linear throughout the run (corner speed data is available for the most part — hence ‘linear’ acceleration or deceleration)

And in this speed chart, we can quite clearly see the straight line speed advantage of Perez over Leclerc giving him the advantage in sectors 2 and 3.

Part 3 : Closing thoughts

I still can’t get over how easy and simple it is to get this level of ‘quality’ data. Despite a few caveats, it is a very powerful tool to understand where the advantages and the disadvantages are.

Now you can monitor the speeds of the cars and you can tell Hamilton and the other drivers (you get it if you watched Saudi GP quali) , where their speed is lacking compared to the competition.

Go on then! Try this out for today’s Australian GP qualifying!

https://medium.com/mlearning-ai/mlearning-ai-submission-suggestions-b51e2b130bfb

--

--

Rajaah Dhananjey

Data Analyst. Fascinated towards data and insightful analytics