Boost up brute force with mathematics

5 min readSep 26, 2020

Photo by ThisisEngineering RAEng on Unsplash

In this post, I’m going to discuss a little bit about brute forcing method and a special scenario where I found brute force vulnerable system. In fact, I boosted up the brute force with use of simple mathematics, so I could actually prove that the system is completely manipulable with a finite (relatively small) amount of time. I’ll briefly state everything, stay tuned.

What is Brute force?

Brute force is trying to get something correct by trial and error. This can be applied to entering to a system with a 4 number pin code. Let’s say some device unlocking requires a 4 digit pin code. Brute force into the system means, try entering all possible combinations of the 4 digits. For example: 0000, 0001, 0002, … , 9998, 9999.

In real life cases, this brute forcing may not be practical due to 2 main reasons.

Too many combinations are there to try.
In above example, there were 10⁴ (=10,000) trials. Considering average inputs needed to unlock such a system with brute force, we can say (0+10000)/2 = 5000. (The attempt can work on the very first attempt or on the last attempt). But imagine if there is a 8 character password in which a character can be one of 64 character. Then number of all attempts would be 64⁸ = 2.8147498 x 10¹⁴; which is a huge number and brute force will be impractical.
Limited number of attempts / Attempts are limited with time
You have seem this for sure. Your mobile devices let you to try the pin/password incorrectly only for some limited number of times. (say 5 times). After that number of attempts, you should wait 30 seconds. Then if you again get your pin/password incorrect 5 times, the waiting time would get increment, and so on. This way, even if for a simple 4 digit pin, you wouldn’t get to brute force in a decent amount of time. May be after some total number of incorrect attempts, the device will get into a different state from which you can unlock the device by proving your identity in a different way.

Brute force vulnerable systems

Basically, a brute force vulnerable system would have following behaviors.

Let the user know the result of his/her input after the attempt. (Not an essential vulnerability characteristic, but this is included in brute force check list)
Letting the user to try and fail repeatedly without blocking attempts or time breaks.
Having a small number of input combinations

But brute force necessarily doesn’t have to be just trial and error on all input combinations. We can use some techniques to reduce the number of attempts. Let’s call such system indirectly brute force-able systems.

Indirect brute force vulnerable systems

In this post, I’ll mention a real life system, which doesn’t display 2 of above brute force vulnerable characteristics, but yet it can be successfully brute forced in a decent amount if time. (finite number of attempts)

The vulnerable competition portal

This particular system is a prediction uploading portal for an online AI competition. The competitors should upload a .csv file in which a series of predicted numbers present. The numbers should be as close as to the expected number series. For example, let’s say the expected number series is 40, 50, 25, 72, 100. In the .csv file, competitors can upload any 5 number series. Let’s say some competitor uploaded 38, 45, 27, 75, 100. The only correct answer is 100. But the others are somewhat closer to the answer. Therefore, this competition measures the closeness to the actual number series with a measurement called MAE (Mean Absolute Error). MAE is the average of absolute difference between each expected and provided number.

Eg: MAE of above example:

MAE = (|40–38| + |50-45| + |25–27| + |72–75| + |100–100|)/5
MAE = 2.4

The competition portal only lets one user to upload 3 .csv file per day. But competitors can team up and try even more attempts per day. Let’s say 10 competitors team up; then they have 30 attempts. But the expected series can be anything, therefore one can see this cannot be brute forced at all. But that’s not correct. Since the portal calculates the MAE for every upload of a .csv file and return the MAE with a leader board ranking. This way, competitors can measure their progress and see their position on the competition. The competition duration is 3 months and the length of the predicted number series we should upload is 450.

Let’s see how we can use the legit upload of the .csv files with a little bit of mathematics to come to the very top of the leader board ;)

Hacking — Not exactly, but it’ll do the thing

Let yᵢ be the expected number for the iᵗʰ position of the number series.
Let xᵢ be the competitor uploading number for the iᵗʰ position of the number series.
Let E be the MAE of the submission, without the iᵗʰ number counted. Then,

MAE = (|yᵢ-xᵢ| + E )/n

By the nature of the competition, let’s say we know that the number is somewhere between 1–1000. Then we’ll need to upload 0 and 10,000 for a particular position and do calculations. (0 and a very large number out of the expected range, you’ll see why in next steps)

Say, for a particular position i, xᵢ = 0 and xᵢ = 10000 and other values are kept unchanged. Note that this will cost 2 .csv uploads and we can obtain 2 MAEs. Say, MAE₁ and MAE₂. Then,

|0-yᵢ| + E = n ✕ MAE₁
|10000-yᵢ| + E = n ✕ MAE₂

Note that since we carefully chose 0 and a very large number, we can simplify above equations’ absolute differences further.

yᵢ+ E = n ✕ MAE₁ — — — — (1)
10000-yᵢ+ E = n ✕ MAE₂ — — — — (2)

Now, perform (1)-(2)

=> 2yᵢ-10000 = n(MAE₁-MAE₂)

=> yᵢ = (n(MAE₁-MAE₂) + 10000) / 2

Now, we have successfully found expected iᵗʰ number in the series. We’ll have to repeat this by n times.

Since the competition has a n=450 and 3 per day upload limitations per user, let’s imagine 10 competitors team up and do this hack.

(2 ✕ 450 attempts)/ (3 ✕ 10 attempts per day)= 30 days

Yay!, we have mathematically proven that this portal enabled competition can be brute forced with only limited number of attempts and limited amount of time.

Final words…

Please do not try these kinds of hacks and tricks on real life competitions. If such exists, try to prove them mathematically; specially if you’re building some system, try to perform some mathematical operations on it and see whether you can actually break through it with the information you’re exposing.

So, this competition could have easily prevented the suggested mathematical brute force by simply not returning the MAE directly to the competitor. Rather, it could have calculated the MAE and return the ranking of the competitor/team.

Thanks for making it to the end… I hope you learnt something useful.