The Brilliance Of Quake’s Fast Inverse Square Root Algorithm
The game developer of Quake, have made the code to Quake III open source, revealing something interesting for programmers. This is the Fast Inverse Square Root algorithm, as applied in the game engine. This might seem nothing special, since most programming languages already have a function for it, but there is a purpose for this.
Inverse Square Root
The inverse square root formula is simply:
let a = 1 / Math.sqrt(x)
When you need to render the physics of lighting and reflections in a game engine, you need to use inverse square root calculations. It gives the sense of depth and shadow like what one would see graphics cards do with ambient occlusion in games.
It looks simple enough but it can be really expensive to compute for the CPU. That also means it is not very fast. When you need to perform zillions of calculations per second, it can over burden the CPU. Optimizations to inverse square root calculations were necessary.
Fast Inverse Square Root Algorithm
This is the original code snippet used in Quake III, written in the C language (id Software):
float Q_rsqrt( float number )
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration,
// this can be removed
What is going on here? Let’s break it down into 3 parts, Evil Bit Hack, WTF, and Newton Method Of Approximation.
Evil Bit Hack
First we need to store floating-point bits into integer format.
y = number;
int i = *(int*)&x;
We need to put a single-precision float stored in memory as a 32-bit number. The number is then reinterpreted as an integer, where the most significant bit stores the sign.
i = * ( long * ) &y;
Now we take a single bit shift to the right, which inserts a 0 as the most significant bit and dropping the least significant bit. This is what divides the integer interpretation of the number by 2.
If the same number were to be interpreted as a float, the change is much more far-reaching compared to division by 2. The bitwise shift negates the exponent and divides it by 2. This shifts a bit from the exponent into the mantissa, and drops its least significant bit.
When shifting to the right by one, there are 2 possible cases: either the number’s exponent is odd or even. The hex number (aka magic number), 0x5f3759df, was chosen to minimize the error in the 2 cases.
i = 0x5f3759df - ( i >> 1 );
The idea here is that whether the bit-shifted number is an odd or even exponent, when you subtract the number from the hex number, it yields an estimate of the inverse square root of the original number.
Returning to the code, we can reinterpret the result as a float.
y = * ( float * ) &i;
Since the value of i is stored in memory as an integer, it must be converted to floating-point in a mantissa-exponent form. This is what the function above performs.
Newton Method Of Approximation
An initial estimate of the inverse root is made using Newton’s Method technique called Newton-Raphson Iteration. It then goes through a single iteration process to return the result.
y = y * ( threehalfs — ( x2 * y * y ) );
This is used to calculate a more accurate approximation of the inverse square root. The result is accurate to a maximum error of 0.175%.
Notice that it uses no actual division operation. This applies Newton’s method for the function f(y) = 1/y² — x. In summary, the method was able to determine the inverse square root by using subtraction, bit-shifting, and multiplication operations only.
How Was The Magic Number Chosen?
The big question here is how was the magic number chosen for this algorithm, which is probably why the code comment was what it was. This has something to do with how floating-point numbers and integers are stored by a computer. This is best explained in the paper by Chris Lomont.
When it comes to normalizing vectors in graphics processing, the calculation of a square root and a floating-point division is required. Both operations are costly to the CPU. What Quake developers did was to create an algorithm to optimize their game engine which is able to compute fast inverse square root using simpler operations not requiring a powerful GPU.
This method of calculating the inverse square root seems outdated by today’s standards. Nonetheless, it offers an optimization technique that was necessary at one time and a revelation of “tricks” only developers knew. This can be put in the computing archives for its ingenious hack.