The Legendary Fast Inverse Square Root
In the 90s, 3D video games were in their infancy, and the ability to render 3D graphics was constrained by hardware. This led programmers to devise creative ways to get around hardware limitations. One such innovation, known as the Fast Inverse Square Root function, was introduced in id Software’s First Person Shooter from 1999, Quake III Arena.
Why inverse square roots?
Inverse square roots are useful in video game graphics and particularly in 3D game engines. Pathfinding, lighting, reflections, and many other game programming techniques require vector normalization, which involves an inverse square root operation. Inverse square roots, which rely on floating point division, are expensive for processors. In a fast paced, graphically intense game like Quake III Arena, these calculations are made millions of times every second, so even a minor improvement in the performance of such a calculation could significantly improve graphics computation speed and ultimately the game’s frame rate. So how did programmers of the id Tech 3 engine get around the expensive inverse square root function? They implemented a very accurate, very fast approximation.
Here it is, in all it’s glory: the fast inverse square root source code from Quake III Arena! The C preprocessor directives have been removed, but the code and comments are exactly as they appeared in the Quake engine.
float Q_rsqrt( float number )
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration,
// this can be removed
Okay… so that’s ridiculous. Nothing short of voodoo black magic, but let me try to explain. The overall idea of this approximation is essentially a Newton-Raphson iteration. The Newton-Raphson root finding method works like this: given an approximation of a numbers root, c, a better approximation is found using
y = c/2 * (3 — xc²) . The fast inverse square root function implements a single Newton-Raphson iteration starting with a really clever first guess.
Let’s take a closer look at how it works:
1. Get the bits of the floating point number
y = number;
In C, a single-precision float is stored in memory as a 32 bit number. Bit 31 represents the sign, bits 30 - 23 represent an 8 bit exponent, and bits 22 - 0 represent a 23 bit mantissa — a binary number with the decimal point to the left.
2. Reinterpret the number as an integer
i = * ( long * ) &y;
The same 32 bit number can also be interpreted as an integer, where the most significant bit stores the sign.
3. Bitwise shift the 32 bit number to the right by one
( i >> 1 );
A single bit shift to the right inserts a 0 as the most significant bit and drops the least significant bit of a number. This effectively divides the integer interpretation of the number by 2. However if the same number were to be interpreted as a float, the change is much more drastic than division by 2. The bitwise shift negates the exponent and divides it by 2, shifts a bit from the exponent into the mantissa, and drops the least significant bit of the mantissa.
3. Subtract the bit-shifted number from 0x5f3759df
i = 0x5f3759df — ( i >> 1 );
Chris Lomont explains in his paper that the initial guess used in Quake III Arena’s fast inverse square “is piecewise linear, and the function being approximated is not the same in all cases.” When shifting the number’s bits to the right by one, there are two possible cases: either the number’s exponent is even or it is odd. The magic hex number, 0x5f3759df, was chosen such that it minimizes the error between the two cases. Thus, whether the bit-shifted number has an odd or even exponent, subtracting the number from the magic number yields an accurate approximation of the inverse square root of the original number (once the result,
i , has been reinterpreted as a float).
4. Reinterpret the result as a float
y = * ( float * ) &i;
At this point,
i is stored in memory as an integer. The function reinterprets
i and stores the result in memory as a floating point number. This number is a relatively accurate initial guess of the inverse square root of the input value.
5. Perform a Newton-Raphson iteration
y = y * ( threehalfs — ( x2 * y * y ) );
A single Newton-Raphson iteration is performed to calculate a more accurate approximation of the inverse square root of the input. The result of the Newton-Raphson iteration is the return value of the function. The result is extremely accurate with a maximum error of 0.175%.
The real genius of this code, is the line:
i = 0x5f3759df - ( i >> 1 );
As I said before, voodoo black magic. Where did it come from? Who on Earth conjured such sorcery?
Origin of the Fast Inverse Square Root
The Fast Inverse Square Root has an interesting history. The original author was not revealed until 2006, 7 years after the game’s release, and around 20 years after the code was originally written. Rys Sommerfeldt, Senior Manager of the European Game Engineering Team at AMD RTG, launched an investigation into the function’s origins in 2004. He began his search with John Carmack, lead programmer of Quake III, who denied authorship and suggested Terje Mathisen, a master of x86 assembly programming who helped with optimization on the Quake III engine.
Sommerfeldt reached out to Mathisen, who replied that he had, in fact, written an inverse square root function some 5+ years earlier, but that the Quake fast inverse square root function was not the one that he wrote. He suggested that it looked like something that might have come from MIT’s HAKMEM documents.
More investigative work from Sommerfeldt led him to a contact at NVIDIA who suggested he ask Gary Tarolli, an employee at NVIDIA and a founding member of 3dfx. Sommerfeldt emailed Gary, who responded that the function definitely passed by his keyboard many times. Gary said that he had simulated different hex constants in his work on the IRIS Indigo workstation computer, but that he was not the original author. With no other leads, the investigation lay dormant for about a year.
As the article that Sommerfeldt wrote gained publicity, it finally reached the eyes of the original author of the Fast Inverse Square Root function, Greg Walsh! *thunderous applause* Greg Walsh is a monument in the world of computing. He helped engineer the first WYSIWYG (“what you see is what you get”) word processor at Xerox PARC and helped found Ardent Computer. Greg worked closely with Cleve Moler, author of Matlab, while at Ardent and it was Cleve who Greg called the inspiration for the Fast Inverse Square Root function.
The Fast Inverse Square Root function is a beautiful bit of code. I am both inspired and mesmerized by it. The world has Greg Walsh, the artist, and Cleve Moler, the muse, to thank for bringing such a brilliant work of art to creation.