Fast inverse square root in Go (and JavaScript) for fun

Adrien
4 min readFeb 3, 2020

--

Implementing the fast inverse square root algorithm in Go (& JS)

The fast inverse square root is an algorithm that takes a few shortcuts to quickly calculate an estimate of 1 ⁄ √x (the inverse square root).

It takes advantage of the IEEE 754 standard representation of floating point numbers in memory to avoid division and iteration to reach an answer, instead using simple bit manipulation, subtraction and a magic number.

IEEE 754 32 bit float

Why?

The inverse square root is used widely in vector math and computer graphics to calculate normals and angles, for lighting, shading and plenty of others. Calculating the inverse square root could be expensive, the fast inverse square root technique is a reasonably accurate, cheap alternative.

The technique was made famous by John Carmack’s use of it in Quake III Arena, but it had been bouncing around the computer graphics industry for a few years before Q3A.

Ok, on with it

0x5F3759DF was the original magic number for 32 bit floats, apparently discovered by trial and error, and refined to 0x5F375A86 by Chris Lomont.

0x5FE6EB50C7B537A9 is the magic number for 64 bit floats, discovered later by Matthew Robertson.

The 32 bit Algorithm

import "math"
const magic32 = 0x5F375A86
func FastInvSqrt32(n float32) float32 {
// If n is negative return NaN
if n < 0 {
return float32(math.NaN())
}
// n2 and th are for one iteration of Newton's method later
n2, th := n*0.5, float32(1.5)
// Use math.Float32bits to represent the float32, n, as
// an uint32 without modification.
b := math.Float32bits(n)
// Use the new uint32 view of the float32 to shift the bits
// of the float32 1 to the right, chopping off 1 bit from
// the fraction part of the float32.
b = magic32 - (b >> 1)
// Use math.Float32frombits to convert the uint32 bits back
// into their float32 representation, again no actual change
// in the bits, just a change in how we treat them in memory.
// f is now our answer of 1 / sqrt(n)
f := math.Float32frombits(b)
// Perform one iteration of Newton's method on f to improve
// accuracy
f *= th - (n2 * f * f)

// And return our fast inverse square root result
return f
}

https://github.com/arccoza/go-fastinvsqrt

The 64 bit Algorithm

import "math"
const magic64 = 0x5FE6EB50C7B537A9
func FastInvSqrt64(n float64) float64 {
if n < 0 {
return math.NaN()
}
n2, th := n*0.5, float64(1.5)
b := math.Float64bits(n)
b = magic64 - (b >> 1)
f := math.Float64frombits(b)
f *= th - (n2 * f * f)
return f
}

Performance?

Here are some numbers vs Go’s math.Sqrt built-in wrapped in the function invSqrt64:

goos: linux
goarch: amd64
pkg: github.com/arccoza/go-fastinvsqrt
#Go Native
BenchmarkInvSqrt64_1-4 149562032 7.81 ns/op
BenchmarkInvSqrt64_2-4 153053457 7.77 ns/op
BenchmarkInvSqrt64_3-4 150307755 7.82 ns/op
BenchmarkInvSqrt64_4-4 153308527 7.80 ns/op
BenchmarkInvSqrt64_5-4 199944013 5.96 ns/op
# Fast inverse square root 32 bit
BenchmarkFastInvSqrt32_1-4 694597858 1.77 ns/op
BenchmarkFastInvSqrt32_2-4 634092076 1.69 ns/op
BenchmarkFastInvSqrt32_3-4 706532166 1.69 ns/op
BenchmarkFastInvSqrt32_4-4 700042736 1.70 ns/op
BenchmarkFastInvSqrt32_5-4 698131202 1.79 ns/op
# Fast inverse square root 64 bit
BenchmarkFastInvSqrt64_1-4 447052143 2.39 ns/op
BenchmarkFastInvSqrt64_2-4 506831372 2.26 ns/op
BenchmarkFastInvSqrt64_3-4 524493175 2.26 ns/op
BenchmarkFastInvSqrt64_4-4 525436268 2.47 ns/op
BenchmarkFastInvSqrt64_5-4 470687276 2.28 ns/op
PASS
ok github.com/arccoza/go-fastinvsqrt 23.582s

Not bad 👍.

Again, in JavaScript

Why not? So JavaScript being a dynamic language has no typed values, so the only way to manipulate a piece of memory first as a float then an int then a float again is to use Typed Arrays.

Actually we’re using ArrayBuffer and DataView directly instead, so we can access the same bytes as different types.

const m = 0x5F375A86,
// Creating the buffer and view outside the function
// for performance, but this is not thread safe.
buffer = new ArrayBuffer(4),
view = new DataView(buffer)
function fastInvSqrt (n) {
var f, n2 = n * 0.5, th = 1.5
view.setFloat32(0, n)
view.setUint32(0, m - (view.getUint32(0) >> 1))
f = view.getFloat32(0)
f *= (th - (n2 * f * f))
f *= (th - (n2 * f * f))
return f
}

https://github.com/arccoza/js-fastinvsqrt

And performance is.. horrible 💩. No real surprise, oh well, you can check for yourself here: https://jsbench.me/vjk61wfui5/1

--

--