Why I Love: Fast Inverse Square Root

What Is An Inverse Square Root?
What Can You Do With One?
How Do We Get One… Fast?
Where Did It Come From?
Where Did It Go?
Why Do You Love It?

In For Want Of A More Limited Computer, I made mention of the algorithm Fast Inverse Square Root. I came across this in one of the most famous applications to use it, Quake III Arena, as I'm sure many have. While it may not have been created for Quake III Arena, it did certainly give it quite the spotlight.

But what is it? What does it do? Where did it come from? Where did it go? Cotton Eye Joe?

What Is An Inverse Square Root?

Well it's simple!

\(\frac{1}{\sqrt{x}}\)

Moving on!

Alright, just kidding. Here's what you need to know. A reciprocal, or multiplicative inverse for a number \(x\), is a number that can be multiplied by \(x\) to get the number 1. 1 is what's called the "multiplicative identity", which is just a fancy term to say you can multiply by 1 without doing anything to the values.

An inverse square root then, is just the reciprocal of a square root, or the number that can be multiplied by the square root of \(x\) to get 1.

What Can You Do With One?

A common usage of an inverse square root of a number is used to normalize a vector. A vector is a quantity that has both a direction and a magnitude, and a normalized vector has a length of 1. Normalizing a vector means to make a vector with the same direction, but a magnitude of 1. A vector with a magnitude of 1 is called a unit vector.

Okay… cool, so what?

A normalized vector makes a lot of calculations a lot simpler to do, often relating to angles between vectors. So, when computers do a lot of calculations about what angles things are at to other things, for example, with lighting. Calculations like angles of incidence and reflection angles are all based on the angles between vectors.

How Do We Get One… Fast?

Here's where we get to the cool computer part. Strap yourself in for that famous snippet of code!

float Q_rsqrt(float number)
{
  long i;
  float x2, y;
  const float threehalfs = 1.5F;

  x2 = number * 0.5F;
  y  = number;
  i  = * ( long * ) &y;                       // evil floating point bit level hacking
  i  = 0x5f3759df - ( i >> 1 );               // what the fuck?
  y  = * ( float * ) &i;
  y  = y * ( threehalfs - ( x2 * y * y ) );   // 1st iteration
  // y  = y * ( threehalfs - ( x2 * y * y ) );   // 2nd iteration, this can be removed

  return y;
}

"What the fuck", indeed. So what's going on here? Let's go through it, step by step.

The first three lines:

long i;
float x2, y;
const float threehalfs = 1.5F;

Are fairly normal. We declare a long integer, i, and two floating point numbers, x2 and y, as well as a constant, for multiplying by 1.5. Let's see what values we give them next:

x2 = number * 0.5F;
y  = number;
i  = * ( long * ) &y;                       // evil floating point bit level hacking

Here we see x2 is called that because it's "x over two", or half of x. y is just the passed in number. Why would we do that? The next line tells us; we're creating a long integer pointer to the floating point value stored in y. Yep, this is where the evil floating point bit level hacking starts. By doing this, we can then:

i  = 0x5f3759df - ( i >> 1 );               // what the fuck?
y  = * ( float * ) &i;

Here's where the main bit of magic comes into play. But first:

What Even Is A Floating Point?

A floating point number lets you represent real numbers using two integer numbers, one a number of fixed precision called the significand (or sometimes mantissa), another called the exponent, which multiplies the significand by a fixed-base raised to the exponent.

Let's take a look:

\(123 \times 10^4\)

This is a number in scientific notation. Here, the number 123 would be the significand, the number 10 would be the fixed-base, and the exponent would be 4. You can think of a floating point number like a number in scientific format.

There are a number of ways floating point number can be represented in memory, but a common standard, and the one we're concerned with at the moment, is IEEE-754 32-bit Single Precision. That looks something like this!

             Significand (1 + 0.17... Encoded as 1451392)
 Sign bit   /
/          / 
0 10010011 00101100010010110000000
  \
   Exponent (2^20, encoded as 147)

So the first bit says whether the value is positive or negative. The next 8 bits are an exponent, in base 2. The remaining bits are the significand. This is a value between 0 and 1, added by one.

That's the general idea, but you can play around with the following calculator to get a better feel for it yourself, as well as the page containing more specifics and information about the standard: IEEE-754 Floating Point Converter

Yeah, But What's That Weird Number?

Good question! That strange number, 0x5F3759DF, is used to get an estimate of the inverse square root from the number. Let's break it down, by writing our example and the magic in binary.

0101 1111 0011 0111 0101 1001 1101 1111 - 0x5F3759DF
0100 1001 1001 0110 0010 0101 1000 0000 - i

If you look at the code from earlier:

i  = 0x5f3759df - ( i >> 1 );

You'll see that it first shifts i right by 1. This divides the value as an integer by 2. Here's what that makes it look like:

0101 1111 0011 0111 0101 1001 1101 1111 - 0x5F3759DF
0010 0100 1100 1011 0001 0010 1100 0000 - i

Then i is subtracted from the constant, resulting in:

0101 1111 0011 0111 0101 1001 1101 1111 - 0x5F3759DF
0010 0100 1100 1011 0001 0010 1100 0000 - i
---------------------------------------
0011 1010 0110 1100 0100 0111 0001 1111

Which when we read it as a float again, we get 0.00090132834, which is rather close to 0.000901669634667! We have successfully done our first approximation.

What actually happened here? If you want the inverse square root of a number in scientific format, you can simply divide the exponent by -2. Remember how the exponent is stored in those eight bits? When we shifted the bits, the exponent got divided by two, avoiding the expensive floating point division, making this algorithm fast! The reason we subtract from the magic number is that this number has a bunch of useful properties, like being able to fix up the significand (roughly), negate the exponent, and handle odd and even exponents. For more information, this paper explains how it does this! I will attempt to explain this myself in another article, "Why I Love: Fast Inverse Square Root, The Mathematics", to be released in the future.

So, by right shifting, we divide the exponent by two, and by subtracting from the magic number, we fix up our value to get our first approximation!

What's The Last Line Doing?

y  = y * ( threehalfs - ( x2 * y * y ) );   // 1st iteration

Oh, this one? This little line does an approximation of an iteration of Newton's method. What is that? It's a mathematical technique for getting a better approximation from an approximation. I'll also cover this in "Why I Love: Fast Inverse Square Root, The Mathematics".

Where Did It Come From?

When this code was first being discussed online as part of the Quake III Arena source code, many speculated that it may have been created by John Carmack. This isn't right, however. John Carmack has repeatedly stated he didn't. I must thank him, however, personally, for being part of the push to release the source code of the engine under a free as in freedom license. It pushed me to start programming.

So where did it come from? A couple of places, it turns out! In May of 1986, a mathematician and computer scientist named William Kahan, along with K. C. Ng, wrote an unreleased paper that described the basic bit fiddling that made the algorithm possible.

Later in the 1980s, Cleve Moler, working at Ardent Computer found out about this, and told a coworker, Greg Walsh. Greg Walsh was the one who was able to derive the constant and developed the algorithm. A consultant for Kubota (a company funding Ardent Computer), Gary Tarolli, brought the algorithm from there to 3dfx Interactive around 1994.

The algorithm was then shown by Jim Blinn in 1997 for "IEEE Computer Graphics And Applications".

So it's a very neat little trick that was bounced around a lot of people!

Where Did It Go?

So, is this little algorithm still in use? No, not really. Most hardware nowadays includes dedicated methods to calculate the same thing, which will be far faster than doing this, as well as floating point operations generally getting faster overall.

Why Do You Love It?

This is an algorithm that lies so firmly between arcane programming techniques and obscure mathematics that the first time I saw it I decided to become a programmer to understand it. While I'm still wrapping my head around the mathematics as a result of my byte-oriented brain, I'm finally able to understand what it's even doing. It's a marker of my own growth in knowledge. I don't plan to stop here though.

Plus, even now, it's motivating me to fix my lack of understanding of the mathematics so that I can hopefully share that knowledge with you, the reader of this article!

So while it's an interesting algorithm, and I love it to bits, it's very rarely useful these days, other than, for say, getting nerds into the world of computer science through sheer intrigue.

Table of Contents