Similar trick to that used in the fast inverse square root routine[1] popularize...

fph · on Dec 31, 2024

For some more explanation: the main idea behind both tricks is that the IEEE floating point formats are designed so that the most significant bits represent its exponent, that is, floor(log2(x)). Hence reinterpret-casting a float x to an unsigned integer uint(x) approximates a multiple of log2(x). So these kinds of approximations work like logarithm arithmetic: float(C + a*uint(x)) approximates x^a, for a suitable constant C. Quake's invsqrt is a=-1/2, this post is a=-1.

More detail on https://en.wikipedia.org/wiki/Fast_inverse_square_root#Alias... .

IEEE754 floats are a very well-designed binary format, and the fact that these approximations are possible is part of this design; indeed, the first known instance of this trick is for a=1/2 by W. Kahan, the main designer of IEE754.

mbitsnbites · on Dec 31, 2024

Yep. Along the same lines. This one is even simpler, though, as it requires only a single integer CPU instruction (and the simplest of all instructions too).

If you want full precision, you need to do three Newton-Raphson iterations after the initial approximation. One iteration is:

    y = y * (2.0F - x * y);

magicalhippo · on Dec 31, 2024

It's a neat trick, and could be very useful on microcontrollers which doesn't have hardware division but does have hardware multiplication.

actionfromafar · on Dec 31, 2024

Hm, like 68000?

pechay · on Dec 31, 2024

The 68k has both signed and unsigned 32/16 bit divide instructions.