On my machine (i7 3770) inserting famous Quake3 hack in place of 1/sqrtf: http://en.wikipedia.org/wiki/Fast_inverse_square_root is faster but not accurate with one iteration (image is very distorted) and slower but accurate with 2 iterations.
On my machine (i7 3770) inserting famous Quake3 hack in place of 1/sqrtf: http://en.wikipedia.org/wiki/Fast_inverse_square_root is faster but not accurate with one iteration (image is very distorted) and slower but accurate with 2 iterations.