In the ARM there will not be branches. The sequences are short enough that conditional execution can be used. The deeply pipelined (x86 and such) processors have fast multiplies.
This looks especially handy for embedded processors. I was planning to use a lookup table and linear interpolation in an upcoming project to approximate log on an AVR, but I think I'll go with this instead.
For loops and similar situations are when the comma operator really comes in handy - although you really need to watch out for the rule of not assigning to the same variable more than once before a sequence point, especially as many compilers will only warn you about it at highest warning level - or not at all.
Well, if you haven't got multiplication, your CPU is probably not pipelined, either. Depending on the platform, you may, however, be able to find certain non-branching conditional instructions that help with this sort of algorithm.
See http://git.videolan.org/?p=x264.git;a=commitdiff;h=549cc55b5... .