2016-08-26 3:43 GMT+02:00 MarLinn via Haskell-Cafe <haskell-cafe@haskell.org>:
[...]  Divisions are expensive, too.

Yes, considered in isolation they are, but things are totally different if you consider modern CPU architectures and the program as a whole: With highly pipelined architectures, out-of-order execution, register renaming etc. it is not uncommon that the high cost of the division itself is hidden because e.g. its result is not immediately needed. And the other observation is: The cunning bit-fiddling algorithms of the past often have very high hidden costs because they need more registers, have strong value dependencies (leading to pipeline stalls), interact badly with branch prediction etc.  As a result, it is not uncommon that the most straightforward code is the fastest. In the end you have to measure... (on various platforms)
 
(interesting fact: Java uses a seven-fingered merge sort with bit-shift-trickery to approximate a division by seven because the division is so damn expensive) [...]

This is a perfect example for a bad algorithm on current CPUs: An integer division by a constant can and should be replaced by an integer multiplication plus some tiny correction by a shift/add, see e.g. section "8.1 Replacing division with multiplication" in AMD's optimization guide (http://support.amd.com/TechDocs/25112.PDF) or "Computing Magic Numbers" in Hacker's Delight (http://www.hackersdelight.org/). Looking at e.g. the Skylake table in http://www.agner.org/optimize/instruction_tables.pdf (the CPU performance "bible" ;-), you can see how blazingly fast that is.

In a nutshell: We don't program on our grandpa's CPUs anymore, :-D at least not on the desktop/server, so a lot of traditional advice from the past is misleading nowadays.