
On 29 Oct 2008, at 8:31 am, Andrew Coppin wrote:
Hi guys.
This isn't specifically to do with Haskell, but... does anybody have any idea roughly how fast various CPU operations are?
For example, is integer arithmetic faster or slower than floating- point? Is addition faster or slower than multiplication? How much slower are the trigonometric functions? etc. Does using 8-bit integers make arithmetic any faster than using wider values?
Does anybody have a good resource for this kind of information?
lmbench3 http://sourceforge.net/projects/lmbench Building and running it will give you some answers for your machine. It's hard to answer your questions as they are posed, because of the difference between throughput and latency. For example, you might be able to start a new multiplication on every cycle on some machine (throughput is 1 multiply per cycle) but the result of any one of them might not be available for twelve cycles (latency is 12 cycles per multiply). Rough guesses: integer adds, subtracts, and compares are fast, integer multiplies and divides are much slower, slow enough that compilers go to some trouble to do something else when multiplying or dividing by a constant. The speed of the trig functions depends on how much hardware support you have and on whether your library writer favoured speed or accuracy (especially accuracy over a wide range). I don't think lmbench measures these, but it wouldn't be hard to add something suitable. Using 8-bit integers is unlikely to make your program *directly* faster unless you are using a compiler which is smart enough to exploit SIMD instructions without programmer-written hints. The Intel C compiler _is_ that smart. In my code I have found it to be extremely good at vectorising trivial loops, with no actual effect on the run-time of my programs. However, code written with the intention of exploiting that would be different. The one thing that lmbench3 will tell you that will drop your jaw and widen your eyes is the bit at the end where it tells you about memory speed. Main memory is exceeding slow compared with cache. This is why I said that switching over to 8-bit integers might not make your program *directly* faster; if you have a lot of data which you can pack tightly, so that as 32-bit words it would not fit into L2 cache but as bytes it does, you may well get a very useful speedup from that. Or you may not. There is no substitute for measurement.