
phil:
On Mon, Aug 20, 2007 at 09:57:38PM +0100, Simon Peyton-Jones wrote:
GHC does some constant folding, but little by way of strength reduction, or using shifts instead of multiplication. It's pretty easy to add more: it's all done in a single module. Look at primOpRules in the module PrelRules.
Patches welcome! But please also supply test-suite tests that check the correctness of the rules.
Sucking another example out of comp.lang.functional:
This:
import System
f :: Int -> Int -> Int f s n = if n > 0 then f (s+n) (n-1) else s
main = do [n] <- getArgs putStrLn $ show $ f 0 (read n)
is 3-4x slower than this:
#include
#include #include int f(int s, int n) { return n > 0 ? f(s+n, n-1) : s; }
int main(int argc, char *argv[]) { assert(argc == 2); printf("%d\n", f(0, strtol(argv[1],0,0))); }
The generated assembler suggests (if I've read it correctly) that gcc is spotting that it can replace the tail call with a jump in the C version, but for some reason it can't spot it for the Haskell version when compiling with -fvia-C (and neither does ghc itself using -fasm). So the haskell version ends up pushing and popping values on and off the stack for every call to f, which is a bit sad.
That doesn't sound quite right. The C version should get a tail call , with gcc -O2, the Haskell version should be a tail call anyway. Let's see: C $ gcc -O t.c -o t $ time ./t 1000000000 zsh: segmentation fault (core dumped) ./t 1000000000 ./t 1000000000 0.02s user 0.22s system 5% cpu 4.640 total Turning on -O2 $ time ./t 1000000000 -243309312 ./t 1000000000 1.89s user 0.00s system 97% cpu 1.940 total And GHC: $ ghc -O2 A.hs -o A $ time ./A 1000000000 -243309312 ./A 1000000000 3.21s user 0.01s system 97% cpu 3.289 total So, what, 1.6x slower than gcc -O2 Seems ok without any tuning. -- Don