RE: Faster, GHC, and floating point.

I added some `seq` 's to the code. I also used SSE (not p4, i don't have the p4 available by now, but i'll test it) and indeed, it runs _a bit_ faster: ~640 now vs. 711 ms before (Speedup 1.11).
Now, Haskell takes 4.57 the time of C++ (cygwin gnu C++ with -O2).
But if i look in the interface file by -ddump-hi, i see lot's of U(L) 's instead of S or similar in the signature of the functions. I think that U(L) is better than L, but can i do better than U(L) somehow?
Hard to tell without investigating the code in detail. U(L) is indeed better than S, though. If you're up to reading Core, you could try -ddump-simpl and see if the code generated by the compiler is what you expect. This is the only tool for investigating performance problems on a small scale. Or, you could try converting your code to use unboxed Double# values and primitive operations, instead of the plain Double. In general, it is better to convince the compiler to do this itself (by making things strict enough), but sometimes using Double# directly is a good idea for really performance-critical code. Also this will tell you whether your performance problems are due to strictness/boxing or something else. It should be possible to obtain the same performance as C++ (or better). Cheers, Simon
participants (1)
-
Simon Marlow