
Don Stewart
xj2106:
I used `unsafePerformIO' with `INLINE', because I don't know where `inlinePerformIO' is now. And also the `-optc-march' is changed to `nocona'.
Using unsafePerformIO here would break some crucial inlining. (the same trick is used in Data.ByteString, by the way).
You can find inlinePerformIO is in Data.ByteString.Internal.
Comparing the two, n=5500, ghc 6.8:
$ ghc -O -fglasgow-exts -fbang-patterns -optc-O3 -optc-march=pentium4 -optc-mfpmath=sse -optc-msse2 -optc-ffast-math spec.hs -o spec_hs --make
With inlinePerformIO:
$ time ./spec_hs 5500 1.274224153 ./spec_hs 5500 26.32s user 0.00s system 99% cpu 26.406 total
As expected, and comparable to the shooutout result for the same N. With unsafePerformIO, the whole thing falls apart:
$ time ./spec_hs 5500 ^Cspec_hs: interrupted ./spec_hs 5500 124.86s user 0.11s system 99% cpu 2:05.04 total
I gave up after 2 minutes. This FFI peek/poke code, acting as an ST monad, under a pure interface relies on inlinePerformIO.
Thanks for pointing this out. Xiao-Yong -- c/* __o/* <\ * (__ */\ <