Re: a compiled program is slower than byte code

If I compile the attach code with GHC of the newest ghc-7.0 darcs branch, the compiled program is much slower than byte code. This phenomenon does not exist in GHC 6.12.3.
6.12.3 runghc -- 6.23s user 0.59s system 98% cpu 6.912 total ghc -- 5.72s user 0.70s system 99% cpu 6.422 total ghc -O -- 5.70s user 0.67s system 99% cpu 6.376 total ghc -O2 -- 5.69s user 0.67s system 99% cpu 6.373 total
ghc-7.0 runghc -- 6.43s user 0.10s system 99% cpu 6.593 total ghc -- 9.20s user 0.09s system 99% cpu 9.302 total ghc -O -- 9.20s user 0.09s system 99% cpu 9.298 total ghc -O2 -- 9.38s user 0.09s system 99% cpu 9.478 total
Is this a bug?
I don't see the same results here (x86_64/Linux). This is the 7.0 branch built last night: $ ghc-stable-nightly2 ./kazu.hs -O2 -fforce-recomp; time ./kazu [1 of 1] Compiling Main ( kazu.hs, kazu.o ) Linking kazu ... 8.58s real 8.54s user 0.04s system 99% ./kazu $ time runghc -f ghc-stable-nightly2 ./kazu.hs -O2 -fforce-recomp; 16.92s real 14.21s user 2.70s system 99% runghc -f ghc-stable-nightly2 ./kazu.hs -O2 -fforce-recomp However, there is a fairly large difference between the 7.0 branch and HEAD: $ ghc-nightly2 ./kazu.hs -O2 -fforce-recomp; time ./kazu [1 of 1] Compiling Main ( kazu.hs, kazu.o ) Linking kazu ... 4.17s real 4.16s user 0.01s system 99% ./kazu HEAD is more than twice as fast on this program. Fusion not working, perhaps? Cheers, Simon
My environment is Mac which runs Snow Leopard.
--Kazu
---- import System.IO
n :: Int n = 10000
main :: IO () main = withFile "/dev/null" WriteMode $ \h -> hPutStr h . foldr1 (++) . replicate n . replicate n $ 'a' ----

Hello Simon,
$ ghc-nightly2 ./kazu.hs -O2 -fforce-recomp; time ./kazu [1 of 1] Compiling Main ( kazu.hs, kazu.o ) Linking kazu ... 4.17s real 4.16s user 0.01s system 99% ./kazu
OK. I ran it on 32bit Linux. 6.12.3 runghc -- 2.22s user 0.40s system 96% cpu 2.724 total ghc -- 1.96s user 0.14s system 97% cpu 2.151 total ghc -O -- 2.18s user 0.10s system 97% cpu 2.333 total ghc -O2 -- 2.27s user 0.07s system 97% cpu 2.393 total ghc-7.0 runghc -- 3.43s user 0.35s system 97% cpu 3.861 total ghc -- 5.11s user 0.07s system 97% cpu 5.299 total ghc -O -- 5.38s user 0.03s system 97% cpu 5.534 total ghc -O2 -- 5.54s user 0.10s system 97% cpu 5.783 total ghc-7.0 is slower than 6.12.3. And if the code is compiled with ghc-7.0, it is slower than runghc.
HEAD is more than twice as fast on this program. Fusion not working, perhaps?
Here is the results of "ghc -O -ddump-simpl-stats": ghc-7.0: 19 RuleFired 6 ++ 2 <=# 1 fold/build 5 foldr/app 1 foldr/augment 1 foldr/single 1 map 1 unpack 1 unpack-list 6.12.3: 25 RuleFired 6 ++ 2 <=# 3 fold/build 6 foldr/app 1 foldr/augment 1 map 2 repeat 2 take 1 unpack 1 unpack-list They are exactly the same on Linux and Mac. --Kazu

On 09/02/2011 04:35, Kazu Yamamoto (山本和彦) wrote:
Hello Simon,
$ ghc-nightly2 ./kazu.hs -O2 -fforce-recomp; time ./kazu [1 of 1] Compiling Main ( kazu.hs, kazu.o ) Linking kazu ... 4.17s real 4.16s user 0.01s system 99% ./kazu
OK. I ran it on 32bit Linux.
6.12.3 runghc -- 2.22s user 0.40s system 96% cpu 2.724 total ghc -- 1.96s user 0.14s system 97% cpu 2.151 total ghc -O -- 2.18s user 0.10s system 97% cpu 2.333 total ghc -O2 -- 2.27s user 0.07s system 97% cpu 2.393 total
ghc-7.0 runghc -- 3.43s user 0.35s system 97% cpu 3.861 total ghc -- 5.11s user 0.07s system 97% cpu 5.299 total ghc -O -- 5.38s user 0.03s system 97% cpu 5.534 total ghc -O2 -- 5.54s user 0.10s system 97% cpu 5.783 total
Thanks, I do see the problem now (I was inadvertently running the compiled code with runghc, because I used "runghc foo.hs -fforce-recomp", which doesn't do what I thought it did). I created a ticket: http://hackage.haskell.org/trac/ghc/ticket/4951 I think the difference between interpreted and compiled is due to http://hackage.haskell.org/trac/ghc/ticket/917. What I'm more worried about is the slowdown in pre-7.0.2 relative to other versions.
HEAD is more than twice as fast on this program. Fusion not working, perhaps?
Here is the results of "ghc -O -ddump-simpl-stats":
ghc-7.0:
19 RuleFired 6 ++ 2<=# 1 fold/build 5 foldr/app 1 foldr/augment 1 foldr/single 1 map 1 unpack 1 unpack-list
6.12.3:
25 RuleFired 6 ++ 2<=# 3 fold/build 6 foldr/app 1 foldr/augment 1 map 2 repeat 2 take 1 unpack 1 unpack-list
My stats look very different. 6 RuleFired 1 ++ 2 <=# 1 foldr/app 1 unpack 1 unpack-list Are your libraries compiled with -O2? Cheers, Simon

Hello,
My stats look very different.
6 RuleFired 1 ++ 2 <=# 1 foldr/app 1 unpack 1 unpack-list
Are your libraries compiled with -O2?
I don't know. How can I check? I just installed ghc-7.0 by "perl boot; configure; make; make install". --Kazu

On 09/02/2011 09:25, Kazu Yamamoto (山本和彦) wrote:
Hello,
My stats look very different.
6 RuleFired 1 ++ 2<=# 1 foldr/app 1 unpack 1 unpack-list
Are your libraries compiled with -O2?
I don't know. How can I check?
I just installed ghc-7.0 by "perl boot; configure; make; make install".
(sorry, I missed your reply earlier) The default build settings use -O2 for the libraries, so your build is fine. I don't understand the differences between your -ddump-simpl-stats and mine, but I think we have this particular performance problem fixed now anyway (I'm right now validating patches for the 7.0 branch). Cheers, Simon
participants (2)
-
Kazu Yamamoto
-
Simon Marlow