Difference in Runtime but no explanation

Hello, I'm still to Haskell, and after I read through http://users.aber.ac.uk/afc/stricthaskell.html#seq I thought, that these tow fragments after term rewriting are really the same: myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = l `seq` len xs (l+1) main = print $ myLength [1..10000000] -- vs. myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = len xs $! (l+1) main = print $ myLength [1..10000000] main = print $ myLength [1..10000000] But the first expression evaluates more then twice as fast as the second one. Tested on GHC 6.10.4 and Windows XP, dual core (for what it's worth) It's on http://moonpatio.com/fastcgi/hpaste.fcgi/view?id=5321#a5321 btw. I can't see the difference, especially as $! is expressed in terms of seq

On Tue, 2009-12-15 at 09:52 -0800, Johann Höchtl wrote:
Hello,
I'm still to Haskell, and after I read through http://users.aber.ac.uk/afc/stricthaskell.html#seq
I thought, that these tow fragments after term rewriting are really the same:
myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = l `seq` len xs (l+1)
main = print $ myLength [1..10000000]
-- vs.
myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = len xs $! (l+1)
main = print $ myLength [1..10000000]
main = print $ myLength [1..10000000]
But the first expression evaluates more then twice as fast as the second one. Tested on GHC 6.10.4 and Windows XP, dual core (for what it's worth)
It's on http://moonpatio.com/fastcgi/hpaste.fcgi/view?id=5321#a5321 btw.
I can't see the difference, especially as $! is expressed in terms of seq
The second one is IMGO: myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = let l' = l+1 in l' `seq` len xs l' So in thew first + is not forced to be evaluated. My results (ghc 6.12.1, Core 2 Duo 2.8 GHz, Linux 2.6.32, Gentoo): Not Optimized & not compiled: First: 12.47 secs, 1530911440 bytes Second: 17.40 secs, 1929614816 bytes Optimized & compiled: First: 1.24 secs, 966280832 bytes Second: 1.11 secs, 966277152 bytes Repeating gave similar results - first being better w/out optimalization as 1.2:1.7 and second being better with optimalizations (-O). Why the first one is better unoptimalized? Regards

On Tue, Dec 15, 2009 at 10:30 AM, Maciej Piechotka
My results (ghc 6.12.1, Core 2 Duo 2.8 GHz, Linux 2.6.32, Gentoo):
Not Optimized & not compiled: First: 12.47 secs, 1530911440 bytes Second: 17.40 secs, 1929614816 bytes Optimized & compiled: First: 1.24 secs, 966280832 bytes Second: 1.11 secs, 966277152 bytes
Seconded, I consistently get 0.350 seconds vs 0.300 when compiling the original code via -O2.
Repeating gave similar results - first being better w/out optimalization as 1.2:1.7 and second being better with optimalizations (-O).
Why the first one is better unoptimalized?
I think thats not a specific enough question. I replaced the 'Integer' with 'Int' and found (via -ddump-asm) that the assembly is _identical_. A test shows what you would expect - the performance was identical. So the question in my mind is what presumably trivial optimization doesn't happen when Integer is used? Thomas

On Tue, 2009-12-15 at 10:56 -0800, Thomas DuBuisson wrote:
On Tue, Dec 15, 2009 at 10:30 AM, Maciej Piechotka
wrote: My results (ghc 6.12.1, Core 2 Duo 2.8 GHz, Linux 2.6.32, Gentoo):
Not Optimized & not compiled: First: 12.47 secs, 1530911440 bytes Second: 17.40 secs, 1929614816 bytes Optimized & compiled: First: 1.24 secs, 966280832 bytes Second: 1.11 secs, 966277152 bytes
Seconded, I consistently get 0.350 seconds vs 0.300 when compiling the original code via -O2.
Repeating gave similar results - first being better w/out optimalization as 1.2:1.7 and second being better with optimalizations (-O).
Why the first one is better unoptimalized?
I think thats not a specific enough question. I replaced the 'Integer' with 'Int' and found (via -ddump-asm) that the assembly is _identical_. A test shows what you would expect - the performance was identical. So the question in my mind is what presumably trivial optimization doesn't happen when Integer is used?
Thomas
Ups. I forgot about types :( Integer Not optimized & compiled: First: 19.81 secs, 2774731128 bytes Second: 22.64 secs, 3092270464 bytes Optimized & compiled: First: 0.64 secs, 980673088 bytes Second: 0.65 secs, 975408024 bytes Int Not optimized & compiled: First: 19.60 secs, 2774208376 bytes Second: 22.65 secs, 3092274608 bytes Optimized & compiled: First: 0.46 secs, 808978336 bytes Second: 0.43 secs, 803460216 bytes As for Integer. In case of Int l+1 `seq` something it can be simply transformed into (pseudo-assembler): call l incq %rax ; Or addq $1, %rax call something But for Integer: call l movq $1, %rbx call Integer.+ call something Regards

Am Dienstag 15 Dezember 2009 19:30:06 schrieb Maciej Piechotka:
On Tue, 2009-12-15 at 09:52 -0800, Johann Höchtl wrote:
Hello,
I'm still to Haskell, and after I read through http://users.aber.ac.uk/afc/stricthaskell.html#seq
I thought, that these tow fragments after term rewriting are really the same:
myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = l `seq` len xs (l+1)
main = print $ myLength [1..10000000]
-- vs.
myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = len xs $! (l+1)
main = print $ myLength [1..10000000]
main = print $ myLength [1..10000000]
But the first expression evaluates more then twice as fast as the second one. Tested on GHC 6.10.4 and Windows XP, dual core (for what it's worth)
Is that a) interpreted b) compiled without optimisations c) compiled with optimisations ? If c), that's a serious bug. With optimisations turned on, both should give (approximately) identical code, the same as with "len (x:xs) l = len xs (l+1)". With -O2, that gives exactly the same core as the first, the second has one (unnecesary, because everything is recognized as strict) case expression in otherwise identical core. All three perform identically here. If a) or b), it's because the second has one more function call per list element, first ($!) is called and then seq via that, while the first calls seq directly.
It's on http://moonpatio.com/fastcgi/hpaste.fcgi/view?id=5321#a5321 btw.
I can't see the difference, especially as $! is expressed in terms of seq
The second one is IMGO: myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = let l' = l+1 in l' `seq` len xs l'
So in thew first + is not forced to be evaluated.
My results (ghc 6.12.1, Core 2 Duo 2.8 GHz, Linux 2.6.32, Gentoo):
Not Optimized & not compiled: First: 12.47 secs, 1530911440 bytes Second: 17.40 secs, 1929614816 bytes
Wow. Here, interpreted and not optimised: First: 10.10 secs (Integer) 10.45 secs (Int) Second: 11.80 secs (Integer) 11.93 secs (Int) ghc 6.12.1, built from source, 3GHz Pentium 4 (2 Cores), Linux linux-mkk1 2.6.27.39-0.2- pae (openSuse 11.1) With ghci-6.10.3 First: 14.58 secs (Integer) 14.53 secs (Int) Second: 17.05 secs (Integer) 18.31 secs (Int) Nowhere near the difference that you have.
Optimized & compiled: First: 1.24 secs, 966280832 bytes Second: 1.11 secs, 966277152 bytes
Compiled with -O (same with -O2) First: 0.36 secs (Integer) 0.30 secs (Int) Second: 0.36 secs (Integer) 0.30 secs (Int) length: 0.32 secs Astonishing that compiling does so much less on your box. With ghc-6.10.3 (-O/-O2) First: 0.23 secs (Integer) 0.17 secs (Int) Second: 0.26 secs (Integer) 0.17 secs (Int) length: 0.19 secs Ouch! Why is 6.10.3 so much better here? Compiled without optimisations First: 0.48 secs (Integer) 0.49 secs (Int) Second: 1.07 secs (Integer) 1.01 secs (Int) With 6.10.3 First: 0.38 secs (Integer) 0.36 secs (Int) Second: 1.47 secs (Integer) 1.46 secs (Int) Wait, what?

On Dec 15, 9:30 pm, Daniel Fischer
Am Dienstag 15 Dezember 2009 19:30:06 schrieb Maciej Piechotka:
On Tue, 2009-12-15 at 09:52 -0800, Johann Höchtl wrote:
Hello,
I'm still to Haskell, and after I read through http://users.aber.ac.uk/afc/stricthaskell.html#seq
I thought, that these tow fragments after term rewriting are really the same:
myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = l `seq` len xs (l+1)
main = print $ myLength [1..10000000]
-- vs.
myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = len xs $! (l+1)
main = print $ myLength [1..10000000]
main = print $ myLength [1..10000000]
But the first expression evaluates more then twice as fast as the second one. Tested on GHC 6.10.4 and Windows XP, dual core (for what it's worth)
Is that a) interpreted b) compiled without optimisations c) compiled with optimisations ?
It's b, compiled without optimisations. Of what I understand from the explanations so far, the first solution with seq should acutally be slower; In the unoptimised case it's more than two times faster than the version len (x:xs) l = len xs $! (l+1)
If c), that's a serious bug. With optimisations turned on, both should give (approximately) identical code, the same as with "len (x:xs) l = len xs (l+1)". With -O2, that gives exactly the same core as the first, the second has one (unnecesary, because everything is recognized as strict) case expression in otherwise identical core. All three perform identically here.
If a) or b), it's because the second has one more function call per list element, first ($!) is called and then seq via that, while the first calls seq directly.
Ah, ok, I was not aware that this is realy a function call. I thought this is syntactic sugar only and would be rewritten into a call to `seq`. I think to much of a macro in this case.
It's onhttp://moonpatio.com/fastcgi/hpaste.fcgi/view?id=5321#a5321 btw.
I can't see the difference, especially as $! is expressed in terms of seq
The second one is IMGO: myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = let l' = l+1 in l' `seq` len xs l'
So in thew first + is not forced to be evaluated.
My results (ghc 6.12.1, Core 2 Duo 2.8 GHz, Linux 2.6.32, Gentoo):
Not Optimized & not compiled: First: 12.47 secs, 1530911440 bytes Second: 17.40 secs, 1929614816 bytes
Wow. Here, interpreted and not optimised: First: 10.10 secs (Integer) 10.45 secs (Int) Second: 11.80 secs (Integer) 11.93 secs (Int)
ghc 6.12.1, built from source, 3GHz Pentium 4 (2 Cores), Linux linux-mkk1 2.6.27.39-0.2- pae (openSuse 11.1)
With ghci-6.10.3 First: 14.58 secs (Integer) 14.53 secs (Int) Second: 17.05 secs (Integer) 18.31 secs (Int)
Nowhere near the difference that you have.
Optimized & compiled: First: 1.24 secs, 966280832 bytes Second: 1.11 secs, 966277152 bytes
Compiled with -O (same with -O2) First: 0.36 secs (Integer) 0.30 secs (Int) Second: 0.36 secs (Integer) 0.30 secs (Int) length: 0.32 secs
Astonishing that compiling does so much less on your box.
With ghc-6.10.3 (-O/-O2) First: 0.23 secs (Integer) 0.17 secs (Int) Second: 0.26 secs (Integer) 0.17 secs (Int) length: 0.19 secs
Ouch! Why is 6.10.3 so much better here?
Compiled without optimisations First: 0.48 secs (Integer) 0.49 secs (Int) Second: 1.07 secs (Integer) 1.01 secs (Int)
With 6.10.3 First: 0.38 secs (Integer) 0.36 secs (Int) Second: 1.47 secs (Integer) 1.46 secs (Int)
Wait, what?
_______________________________________________ Haskell-Cafe mailing list Haskell-C...@haskell.orghttp://www.haskell.org/mailman/listinfo/haskell-cafe

Am Dienstag 15 Dezember 2009 21:50:22 schrieb Johann Höchtl:
It's b, compiled without optimisations.
Don't do that. Maybe if you have huge projects with large compile times. But without optimisations, small variations can incur huge performance costs (as we see here).
If a) or b), it's because the second has one more function call per list element, first ($!) is called and then seq via that, while the first calls seq directly.
Ah, ok, I was not aware that this is realy a function call. I thought this is syntactic sugar only and would be rewritten into a call to `seq`. I think to much of a macro in this case.
It's a real function, defined in Haskell. Unfortunately, without -O, ghc is not keen to inline across module boundaries. It would be inlined within the same module. seq is a primitive from GHC.prim, thus it's special and doesn't produce a function call.

On Dec 15, 7:30 pm, Maciej Piechotka
On Tue, 2009-12-15 at 09:52 -0800, Johann Höchtl wrote:
Hello,
I'm still to Haskell, and after I read through http://users.aber.ac.uk/afc/stricthaskell.html#seq
I thought, that these tow fragments after term rewriting are really the same:
myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = l `seq` len xs (l+1)
main = print $ myLength [1..10000000]
-- vs.
myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = len xs $! (l+1)
main = print $ myLength [1..10000000]
main = print $ myLength [1..10000000]
But the first expression evaluates more then twice as fast as the second one. Tested on GHC 6.10.4 and Windows XP, dual core (for what it's worth)
It's onhttp://moonpatio.com/fastcgi/hpaste.fcgi/view?id=5321#a5321 btw.
I can't see the difference, especially as $! is expressed in terms of seq
The second one is IMGO: myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = let l' = l+1 in l' `seq` len xs l'
So in thew first + is not forced to be evaluated.
Please describe for me as a beginner, why there _is_ a difference: 1. does len (x:xs) l = l `seq` len xs (l+1) vs. len xs $! (l+1) expand into sthg. different? 2. Do I understand right, that the first expression "should" actually be slower but (for what reason ever in an unoptimized case isn't? 3. The function is anotated with Integer. Why is suddenly Int of importance? (4. When optimizing is switched on, the second expession executes faster; as such I assume, that there is a difference between these two statements) Thank you!
My results (ghc 6.12.1, Core 2 Duo 2.8 GHz, Linux 2.6.32, Gentoo):
Not Optimized & not compiled: First: 12.47 secs, 1530911440 bytes Second: 17.40 secs, 1929614816 bytes Optimized & compiled: First: 1.24 secs, 966280832 bytes Second: 1.11 secs, 966277152 bytes
Repeating gave similar results - first being better w/out optimalization as 1.2:1.7 and second being better with optimalizations (-O).
Why the first one is better unoptimalized?
Regards
_______________________________________________ Haskell-Cafe mailing list Haskell-C...@haskell.orghttp://www.haskell.org/mailman/listinfo/haskell-cafe

Am Dienstag 15 Dezember 2009 21:43:46 schrieb Johann Höchtl:
Please describe for me as a beginner, why there _is_ a difference:
1. does len (x:xs) l = l `seq` len xs (l+1) vs. len xs $! (l+1) expand into sthg. different?
Yes. How different depends on optimisation level and compiler version. Without optimisations, seq is inlined, giving one self-contained loop, while ($!) is not inlined, so in every iteration of the loop there's a call to ($!) - bad for performance. ghc-6.10.3 and ghc-6.12.1 produce nearly identical core for each of the functions. With optimisations (-O2), both functions get compiled into a self-contained loop, 6.12.1 produces near identical core for the two functions. The core for the second contains one case-expression the core for the first doesn't, but they should produce the same assembly. One improvement versus the unoptimised core is that plusInteger is called instead of GHC.Num.+. 6.10.3 produces very different core with -O2. The core for the second variant is close to that which 6.12.1 produces, I know not enough about core to see how that would influence performance. For the first variant, 6.10.3 produces different core, special casing for small and large Integers, which proves to be more efficient. Again, I'm not specialist enough to know why a) it produces so different core b) that core is so much faster.
2. Do I understand right, that the first expression "should" actually be slower but (for what reason ever in an unoptimized case isn't?
No. In principle, with len (x:xs) l = l `seq` len xs (l+1), the evaluation of l lags one step behind, so you'd have the reduction len [1,2,3] 0 ~> len [2,3] (0+1) ~> len [3] (1+1) ~> len [] (2+1) ~> (2+1) ~> 3 while len (x:xs) l = let l' = l+1 in l' `seq` len xs l' gives len [1,2,3] 0 ~> len [2,3] 1 ~> len [3] 2 ~> len [] 3 ~> 3 but that difference isn't measurable (if they produce different machine instructions at all, the difference is at most a handful of clock cycles). *If* len xs $! (l+1) were expanded into the latter, both would be - for all practical purposes at least - equally fast.
3. The function is anotated with Integer. Why is suddenly Int of importance?
Thomas DuBuisson tried Int to investigate. It's always interesting what changes when you change types.
(4. When optimizing is switched on, the second expession executes faster; as such I assume, that there is a difference between these two statements)
Not here. With 6.12.1 and -O(2), both are equally fast, with 6.10.3, the first is faster. I would rather expect 6.10.4 to behave more like 6.10.3. It may be, of course, that it's a hardware/OS issue which code is faster. Can you ghc-6.10.4 -O2 -fforce-recomp -ddump-simpl --make Whatever.hs > Whatever.core so I can see what core that produces?
Thank you!

On Dec 15, 10:45 pm, Daniel Fischer
Am Dienstag 15 Dezember 2009 21:43:46 schrieb Johann Höchtl:
Please describe for me as a beginner, why there _is_ a difference:
1. does len (x:xs) l = l `seq` len xs (l+1) vs. len xs $! (l+1) expand into sthg. different?
Yes. How different depends on optimisation level and compiler version.
Without optimisations, seq is inlined, giving one self-contained loop, while ($!) is not inlined, so in every iteration of the loop there's a call to ($!) - bad for performance. ghc-6.10.3 and ghc-6.12.1 produce nearly identical core for each of the functions.
With optimisations (-O2), both functions get compiled into a self-contained loop, 6.12.1 produces near identical core for the two functions. The core for the second contains one case-expression the core for the first doesn't, but they should produce the same assembly. One improvement versus the unoptimised core is that plusInteger is called instead of GHC.Num.+.
6.10.3 produces very different core with -O2. The core for the second variant is close to that which 6.12.1 produces, I know not enough about core to see how that would influence performance. For the first variant, 6.10.3 produces different core, special casing for small and large Integers, which proves to be more efficient. Again, I'm not specialist enough to know why a) it produces so different core b) that core is so much faster.
2. Do I understand right, that the first expression "should" actually be slower but (for what reason ever in an unoptimized case isn't?
No. In principle, with len (x:xs) l = l `seq` len xs (l+1), the evaluation of l lags one step behind, so you'd have the reduction len [1,2,3] 0 ~> len [2,3] (0+1) ~> len [3] (1+1) ~> len [] (2+1) ~> (2+1) ~> 3 while len (x:xs) l = let l' = l+1 in l' `seq` len xs l' gives len [1,2,3] 0 ~> len [2,3] 1 ~> len [3] 2 ~> len [] 3 ~> 3 but that difference isn't measurable (if they produce different machine instructions at all, the difference is at most a handful of clock cycles). *If* len xs $! (l+1) were expanded into the latter, both would be - for all practical purposes at least - equally fast.
3. The function is anotated with Integer. Why is suddenly Int of importance?
Thomas DuBuisson tried Int to investigate. It's always interesting what changes when you change types.
(4. When optimizing is switched on, the second expession executes faster; as such I assume, that there is a difference between these two statements)
Not here. With 6.12.1 and -O(2), both are equally fast, with 6.10.3, the first is faster. I would rather expect 6.10.4 to behave more like 6.10.3. It may be, of course, that it's a hardware/OS issue which code is faster. Can you
ghc-6.10.4 -O2 -fforce-recomp -ddump-simpl --make Whatever.hs > Whatever.core
so I can see what core that produces?
Thank you very much for the helpful explanations. With optmisations turned on, the runtime performance (whatever.exe +RTS -s) is almost the same, with the seq variant still a tad faster. I copy the .core in here: ======== BEGIN seq - variant ============= ==================== Tidy Core ==================== Rec { Main.len :: [GHC.Integer.Internals.Integer] -> GHC.Integer.Internals.Integer -> GHC.Integer.Internals.Integer [GlobalId] [Arity 2 NoCafRefs Str: DmdType SS] Main.len = \ (ds_dzE :: [GHC.Integer.Internals.Integer]) (l_afx :: GHC.Integer.Internals.Integer) -> case ds_dzE of wild_B1 { [] -> l_afx; : x_afz xs_afB -> Main.len xs_afB (case l_afx of wild1_dAc { GHC.Integer.Internals.S# i_dAe -> case GHC.Prim.addIntC# i_dAe 1 of wild2_dAk { (# r_dAm, c_dAn #) -> case c_dAn of wild3_dAp { __DEFAULT -> case GHC.Prim.int2Integer# i_dAe of wild4_dAq { (# s_dAs, d_dAt #) -> case GHC.Prim.int2Integer# 1 of wild5_dAv { (# s1_dAx, d1_dAy #) -> GHC.Integer.$splusInteger d1_dAy s1_dAx d_dAt s_dAs } }; 0 -> GHC.Integer.Internals.S# r_dAm } }; GHC.Integer.Internals.J# ds1_dAK ds11_dAL -> case GHC.Prim.int2Integer# 1 of wild2_dAR { (# s_dAT, d_dAU #) -> GHC.Integer.$splusInteger d_dAU s_dAT ds11_dAL ds1_dAK } }) } end Rec } Main.lvl :: GHC.Integer.Internals.Integer [GlobalId] [NoCafRefs] Main.lvl = GHC.Integer.Internals.S# 1 Main.lvl1 :: GHC.Integer.Internals.Integer [GlobalId] [NoCafRefs] Main.lvl1 = GHC.Integer.Internals.S# 10000000 Main.lvl2 :: [GHC.Integer.Internals.Integer] [GlobalId] [] Main.lvl2 = GHC.Num.up_list Main.lvl Main.lvl Main.lvl1 Main.lvl3 :: GHC.Integer.Internals.Integer [GlobalId] [NoCafRefs] Main.lvl3 = GHC.Integer.Internals.S# 0 Main.lvl4 :: GHC.Integer.Internals.Integer [GlobalId] [] Main.lvl4 = Main.len Main.lvl2 Main.lvl3 Main.lvl5 :: GHC.Base.String [GlobalId] [] Main.lvl5 = GHC.Num.$wshowsPrec 0 Main.lvl4 (GHC.Types.[] @ GHC.Types.Char) Main.a :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #) [GlobalId] [Arity 1 Str: DmdType L] Main.a = \ (eta_aBl :: GHC.Prim.State# GHC.Prim.RealWorld) -> case GHC.IO.a29 GHC.Handle.stdout Main.lvl5 eta_aBl of wild_aHm { (# new_s_aHo, a89_aHp #) -> GHC.IO.$wa10 GHC.Handle.stdout '\n' new_s_aHo } Main.a1 :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #) [GlobalId] [Arity 1 Str: DmdType L] Main.a1 = GHC.TopHandler.a4 @ () (Main.a `cast` (sym ((GHC.IOBase.:CoIO) ()) :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #) ~ GHC.IOBase.IO ())) Main.main :: GHC.IOBase.IO () [GlobalId] [Arity 1 Str: DmdType L] Main.main = Main.a `cast` (sym ((GHC.IOBase.:CoIO) ()) :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #) ~ GHC.IOBase.IO ()) :Main.main :: GHC.IOBase.IO () [GlobalId] [Arity 1 Str: DmdType L] :Main.main = Main.a1 `cast` (sym ((GHC.IOBase.:CoIO) ()) :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #) ~ GHC.IOBase.IO ()) ==================== Tidy Core Rules ==================== ======== BEGIN $!- variant ============= ==================== Tidy Core ==================== Main.lit :: GHC.Integer.Internals.Integer [GlobalId] [NoCafRefs Str: DmdType] Main.lit = GHC.Integer.Internals.S# 1 Rec { Main.len :: [GHC.Integer.Internals.Integer] -> GHC.Integer.Internals.Integer -> GHC.Integer.Internals.Integer [GlobalId] [Arity 2 NoCafRefs Str: DmdType SS] Main.len = \ (ds_dzF :: [GHC.Integer.Internals.Integer]) (l_afx :: GHC.Integer.Internals.Integer) -> case ds_dzF of wild_B1 { [] -> l_afx; : x_afz xs_afB -> case GHC.Integer.plusInteger l_afx Main.lit of x1_azT { __DEFAULT -> Main.len xs_afB x1_azT } } end Rec } Main.lvl :: GHC.Integer.Internals.Integer [GlobalId] [NoCafRefs] Main.lvl = GHC.Integer.Internals.S# 10000000 Main.lvl1 :: [GHC.Integer.Internals.Integer] [GlobalId] [] Main.lvl1 = GHC.Num.up_list Main.lit Main.lit Main.lvl Main.lvl2 :: GHC.Integer.Internals.Integer [GlobalId] [NoCafRefs] Main.lvl2 = GHC.Integer.Internals.S# 0 Main.lvl3 :: GHC.Integer.Internals.Integer [GlobalId] [] Main.lvl3 = Main.len Main.lvl1 Main.lvl2 Main.lvl4 :: GHC.Base.String [GlobalId] [] Main.lvl4 = GHC.Num.$wshowsPrec 0 Main.lvl3 (GHC.Types.[] @ GHC.Types.Char) Main.a :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #) [GlobalId] [Arity 1 Str: DmdType L] Main.a = \ (eta_aBi :: GHC.Prim.State# GHC.Prim.RealWorld) -> case GHC.IO.a29 GHC.Handle.stdout Main.lvl4 eta_aBi of wild_aHj { (# new_s_aHl, a89_aHm #) -> GHC.IO.$wa10 GHC.Handle.stdout '\n' new_s_aHl } Main.a1 :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #) [GlobalId] [Arity 1 Str: DmdType L] Main.a1 = GHC.TopHandler.a4 @ () (Main.a `cast` (sym ((GHC.IOBase.:CoIO) ()) :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #) ~ GHC.IOBase.IO ())) Main.main :: GHC.IOBase.IO () [GlobalId] [Arity 1 Str: DmdType L] Main.main = Main.a `cast` (sym ((GHC.IOBase.:CoIO) ()) :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #) ~ GHC.IOBase.IO ()) :Main.main :: GHC.IOBase.IO () [GlobalId] [Arity 1 Str: DmdType L] :Main.main = Main.a1 `cast` (sym ((GHC.IOBase.:CoIO) ()) :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #) ~ GHC.IOBase.IO ()) ==================== Tidy Core Rules ==================== As you said, the $! variant of your 6.10.3 produces almost the same core as that of 6.12.1 would indicate, that 6.12.1 misses an optimisation, as the seq variant (at least on 6.10.4) is still faster, albeit that is neglectable.
Thank you!
Thank you, Johann
_______________________________________________ Haskell-Cafe mailing list Haskell-C...@haskell.orghttp://www.haskell.org/mailman/listinfo/haskell-cafe
participants (4)
-
Daniel Fischer
-
Johann Höchtl
-
Maciej Piechotka
-
Thomas DuBuisson