
Hi Tomasz! On Sun, Oct 03, 2004 at 03:07:01PM +0200, Tomasz Zielonka wrote:
Hello!
I was playing with monadic looping a'la replicateM_ and I created this function:
for :: Int -> IO () -> IO () for 0 _ = return () for n x = x >> for (n - 1) x
Compiled with -O2, it is really fast and makes no unnecessary allocations.
Yes, good code: T.$wfor = \r [ww w w1] case ww of ds { __DEFAULT -> case w w1 of wild { GHC.Prim.(#,#) new_s a41 -> case -# [ds 1] of sat_s1ZG { __DEFAULT -> T.$wfor sat_s1ZG w new_s; }; }; 0 -> GHC.Prim.(#,#) [w1 GHC.Base.()]; }; SRT(T.$wfor): [] T.for = \r [w w1 w2] case w of w3 { GHC.Base.I# ww -> T.$wfor ww w1 w2; }; SRT(T.for): []
So I made another version:
for :: Int -> IO () -> IO () for n x | n > 0 = x >> for (n - 1) x | otherwise = return ()
To my surprise, it was much slower and made many allocations: [... Then I noticed the cause: GHC.Prim.<# returns a boxed, heap allocated Bool, and so do other primitive comparison operators.
That's not really the cause. A function returning a boxed value does not necessarily have to allocate it, it is just a vectored return afaik. The code is: T.$wfor' = \r [ww w] case ># [ww 0] of wild { GHC.Base.True -> let { k = \u [] case -# [ww 1] of sat_s1Z9 { __DEFAULT -> T.$wfor' sat_s1Z9 w; }; } in let { sat_s20d = \r [eta] case w eta of wild1 { GHC.Prim.(#,#) new_s a41 -> k new_s; }; } in sat_s20d; GHC.Base.False -> lvl4; }; SRT(T.$wfor'): [] T.for' = \r [w w1] case w of w2 { GHC.Base.I# ww -> T.$wfor' ww w1; }; SRT(T.for'): [] The culprit is `let { k = \u ... }'. The cause seems to be that eta expansion is done at the wrong place, I do not know why. The code we would want is T.$wfor4 = \r [ww w w1] case ># [ww 0] of wild { GHC.Base.True -> case w w1 of wild1 { GHC.Prim.(#,#) new_s a41 -> case -# [ww 1] of sat_s1Y0 { __DEFAULT -> T.$wfor4 sat_s1Y0 w new_s; }; }; GHC.Base.False -> GHC.Prim.(#,#) [w1 GHC.Base.()]; }; SRT(T.$wfor4): [] T.for4 = \r [w w1 w2] case w of w3 { GHC.Base.I# ww -> T.$wfor4 ww w1 w2; }; SRT(T.for4): [] (Notice that $wfor again take three arguments, the last one being the state.) Actually, this is produced by the following, although I have no idea why. Just the optimizer working unpredictably, I guess. for4 :: Int -> IO () -> IO () for4 n x = if n `gt` 0 == 0 then return () else x >> (for4 (n-1) x) gt :: Int -> Int -> Int gt x y = if x > y then 1 else 0 If you test it, it should be fast. BTW, although counting upwards (and not solving the problem generally), the following is ok too: for2 :: Int -> IO () -> IO () for2 n x = sequence_ [x | i <- [1..n]] T.lvl = \r [s] GHC.Prim.(#,#) [s GHC.Base.()]; SRT(T.lvl): [] T.$wfor2 = \r [ww w] case ># [1 ww] of wild { GHC.Base.True -> T.lvl; GHC.Base.False -> let { go10 = \r [x1 eta] case w eta of wild1 { GHC.Prim.(#,#) new_s a41 -> case ==# [x1 ww] of wild11 { GHC.Base.True -> GHC.Prim.(#,#) [new_s GHC.Base.()]; GHC.Base.False -> case +# [x1 1] of sat_s1XA { __DEFAULT -> go10 sat_s1XA new_s; }; }; }; } in go10 1; }; SRT(T.$wfor2): [] T.for2 = \r [w w1] case w of w2 { GHC.Base.I# ww -> T.$wfor2 ww w1; }; SRT(T.for2): [] Playing with the code generated by ghc is a great way to waste time for me. Wait until you have found the RULES-pragma :-) Have fun, Carsten -- Carsten Schultz (2:38, 33:47), FB Mathematik, FU Berlin http://carsten.codimi.de/ PGP/GPG key on the pgp.net key servers, fingerprint on my home page.