Re: Bools are not unboxed

3 Oct 2004

      Hi Tomasz!

On Sun, Oct 03, 2004 at 03:07:01PM +0200, Tomasz Zielonka wrote:
...
Hello!
I was playing with monadic looping a'la replicateM_ and I created this
function:
for :: Int -> IO () -> IO ()
    for 0 _ = return ()
    for n x = x >> for (n - 1) x
Compiled with -O2, it is really fast and makes no unnecessary
allocations.
Yes, good code:

T.$wfor =
    \r [ww w w1]
        case ww of ds {
          __DEFAULT ->
              case w w1 of wild {
                GHC.Prim.(#,#) new_s a41 ->
                    case -# [ds 1] of sat_s1ZG {
                      __DEFAULT -> T.$wfor sat_s1ZG w new_s;
                    };
              };
          0 -> GHC.Prim.(#,#) [w1 GHC.Base.()];
        };
SRT(T.$wfor): []
T.for =
    \r [w w1 w2] case w of w3 { GHC.Base.I# ww -> T.$wfor ww w1 w2; };
SRT(T.for): []
...
So I made another version:
for :: Int -> IO () -> IO ()
    for n x | n > 0     = x >> for (n - 1) x
      | otherwise = return ()
To my surprise, it was much slower and made many allocations:
[...
Then I noticed the cause: 
    GHC.Prim.<# returns a boxed, heap allocated Bool, and so do other
    primitive comparison operators.
That's not really the cause.  A function returning a boxed value does
not necessarily have to allocate it, it is just a vectored return
afaik.

The code is:

T.$wfor' =
    \r [ww w]
        case ># [ww 0] of wild {
          GHC.Base.True ->
              let {
                k = \u []
                        case -# [ww 1] of sat_s1Z9 {
                          __DEFAULT -> T.$wfor' sat_s1Z9 w;
                        }; } in
              let {
                sat_s20d =
                    \r [eta]
                        case w eta of wild1 
			    { GHC.Prim.(#,#) new_s a41 -> k new_s; };
              } in  sat_s20d;
          GHC.Base.False -> lvl4;
        };
SRT(T.$wfor'): []
T.for' =
    \r [w w1] case w of w2 { GHC.Base.I# ww -> T.$wfor' ww w1; };
SRT(T.for'): []

The culprit is `let { k = \u ... }'.  The cause seems to be that eta
expansion is done at the wrong place, I do not know why.  The code we
would want is

T.$wfor4 =
    \r [ww w w1]
        case ># [ww 0] of wild {
          GHC.Base.True ->
              case w w1 of wild1 {
                GHC.Prim.(#,#) new_s a41 ->
                    case -# [ww 1] of sat_s1Y0 {
                      __DEFAULT -> T.$wfor4 sat_s1Y0 w new_s;
                    };
              };
          GHC.Base.False -> GHC.Prim.(#,#) [w1 GHC.Base.()];
        };
SRT(T.$wfor4): []
T.for4 =
    \r [w w1 w2] case w of w3 { GHC.Base.I# ww -> T.$wfor4 ww w1 w2; };
SRT(T.for4): []

(Notice that $wfor again take three arguments, the last one being the
state.)

Actually, this is produced by the following, although I have no idea
why.  Just the optimizer working unpredictably, I guess.

for4 :: Int -> IO () -> IO ()
for4 n x = if n `gt` 0 == 0 then return () else x >> (for4 (n-1) x)

gt :: Int -> Int -> Int
gt x y = if x > y then 1 else 0

If you test it, it should be fast.

BTW, although counting upwards (and not solving the problem
generally), the following is ok too:

for2 :: Int -> IO () -> IO ()
for2 n x = sequence_ [x | i <- [1..n]]

T.lvl = \r [s] GHC.Prim.(#,#) [s GHC.Base.()];
SRT(T.lvl): []
T.$wfor2 =
    \r [ww w]
        case ># [1 ww] of wild {
          GHC.Base.True -> T.lvl;
          GHC.Base.False ->
              let {
                go10 =
                    \r [x1 eta]
                        case w eta of wild1 {
                          GHC.Prim.(#,#) new_s a41 ->
                              case ==# [x1 ww] of wild11 {
                                GHC.Base.True -> 
				    GHC.Prim.(#,#) [new_s GHC.Base.()];
                                GHC.Base.False ->
                                    case +# [x1 1] of sat_s1XA {
                                      __DEFAULT -> go10 sat_s1XA new_s;
                                    };
                              };
                        };
              } in  go10 1;
        };
SRT(T.$wfor2): []

T.for2 =
    \r [w w1] case w of w2 { GHC.Base.I# ww -> T.$wfor2 ww w1; };
SRT(T.for2): []

Playing with the code generated by ghc is a great way to waste time
for me.  Wait until you have found the RULES-pragma :-)

Have fun,

Carsten

-- 
Carsten Schultz (2:38, 33:47), FB Mathematik, FU Berlin
http://carsten.codimi.de/
PGP/GPG key on the pgp.net key servers, 
fingerprint on my home page.