Re: Re[2]: [Haskell-cafe] idea for avoiding temporaries

12 Mar 2007

      Hi Bulat,
...
obviously, arrays version should create no temporary cells.
that's why the memory traffic surprised me. i knew there had to be something wrong.
...
the problems was mainly due to 2 factors:
1) readArray m (i,j)
yes, indeed. since we are dealing in bulk operations, we might as well take advantage 
of that, so dropping the repeated bounds-checks inside the loops makes a lot of sense.

moral/note to self: bulk array operations are your friend (i knew that!-), but you need 
to use that when defining them (unsafeRead et al are only for library writers, but library 
writers ought to use them; and i was writing a small inlined library)
...
2) 'op' in 'l' which was passed as real closure and was not inlined
due to weakness of ghc optimizer
it irked me having to write the same loop twice, but i didn't consider the consequences.
an INLINE pragma on l almost seems sufficient to regain the loss, so i prefer that; but 
writing out the loop twice is still a tiny tick faster..
...
we should help strictness analyzer by marking all the variables used in tight loops as strict.
ah, there were a few surprises in that one. i thought i had considered possible strictness
problems, but obviously, i missed a few relevant possibilities. annotating everything is the
safe option, and clearly documents the intentions, but i cannot help thinking about which 
annotations could be omitted:

- modArray: a and i are both used anyway
- i index in loop is definitely checked (but j isn't, and some others weren't, either; so 
   better safe than sorry)
- some of the parameters need not be annotated in this example, but should be if one
  wanted to reuse the code elsewhere 

- the one i really don't get yet is the one on the accumulators (!s in l, in dotA/matA);
  i thought i had covered that by using $! in applying the loop, but annotating the
  formal loop parameter, apart from being nicer, also seems faster..

moral/note to self: don't try to be clever, try to be clear..; strictness in formal 
parameters is better than strictness in actual parameters; bang-patterns are good;-)
...
after that is done, we got 1000 times less temporary data allocated and 5x faster 
execution. now it's a bit faster than strict lists
is this now anywhere near what the number-crunchers use, when they use Haskell?-)

Bulat, thanks for looking into it and for isolating the issues so quickly!-)
claus

Re: Re[2]: [Haskell-cafe] idea for avoiding temporaries

Claus Reinke