Re: [Haskell-cafe] How to write fast for loops

28 Apr 2014

      I've just uploaded my benchmarks to https://github.com/nh2/loop.

Please take a look. There are some interesting results.

The first thing I don't understand at all is:

http://htmlpreview.github.io/?https://github.com/nh2/loop/blob/master/result...

See how w32/loop and w32/unsafeLoop are equally fast. Then look at

http://htmlpreview.github.io/?https://github.com/nh2/loop/blob/master/result...

Here I run the same thing over the whole of Word32.
See how `loop` is faster here than `unsafeLoop`? How does that make sense?

Next thing:

It seems that V.enumFromTo and V.fromList are actually the same:

http://hackage.haskell.org/package/vector-0.10.9.1/docs/src/Data-Vector-Fusi...

However in my benchmark, at least for `Int`, the performance is
different - am I overlooking something?

On 28/04/14 01:34, John Lato wrote:
...
It can make a difference in that, with unboxed vectors, the compiler can
statically determine that it is able to use unboxed values, and
therefore is more likely to do so.  Having finally broken down and run
some tests, I can report that on my system using V.enumFromTo with
unboxed vectors results in the same performance as the hand-written loop.
I cannot see a difference between Vector.enumFromTo and
Vector.Unboxed.enumFromTo in my benchmark.

Vector.enumFromTo is as fast as the hand-written loop, but only for
`Int`. For `Word32`, it is 5 times slower. Any idea why?
...
> I would expect it's because you never force the argument.  With
    > `enumFromTo` the argument is forced because it needs to be checked for
    > termination, but `enumFromN` is probably building up a big chain of
    > thunks.  I guess for this case `enumFromN` has no benefit over
    > `enumFromTo` because the intention is to create a single loop
    instead of
    > actually allocating the vector, so the warning in the documentation
    > doesn't necessarily apply.
Also haven't checked that yet, but I suspect that instead of something
    thunk-related, the thing plainly allocates the vector.
Just to clarify: `V.enumFromTo` works much better than `V.enumFromN`
    because in contrast to the latter it doesn't actually try to create the
    fully sized vector.
I believe that if you check this with ghc-7.6.3 and -O2, you will
discover that my analysis is correct :)
Ok, I think I understand now what you mean.

Re: [Haskell-cafe] How to write fast for loops

Niklas Hambüchen