
I've just uploaded my benchmarks to https://github.com/nh2/loop. Please take a look. There are some interesting results. The first thing I don't understand at all is: http://htmlpreview.github.io/?https://github.com/nh2/loop/blob/master/result... See how w32/loop and w32/unsafeLoop are equally fast. Then look at http://htmlpreview.github.io/?https://github.com/nh2/loop/blob/master/result... Here I run the same thing over the whole of Word32. See how `loop` is faster here than `unsafeLoop`? How does that make sense? Next thing: It seems that V.enumFromTo and V.fromList are actually the same: http://hackage.haskell.org/package/vector-0.10.9.1/docs/src/Data-Vector-Fusi... However in my benchmark, at least for `Int`, the performance is different - am I overlooking something? On 28/04/14 01:34, John Lato wrote:
It can make a difference in that, with unboxed vectors, the compiler can statically determine that it is able to use unboxed values, and therefore is more likely to do so. Having finally broken down and run some tests, I can report that on my system using V.enumFromTo with unboxed vectors results in the same performance as the hand-written loop.
I cannot see a difference between Vector.enumFromTo and Vector.Unboxed.enumFromTo in my benchmark. Vector.enumFromTo is as fast as the hand-written loop, but only for `Int`. For `Word32`, it is 5 times slower. Any idea why?
> I would expect it's because you never force the argument. With > `enumFromTo` the argument is forced because it needs to be checked for > termination, but `enumFromN` is probably building up a big chain of > thunks. I guess for this case `enumFromN` has no benefit over > `enumFromTo` because the intention is to create a single loop instead of > actually allocating the vector, so the warning in the documentation > doesn't necessarily apply.
Also haven't checked that yet, but I suspect that instead of something thunk-related, the thing plainly allocates the vector.
Just to clarify: `V.enumFromTo` works much better than `V.enumFromN` because in contrast to the latter it doesn't actually try to create the fully sized vector.
I believe that if you check this with ghc-7.6.3 and -O2, you will discover that my analysis is correct :)
Ok, I think I understand now what you mean.