http://book.realworldhaskell.org/read/profiling-and-optimization.html

Would be a start

On Monday, 28 April 2014, Mark Thom <markjordanthom@gmail.com> wrote:
Has anyone written a blog post to the effect of "Haskell: The Slow Parts"? Or a heuristic for reliably identifying them, maybe? It sounds as though I could really benefit from it. I had no idea about forM.


On Sun, Apr 27, 2014 at 10:24 PM, John Lato <jwlato@gmail.com> wrote:
On Sun, Apr 27, 2014 at 7:23 PM, Niklas Hambüchen <mail@nh2.me> wrote:
I've just uploaded my benchmarks to https://github.com/nh2/loop.

Please take a look. There are some interesting results.


The first thing I don't understand at all is:


http://htmlpreview.github.io/?https://github.com/nh2/loop/blob/master/results/bench.html

See how w32/loop and w32/unsafeLoop are equally fast. Then look at


http://htmlpreview.github.io/?https://github.com/nh2/loop/blob/master/results/bench-traverse-w32.html

Here I run the same thing over the whole of Word32.
See how `loop` is faster here than `unsafeLoop`? How does that make sense?

Huh?  In the comments you wrote:

-- Note that some types (e.g. Word32) have bounds checks even for
-- `toEnum`. 

Doesn't that explain it?  For Int, toEnum/fromEnum is a noop, but on Word32 it's not.


Next thing:

It seems that V.enumFromTo and V.fromList are actually the same:


http://hackage.haskell.org/package/vector-0.10.9.1/docs/src/Data-Vector-Fusion-Stream-Monadic.html#enumFromTo

However in my benchmark, at least for `Int`, the performance is
different - am I overlooking something?

Probably with V.fromList, the list gets floated out?  Just guessing, check the core!


On 28/04/14 01:34, John Lato wrote:
> It can make a difference in that, with unboxed vectors, the compiler can
> statically determine that it is able to use unboxed values, and
> therefore is more likely to do so.  Having finally broken down and run
> some tests, I can report that on my system using V.enumFromTo with
> unboxed vectors results in the same performance as the hand-written loop.

I cannot see a difference between Vector.enumFromTo and
Vector.Unboxed.enumFromTo in my benchmark.

Vector.enumFromTo is as fast as the hand-written loop, but only for
`Int`. For `Word32`, it is 5 times slower. Any idea why?

Ahh, you made me look at the core again.  I think this is related to your observation about V.enumFromTo being the same as V.fromList.  With Word32 the generated core shows that this goes via a list representation instead of a nice loop.  Which makes me suspect there's some RULE that applies to Stream.enumFromTo that is firing in the first case but not the second.  And if I build both versions with -ddump-rule-firings, indeed I see that the Int version has
Rule fired: enumFromTo<Int> [Stream] 
With nothing comparable for the Word32 version.  I'd imagine if you grep for that in the Vector sources, you'd find something interesting.

The EnumFromN version does not seem to suffer from this (but again it's necessary to evaluate the argument).

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe