
On Tue, Jan 03, 2012 at 11:00:58AM +0000, Simon Marlow wrote:
On 21/12/2011 22:36, Roman Cheplyaka wrote:
* Ian Lynagh
[2011-12-21 18:29:21+0000] * The profiling and hpc implementations have been merged and overhauled. Visible changes include renaming of profiling flags: http://www.haskell.org/ghc/dist/stable/docs/html/users_guide/flag-reference.... and the cost-centre stacks have a new semantics, which should in most cases result in more useful and intuitive profiles. The +RTS -xc flag now also gives a stack trace.
Where can we learn more about the new semantics?
I haven't writtne down the semantics formally yet, I'm afraid, and it may yet change. However, you should find that the changes give more intuitive results, and profiling is now more robust to compiler optimisations.
There are a few visible changes. One is that -auto-all will label nested functions by default now (but you can revert to the previous behaviour of labelling only top-level functions with -fprof-auto-top).
The labeling of nested functions has been very convenient for me. It has made tracking down stack overflows due to the evaluation of large thunk chains much easier - I don't have to manually add SCCs everywhere. I can't even imagine how many hours this combined with stack traces has saved me compared to the debugging/profiling tools available in 7.0.x
Another visible change is that higher-order functions are now pushed on the stack when they are called, not when they are referenced. For example, in "map f xs" you will see that map calls f, whereas previously f would be shown as a child of the function containing the call to map (sometimes!).
This also means that the costs of a calling a higher-order function are always part of the aggregate costs of the caller, rather than being attributed to the higher-order function itself. For example, if you have
f xs ys = {-# SCC "map1" #-} map g xs ++ {-# SCC "map2" #-} map g ys where g = ...
then you'll see that map1 calls g and map2 calls g, and the costs of calling g on the elements of xs are recorded separately from the costs of calling g on the elements of ys. Previously all the costs of g would be attributed to g itself, which is much less useful.
I noticed this in my profiles and it has also been really helpful.
I'd be interested in hearing feedback, particularly if you find a case where costs are attributed somewhere that you didn't expect, or the stack looks wrong.
This might be the expected behavior but I'll ask anyway. I have what seems to be a legitimate stack overflow (due to excessive recursion and not the evaluation of a big thunk). The stack trace from -xc only shows about 13 calls on the stack (with each function that is called only appearing once). Is it by design that functions only appear once?