
People,
I describe below my small experience with the time profiling in
ghc-6.6.
To make sure, consider first the following simple example:
-------------------------------------------------------
import List (sort)
main = {-# SCC "main" #-}
let n = 5*10^4 :: Int
in
putStr (shows (f [1 .. n]) "\n")
f :: [Int] -> Int
f xs =
{-# SCC "f" #-}
case xs of [] -> 0
x: ys -> (g x) + (h x) + (f ys)
where
g x = {-# SCC "g" #-} last $ sort [0 .. (rem x 10)]
h x = {-# SCC "h" #-} last $ sort [0 .. (rem x 20)]
-------------------------------------------------------
Making: ghc -O -prof --make Main
Running: ./Main +RTS -p -M100m -RTS
This produces the report (its indentation shows which center is called
from which):
--------------------------------------------------------------------
total time = 3.00 secs (60 ticks @ 50 ms)
total alloc = 130,608,300 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
h Main 65.0 67.3
g Main 33.3 30.9
main Main 1.7 1.5
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 1 0 0.0 0.0 100.0 100.0
main Main 162 1 0.0 0.0 0.0 0.0
CAF Main 156 3 0.0 0.0 100.0 100.0
main Main 163 0 1.7 1.5 100.0 100.0
f Main 164 50001 0.0 0.3 98.3 98.5
h Main 166 50000 65.0 67.3 65.0 67.3
g Main 165 50000 33.3 30.9 33.3 30.9
CAF GHC.Handle 109 3 0.0 0.0 0.0 0.0
--------------------------------------------------------------------
To my mind, these numbers look all right: like I expected.
Only I do not so firmly understand what is CAF, a
"constant application form", a data which has cost, is assigned cost,
but is out of the stack of calls.
Is this profiling all right?
Then, I need to find the main time eater in a certain concrete
(and complex) example. Its profiling looks strange.
I set SCC in all valuable inner loops in the modules Main.hs
and LemmaSearch.hs.
It shows
---------------------------------------------
LemSearchList +RTS -p -M200m -RTS
total time = 112.00 secs (2240 ticks @ 50 ms)
total alloc = 3,813,365,740 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
CAF Main 95.5 96.1
splitDisjunctScheme LemmaSearch 2.5 2.2
addEquationsWithReduction LemmaSearch 1.3 1.4
---------------------------------------------
My idea was that some of the loops inside LemmaSearch must take
almost all time.
Instead, it shows that some CAF (anonymous, to me) in Main takes
almost everything, and among places which I marked with SCC, the two
are shown as 2.5% and 1.4%, and all the rest shown as zero:
---------------------------------------------------
MAIN MAIN \ 1 0 0.0 0.0 100.0 100.0
CAF Main 2
80 57 96.0 96.1 99.9 99.9
...
---------------------------------------------------
The whole user project is build under -prof
("auto-all" skipped). The test Main.hs is built also under -prof.
SCC are set only inside the functions Main.main and
LemmaSearch.searchLemmataInInitial.
What may consitute this strange CAF cost of 96% ?
Kirsten Chevalier
I didn't look at your code all that carefully, but did you build the GHC libraries with "-prof -auto-all"? (Not just "-prof".) If you don't build the libraries with "-auto-all", then cost centres won't get inserted for library functions, and if it's really a standard library function that's taking all of that time, the profiling report won't indicate that.
I made ghc-6.6 from the official source in a standard way: ./configure ...; make; make install Does this presume that this also generates the .p.o GHC library versions for -prof -auto-all ? Generally, I do not guess, so far, what is happening. Regards, ----------------- Serge Mechveliani mechvel@botik.ru

On 12/6/06, Serge D. Mechveliani
What may consitute this strange CAF cost of 96% ?
Kirsten Chevalier
wrote I didn't look at your code all that carefully, but did you build the GHC libraries with "-prof -auto-all"? (Not just "-prof".) If you don't build the libraries with "-auto-all", then cost centres won't get inserted for library functions, and if it's really a standard library function that's taking all of that time, the profiling report won't indicate that.
I made ghc-6.6 from the official source in a standard way:
./configure ...; make; make install
Does this presume that this also generates the .p.o GHC library versions for -prof -auto-all ?
No; you must have built the profiling libraries (i.e., building the libraries with -prof), otherwise you wouldn't have been able to compile your code for profiling. But, building the profiling libraries in the standard way doesn't add "-auto-all" to the compile flags. So if you want to build the libraries with "-auto-all", do: make EXTRA_HC_OPTS=-auto-all but be careful! As I said in my previous message, adding cost centres disables some optimizations, so the profiling results you get from this may not be accurate with respect to what would happen if you ran your code after building it with -O and no profiling. Cheers, Kirsten -- Kirsten Chevalier* chevalier@alum.wellesley.edu *Often in error, never in doubt "I cannot remember a time when I did not take it as understood that everybody has at least two, if not twenty-two, sides to him."--Robertson Davies

On Wed, Dec 06, 2006 at 03:21:23PM +0000, Kirsten Chevalier wrote:
On 12/6/06, Serge D. Mechveliani
wrote: What may consitute this strange CAF cost of 96% ?
Kirsten Chevalier
wrote I didn't look at your code all that carefully, but did you build the GHC libraries with "-prof -auto-all"? (Not just "-prof".) If you don't build the libraries with "-auto-all", then cost centres won't get inserted for library functions, and if it's really a standard library function that's taking all of that time, the profiling report won't indicate that.
I made ghc-6.6 from the official source in a standard way:
./configure ...; make; make install
Does this presume that this also generates the .p.o GHC library versions for -prof -auto-all ?
No; you must have built the profiling libraries (i.e., building the libraries with -prof), otherwise you wouldn't have been able to compile your code for profiling. But, building the profiling libraries in the standard way doesn't add "-auto-all" to the compile flags. So if you want to build the libraries with "-auto-all", do: make EXTRA_HC_OPTS=-auto-all
Thank you. I am trying this. One could ask, why the GHC installation does not apply -auto-all by default?
but be careful! As I said in my previous message, adding cost centres disables some optimizations, so the profiling results you get from this may not be accurate with respect to what would happen if you ran your code after building it with -O and no profiling.
In my example, -O makes the run 2 times faster, and the profiling time per-cents are shown the same as for -Onot. I suspect that this particular invilisible time eater will present independently on -O flag and on the cost centers. ----------------- Serge Mechveliani mechvel@botik.ru
participants (2)
-
Kirsten Chevalier
-
Serge D. Mechveliani