
It seems I have two options:
A) compile without profiling support and run the compiled program with +RTS -sstderr
B) compile with profiling support -prof -auto-all and run the compiled program with +RTS -p -sstderr
In case A, I get a good measure of GC vs. mutator time, but I don't know the amount of time used by single functions, so I can't seperate between mutator time spent in the functions that really interest me and the time spent for the test frame.
In case B, I can seperate mutator time spent in single functions, but I have the impression that the GC time I then get includes the GC for the profiler, and hence is useless for me, because different GC times for different algorithms might just mean that one of the algorithms requires more profiling overhead.
If you run a profiled program with +RTS -sstderr, the time breakdown includes an extra category, PROF, which counts the time spent in the heap profiler. The amount of GC time consumed by the profiled program will indeed be different from the unprofiled program, because of profiling overheads - there's no way around this, I'm afraid. But you may find that the ratio of mutator to GC time in the profiled program is similar to the unprofiled program (I'd be interested to know whether this is/is not the case). To get the most reliable measure of GC time in a profiled program, do not turn on heap profiling, because this will cause extra GCs to be performed and will inflate the GC time. Turing on time profiling should be ok. Cheers, Simon