RE: profile not showing it all?

On 09 September 2005 15:58, Niels wrote:
Bulat Ziganshin wrote:
i don't know what value shows the -xc oprion, but memory usage in ghc-compiled program are increased because by default used "copying" garbage collection algorithm. using option "+RTS -c" will significantly reduce memory usage, which will make program faster if it currently needs swapping. you can also force, for example, using of compacting algorithm when memory usage grows more than 200 mb with options "+RTS -M1G -c20". see "4.14.2. RTS options to control the garbage collector" in ghc docs for details running the profilingbinary (that was compiled with -prof -auto-all) with the -c option resulted in;
top: showing a max of 161mb memory usage profiled-graph: showing a peak at a little more than 80mb
i guess these results are more or less normal? It does however mean that my datastructures are pretty harsh on the garbage collector, doesnt it?
The difference between the residency displayed by the heap profiler and the actual memory residency can be attributed to several factors (some of which have already been mentioned, I thought I'd list them for completeness): - overhead of profiling itself, which currently runs at 2 extra words per heap object. At a guess, that probably results in about a 30% overhead. - generational garbage collection. A major GC in the standard copying collector will usually require 3L bytes of memory, where L is the amount of live data. This is because by default (see the +RTS -F option) we allow the old generation to grow to twice its size (2L) before collecting it, and we require additionally L bytes to copy the live data into. When using compacting collection, this is reduced to 2L, and can further be reduced by tweaking the -F option. Additionally the allocation area (generation 0) takes 512Kb by default. - The stack isn't counted in the heap profile by default. See the +RTS -xt option. - The program text itself, the C stack, any non-heap data (eg. data allocated by foreign libraries, and data allocated by the RTS), and mmap()'d memory are not counted in the heap profile. Cheers, Simon

- overhead of profiling itself, which currently runs at 2 extra words per heap object. At a guess, that probably results in about a 30% overhead. top also shows 161mb for the non-profiling binary (the same as the
Simon Marlow wrote: profiling binary). This strikes me as odd, because the profiling binary *should* take more memory (because of all the reasons mentioned in this thread).
- generational garbage collection. A major GC in the standard copying collector will usually require 3L bytes of memory, where L is the amount of live data. This is because by default (see the +RTS -F option) we allow the old generation to grow to twice its size (2L) before collecting it, and we require additionally L bytes to copy the live data into. When using compacting collection, this is reduced to 2L, and can further be reduced by tweaking the -F option. Additionally the allocation area (generation 0) takes 512Kb by default. with -c, top records 149mb max (it just grows more gradually) and without -c, top records 161 max.
I used the -B setting to sound a bell at the beginning of a major GC, so i could relate the memory growth showing up in top to the start of a GC. However, im not hearing anything (also -t, which should show GC stats after running the program, is not showing anything). Does this mean there was no GC ? Niels.

These are the latest nrs concerning the space behaviour of my application. Some seem very weird to me. I tested the binaries on to different hosts. host1: Pentium4 2,33 Ghz, 512 mb, 1 gig swap. (linux) GHC version 6.2.1 | top shows VIRT/RSS/TIME | hc graph shows ---------------------------------------------------------------------- non-profiling bin (RTS -c) | 144mb / 144mb / 1:39 | n.a. non-profiling bin | 161mb / 161mb / 1:32 | n.a. profiling bin (RTS -hc -c) | 188mb / 188mb / 13:59 | 90 mb profiling bin (RTS -hc) | 295mb / 295mb / 19:05 | 90 mb host2: Pentium4 2,00 Ghz, 512 mb, 1 gig swap. (linux) GHC version 6.4 | top shows VIRT/RSS/TIME | hc graph shows ---------------------------------------------------------------------- non-profiling bin (RTS -c) | 148mb / 144mb / 2:12 | n.a. non-profiling bin | 165mb / 161mb / 1:60 | n.a. profiling bin (RTS -hc -c) | 621mb / 420mb / 22:06 | 90 mb profiling bin (RTS -hc) | 300mb / 295mb / 17:11 | 90 mb another funny fact is that the profiling bin (RTS -hc -c) could not finish on my host (Athlon XP 2200, 512mb, 1gig swap. (debian) GHC version 6.4) because i ran out of 1 gig of virtual memory and was using 450mb real mem at that moment. First thing to notice is that in all cases the heap graph was the same (ignoring time axis). What struck me as surprising was the enormeous difference in memory consumption between 2 hosts (that have almost similar hardware, but different GHC version). This raises some questions i can't answer... 1) Can this different be explained by the version difference of GHC !? 2) How can it be that my own host consumes a ridiculous amount of memory in comparison to host1; i couldn't even finish the run because i ran out of 1gig of swap and 512mb memory, while host1 finished the job consuming 188mb (both VIRT and RSS) in 13:59. Even IF the difference can be explained by the version of GHC. Even the difference with host2 (that also has GHC 6.4) is still big. Niels. PS ofcourse the binaries were the same everywhere, so was the test file, etc.
participants (2)
-
Niels van der Velden
-
Simon Marlow