
On 03/04/2012 00:46, Ben Lippmeier wrote:
On 02/04/2012, at 10:10 PM, Jurriaan Hage wrote:
Can anyone tell me what the exact difference is between 1,842,979,344 bytes maximum residency (219 sample(s)) and 4451 MB total memory in use (0 MB lost due to fragmentation)
I could not find this information in the docs anywhere, but I may have missed it.
The "maximum residency" is the peak amount of live data in the heap. The "total memory in use" is the peak amount that the GHC runtime requested from the operating system. Because the runtime system ensures that the heap is always bigger than the size of the live data, the second number will be larger.
The maximum residency is determined by performing a garbage collection, which traces out the graph of live objects. This means that the number reported may not be the exact peak memory use of the program, because objects could be allocated and then become unreachable before the next sample. If you want a more accurate number then increase the frequency of the heap sampling with the -i<sec> RTS flag.
To put it another way, the difference between "maximum residency" and "total memory in use" is the overhead imposed by the runtime's memory manager. Typically for the default settings the total memory in use will be about three times the maximum residency, because the runtime is using copying GC. If your maximum residency is L (for Live data), and we let the heap grow to size 2L before doing a GC (the 2 can be tuned with the -F flag), and we need another L to copy the live data into, then we need in total 3L. This assumes that the live data remains constant, which it doesn't in practice, hence the overhead is not always exactly 3L. Generational GC also adds some memory overhead, but with the default settings it is limited to at most 1MB (512KB for the nursery, and another 512KB for aging). Cheers, Simon