RE: Re[2]: garbage collection

On 20 April 2005 15:56, Bulat Ziganshin wrote:
Tuesday, April 19, 2005, 4:15:53 PM, you wrote:
1) can you add disableGC and enableGC procedures? this can significantly improve performance in some cases
Sure. I imagine you want to do this to avoid a major collection right at the peak of a residency spike.
You probably only want to disable major collections though: it's safe for minor collections to happen.
no, in that particular case i have very simple and fast algorithm, which allocates plenty of memory. minor GC's in such situation is just waste of time. so i want to do:
disableGC result <- eatMemory enableGC
with a effect that all memory allocated in 'eatMemory' procedure will be garbage collected only after return from this procedure. currently i have this stats:
INIT time 0.01s ( 0.00s elapsed) MUT time 0.57s ( 0.60s elapsed) GC time 1.41s ( 1.41s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 1.99s ( 2.01s elapsed)
%GC time 70.8% (70.1% elapsed)
Alloc rate 171,249,142 bytes per MUT second
Productivity 28.7% of total user, 28.4% of total elapsed
as you see, it is very inefficient
I see (I think). Unfortunately currently the size of the allocation area is fixed after a GC, so you'll have to change the code in the runtime to keep allocating more blocks for the nursery.
I guess you're proposing using madvise(M_FREE) (or whatever the equivalent is on your favourite OS). This would certainly be a good idea if the program is swapping, but might impose an overhead when running in memory. I don't know, I haven't tried.
i don't see resons why this can be slower. we will be a "good citizens" - return memory what is not used at current moment and reallocate memory when needed.
It might be slower because it involves extra calls to the kernel to free/allocate memory, and the kernel has to update its page tables. I mentioned madvise() above: this is a compromise solution which involves telling the kernel that the data in memory is not relevant, but doesn't actually free the memory. The kernel is free to discard the pages if memory gets tight, without actually swapping them to disk. When the memory is faulted in again, it gets filled with zeros. This is ideal for copying GC: you madvise() the semispace you just copied from, because it contains junk. IIRC, madvise() is a BSD-ish interface, but other OSs probably have similar facilities. We could also consider really returning memory to the OS. This requires more work in the runtime, though.
current implementation only allows memory usage to grow and that is not perfect too. imho it will be better to release unneeded memory after major GC and perform next major GC after allocating fixed amount of memory or, say, after doubling used memory area
GHC has quite a sophisticated block-based storage manager. It's not obvious how to understand your comments in the context of GHC - I suggest you take a look at the source code. Cheers, Simon

On Thu, 2005-04-21 at 10:57 +0100, Simon Marlow wrote:
I mentioned madvise() above: this is a compromise solution which involves telling the kernel that the data in memory is not relevant, but doesn't actually free the memory. The kernel is free to discard the pages if memory gets tight, without actually swapping them to disk. When the memory is faulted in again, it gets filled with zeros. This is ideal for copying GC: you madvise() the semispace you just copied from, because it contains junk.
IIRC, madvise() is a BSD-ish interface, but other OSs probably have similar facilities.
Linux and Solaris have this interface (Solaris with possibly different flags MADV_DONTNEED/MADV_FREE). And there is also a standardised posix_madvise() (that no-one seems to support!) That probably covers it for unixy(linux,solaris,*bsd,darwin) systems. Don't know about win32. Duncan

Hello Duncan, Thursday, April 21, 2005, 5:36:28 PM, you wrote: DC> On Thu, 2005-04-21 at 10:57 +0100, Simon Marlow wrote:
I mentioned madvise() above: this is a compromise solution which involves telling the kernel that the data in memory is not relevant, but doesn't actually free the memory. The kernel is free to discard the pages if memory gets tight, without actually swapping them to disk. When the memory is faulted in again, it gets filled with zeros. This is ideal for copying GC: you madvise() the semispace you just copied from, because it contains junk.
IIRC, madvise() is a BSD-ish interface, but other OSs probably have similar facilities.
DC> Linux and Solaris have this interface (Solaris with possibly different DC> flags MADV_DONTNEED/MADV_FREE). DC> And there is also a standardised posix_madvise() (that no-one seems to DC> support!) DC> That probably covers it for unixy(linux,solaris,*bsd,darwin) systems. DC> Don't know about win32. it seems that madvise() is an ideal solution for systems that support this for other OS'es we can use unmap+map. the drawback is that OS wastes time filling this reallocated memory with zeros (at least that does VirtualFree+VirtualAlloc under WIN XP), so this will add 5-10% to cost of major GCs. as i can see in GC.c, we can't free old memory before allocating all new one, so in case of physical memory shortage this algorithm will be worse than compacting? also while viewing MBlock.c i wonder why you are aligning megablocks to megabyte boundary - may be 8-byte aligning would be enough? megablocks are never copied or allocated as whole, and even in this case aligning to CPU cache line boundary will be enough another strange thins is what under win32 (and only win32) without "+RTS -M" option we are restricted to 256 Mbytes of heap. see at this code:
if ( (base_non_committed == 0) || (next_request + size > end_non_committed) ) { if (base_non_committed) { /* Tacky, but if no user-provided -M option is in effect, * set it to the default (==256M) in time for the heap overflow PSA. */ if (RtsFlags.GcFlags.maxHeapSize == 0) { RtsFlags.GcFlags.maxHeapSize = size_reserved_pool / BLOCK_SIZE; } heapOverflow(); }
i think it must be:
if ( (base_non_committed == 0) || (next_request + size > end_non_committed) ) { if (base_non_committed && RtsFlags.GcFlags.maxHeapSize) { heapOverflow(); }
in order to allow programs without "+RTS -M" option allocate more than one 256 mbyte chunk -- Best regards, Bulat mailto:bulatz@HotPOP.com

Hello Simon, Thursday, April 21, 2005, 1:57:20 PM, you wrote: SM> On 20 April 2005 15:56, Bulat Ziganshin wrote:
1) can you add disableGC and enableGC procedures? this can significantly improve performance in some cases
SM> I see (I think). Unfortunately currently the size of the allocation SM> area is fixed after a GC, so you'll have to change the code in the SM> runtime to keep allocating more blocks for the nursery. so that it is either impossible or too hard, as i understand you?
i don't see resons why this can be slower. we will be a "good citizens" - return memory what is not used at current moment and reallocate memory when needed.
SM> It might be slower because it involves extra calls to the kernel to SM> free/allocate memory, and the kernel has to update its page tables. it is very fast, at least in my win xp (120 thousands of 1-mbyte blocks are unmapped/mapped in one second! you can try yourself included program). the real problem is that windows goes to 0'ize all the memory it returns to our program so, the best solution, i think, will be current one + madvise for systems that supports it + one small but major change: current code switches to compacting when more than 30% of RtsFlags.GcFlags.maxHeapSize is used. it must calculate 30% of PHYSICAL memory while maxHeapSize is developed to limit VIRTUAL memory usage -- Best regards, Bulat mailto:bulatz@HotPOP.com
participants (3)
-
Bulat Ziganshin
-
Duncan Coutts
-
Simon Marlow