
Hello, We've noticed that some applications exhibit significantly worse memory usage when compiled with ghc-7.6.1 compared to ghc-7.4, leading to out of memory errors in some cases. Running one app with +RTS -s, I see this: ghc-7.4 525,451,699,736 bytes allocated in the heap 53,404,833,048 bytes copied during GC 39,097,600 bytes maximum residency (2439 sample(s)) 1,547,040 bytes maximum slop 628 MB total memory in use (0 MB lost due to fragmentation) ghc-7.6 512,535,907,752 bytes allocated in the heap 53,327,184,712 bytes copied during GC 40,038,584 bytes maximum residency (2391 sample(s)) 1,456,472 bytes maximum slop 3414 MB total memory in use (2744 MB lost due to fragmentation) The total memory in use (consistent with 'top's output) is much higher when built with ghc-7.6, due entirely to fragmentation. I've filed a bug report (http://hackage.haskell.org/trac/ghc/ticket/7257, http://hpaste.org/74987), but I was wondering if anyone else has noticed this? I'm not entirely sure what's triggering this behavior (some applications work fine), although I suspect it has to do with allocation of pinned memory. John L.

So the problem is only with the data structures on the heap that are pinned
in place to play nice with C?
I'd be curious to understand the change too, though per se pinned memory (a
la storable or or bytestring) will by definition cause memory fragmentation
in a gc'd lang as a rule, (or at least one like Haskell).
-Carter
On Thu, Sep 20, 2012 at 8:59 PM, John Lato
Hello,
We've noticed that some applications exhibit significantly worse memory usage when compiled with ghc-7.6.1 compared to ghc-7.4, leading to out of memory errors in some cases. Running one app with +RTS -s, I see this:
ghc-7.4 525,451,699,736 bytes allocated in the heap 53,404,833,048 bytes copied during GC 39,097,600 bytes maximum residency (2439 sample(s)) 1,547,040 bytes maximum slop 628 MB total memory in use (0 MB lost due to fragmentation)
ghc-7.6 512,535,907,752 bytes allocated in the heap 53,327,184,712 bytes copied during GC 40,038,584 bytes maximum residency (2391 sample(s)) 1,456,472 bytes maximum slop 3414 MB total memory in use (2744 MB lost due to fragmentation)
The total memory in use (consistent with 'top's output) is much higher when built with ghc-7.6, due entirely to fragmentation.
I've filed a bug report (http://hackage.haskell.org/trac/ghc/ticket/7257, http://hpaste.org/74987), but I was wondering if anyone else has noticed this? I'm not entirely sure what's triggering this behavior (some applications work fine), although I suspect it has to do with allocation of pinned memory.
John L.
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Yes, that's my current understanding. I see this with ByteString and
Data.Vector.Storable, but not
Data.Vector/Data.Vector.Unboxed/Data.Text. As ByteStrings are pretty
widely used for IO, I expected that somebody else would have
experienced this too.
I would expect some memory fragmentation with pinned memory, but the
change from ghc-7.4 to ghc-7.6 is rather extreme (no fragmentation to
several GB).
John L.
On Fri, Sep 21, 2012 at 10:53 AM, Carter Schonwald
So the problem is only with the data structures on the heap that are pinned in place to play nice with C?
I'd be curious to understand the change too, though per se pinned memory (a la storable or or bytestring) will by definition cause memory fragmentation in a gc'd lang as a rule, (or at least one like Haskell). -Carter
On Thu, Sep 20, 2012 at 8:59 PM, John Lato
wrote: Hello,
We've noticed that some applications exhibit significantly worse memory usage when compiled with ghc-7.6.1 compared to ghc-7.4, leading to out of memory errors in some cases. Running one app with +RTS -s, I see this:
ghc-7.4 525,451,699,736 bytes allocated in the heap 53,404,833,048 bytes copied during GC 39,097,600 bytes maximum residency (2439 sample(s)) 1,547,040 bytes maximum slop 628 MB total memory in use (0 MB lost due to fragmentation)
ghc-7.6 512,535,907,752 bytes allocated in the heap 53,327,184,712 bytes copied during GC 40,038,584 bytes maximum residency (2391 sample(s)) 1,456,472 bytes maximum slop 3414 MB total memory in use (2744 MB lost due to fragmentation)
The total memory in use (consistent with 'top's output) is much higher when built with ghc-7.6, due entirely to fragmentation.
I've filed a bug report (http://hackage.haskell.org/trac/ghc/ticket/7257, http://hpaste.org/74987), but I was wondering if anyone else has noticed this? I'm not entirely sure what's triggering this behavior (some applications work fine), although I suspect it has to do with allocation of pinned memory.
John L.
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On 21/09/2012 04:07, John Lato wrote:
Yes, that's my current understanding. I see this with ByteString and Data.Vector.Storable, but not Data.Vector/Data.Vector.Unboxed/Data.Text. As ByteStrings are pretty widely used for IO, I expected that somebody else would have experienced this too.
I would expect some memory fragmentation with pinned memory, but the change from ghc-7.4 to ghc-7.6 is rather extreme (no fragmentation to several GB).
This was a side-effect of the improvements we made to the allocation of pinned objects, which ironically was made to avoid fragmentation of a different kind. What is happening is that the memory for the pinned objects is now taken from the nursery, and so the nursery has to be replenished after GC. When we allocate memory for the nursery we like to allocate it in big contiguous chunks, because that works better with automatic prefecthing, but the memory is horribly fragmented due to all the pinned objects, so the large allocation has to be satisfied from the OS. The fix is not to allocate large chunks for the nursery unless there are no small chunks to use up, so I've implemented that. Happily I also found two other bugs while looking for this one, one of which was a performance bug which caused this benchmark to run 10x slower than it should have been! The other bug was a recent regression causing it to misreport the amount of allocated memory. Thanks for the report. Cheers, Simon
John L.
On Fri, Sep 21, 2012 at 10:53 AM, Carter Schonwald
wrote: So the problem is only with the data structures on the heap that are pinned in place to play nice with C?
I'd be curious to understand the change too, though per se pinned memory (a la storable or or bytestring) will by definition cause memory fragmentation in a gc'd lang as a rule, (or at least one like Haskell). -Carter
On Thu, Sep 20, 2012 at 8:59 PM, John Lato
wrote: Hello,
We've noticed that some applications exhibit significantly worse memory usage when compiled with ghc-7.6.1 compared to ghc-7.4, leading to out of memory errors in some cases. Running one app with +RTS -s, I see this:
ghc-7.4 525,451,699,736 bytes allocated in the heap 53,404,833,048 bytes copied during GC 39,097,600 bytes maximum residency (2439 sample(s)) 1,547,040 bytes maximum slop 628 MB total memory in use (0 MB lost due to fragmentation)
ghc-7.6 512,535,907,752 bytes allocated in the heap 53,327,184,712 bytes copied during GC 40,038,584 bytes maximum residency (2391 sample(s)) 1,456,472 bytes maximum slop 3414 MB total memory in use (2744 MB lost due to fragmentation)
The total memory in use (consistent with 'top's output) is much higher when built with ghc-7.6, due entirely to fragmentation.
I've filed a bug report (http://hackage.haskell.org/trac/ghc/ticket/7257, http://hpaste.org/74987), but I was wondering if anyone else has noticed this? I'm not entirely sure what's triggering this behavior (some applications work fine), although I suspect it has to do with allocation of pinned memory.
John L.
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Simon Marlow
On 21/09/2012 04:07, John Lato wrote:
Yes, that's my current understanding. I see this with ByteString and Data.Vector.Storable, but not Data.Vector/Data.Vector.Unboxed/Data.Text. As ByteStrings are pretty widely used for IO, I expected that somebody else would have experienced this too.
I would expect some memory fragmentation with pinned memory, but the change from ghc-7.4 to ghc-7.6 is rather extreme (no fragmentation to several GB).
This was a side-effect of the improvements we made to the allocation of pinned objects, which ironically was made to avoid fragmentation of a different kind. What is happening is that the memory for the pinned objects is now taken from the nursery, and so the nursery has to be replenished after GC. When we allocate memory for the nursery we like to allocate it in big contiguous chunks, because that works better with automatic prefecthing, but the memory is horribly fragmented due to all the pinned objects, so the large allocation has to be satisfied from the OS.
It seems that I was bit badly by this bug with productivity being reduced to 30% with 8 threads. While the fix on HEAD has brought productivity back up to the mid-90% mark, runtime for my program has regressed by nearly 40% compared to 7.4.1. It's been suggested that this is the result of the new code generator. How should I proceed from here? It would be nice to test with the old code generator to verify that the new codegen is in fact the culprit, yet it doesn't seem there is a flag to accomplish this. Ideas? Cheers, - Ben

On 26/09/2012 05:42, Ben Gamari wrote:
Simon Marlow
writes: On 21/09/2012 04:07, John Lato wrote:
Yes, that's my current understanding. I see this with ByteString and Data.Vector.Storable, but not Data.Vector/Data.Vector.Unboxed/Data.Text. As ByteStrings are pretty widely used for IO, I expected that somebody else would have experienced this too.
I would expect some memory fragmentation with pinned memory, but the change from ghc-7.4 to ghc-7.6 is rather extreme (no fragmentation to several GB).
This was a side-effect of the improvements we made to the allocation of pinned objects, which ironically was made to avoid fragmentation of a different kind. What is happening is that the memory for the pinned objects is now taken from the nursery, and so the nursery has to be replenished after GC. When we allocate memory for the nursery we like to allocate it in big contiguous chunks, because that works better with automatic prefecthing, but the memory is horribly fragmented due to all the pinned objects, so the large allocation has to be satisfied from the OS.
It seems that I was bit badly by this bug with productivity being reduced to 30% with 8 threads. While the fix on HEAD has brought productivity back up to the mid-90% mark, runtime for my program has regressed by nearly 40% compared to 7.4.1. It's been suggested that this is the result of the new code generator. How should I proceed from here? It would be nice to test with the old code generator to verify that the new codegen is in fact the culprit, yet it doesn't seem there is a flag to accomplish this. Ideas?
I removed the flag yesterday, so as long as you have a GHC before yesterday you can use -fno-new-codegen to get the old codegen. You might need to compile libraries with the flag too, depending on where the problem is. I'd be very interested to find out whether the regression really is due to the new code generator, because in all the benchmarking I've done the worst case I found is a program that goes 4% slower, and on average performance is the same as the old codegen. It is likely that by 7.8.1 with some tweaking we should be beating the old codegen consistently. Cheers, Simon
participants (4)
-
Ben Gamari
-
Carter Schonwald
-
John Lato
-
Simon Marlow