rts nursery and stopping the world minor gc

I've been reading about garbage collection. I understand that is stop-the-world, parallel and generational. Per capability also exists "the nursery", which is a block of memory where new objects are allocated. When the nursery of a capability gets filled, do we have to stop-the-world just for that local nursery gc?. To me this is different than gen0 or gen1 gc because isn't really a shared resource. -- Ruben Astudillo.

On Thu, Jun 8, 2017 at 3:00 PM, Ruben Astudillo
When the nursery of a capability gets filled, do we have to stop-the-world just for that local nursery gc?. To me this is different than gen0 or gen1 gc because isn't really a shared resource.
Things that survive garbage collection get promoted out of the nursery. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On 08/06/17 15:03, Brandon Allbery wrote:
On Thu, Jun 8, 2017 at 3:00 PM, Ruben Astudillo
mailto:ruben.astud@gmail.com> wrote: When the nursery of a capability gets filled, do we have to stop-the-world just for that local nursery gc?. To me this is different than gen0 or gen1 gc because isn't really a shared resource.
Things that survive garbage collection get promoted out of the nursery.
but being promoted to gen0 is allocation, and thus there isn't need to stop the other threads. -- Ruben Astudillo

On Thu, Jun 8, 2017 at 3:05 PM, Ruben Astudillo
On 08/06/17 15:03, Brandon Allbery wrote:
On Thu, Jun 8, 2017 at 3:00 PM, Ruben Astudillo
mailto:ruben.astud@gmail.com> wrote: When the nursery of a capability gets filled, do we have to stop-the-world just for that local nursery gc?. To me this is different than gen0 or gen1 gc because isn't really a shared resource. Things that survive garbage collection get promoted out of the nursery.
but being promoted to gen0 is allocation, and thus there isn't need to stop the other threads.
It's not just allocation: if it didn't get garbage collected, then there is *at least* one reference to it that needs to be updated. So what happens when there is more than one, and some of those are active in other threads? -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

I didn't understand some parts On 08/06/17 15:08, Brandon Allbery wrote:
It's not just allocation: if it didn't get garbage collected, then there is *at least* one reference to it that needs to be updated.
which reference is that?. Immutability means that old data can't reference the new one, right?. What kind of reference (apart from live data on the stack) can we have on new data on the nursery?
So what happens when there is more than one, and some of those are active in other threads?
Same as before right? Sorry, maybe I don't have the right mental model. If you could show me a general picture of what happens when a nursery gets full I would appreciate it :-) . -- Ruben Astudillo

On Thu, Jun 8, 2017 at 3:29 PM, Ruben Astudillo
On 08/06/17 15:08, Brandon Allbery wrote:
It's not just allocation: if it didn't get garbage collected, then there is *at least* one reference to it that needs to be updated.
which reference is that?. Immutability means that old data can't reference the new one, right?. What kind of reference (apart from live data on the stack) can we have on new data on the nursery?
Another reference within the same nursery. Now consider that the same nursery can *also* contain newly created threads, which can be scheduled just like existing ones. Between two garbage collections, we may have: - allocation of new values - spawning of new sparks which have access to said values If you now have multiple hardware threads, these can be running concurrently and allocating their own heap values which reference those shared values: for example, each is building its own list of cons cells referencing the value in different ways as specified by other values passed to the thread. Relocating the shared value from the nursery now requires updating all such thread-specific cons cells. Those heap values are also being promoted out of the nursery, but that doesn't change anything; copying them does not magically also update the copied things being pointed to, that is part of what GC does. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On Thu, Jun 8, 2017 at 3:45 PM, Brandon Allbery
If you now have multiple hardware threads, these can be running concurrently and allocating their own heap values which reference those shared values: for example, each is building its own list of cons cells referencing the value in different ways as specified by other values passed to the thread. Relocating the shared value from the nursery now requires updating all such thread-specific cons cells. Those heap values are also being promoted out of the nursery, but that doesn't change anything; copying them does not magically also update the copied things being pointed to, that is part of what GC does.
I should also point out that the fact that the cons cells for the other thread are in a different nursery from the shared value *also* does not change anything (aside from possibly complicating the implementation of the garbage collector). They still must be updated to point to the new value, so that thread must be quiescent/not actively accessing through the pointer being updated. This stuff *is* complicated; I've had to stop and think through the whole thing a few times myself. And I might still have some things wrong. :/ -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Simon Marlow wrote a paper about this issue a few years ago. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/local-gc... The conclusion IIRC was that while maintaining thread-local heaps could reduce the amount of time spent in GC, the benefit was surprisingly small, and came at a sizable cost in code complexity. So GHC HQ decided it was not worth merging at the time. I don't know if there's been any work on that front since the paper though. On Thu, Jun 8, 2017, at 12:45, Brandon Allbery wrote:
On Thu, Jun 8, 2017 at 3:29 PM, Ruben Astudillo
wrote: On 08/06/17 15:08, Brandon Allbery wrote:
It's not just allocation: if it didn't get garbage collected, then there is *at least* one reference to it that needs to be updated.
which reference is that?. Immutability means that old data can't reference the new one, right?. What kind of reference (apart from live data on the stack) can we have on new data on the nursery?
Another reference within the same nursery. Now consider that the same nursery can *also* contain newly created threads, which can be scheduled just like existing ones.
Between two garbage collections, we may have:
- allocation of new values - spawning of new sparks which have access to said values
If you now have multiple hardware threads, these can be running concurrently and allocating their own heap values which reference those shared values: for example, each is building its own list of cons cells referencing the value in different ways as specified by other values passed to the thread. Relocating the shared value from the nursery now requires updating all such thread-specific cons cells. Those heap values are also being promoted out of the nursery, but that doesn't change anything; copying them does not magically also update the copied things being pointed to, that is part of what GC does.
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

On 06/08/2017 09:29 PM, Ruben Astudillo wrote:
I didn't understand some parts
On 08/06/17 15:08, Brandon Allbery wrote:
It's not just allocation: if it didn't get garbage collected, then there is *at least* one reference to it that needs to be updated.
which reference is that?. Immutability means that old data can't reference the new one, right?. What kind of reference (apart from live data on the stack) can we have on new data on the nursery?
No, immutability does not mean that old data cannot reference new data. The reason is that we are in a lazy language: If you have an old value referencing a thunk and the thunk is evaluated, it is replaced by an indirection to the new value (it may be replaced if the new value is small enough, but this is just an optimization). So if two threads share a reference to an infinite list and one threads evaluates a piece of the list and later throws away its reference, the other thread still references the new evaluated part of the list - which was allocated in the nursery of the other thread! Additionally, not all data in Haskell is immutable. Just think about all the different vectors, arrays and things like IORefs you can modify as long as you are in the correct monad. The GC has to take care of that, otherwise your program would either leak memory or after the GC frees your data while you still hold an IORef pointing to it or similar.
So what happens when there is more than one, and some of those are active in other threads?
Same as before right?
Sorry, maybe I don't have the right mental model. If you could show me a general picture of what happens when a nursery gets full I would appreciate it :-)> -- Ruben Astudillo _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
participants (4)
-
Brandon Allbery
-
Eric Seidel
-
Jonas Scholl
-
Ruben Astudillo