[GHC] #15642: Improve the worst case performance of weak pointers

#15642: Improve the worst case performance of weak pointers -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: feature | Status: new request | Priority: normal | Milestone: 8.8.1 Component: Runtime | Version: 8.6.1-beta1 System | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Runtime Unknown/Multiple | performance bug Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- Garbage collecting weak pointers involves repeatedly traversing the weak pointer list, checking which keys are still alive, and in response marking the relevant values and finalizers live. This works well in most cases, but if there are long chains of weak pointers, it's terrible. I wonder if we can do something about that without significantly hurting performance elsewhere. Here's the (probably naive) idea: 1. Maintain a hash table mapping weak pointer keys to lists of weak pointers. This replaces the current weak pointer list. 2. Use a bit in the info table pointer of every heap object to indicate whether that object is referenced from any `Weak#`. This is the weakest link: if we don't have a spare bit there, or don't want to use one, then I think this whole idea is sunk. When creating a weak pointer, set the appropriate bit in the key and insert the key and pointer into the hash table. When performing early finalization, clear the bit in the key if no other `Weak#` points to it. When performing garbage collection: check the bit in each object as it is marked. If the bit is set, move the key and its attached `Weak#`s from the hash table into a new hash table (or an association list that gets turned into a hash table after evacuation or whatever), and mark all the linked weak pointer targets and finalizers live. In the end, the old hash table contains only unreachable objects. Now mark those objects live (for finalization and in case of resurrection), and queue up the finalizers. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15642 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15642: Improve the worst case performance of weak pointers -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: 8.8.1 Component: Runtime System | Version: 8.6.1-beta1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): One more challenge: if many `Weak#`s are attached to the same key, we still need to be able to remove them from the hash table quickly in case of early finalization. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15642#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15642: Improve the worst case performance of weak pointers -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: 8.8.1 Component: Runtime System | Version: 8.6.1-beta1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Somehow marking keys and being able map keys to weaks would be nice. As you say the mapping should be able to map a key to multiple weaks because an object may be key to multiple weaks. Out of curiosity, did you measure how long weak collection takes? I'd guess because we scan all weaks in collected gens repeatedly (until we stop evacuating things) the worst case performance (when we have a ton of weaks) would be pretty bad, but I haven't benchmarked this case myself.
This is the weakest link: if we don't have a spare bit there, or don't want to use one, then I think this whole idea is sunk.
I think in theory it is possible to use a bit in info pointer. We already do this to create forwarding pointers in GC, we could use a different bit to indicate that the objects is a weak key. This however requires changes in mutator code as mutators will have to untag info ptr before entering (in addition to untagging the pointer to the object itself in some cases). That's probably not great as we end up generating more code and probably making things slightly slower (one more instruction to enter -- not sure if measurable difference though) to be able to collect weaks faster, and a lot of programs don't have a lot of weaks. I wonder how other languages with weaks (Python, Java, Lua ...) do this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15642#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15642: Improve the worst case performance of weak pointers -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: 8.8.1 Component: Runtime System | Version: 8.6.1-beta1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by osa1): * cc: osa1 (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15642#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15642: Improve the worst case performance of weak pointers -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: 8.8.1 Component: Runtime System | Version: 8.6.1-beta1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Taking a bit in the info pointer for this seems like a very high cost. We only have one spare bit (on 32-bit), and furthermore using that bit has a runtime cost because we'd have to mask it everywhere. Doesn't seem worth it to me. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15642#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15642: Improve the worst case performance of weak pointers -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: 8.8.1 Component: Runtime System | Version: 8.6.1-beta1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): A more conservative approach (less cost for less benefit) would be to only build the hash table during the first/second/third round of fixpointing. I'm a bit surprised we can only spare one bit for 32-bit systems. Isn't the tables+code area likely to be much smaller than the address space? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15642#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

I'm a bit surprised we can only spare one bit for 32-bit systems. Isn't
#15642: Improve the worst case performance of weak pointers -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: 8.8.1 Component: Runtime System | Version: 8.6.1-beta1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): the tables+code area likely to be much smaller than the address space? Not when you have a 2GB binary... -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15642#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15642: Improve the worst case performance of weak pointers -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: 8.8.1 Component: Runtime System | Version: 8.6.1-beta1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): Yes, I suppose that's not too unreasonable a size. Bleh. Assume for the sake of discussion that the proposed mechanism is only used for 64-bit code. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15642#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15642: Improve the worst case performance of weak pointers -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: 8.8.1 Component: Runtime System | Version: 8.6.1-beta1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): I realized two things today: 1. We don't have to tag the info pointer (or deal with the possibility that it's tagged) in the mutator. We can instead tag all the pointers in the table at the beginning of collection and untag as we go. 2. We should have enough bits, even for a large binary on a 32-bit system. Why? Because even a huge binary won't have billions of info tables except perhaps in a pathological case. So if we eventually need another bit, we can impose 8-byte alignment for the tables. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15642#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15642: Improve the worst case performance of weak pointers -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: 8.8.1 Component: Runtime System | Version: 8.6.1-beta1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Hmm I think (1) is a good idea. Just one traversal over all weaks before GC to tag keys, then any live key will be untagged during evacuation. (did I get this right?) Out of curiosity, are you observing any slowness/long pauses in a real program because of collecting weaks? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15642#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15642: Improve the worst case performance of weak pointers -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: 8.8.1 Component: Runtime System | Version: 8.6.1-beta1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:9 osa1]:
Hmm I think (1) is a good idea. Just one traversal over all weaks before GC to tag keys, then any live key will be untagged during evacuation. (did I get this right?)
Out of curiosity, are you observing any slowness/long pauses in a real
Sounds right to me. program because of collecting weaks? I've never written a practical program using weak references. I just think worst-case quadratic time garbage collection sounds pretty bad. Can we fix it without making more common cases worse? Dunno, but I think it's worth trying. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15642#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15642: Improve the worst case performance of weak pointers -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: 8.8.1 Component: Runtime System | Version: 8.6.1-beta1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): Oh, and the hash table can just point to the first weak for each key and we can chain them up. We just need to skip over any manually finalized ones when we traverse. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15642#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC