[GHC] #13165: Speed up the RTS hash table

#13165: Speed up the RTS hash table -------------------------------------+------------------------------------- Reporter: dobenour | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Runtime | Version: 8.1 System | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- The RTS hash table is rather slow. Every lookup involves at least one indirect call (to get a hash code). It also uses separate chaining, which is itself slow. Until recently, this didn't really matter, since the RTS hash table wasn't used for anything particularly performance critical other than `StablePtr` operations. However, it has since become the bottleneck when compacting with sharing, to the point that compacting without sharing is around 10x faster and is thus the default. Fortunately, there are easy ways to make the hash table faster. These include: - Use linear probing instead of separate chaining. - Specialize for the case of pointer keys - Don't use indirect calls for hashing - Use a fast, universal hash function for pointers. - Use SSE instructions where available to do fast searches. - Minimize the number of indirections in the use of the table. In both the case of the StablePtr table and that of compact regions, the keys of the table are not GC pointers, but the values are, so there needs to be a way to ensure that the GC handles the table correctly -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13165 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13165: Speed up the RTS hash table -------------------------------------+------------------------------------- Reporter: dobenour | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): This would be great - are you planning to do it? To avoid the indirect calls don't we need to specialise the hash table to the key type? There are at least two key types in use (strings and pointers). Also, couldn't we pull in a good third-party hash table implementation instead of writing our own? When I was working on the compact code I did try https://troydhanson.github.io/uthash/userguide.html but it was slower than the RTS hash table. I'm sure there are better ones. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13165#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13165: Speed up the RTS hash table -------------------------------------+------------------------------------- Reporter: dobenour | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by simonmar: @@ -6,4 +6,4 @@ - used for anything particularly performance critical other than `StablePtr` - operations. However, it has since become the bottleneck when compacting - with sharing, to the point that compacting without sharing is around 10x - faster and is thus the default. + used for anything particularly performance critical other than + `StableName` operations. However, it has since become the bottleneck when + compacting with sharing, to the point that compacting without sharing is + around 10x faster and is thus the default. New description: The RTS hash table is rather slow. Every lookup involves at least one indirect call (to get a hash code). It also uses separate chaining, which is itself slow. Until recently, this didn't really matter, since the RTS hash table wasn't used for anything particularly performance critical other than `StableName` operations. However, it has since become the bottleneck when compacting with sharing, to the point that compacting without sharing is around 10x faster and is thus the default. Fortunately, there are easy ways to make the hash table faster. These include: - Use linear probing instead of separate chaining. - Specialize for the case of pointer keys - Don't use indirect calls for hashing - Use a fast, universal hash function for pointers. - Use SSE instructions where available to do fast searches. - Minimize the number of indirections in the use of the table. In both the case of the StablePtr table and that of compact regions, the keys of the table are not GC pointers, but the values are, so there needs to be a way to ensure that the GC handles the table correctly -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13165#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13165: Speed up the RTS hash table -------------------------------------+------------------------------------- Reporter: dobenour | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by siddhanathan): Different implementations for different key types would be nice. Doing so would allow type specific optimizations, such as using judy arrays for string keys. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13165#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13165: Speed up the RTS hash table
-------------------------------------+-------------------------------------
Reporter: dobenour | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone: 8.4.1
Component: Runtime System | Version: 8.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#13165: Speed up the RTS hash table -------------------------------------+------------------------------------- Reporter: dobenour | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): The switch to xxhash should help although I suspect more can be done here. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13165#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13165: Speed up the RTS hash table -------------------------------------+------------------------------------- Reporter: dobenour | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): The xxHash patch only improves hashing for strings, compaction uses address hashing which is unaffected. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13165#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13165: Speed up the RTS hash table -------------------------------------+------------------------------------- Reporter: dobenour | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: newcomer Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * keywords: => newcomer * milestone: 8.6.1 => Comment: Removing milestone as no one is currently working on this. Do holler if this is something that you would be interested in trying. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13165#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13165: Speed up the RTS hash table -------------------------------------+------------------------------------- Reporter: dobenour | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: newcomer Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by Crazycolorz5): I'd be interested in helping in resolving this ticket. I'm just a bit unclear as to what needs to be altered / what it's used for. I see the previous changes improve string hashing, but word hashing seems fairly swift (no library calls, just bit-level manipulations) as-is? Are there specific examples that may illustrate the suboptimal performance? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13165#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13165: Speed up the RTS hash table -------------------------------------+------------------------------------- Reporter: dobenour | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: newcomer Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): @Crazycolorz5 for a benchmark, try compacting a large data structure with `Data.Compact.compactWithSharing`. A good way to get some sample data is to take a large JSON and decode it using Aeson. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13165#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13165: Speed up the RTS hash table -------------------------------------+------------------------------------- Reporter: dobenour | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: newcomer Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by NoidedSuper): I'm going to take a crack at implementing a linear probing table today, since that seems to be the easiest thing to do. I think specializing the hash table for different key types will probably yield more performance benefits in the long run, but getting some cache-friendliness with linear probing is better than nothing! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13165#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13165: Speed up the RTS hash table -------------------------------------+------------------------------------- Reporter: dobenour | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: newcomer Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): I'm looking forward to hearing how it goes! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13165#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13165: Speed up the RTS hash table -------------------------------------+------------------------------------- Reporter: dobenour | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: newcomer Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by NoidedSuper): Well, I attempted an implementation at [https://phabricator.haskell.org/D5073], but sadly the test suite seems to break when you run the check (although just building, compiling, and running was working fine). I haven't had the chance to look too much at it but I'm guessing I have a problem with either deleting or mapping over hash tables. I'll work on it some more tomorrow, see if I can fix it. Of course, help is always appreciated! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13165#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13165: Speed up the RTS hash table -------------------------------------+------------------------------------- Reporter: dobenour | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: newcomer Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by mboes): * cc: mboes (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13165#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13165: Speed up the RTS hash table -------------------------------------+------------------------------------- Reporter: dobenour | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: newcomer Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ThrashAbaddon): * cc: ThrashAbaddon (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13165#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC