[GHC] #14727: Unboxed sum performance surprisingly poor

#14727: Unboxed sum performance surprisingly poor -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Keywords: UnboxedSums | Operating System: Unknown/Multiple Architecture: | Type of failure: Runtime Unknown/Multiple | performance bug Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- I tried performing worker-wrapper manually on `Data.IntMap.lookup`: {{{#!hs lookup# :: Int -> IntMap a -> (# (# #) | a #) lookup# = -- The obvious modification of the current implementation lookup :: Int -> IntMap a -> Maybe a lookup k m = case lookup# k m of (# | a #) -> Just a _ -> Nothing }}} Unfortunately, the `lookup` benchmark ''slowed down''. I verified that the benchmark indeed performs an immediate case analysis on the result (with `fromMaybe`), so it ''should'' go faster. And yet it goes slower. Caveat: I have not yet gotten things set up to be able to check with 8.4, so if there have been improvements in `UnboxedSum` performance since `8.2.2`, this may all be silly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14727 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14727: Unboxed sum performance surprisingly poor -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): I'd make sure `lookup` is inlined, in which case allocations should be reduced. OTOH you use one more register to return two values in `lookup#` instead of one as before so that may make some things worse. It'd be helpful to see benchmark code's Cmm for both versions (with `lookup` inlined in the unboxed sums version). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14727#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14727: Unboxed sum performance surprisingly poor -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:1 osa1]:
I'd make sure `lookup` is inlined, in which case allocations should be reduced.
It is inlined. I'll have to look at allocations. Of course, it's somewhat possible that performance goes down ''because'' there's less allocation, if we're unknowingly relying on the GC to improve memory layout.
OTOH you use one more register to return two values in `lookup#` instead of one as before so that may make some things worse.
It'd be helpful to see benchmark code's Cmm for both versions (with `lookup` inlined in the unboxed sums version).
I'm not sure how much you're likely to get from a Criterion benchmark. Is there a specific thing you'd like me to try dumping? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14727#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14727: Unboxed sum performance surprisingly poor -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1):
I'm not sure how much you're likely to get from a Criterion benchmark. Is there a specific thing you'd like me to try dumping?
I think looking at `lookup` in `benchmarks/IntMap.hs` (and any functions referenced by that function) would be helpful. I also think that it may be a good idea to add `{-# NOINLINE lookup #-}` in `benchmarks/IntMap.hs` just to avoid accidentally optimising the code at the use site (i.e. the benchmarking code). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14727#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14727: Unboxed sum performance surprisingly poor -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by michalt): * cc: michalt (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14727#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14727: Unboxed sum performance surprisingly poor -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): It'd be great to make a reproducible test case that Omer (the author of unboxed sums) can look at. Thanks! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14727#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14727: Unboxed sum performance surprisingly poor -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by maoe): * cc: maoe (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14727#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC