
#10069: CPR related performance issue -------------------------------------+------------------------------------- Reporter: pacak | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1-rc2 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Runtime Unknown/Multiple | performance bug Test Case: | Blocked By: Blocking: | Related Tickets: Differential Revisions: | -------------------------------------+------------------------------------- By default CRP analysis can be too aggressive in trying to pass as much as possible in unboxed tuples, in general it's not a problem but when one big datatype is passed to several consumers it might end up pushed to stack several times instead of once - to heap, things are getting worse when there are sufficient fields to cause stack overflow which otherwise is possible to avoid - in our codebase adding one field with ExistentialQuantification (unused, but that prevents ghc from doing CRP transformation) reduces number of stack overflow by a factor of 1000 and increases overall performance by 10%. In provided example performance for both A and B should be identical and yet B is consistently faster by 3-5% It's possible to increase this performance gap by adding more and more fields. I was able to replicate this issue in ghc 7.8.3 and 7.10,1rc2 {{{#!hs {-# LANGUAGE ExistentialQuantification #-} module Blah where import Criterion import Criterion.Main import Data.Typeable data A = A () !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int data B = forall rep. (Typeable rep) => B rep !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int !Int a :: A a = A () 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 b :: B b = B () 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 {-# NOINLINE a1 #-} a1 :: A -> Int a1 (A _ f1 f2 f3 f4 f5 f6 f7 f8 g1 g2 g3 g4 g5 g6 g7 g8 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) = f1 {-# NOINLINE a2 #-} a2 :: A -> Int a2 (A _ f1 f2 f3 f4 f5 f6 f7 f8 g1 g2 g3 g4 g5 g6 g7 g8 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) = f2 {-# NOINLINE a3 #-} a3 :: A -> Int a3 (A _ f1 f2 f3 f4 f5 f6 f7 f8 g1 g2 g3 g4 g5 g6 g7 g8 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) = f3 {-# NOINLINE a4 #-} a4 :: A -> Int a4 (A _ f1 f2 f3 f4 f5 f6 f7 f8 g1 g2 g3 g4 g5 g6 g7 g8 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) = f4 {-# NOINLINE b1 #-} b1 :: B -> Int b1 (B _ f1 f2 f3 f4 f5 f6 f7 f8 g1 g2 g3 g4 g5 g6 g7 g8 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) = f1 {-# NOINLINE b2 #-} b2 :: B -> Int b2 (B _ f1 f2 f3 f4 f5 f6 f7 f8 g1 g2 g3 g4 g5 g6 g7 g8 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) = f2 {-# NOINLINE b3 #-} b3 :: B -> Int b3 (B _ f1 f2 f3 f4 f5 f6 f7 f8 g1 g2 g3 g4 g5 g6 g7 g8 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) = f3 {-# NOINLINE b4 #-} b4 :: B -> Int b4 (B _ f1 f2 f3 f4 f5 f6 f7 f8 g1 g2 g3 g4 g5 g6 g7 g8 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) = f4 {-# NOINLINE fa #-} fa :: A -> Int fa a = a1 a + a2 a + a3 a + a4 a {-# NOINLINE fb #-} fb :: B -> Int fb b = b1 b + b2 b + b3 b + b4 b main :: IO () main = defaultMain [ bgroup "single call" [ bench "A" $ whnf fa a , bench "B" $ whnf fb b ] ] }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10069 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler