
#9630: compile-time performance regression (probably due to Generics) -------------------------------------+------------------------------------- Reporter: hvr | Owner: simonpj Type: bug | Status: new Priority: high | Milestone: 8.0.1 Component: Compiler | Version: 7.9 Resolution: | Keywords: deriving-perf Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9583, #10293 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ezyang): Here is a minimized test-case with no dependencies, which may be useful for diagnosing: {{{ {-# LANGUAGE DeriveGeneric #-} {-# LANGUAGE TypeOperators #-} {-# OPTIONS_GHC -fno-warn-missing-methods #-} module Gen where import GHC.Generics import Control.Monad import Data.Monoid data PairS a = PairS a !(() -> ()) newtype PutM a = Put { unPut :: PairS a } -- Use of this writer monad seems to be important; IO speeds it up type Put = PutM () -- type Put = IO () -- binary has INLINE pragmas on most of the instances but you can still -- trigger bad behavior without them. instance Functor PutM where fmap f m = Put $ let PairS a w = unPut m in PairS (f a) w -- Just to appease AMP instance Applicative PutM where pure = return (<*>) = ap instance Monad PutM where return a = Put $ PairS a id m >>= k = Put $ let PairS a w = unPut m PairS b w' = unPut (k a) in PairS b (w . w') class GBinary f where gput :: f t -> Put -- Forcing the dictionary to have two elements hurts -- the optimizer a lot. not_used :: f t instance GBinary a => GBinary (M1 i c a) where gput = gput . unM1 instance Binary a => GBinary (K1 i a) where gput = put . unK1 instance (GBinary a, GBinary b) => GBinary (a :*: b) where gput (x :*: y) = gput x >> gput y class Binary t where put :: t -> Put instance Binary () where put () = return () data T = T () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () deriving Generic -- Trigger specialization tput :: T -> Put tput = gput . from }}} On my machine, it takes 2.8s to build -O2, and 0.9s to build -O0. There are a few important ways you can tweak this: 1. This also exhibits the "out-of-line more expensive" behavior; moving the code out into a separate module jumps compile time from 2.7s to 5.2s. 2. If you replace `PutM ()` with `IO ()`, compile time goes from 2.7s to 1.9s 3. Removing `not_used`, pushes compile time from 2.7s to 1.5s. This DOES NOT stack with (2). So having to deal with dictionaries seems to make things work. I also wonder if this writer monad is actually leaking thunks, because apparently it's impossible to correctly implement writer without leaking. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9630#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler