
Hello! tl;dr: text package's pack function is creating huge chunks of code everywhere. Michael Snoyman and I have been trying to nail the performance problems of persistent's Template Haskell code -- GHC was taking a lot of memory and CPU time to compile these. What we found out is that the code size was getting increased 20-fold by the simplifier on phase 0 on GHC 7.0 (c.f. http://groups.google.com/group/yesodweb/msg/9f625fcf85575263). So, what was increasing in size? Consider this extremely simple module (attached as Bug.hs): module Bug (text) where import qualified Data.Text as T text :: T.Text text = T.pack "text" Until simplifier phase 0, the code size floats but tops at 12. Here's the core: Bug.text :: Data.Text.Internal.Text [LclIdX, Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False, ConLike=False, Cheap=False, Expandable=False, Guidance=IF_ARGS [] 11 0}] Bug.text = Data.Text.Fusion.unstream (Data.Text.Fusion.Common.streamList @ GHC.Types.Char (GHC.Base.map @ GHC.Types.Char @ GHC.Types.Char Data.Text.Internal.safe (GHC.Base.unpackCString# "text"))) Which is straightforward. However, on simplifier phase 0 the code size jumps to 391 (!!), a 32-fold increase. I've attached the core (after.hs) since it's too large to copy here on the body. So it seems that the (unstream . streamList) pair above is getting inlined to a HUGE chunk of code (at least Data.Text.Array.new is getting inlined). Worse yet, this happens for every single pack that you use, even those packs hidden by OverloadedStrings! Does anyone have any ideas why GHC is inlining so much code everywhere? While everything I said here was tested on GHC 7.0, we have evidence that GHC 7.4 suffers from the same problem. We don't know about GHC 6.12, though. This seems to be a problem for everyone who uses text, which we hope is everyone using Haskell ;-). Cheers, -- Felipe.