
#9400: poor performance when compiling modules with many Text literals at -O1 -------------------------------------+------------------------------------- Reporter: rwbarton | Owner: Type: bug | Status: closed Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: invalid | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Compile- | Blocked By: time performance bug | Related Tickets: #9370 Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by rwbarton): * status: new => closed * resolution: => invalid Comment: OK this is a bit funny. Normally a Text literal `"abc"` gets desugared as {{{ fromString $fIsStringText (unpackCString# "abc"#) }}} Now `fromString $fIsStringText = pack`, and `pack = unstream . S.map safe . S.streamList`, and there is a rule in `Data.Text` {{{ {-# RULES "TEXT literal" forall a. unstream (S.map safe (S.streamList (GHC.unpackCString# a))) = unpackCString# a #-} }}} and `Data.Text.unpackCString#` has a NOINLINE pragma so we end up with the nice small code: `Data.Text.unpackCString# "abc"`. ''But'', a ''single-character'' literal `"a"` is instead desugared as {{{ fromString $fIsStringText (: (C# 'a') ([])) }}} and now there is no rule which matches this pattern. And `unstream` is marked `INLINE [0]`, as Simon predicted; and it's rather large. And most XML entities represent single Unicode characters, so GHC generated around 2000 copies of `unstream`. I don't know why there is an `INLINE` pragma on `unstream`. Perhaps no good reason. But anyways, there is a simple fix to the text package: add another rule to match the pattern `unstream (S.map safe (S.streamList [c]))`. (And similarly for empty string literals.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9400#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler