Re: [GHC] #9400: poor performance when compiling modules with many Text literals at -O1

6 Sep 2014

      #9400: poor performance when compiling modules with many Text literals at -O1
-------------------------------------+-------------------------------------
              Reporter:  rwbarton    |            Owner:  xnyhps
                  Type:  bug         |           Status:  new
              Priority:  normal      |        Milestone:
             Component:  Compiler    |          Version:  7.8.3
            Resolution:              |         Keywords:
      Operating System:              |     Architecture:  Unknown/Multiple
  Unknown/Multiple                   |       Difficulty:  Unknown
       Type of failure:  Compile-    |       Blocked By:
  time performance bug               |  Related Tickets:  #9370
             Test Case:              |
              Blocking:              |
Differential Revisions:              |
-------------------------------------+-------------------------------------
Changes (by xnyhps):

 * owner:   => xnyhps

Comment:

 This looked like quite a simple bug I could work on, so I decided to have
 a look.

 * Removing the `lengthFS str == 1` case in `mkStringExprFS` does not
 appear to break anything.

 * I've managed to modify `CoreSubst.exprIsConApp_maybe` to split calls to
 `unpackCString#` or `unpackCStringUtf8#`. It'll also recognize when it's a
 single-character string and split it into `(':', [c, nil])`, to avoid ever
 creating calls to `unpackCString*# ""#`. Looking at the generated Core,
 I've verified that it now pushes `unpackCString*#` calls through
 `case`-statements.

   This means that, even at -O0:

   {{{
 main = case "abc" of
     (x:xs) -> putStrLn xs
 }}}

   compiles to `main = System.IO.putStrLn GHC.CString.unpackCString# "bc"#`

 * I haven't timed it, but `src/Text/XmlHtml/HTML/Meta.hs` now compiles
 quickly and with only ~180MB RAM.

 * `unpackCString#` and `unpackCStringUtf8#` seem very similar, except that
 `unpackCStringUtf8#` parses UTF8. At least for `mkStringExprFS` it could
 work to only use `unpackCStringUtf8#`, with a slight run-time penalty
 (calling `leChar#` for each character in the string), but the only benefit
 would be making rewrite rules easier.

--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9400#comment:9
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler