
#8814: 7.8 optimizes attoparsec improperly --------------------------------------------+------------------------------ Reporter: joelteon | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.1-rc1 Resolution: | Keywords: Operating System: MacOS X | Architecture: x86_64 Type of failure: Runtime performance bug | (amd64) Test Case: | Difficulty: Unknown Blocking: | Blocked By: | Related Tickets: --------------------------------------------+------------------------------ Comment (by simonpj): I have not had any time to devote to this. I tried {{{ ghc -O T8814.hs -ddump-simpl -o T8814 }}} with and without `-fno-full-laziness`. Indeed I see the perf difference. The Core from `-ddump-simpl` looks very different. Inside `Main.$wa` you'll see a call to `runSTRep`. The function to which `runSTRep` is applied looks very different. * Without full laziness, it consists of a call to `newArray#` followed by a couple of `memcpy` calls * With full laziness, it has a rather complicated local recursive function that allocates a LOT of memory. I have no idea why. I think it must be to do with optimisations being done by RULES in the text library. If I add `-ddump-rule-firings` and grep for `TEXT` in the rule names, I get {{{ -- With full laziness Rule fired: TEXT append -> fused Rule fired: TEXT append -> fused Rule fired: TEXT append -> fused Rule fired: TEXT append -> fused Rule fired: TEXT append -> unfused Rule fired: TEXT tail -> unfused Rule fired: TEXT tail -> unfused -- Without full laziness Rule fired: TEXT append -> fused Rule fired: TEXT append -> fused Rule fired: TEXT append -> fused Rule fired: TEXT append -> fused Rule fired: TEXT append -> unfused Rule fired: TEXT append -> unfused Rule fired: TEXT append -> unfused Rule fired: TEXT tail -> unfused Rule fired: TEXT tail -> unfused Rule fired: TEXT append -> unfused }}} So there is clearly a difference. Should that difference have such a massive performance impact? Ask the author of the text library! Why does full laziness have the effect? Well if you have `(\x. map (f x) (map g ys))`, say, full laziness may float out the `map g ys` and then the map/map fusion won't happen. At this point I hope that someone else will take over debugging to find out more. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8814#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler