
#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): A smaller reproduction case that depends only on `text`: {{{ module Main where import qualified Data.Text.Lazy as LText import Data.Char import System.Environment type LText = LText.Text matches :: LText -> ([LText], [LText]) matches str = case LText.uncons str of Nothing -> ([], []) Just (c, str') -> let (upperMatches, lowerMatches) = matches $ LText.drop 1 str upper = isUpper c lower = isLower c match = LText.take 10 str upperMatches' = if upper then match:upperMatches else upperMatches lowerMatches' = if lower then match:lowerMatches else lowerMatches in (upperMatches', lowerMatches') main = do (arg0:args) <- getArgs input <- LText.pack <$> readFile arg0 let (upper, lower) = matches input putStrLn $ "Lowercase: " ++ show (take 1 lower) putStrLn $ "Uppercase: " ++ show (take 1 upper) print $ LText.take 10 input }}} This example program tries to roughly mimic the usage of lazy `Text`s in `regex-tdfa-text` without actually using any code from that. Specifically, it uses `drop` (which triggers the offending `RULE`), it snatches off characters from the front of the remaining string one by one, it passes the tail through a recursive loop, it holds on to chunks of text as it traverses the input, thus preventing them from being collected, and it eventually forces at least the first one of the accumulated chunks by printing them to the console. I'm not 100% sure whether preventing collection is absolutely necessary to trigger the bug, but in any case, the above example runs slower by about a factor 2 when compiled with optimizations, the dump shows that the offending `RULE` is being hit, and the ticky profiles hint at the same performance issue as well. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:43 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler