[GHC] #14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Runtime Unknown/Multiple | performance bug Test Case: | Blocked By: https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text-bug | Blocking: | Related Tickets: 13745 Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- When using `Text.RE.TDFA` from the regex-tdfa package on GHC 8.2 to match trivial regexes against `Data.Text.Lazy` strings, I see **exponential runtime**. On GHC 8.0, or using `Data.Text` strict strings or `String` strings, the runtime is as expected. Here's my repo with simple test code illustrating the regression and timing stats: https://github.com/ntc2/ghc-8.2.1-regex-lazy-text-bug. In summary, for the problematic combination of GHC version and libraries, the run times are 3s, 10s, 22s, and 40s for counting regex matches in files with 10000, 20000, 30000, and 40000 lines, respectively. For all of the unproblematic combinations -- i.e. GHC 8.0.2, `String`, strict `Data.Text`, or building with profiling -- the run time is always about 1s! And **this is a Heisenbug**, in that the performance problem goes away if I build with profiling support! There is the separate Issue #13745 about **compile-time regressions** in GHC 8.2 + the regex-tdfa package, that [https://ghc.haskell.org/trac/ghc/ticket/13745#comment:18 I commented on]. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text-bug Blocked By: | Blocking: Related Tickets: 13745 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ntc2): * Attachment "Main.hs" added. Test code that exhibits exponential runtime regression -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text-bug Blocked By: | Blocking: Related Tickets: 13745 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ntc2): * Attachment "defs-30000.txt" added. Input file for test program -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text-bug Blocked By: | Blocking: Related Tickets: 13745 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ntc2): As suggested by @nomeata, I'm going to try bisecting GHC between 8.0.2 and 8.2.0 to discover when the performance regression was introduced. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text-bug Blocked By: | Blocking: Related Tickets: 13745 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ntc2): Having "unusable due to shadowed dependencies" trouble with the GHC 8.2.1 I built from source when trying to build my test code. Links to possibly related issues and a patch: - https://github.com/haskell/cabal/issues/4728 - https://github.com/commercialhaskell/stack/issues/3554 - https://phabricator.haskell.org/D4159 - https://ghc.haskell.org/trac/ghc/ticket/14381 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text-bug Blocked By: | Blocking: Related Tickets: #13745 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * owner: (none) => tdammers * related: 13745 => #13745 * milestone: => 8.4.1 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ntc2): * testcase: https://github.com/ntc2/ghc-8.2.1-regex-lazy-text-bug => https://github.com/ntc2/ghc-8.2.1-regex-lazy-text- bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Old description:
When using `Text.RE.TDFA` from the regex-tdfa package on GHC 8.2 to match trivial regexes against `Data.Text.Lazy` strings, I see **exponential runtime**. On GHC 8.0, or using `Data.Text` strict strings or `String` strings, the runtime is as expected.
Here's my repo with simple test code illustrating the regression and timing stats: https://github.com/ntc2/ghc-8.2.1-regex-lazy-text-bug.
In summary, for the problematic combination of GHC version and libraries, the run times are 3s, 10s, 22s, and 40s for counting regex matches in files with 10000, 20000, 30000, and 40000 lines, respectively. For all of the unproblematic combinations -- i.e. GHC 8.0.2, `String`, strict `Data.Text`, or building with profiling -- the run time is always about 1s!
And **this is a Heisenbug**, in that the performance problem goes away if I build with profiling support!
There is the separate Issue #13745 about **compile-time regressions** in GHC 8.2 + the regex-tdfa package, that [https://ghc.haskell.org/trac/ghc/ticket/13745#comment:18 I commented on].
New description: When using `Text.RE.TDFA` from the regex-tdfa package on GHC 8.2 to match trivial regexes against `Data.Text.Lazy` strings, I see **exponential runtime**. On GHC 8.0, or using `Data.Text` strict strings or `String` strings, the runtime is as expected. Here's my repo with simple test code illustrating the regression and timing stats: https://github.com/ntc2/ghc-8.2.1-regex-lazy-text- bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 . In summary, for the problematic combination of GHC version and libraries, the run times are 3s, 10s, 22s, and 40s for counting regex matches in files with 10000, 20000, 30000, and 40000 lines, respectively. For all of the unproblematic combinations -- i.e. GHC 8.0.2, `String`, strict `Data.Text`, or building with profiling -- the run time is always about 1s! And **this is a Heisenbug**, in that the performance problem goes away if I build with profiling support! There is the separate Issue #13745 about **compile-time regressions** in GHC 8.2 + the regex-tdfa package, that [https://ghc.haskell.org/trac/ghc/ticket/13745#comment:18 I commented on]. -- Comment: I'm having trouble with the bisection search because `hashable` won't build on some intermediate dev GHCs I've tried. I tried changing the test case to use `regex-tdfa-text`+`Data.Text.Lazy` directly, instead of via the common interface provided by `regex`, but then the problem goes away. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Don't forget to try with `-fno-state-hack`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ntc2): * related: #13745 => #13745, #14564 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): I took a stab at bisecting this myself, the problem however is that I'm getting too many false positives, even when cleaning thoroughly and starting with as clean a slate as possible, and when I do that, build times become prohibitive, so I gave up after a few night-long attempts. So I will now take a different route, trying to isolate the problem further on current GHC HEAD, see if I can pin it down. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Quick data point: in a test case using a modified `Lazy.hs`, the bad behavior can be reproduced when running certain regular expressions against a large test data set, but not others: - `^def `: bad - `^def[^.]`: bad - `^[^.]`: good - `^d[^.]`: good - `^de[^.]`: good - `^def[^.]`: bad - `^[^.]def`: good (I picked `[^.]` as a subexpression that can never match). Particularly interesting is `^de[^.]` vs. `^def[^.]`: it seems that having to consume 4 or more tokens from the start of the input triggers the bad behavior. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Actually it turns out that I must have done something wrong with my benchmarking; proper testing shows that *all* the examples that start with `^[.^]` (i.e. everything that fails on the first token) runs fast (~20 ms), while everything that matches on the first token at least runs slowly (~30,000 ms). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): I would urge you to build the program and libraries with `-ticky` and see which function is getting executed a lot. The advantage of `-ticky` is that it's guaranteed not to affect optimisation or indeed anything. It just logs what happens. Then you'll need a `-ddump-simpl -ddump-stg` of the relevant modules to match up with the ticky logs. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers):
I would urge you to build the program and libraries with -ticky and see which function is getting executed a lot. The advantage of -ticky is that it's guaranteed not to affect optimisation or indeed anything. It just logs what happens.
That's actually exactly what I'm doing right now. A comparison of running two regular expressions such that one is fast (`^[^.]d`) and the other is slow (`^def`) shows that all the metrics are the same, or nearly the same, except for ALLOC_PRIM_gds (from 669 up to 1801526) and ALLOC_PRIM_ctr (from 371392 up to 9325581456). I should probably dig a bit deeper from here. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Interesting. I was thinking of this part of the ticky file {{{ Entries Alloc Alloc'd Non-void Arguments STG Name -------------------------------------------------------------------------------- 1800047 78722320 0 1 L p_term{v raIX} (main:Main) (fun) 2160065 50401232 0 2 >L base:GHC.List.takeWhile{v rcW} (fun) 1509183 43941968 0 2 LL base:GHC.Base.++{v 03} (fun) 1086514 43446320 0 2 LL base:GHC.List.zip{v 0x} (fun) 1088129 37257616 0 2 >L base:GHC.Base.map{v 01X} (fun) 480015 24961008 0 1 L p2{v raIU} (main:Main) (fun) 240001 19200080 0 2 ML $w$j{v raIV} (main:Main) (fun,se) 665502 17440560 0 2 iL go1{v raJb} (main:Main) (fun) 545117 17421624 0 1 M subterms{v r1ki} (main:Main) (fun) 723688 15430288 0 3 LMM $wmatch'{v raJ5} (main:Main) (fun) 662950 15428096 0 3 MML $sfind'{v raIM} (main:Main) (fun) 363373 12627992 0 1 L go11{v saSX} (main:Main) (fun) in raJb 487124 11666944 0 3 >>M expr_fold{v r1jH} (main:Main) (fun) 1267348 11605760 0 2 LM find'{v raIN} (main:Main) (fun) }}} Usually when something exponential is going on you get some very bug numbers at the top. (You may need to sort first.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Correct, but that's why I needed to dig deeper - ticky-ing only the module itself didn't reveal anything, but compiling a bunch of dependencies with -ticky as well gives me this line: {{{ 2250377760180010799840 0 2 Mi $wnext2{v reC2} (regex-tdfa- text-1.0.0.3-CIfFZ6rjdCoJI5EFpqTwBO:Text.Regex.TDFA.Text.Lazy) (fun) }}} (note that it actually overflows the `Alloc` field, so that's two large numbers concatenated together there) I can't seem to find `wnext2` anywhere in the dumps though, which is a bit strange. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Compile `Text.Regex.TDFA.Text.Lazy` with `-ddump-simpl -ddump-stg -ticky`. I'd be surprised if `$wnext2{v reC2}` doesn't show up. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): That's what I thought I was doing, but apparently I fat-fingered a cabal command. Recompiling as we speak, will report back. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Recompiled everything, with `--force-reinstalls` and `-fforce-recomp`, and `-ddump-stg` on everything including dependencies, however, grepping the entire project tree for `wnext` only matches binaries (`.o`, `.a` and the like), but none of the dumps. So I ran GHC directly on the source file inside the cabal tree: {{{ ../ghc/inplace/bin/ghc-stage2 regex-tdfa- text-1.0.0.3/Text/Regex/TDFA/Text/Lazy.hs -fforce-recomp -package-db .cabal-sandbox/x86_64-linux-ghc-8.5.20180108-packages.conf.d -ticky -c -ddump-stg -rtsopts -XMultiParamTypeClasses }}} But to no avail, `wnext2` does not appear in the STG dump: {{{ ==================== Pre unarise: ==================== sat_s6o2 :: GHC.Types.Int -> GHC.Int.Int64 [LclId] = [] \u [] GHC.Enum.toEnum GHC.Int.$fEnumInt64; $cafter_r6mQ :: GHC.Types.Int -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text [GblId] = [] \u [] GHC.Base.. Data.Text.Lazy.drop sat_s6o2; sat_s6o3 :: GHC.Types.Int -> GHC.Int.Int64 [LclId] = [] \u [] GHC.Enum.toEnum GHC.Int.$fEnumInt64; $cbefore_r6nJ :: GHC.Types.Int -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text [GblId] = [] \u [] GHC.Base.. Data.Text.Lazy.take sat_s6o3; Text.Regex.TDFA.Text.Lazy.$fExtractText [InlPrag=NOUSERINLINE CONLIKE] :: Text.Regex.Base.RegexLike.Extract Data.Text.Internal.Lazy.Text [GblId[DFunId]] = NO_CCS Text.Regex.Base.RegexLike.C:Extract! [$cbefore_r6nJ $cafter_r6mQ Data.Text.Internal.Lazy.empty $cextract_r6nK]; $cextract_r6nK :: (GHC.Types.Int, GHC.Types.Int) -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text [GblId] = [] \u [] Text.Regex.Base.RegexLike.$dmextract Text.Regex.TDFA.Text.Lazy.$fExtractText; Text.Regex.TDFA.Text.Lazy.$fUnconsText [InlPrag=INLINE (sat-args=0)] :: Text.Regex.TDFA.NewDFA.Uncons.Uncons Data.Text.Internal.Lazy.Text [GblId[DFunId(nt)]] = [] \u [] Data.Text.Lazy.uncons; $cmakeRegexOptsM_r6nL :: forall (m :: * -> *). GHC.Base.Monad m => Text.Regex.TDFA.Common.CompOption -> Text.Regex.TDFA.Common.ExecOption -> Data.Text.Internal.Lazy.Text -> m Text.Regex.TDFA.Common.Regex [GblId, Arity=4, Unf=OtherCon []] = [] \r [$dMonad_s6o4 c_s6o5 e_s6o6 source_s6o7] let { sat_s6o8 [Occ=Once] :: GHC.Base.String [LclId] = [source_s6o7] \u [] Data.Text.Lazy.unpack source_s6o7; } in Text.Regex.Base.RegexLike.makeRegexOptsM Text.Regex.TDFA.String.$fRegexMakerRegexCompOptionExecOption[] $dMonad_s6o4 c_s6o5 e_s6o6 sat_s6o8; Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText [InlPrag=NOUSERINLINE CONLIKE] :: Text.Regex.Base.RegexLike.RegexMaker Text.Regex.TDFA.Common.Regex Text.Regex.TDFA.Common.CompOption Text.Regex.TDFA.Common.ExecOption Data.Text.Internal.Lazy.Text [GblId[DFunId]] = NO_CCS Text.Regex.Base.RegexLike.C:RegexMaker! [Text.Regex.TDFA.Common.$fRegexOptionsRegexCompOptionExecOption $cmakeRegex_r6nN $cmakeRegexOpts_r6nM $cmakeRegexM_r6nO $cmakeRegexOptsM_r6nL]; $cmakeRegexOpts_r6nM :: Text.Regex.TDFA.Common.CompOption -> Text.Regex.TDFA.Common.ExecOption -> Data.Text.Internal.Lazy.Text -> Text.Regex.TDFA.Common.Regex [GblId] = [] \u [] Text.Regex.Base.RegexLike.$dmmakeRegexOpts Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText; $cmakeRegex_r6nN :: Data.Text.Internal.Lazy.Text -> Text.Regex.TDFA.Common.Regex [GblId] = [] \u [] Text.Regex.Base.RegexLike.$dmmakeRegex Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText; $cmakeRegexM_r6nO :: forall (m :: * -> *). GHC.Base.Monad m => Data.Text.Internal.Lazy.Text -> m Text.Regex.TDFA.Common.Regex [GblId, Arity=1, Unf=OtherCon []] = [] \r [$dMonad_s6o9] Text.Regex.Base.RegexLike.$dmmakeRegexM Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText $dMonad_s6o9; $trModule1_r6nP :: GHC.Prim.Addr# [GblId, Caf=NoCafRefs, Unf=OtherCon []] = "main"#; $trModule2_r6nQ :: GHC.Types.TrName [GblId, Caf=NoCafRefs, Unf=OtherCon []] = NO_CCS GHC.Types.TrNameS! [$trModule1_r6nP]; $trModule3_r6nR :: GHC.Prim.Addr# [GblId, Caf=NoCafRefs, Unf=OtherCon []] = "Text.Regex.TDFA.Text.Lazy"#; $trModule4_r6nS :: GHC.Types.TrName [GblId, Caf=NoCafRefs, Unf=OtherCon []] = NO_CCS GHC.Types.TrNameS! [$trModule3_r6nR]; Text.Regex.TDFA.Text.Lazy.$trModule :: GHC.Types.Module [GblId, Caf=NoCafRefs, Unf=OtherCon []] = NO_CCS GHC.Types.Module! [$trModule2_r6nQ $trModule4_r6nS]; $cmatchTest_r6nT :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> GHC.Types.Bool [GblId] = [] \u [] Text.Regex.TDFA.NewDFA.Tester.matchTest Data.Text.Lazy.uncons; $cmatchAll_r6nU :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> [Text.Regex.Base.RegexLike.MatchArray] [GblId, Arity=2, Unf=OtherCon []] = [] \r [r_s6oa s_s6ob] let { sat_s6od [Occ=Once] :: GHC.Types.Char [LclId] = NO_CCS GHC.Types.C#! ['\n'#]; } in let { sat_s6oc [Occ=Once] :: Text.Regex.TDFA.Common.Position [LclId] = NO_CCS GHC.Types.I#! [0#]; } in Text.Regex.TDFA.NewDFA.Engine.execMatch Data.Text.Lazy.uncons r_s6oa sat_s6oc sat_s6od s_s6ob; Text.Regex.TDFA.Text.Lazy.compile :: Text.Regex.TDFA.Common.CompOption -> Text.Regex.TDFA.Common.ExecOption -> Data.Text.Internal.Lazy.Text -> Data.Either.Either GHC.Base.String Text.Regex.TDFA.Common.Regex [GblId, Arity=3, Unf=OtherCon []] = [] \r [compOpt_s6oe execOpt_s6of txt_s6og] let { sat_s6oh [Occ=Once] :: GHC.Base.String [LclId] = [txt_s6og] \u [] Data.Text.Lazy.unpack txt_s6og; } in case Text.Regex.TDFA.ReadRegex.parseRegex sat_s6oh of { Data.Either.Left err_s6oj [Occ=Once] -> let { sat_s6om [Occ=Once] :: [GHC.Types.Char] [LclId] = [err_s6oj] \u [] let { sat_s6ol [Occ=Once] :: [GHC.Types.Char] [LclId] = [err_s6oj] \u [] GHC.Show.show Text.Parsec.Error.$fShowParseError err_s6oj; } in let { sat_s6ok [Occ=Once] :: [GHC.Types.Char] [LclId] = [] \u [] GHC.CString.unpackCString# "parseRegex for Text.Regex.TDFA.Text.Lazy failed:"#; } in GHC.Base.++ sat_s6ok sat_s6ol; } in Data.Either.Left [sat_s6om]; Data.Either.Right pattern_s6on [Occ=Once] -> let { sat_s6oo [Occ=Once] :: Text.Regex.TDFA.Common.Regex [LclId] = [compOpt_s6oe execOpt_s6of pattern_s6on] \u [] Text.Regex.TDFA.TDFA.patternToRegex pattern_s6on compOpt_s6oe execOpt_s6of; } in Data.Either.Right [sat_s6oo]; }; $cmatchOnce_r6nV :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> GHC.Base.Maybe Text.Regex.Base.RegexLike.MatchArray [GblId, Arity=2, Unf=OtherCon []] = [] \r [r_s6op s_s6oq] let { sat_s6or [Occ=Once] :: [Text.Regex.Base.RegexLike.MatchArray] [LclId] = [r_s6op s_s6oq] \u [] $cmatchAll_r6nU r_s6op s_s6oq; } in Data.Maybe.listToMaybe sat_s6or; $cmatchAllText_r6nW :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> [Text.Regex.Base.RegexLike.MatchText Data.Text.Internal.Lazy.Text] [GblId, Arity=2, Unf=OtherCon []] = [] \r [regex_s6os source_s6ot] let { go_s6ou [Occ=LoopBreaker] :: GHC.Types.Int -> Data.Text.Internal.Lazy.Text -> [GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int)] -> [GHC.Arr.Array GHC.Types.Int (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int))] [LclId, Arity=3, Unf=OtherCon []] = sat-only [go_s6ou] \r [i_s6ov ds_s6ow ds1_s6ox] case i_s6ov of i1_s6oy { GHC.Types.I# _ [Occ=Dead] -> case ds1_s6ox of { [] -> [] []; : x_s6oB xs_s6oC [Occ=Once] -> let { ds2_s6oD :: (GHC.Types.Int, GHC.Types.Int) [LclId] = [x_s6oB] \u [] let { sat_s6oE [Occ=Once] :: GHC.Types.Int [LclId] = NO_CCS GHC.Types.I#! [0#]; } in Data.Array.Base.! Data.Array.Base.$fIArrayArraye GHC.Arr.$fIxInt x_s6oB sat_s6oE; } in let { off0_s6oF :: GHC.Types.Int [LclId] = [ds2_s6oD] \u [] case ds2_s6oD of { (,) off1_s6oH [Occ=Once] _ [Occ=Dead] -> off1_s6oH; }; } in let { len0_s6oJ :: GHC.Types.Int [LclId] = [ds2_s6oD] \u [] case ds2_s6oD of { (,) _ [Occ=Dead] len1_s6oM [Occ=Once] -> len1_s6oM; }; } in let { sat_s6p0 [Occ=Once] :: [GHC.Arr.Array GHC.Types.Int (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int))] [LclId] = [go_s6ou ds_s6ow i1_s6oy xs_s6oC off0_s6oF len0_s6oJ] \u [] let { sat_s6oX [Occ=Once] :: GHC.Types.Int [LclId] = [i1_s6oy off0_s6oF len0_s6oJ] \u [] let { sat_s6oW [Occ=Once] :: GHC.Types.Int [LclId] = [i1_s6oy len0_s6oJ] \u [] GHC.Num.- GHC.Num.$fNumInt len0_s6oJ i1_s6oy; } in GHC.Num.+ GHC.Num.$fNumInt off0_s6oF sat_s6oW; } in case $cafter_r6mQ sat_s6oX ds_s6ow of t'_s6oY { __DEFAULT -> let { sat_s6oZ [Occ=Once] :: GHC.Types.Int [LclId] = [off0_s6oF len0_s6oJ] \u [] GHC.Num.+ GHC.Num.$fNumInt off0_s6oF len0_s6oJ; } in go_s6ou sat_s6oZ t'_s6oY xs_s6oC; }; } in let { sat_s6oV [Occ=Once] :: GHC.Arr.Array GHC.Types.Int (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int)) [LclId] = [ds_s6ow i1_s6oy x_s6oB] \u [] let { sat_s6oU [Occ=Once] :: (GHC.Types.Int, GHC.Types.Int) -> (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int)) [LclId] = [ds_s6ow i1_s6oy] \r [pair_s6oN] case pair_s6oN of wild1_s6oO { (,) off_s6oP [Occ=Once] len_s6oQ [Occ=Once] -> let { sat_s6oT [Occ=Once] :: Data.Text.Internal.Lazy.Text [LclId] = [ds_s6ow i1_s6oy off_s6oP len_s6oQ] \u [] let { sat_s6oR [Occ=Once] :: GHC.Types.Int [LclId] = [i1_s6oy off_s6oP] \u [] GHC.Num.- GHC.Num.$fNumInt off_s6oP i1_s6oy; } in let { sat_s6oS [Occ=Once] :: (GHC.Types.Int, GHC.Types.Int) [LclId] = NO_CCS (,)! [sat_s6oR len_s6oQ]; } in $cextract_r6nK sat_s6oS ds_s6ow; } in (,) [sat_s6oT wild1_s6oO]; }; } in GHC.Base.fmap GHC.Arr.$fFunctorArray sat_s6oU x_s6oB; } in : [sat_s6oV sat_s6p0]; }; }; } in let { sat_s6p2 [Occ=Once] :: [GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int)] [LclId] = [regex_s6os source_s6ot] \u [] $cmatchAll_r6nU regex_s6os source_s6ot; } in let { sat_s6p1 [Occ=Once] :: GHC.Types.Int [LclId] = NO_CCS GHC.Types.I#! [0#]; } in go_s6ou sat_s6p1 source_s6ot sat_s6p2; $cmatchCount_r6nX :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> GHC.Types.Int [GblId, Arity=2, Unf=OtherCon []] = [] \r [r_s6p3 s_s6p4] let { sat_s6pm [Occ=Once] :: [Text.Regex.Base.RegexLike.MatchArray] [LclId] = [r_s6p3 s_s6p4] \u [] let { sat_s6pl [Occ=Once] :: GHC.Types.Char [LclId] = NO_CCS GHC.Types.C#! ['\n'#]; } in let { sat_s6pk [Occ=Once] :: Text.Regex.TDFA.Common.Position [LclId] = NO_CCS GHC.Types.I#! [0#]; } in let { sat_s6pj [Occ=Once] :: Text.Regex.TDFA.Common.Regex [LclId] = [r_s6p3] \u [] case r_s6p3 of wild_s6p5 { Text.Regex.TDFA.Common.Regex ds_s6p6 [Occ=Once] ds1_s6p7 [Occ=Once] ds2_s6p8 [Occ=Once] ds3_s6p9 [Occ=Once] ds4_s6pa [Occ=Once] ds5_s6pb [Occ=Once] ds6_s6pc [Occ=Once] ds7_s6pd [Occ=Once] ds8_s6pe [Occ=Once] _ [Occ=Dead] -> let { sat_s6pi [Occ=Once] :: Text.Regex.TDFA.Common.ExecOption [LclId] = [wild_s6p5] \u [] case Text.Regex.TDFA.Common.regex_execOptions wild_s6p5 of { Text.Regex.TDFA.Common.ExecOption _ [Occ=Dead] -> Text.Regex.TDFA.Common.ExecOption [GHC.Types.False]; }; } in Text.Regex.TDFA.Common.Regex [ds_s6p6 ds1_s6p7 ds2_s6p8 ds3_s6p9 ds4_s6pa ds5_s6pb ds6_s6pc ds7_s6pd ds8_s6pe sat_s6pi]; }; } in Text.Regex.TDFA.NewDFA.Engine.execMatch Data.Text.Lazy.uncons sat_s6pj sat_s6pk sat_s6pl s_s6p4; } in Data.Foldable.length Data.Foldable.$fFoldable[] sat_s6pm; $cmatchOnceText_r6nY :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> GHC.Base.Maybe (Data.Text.Internal.Lazy.Text, Text.Regex.Base.RegexLike.MatchText Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text) [GblId, Arity=2, Unf=OtherCon []] = [] \r [regex_s6pn source_s6po] let { sat_s6pL [Occ=Once] :: GHC.Base.Maybe (GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int)) [LclId] = [regex_s6pn source_s6po] \u [] let { sat_s6pK [Occ=Once] :: [Text.Regex.Base.RegexLike.MatchArray] [LclId] = [regex_s6pn source_s6po] \u [] let { sat_s6pJ [Occ=Once] :: GHC.Types.Char [LclId] = NO_CCS GHC.Types.C#! ['\n'#]; } in let { sat_s6pI [Occ=Once] :: Text.Regex.TDFA.Common.Position [LclId] = NO_CCS GHC.Types.I#! [0#]; } in Text.Regex.TDFA.NewDFA.Engine.execMatch Data.Text.Lazy.uncons regex_s6pn sat_s6pI sat_s6pJ source_s6po; } in Data.Maybe.listToMaybe sat_s6pK; } in let { sat_s6pH [Occ=Once] :: GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int) -> (Data.Text.Internal.Lazy.Text, GHC.Arr.Array GHC.Types.Int (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int)), Data.Text.Internal.Lazy.Text) [LclId] = [source_s6po] \r [ma_s6pp] let { ds_s6pq :: (GHC.Types.Int, GHC.Types.Int) [LclId] = [ma_s6pp] \u [] let { sat_s6pr [Occ=Once] :: GHC.Types.Int [LclId] = NO_CCS GHC.Types.I#! [0#]; } in Data.Array.Base.! Data.Array.Base.$fIArrayArraye GHC.Arr.$fIxInt ma_s6pp sat_s6pr; } in let { o_s6ps :: GHC.Types.Int [LclId] = [ds_s6pq] \u [] case ds_s6pq of { (,) o1_s6pu [Occ=Once] _ [Occ=Dead] -> o1_s6pu; }; } in let { sat_s6pG [Occ=Once] :: Data.Text.Internal.Lazy.Text [LclId] = [source_s6po ds_s6pq o_s6ps] \u [] let { sat_s6pF [Occ=Once] :: GHC.Types.Int [LclId] = [ds_s6pq o_s6ps] \u [] let { sat_s6pE [Occ=Once] :: GHC.Types.Int [LclId] = [ds_s6pq] \u [] case ds_s6pq of { (,) _ [Occ=Dead] l_s6pD [Occ=Once] -> l_s6pD; }; } in GHC.Num.+ GHC.Num.$fNumInt o_s6ps sat_s6pE; } in $cafter_r6mQ sat_s6pF source_s6po; } in let { sat_s6pA [Occ=Once] :: GHC.Arr.Array GHC.Types.Int (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int)) [LclId] = [source_s6po ma_s6pp] \u [] let { sat_s6pz [Occ=Once] :: (GHC.Types.Int, GHC.Types.Int) -> (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int)) [LclId] = [source_s6po] \r [ol_s6px] let { sat_s6py [Occ=Once] :: Data.Text.Internal.Lazy.Text [LclId] = [source_s6po ol_s6px] \u [] $cextract_r6nK ol_s6px source_s6po; } in (,) [sat_s6py ol_s6px]; } in GHC.Base.fmap GHC.Arr.$fFunctorArray sat_s6pz ma_s6pp; } in let { sat_s6pw [Occ=Once] :: Data.Text.Internal.Lazy.Text [LclId] = [source_s6po o_s6ps] \u [] $cbefore_r6nJ o_s6ps source_s6po; } in (,,) [sat_s6pw sat_s6pA sat_s6pG]; } in GHC.Base.fmap GHC.Base.$fFunctorMaybe sat_s6pH sat_s6pL; Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText [InlPrag=NOUSERINLINE CONLIKE] :: Text.Regex.Base.RegexLike.RegexLike Text.Regex.TDFA.Common.Regex Data.Text.Internal.Lazy.Text [GblId[DFunId]] = NO_CCS Text.Regex.Base.RegexLike.C:RegexLike! [Text.Regex.TDFA.Text.Lazy.$fExtractText $cmatchOnce_r6nV $cmatchAll_r6nU $cmatchCount_r6nX $cmatchTest_r6nT $cmatchAllText_r6nW $cmatchOnceText_r6nY]; Text.Regex.TDFA.Text.Lazy.regexec :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> Data.Either.Either GHC.Base.String (GHC.Base.Maybe (Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text, [Data.Text.Internal.Lazy.Text])) [GblId, Arity=2, Unf=OtherCon []] = [] \r [r_s6pM txt_s6pN] case $cmatchOnceText_r6nY r_s6pM txt_s6pN of { GHC.Base.Nothing -> Data.Either.Right [GHC.Base.Nothing]; GHC.Base.Just ds_s6pP [Occ=Once!] -> case ds_s6pP of { (,,) pre_s6pR [Occ=Once] mt_s6pS post_s6pT [Occ=Once] -> let { sat_s6pZ [Occ=Once] :: [Data.Text.Internal.Lazy.Text] [LclId] = [mt_s6pS] \u [] let { sat_s6pY [Occ=Once] :: [(Data.Text.Internal.Lazy.Text, (Text.Regex.Base.RegexLike.MatchOffset, Text.Regex.Base.RegexLike.MatchLength))] [LclId] = [mt_s6pS] \u [] let { sat_s6pX [Occ=Once] :: [(Data.Text.Internal.Lazy.Text, (Text.Regex.Base.RegexLike.MatchOffset, Text.Regex.Base.RegexLike.MatchLength))] [LclId] = [mt_s6pS] \u [] Data.Array.Base.elems Data.Array.Base.$fIArrayArraye GHC.Arr.$fIxInt mt_s6pS; } in GHC.List.tail sat_s6pX; } in GHC.Base.map Data.Tuple.fst sat_s6pY; } in let { sat_s6pW [Occ=Once] :: Data.Text.Internal.Lazy.Text [LclId] = [mt_s6pS] \u [] let { sat_s6pV [Occ=Once] :: (Data.Text.Internal.Lazy.Text, (Text.Regex.Base.RegexLike.MatchOffset, Text.Regex.Base.RegexLike.MatchLength)) [LclId] = [mt_s6pS] \u [] let { sat_s6pU [Occ=Once] :: GHC.Types.Int [LclId] = NO_CCS GHC.Types.I#! [0#]; } in Data.Array.Base.! Data.Array.Base.$fIArrayArraye GHC.Arr.$fIxInt mt_s6pS sat_s6pU; } in Data.Tuple.fst sat_s6pV; } in let { sat_s6q0 [Occ=Once] :: (Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text, [Data.Text.Internal.Lazy.Text]) [LclId] = NO_CCS (,,,)! [pre_s6pR sat_s6pW post_s6pT sat_s6pZ]; } in let { sat_s6q1 [Occ=Once] :: GHC.Base.Maybe (Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text, [Data.Text.Internal.Lazy.Text]) [LclId] = NO_CCS GHC.Base.Just! [sat_s6q0]; } in Data.Either.Right [sat_s6q1]; }; }; $cmatch_r6nZ :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text [GblId] = [] \u [] Text.Regex.Base.Impl.polymatch Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText; $cmatchM_r6o0 :: forall (m :: * -> *). GHC.Base.Monad m => Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> m Data.Text.Internal.Lazy.Text [GblId, Arity=1, Unf=OtherCon []] = [] \r [$dMonad_s6q2] Text.Regex.Base.Impl.polymatchM Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText $dMonad_s6q2; Text.Regex.TDFA.Text.Lazy.$fRegexContextRegexTextText [InlPrag=NOUSERINLINE CONLIKE] :: Text.Regex.Base.RegexLike.RegexContext Text.Regex.TDFA.Common.Regex Data.Text.Internal.Lazy.Text Data.Text.Internal.Lazy.Text [GblId[DFunId]] = NO_CCS Text.Regex.Base.RegexLike.C:RegexContext! [Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText $cmatch_r6nZ $cmatchM_r6o0]; Text.Regex.TDFA.Text.Lazy.execute :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> Data.Either.Either GHC.Base.String (GHC.Base.Maybe Text.Regex.Base.RegexLike.MatchArray) [GblId, Arity=2, Unf=OtherCon []] = [] \r [r_s6q3 txt_s6q4] let { sat_s6q5 [Occ=Once] :: GHC.Base.Maybe Text.Regex.Base.RegexLike.MatchArray [LclId] = [r_s6q3 txt_s6q4] \u [] $cmatchOnce_r6nV r_s6q3 txt_s6q4; } in Data.Either.Right [sat_s6q5]; ==================== STG syntax: ==================== sat_s6o2 :: GHC.Types.Int -> GHC.Int.Int64 [LclId] = [] \u [] GHC.Enum.toEnum GHC.Int.$fEnumInt64; $cafter_r6mQ :: GHC.Types.Int -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text [GblId] = [] \u [] GHC.Base.. Data.Text.Lazy.drop sat_s6o2; sat_s6o3 :: GHC.Types.Int -> GHC.Int.Int64 [LclId] = [] \u [] GHC.Enum.toEnum GHC.Int.$fEnumInt64; $cbefore_r6nJ :: GHC.Types.Int -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text [GblId] = [] \u [] GHC.Base.. Data.Text.Lazy.take sat_s6o3; Text.Regex.TDFA.Text.Lazy.$fExtractText [InlPrag=NOUSERINLINE CONLIKE] :: Text.Regex.Base.RegexLike.Extract Data.Text.Internal.Lazy.Text [GblId[DFunId]] = NO_CCS Text.Regex.Base.RegexLike.C:Extract! [$cbefore_r6nJ $cafter_r6mQ Data.Text.Internal.Lazy.empty $cextract_r6nK]; $cextract_r6nK :: (GHC.Types.Int, GHC.Types.Int) -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text [GblId] = [] \u [] Text.Regex.Base.RegexLike.$dmextract Text.Regex.TDFA.Text.Lazy.$fExtractText; Text.Regex.TDFA.Text.Lazy.$fUnconsText [InlPrag=INLINE (sat-args=0)] :: Text.Regex.TDFA.NewDFA.Uncons.Uncons Data.Text.Internal.Lazy.Text [GblId[DFunId(nt)]] = [] \u [] Data.Text.Lazy.uncons; $cmakeRegexOptsM_r6nL :: forall (m :: * -> *). GHC.Base.Monad m => Text.Regex.TDFA.Common.CompOption -> Text.Regex.TDFA.Common.ExecOption -> Data.Text.Internal.Lazy.Text -> m Text.Regex.TDFA.Common.Regex [GblId, Arity=4, Unf=OtherCon []] = [] \r [$dMonad_s6o4 c_s6o5 e_s6o6 source_s6o7] let { sat_s6o8 [Occ=Once] :: GHC.Base.String [LclId] = [source_s6o7] \u [] Data.Text.Lazy.unpack source_s6o7; } in Text.Regex.Base.RegexLike.makeRegexOptsM Text.Regex.TDFA.String.$fRegexMakerRegexCompOptionExecOption[] $dMonad_s6o4 c_s6o5 e_s6o6 sat_s6o8; Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText [InlPrag=NOUSERINLINE CONLIKE] :: Text.Regex.Base.RegexLike.RegexMaker Text.Regex.TDFA.Common.Regex Text.Regex.TDFA.Common.CompOption Text.Regex.TDFA.Common.ExecOption Data.Text.Internal.Lazy.Text [GblId[DFunId]] = NO_CCS Text.Regex.Base.RegexLike.C:RegexMaker! [Text.Regex.TDFA.Common.$fRegexOptionsRegexCompOptionExecOption $cmakeRegex_r6nN $cmakeRegexOpts_r6nM $cmakeRegexM_r6nO $cmakeRegexOptsM_r6nL]; $cmakeRegexOpts_r6nM :: Text.Regex.TDFA.Common.CompOption -> Text.Regex.TDFA.Common.ExecOption -> Data.Text.Internal.Lazy.Text -> Text.Regex.TDFA.Common.Regex [GblId] = [] \u [] Text.Regex.Base.RegexLike.$dmmakeRegexOpts Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText; $cmakeRegex_r6nN :: Data.Text.Internal.Lazy.Text -> Text.Regex.TDFA.Common.Regex [GblId] = [] \u [] Text.Regex.Base.RegexLike.$dmmakeRegex Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText; $cmakeRegexM_r6nO :: forall (m :: * -> *). GHC.Base.Monad m => Data.Text.Internal.Lazy.Text -> m Text.Regex.TDFA.Common.Regex [GblId, Arity=1, Unf=OtherCon []] = [] \r [$dMonad_s6o9] Text.Regex.Base.RegexLike.$dmmakeRegexM Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText $dMonad_s6o9; $trModule1_r6nP :: GHC.Prim.Addr# [GblId, Caf=NoCafRefs, Unf=OtherCon []] = "main"#; $trModule2_r6nQ :: GHC.Types.TrName [GblId, Caf=NoCafRefs, Unf=OtherCon []] = NO_CCS GHC.Types.TrNameS! [$trModule1_r6nP]; $trModule3_r6nR :: GHC.Prim.Addr# [GblId, Caf=NoCafRefs, Unf=OtherCon []] = "Text.Regex.TDFA.Text.Lazy"#; $trModule4_r6nS :: GHC.Types.TrName [GblId, Caf=NoCafRefs, Unf=OtherCon []] = NO_CCS GHC.Types.TrNameS! [$trModule3_r6nR]; Text.Regex.TDFA.Text.Lazy.$trModule :: GHC.Types.Module [GblId, Caf=NoCafRefs, Unf=OtherCon []] = NO_CCS GHC.Types.Module! [$trModule2_r6nQ $trModule4_r6nS]; $cmatchTest_r6nT :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> GHC.Types.Bool [GblId] = [] \u [] Text.Regex.TDFA.NewDFA.Tester.matchTest Data.Text.Lazy.uncons; $cmatchAll_r6nU :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> [Text.Regex.Base.RegexLike.MatchArray] [GblId, Arity=2, Unf=OtherCon []] = [] \r [r_s6oa s_s6ob] let { sat_s6od [Occ=Once] :: GHC.Types.Char [LclId] = NO_CCS GHC.Types.C#! ['\n'#]; } in let { sat_s6oc [Occ=Once] :: Text.Regex.TDFA.Common.Position [LclId] = NO_CCS GHC.Types.I#! [0#]; } in Text.Regex.TDFA.NewDFA.Engine.execMatch Data.Text.Lazy.uncons r_s6oa sat_s6oc sat_s6od s_s6ob; Text.Regex.TDFA.Text.Lazy.compile :: Text.Regex.TDFA.Common.CompOption -> Text.Regex.TDFA.Common.ExecOption -> Data.Text.Internal.Lazy.Text -> Data.Either.Either GHC.Base.String Text.Regex.TDFA.Common.Regex [GblId, Arity=3, Unf=OtherCon []] = [] \r [compOpt_s6oe execOpt_s6of txt_s6og] let { sat_s6oh [Occ=Once] :: GHC.Base.String [LclId] = [txt_s6og] \u [] Data.Text.Lazy.unpack txt_s6og; } in case Text.Regex.TDFA.ReadRegex.parseRegex sat_s6oh of { Data.Either.Left err_s6oj [Occ=Once] -> let { sat_s6om [Occ=Once] :: [GHC.Types.Char] [LclId] = [err_s6oj] \u [] let { sat_s6ol [Occ=Once] :: [GHC.Types.Char] [LclId] = [err_s6oj] \u [] GHC.Show.show Text.Parsec.Error.$fShowParseError err_s6oj; } in let { sat_s6ok [Occ=Once] :: [GHC.Types.Char] [LclId] = [] \u [] GHC.CString.unpackCString# "parseRegex for Text.Regex.TDFA.Text.Lazy failed:"#; } in GHC.Base.++ sat_s6ok sat_s6ol; } in Data.Either.Left [sat_s6om]; Data.Either.Right pattern_s6on [Occ=Once] -> let { sat_s6oo [Occ=Once] :: Text.Regex.TDFA.Common.Regex [LclId] = [compOpt_s6oe execOpt_s6of pattern_s6on] \u [] Text.Regex.TDFA.TDFA.patternToRegex pattern_s6on compOpt_s6oe execOpt_s6of; } in Data.Either.Right [sat_s6oo]; }; $cmatchOnce_r6nV :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> GHC.Base.Maybe Text.Regex.Base.RegexLike.MatchArray [GblId, Arity=2, Unf=OtherCon []] = [] \r [r_s6op s_s6oq] let { sat_s6or [Occ=Once] :: [Text.Regex.Base.RegexLike.MatchArray] [LclId] = [r_s6op s_s6oq] \u [] $cmatchAll_r6nU r_s6op s_s6oq; } in Data.Maybe.listToMaybe sat_s6or; $cmatchAllText_r6nW :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> [Text.Regex.Base.RegexLike.MatchText Data.Text.Internal.Lazy.Text] [GblId, Arity=2, Unf=OtherCon []] = [] \r [regex_s6os source_s6ot] let { go_s6ou [Occ=LoopBreaker] :: GHC.Types.Int -> Data.Text.Internal.Lazy.Text -> [GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int)] -> [GHC.Arr.Array GHC.Types.Int (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int))] [LclId, Arity=3, Unf=OtherCon []] = sat-only [go_s6ou] \r [i_s6ov ds_s6ow ds1_s6ox] case i_s6ov of i1_s6oy { GHC.Types.I# _ [Occ=Dead] -> case ds1_s6ox of { [] -> [] []; : x_s6oB xs_s6oC [Occ=Once] -> let { ds2_s6oD :: (GHC.Types.Int, GHC.Types.Int) [LclId] = [x_s6oB] \u [] let { sat_s6oE [Occ=Once] :: GHC.Types.Int [LclId] = NO_CCS GHC.Types.I#! [0#]; } in Data.Array.Base.! Data.Array.Base.$fIArrayArraye GHC.Arr.$fIxInt x_s6oB sat_s6oE; } in let { off0_s6oF :: GHC.Types.Int [LclId] = [ds2_s6oD] \u [] case ds2_s6oD of { (,) off1_s6oH [Occ=Once] _ [Occ=Dead] -> off1_s6oH; }; } in let { len0_s6oJ :: GHC.Types.Int [LclId] = [ds2_s6oD] \u [] case ds2_s6oD of { (,) _ [Occ=Dead] len1_s6oM [Occ=Once] -> len1_s6oM; }; } in let { sat_s6p0 [Occ=Once] :: [GHC.Arr.Array GHC.Types.Int (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int))] [LclId] = [go_s6ou ds_s6ow i1_s6oy xs_s6oC off0_s6oF len0_s6oJ] \u [] let { sat_s6oX [Occ=Once] :: GHC.Types.Int [LclId] = [i1_s6oy off0_s6oF len0_s6oJ] \u [] let { sat_s6oW [Occ=Once] :: GHC.Types.Int [LclId] = [i1_s6oy len0_s6oJ] \u [] GHC.Num.- GHC.Num.$fNumInt len0_s6oJ i1_s6oy; } in GHC.Num.+ GHC.Num.$fNumInt off0_s6oF sat_s6oW; } in case $cafter_r6mQ sat_s6oX ds_s6ow of t'_s6oY { __DEFAULT -> let { sat_s6oZ [Occ=Once] :: GHC.Types.Int [LclId] = [off0_s6oF len0_s6oJ] \u [] GHC.Num.+ GHC.Num.$fNumInt off0_s6oF len0_s6oJ; } in go_s6ou sat_s6oZ t'_s6oY xs_s6oC; }; } in let { sat_s6oV [Occ=Once] :: GHC.Arr.Array GHC.Types.Int (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int)) [LclId] = [ds_s6ow i1_s6oy x_s6oB] \u [] let { sat_s6oU [Occ=Once] :: (GHC.Types.Int, GHC.Types.Int) -> (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int)) [LclId] = [ds_s6ow i1_s6oy] \r [pair_s6oN] case pair_s6oN of wild1_s6oO { (,) off_s6oP [Occ=Once] len_s6oQ [Occ=Once] -> let { sat_s6oT [Occ=Once] :: Data.Text.Internal.Lazy.Text [LclId] = [ds_s6ow i1_s6oy off_s6oP len_s6oQ] \u [] let { sat_s6oR [Occ=Once] :: GHC.Types.Int [LclId] = [i1_s6oy off_s6oP] \u [] GHC.Num.- GHC.Num.$fNumInt off_s6oP i1_s6oy; } in let { sat_s6oS [Occ=Once] :: (GHC.Types.Int, GHC.Types.Int) [LclId] = NO_CCS (,)! [sat_s6oR len_s6oQ]; } in $cextract_r6nK sat_s6oS ds_s6ow; } in (,) [sat_s6oT wild1_s6oO]; }; } in GHC.Base.fmap GHC.Arr.$fFunctorArray sat_s6oU x_s6oB; } in : [sat_s6oV sat_s6p0]; }; }; } in let { sat_s6p2 [Occ=Once] :: [GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int)] [LclId] = [regex_s6os source_s6ot] \u [] $cmatchAll_r6nU regex_s6os source_s6ot; } in let { sat_s6p1 [Occ=Once] :: GHC.Types.Int [LclId] = NO_CCS GHC.Types.I#! [0#]; } in go_s6ou sat_s6p1 source_s6ot sat_s6p2; $cmatchCount_r6nX :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> GHC.Types.Int [GblId, Arity=2, Unf=OtherCon []] = [] \r [r_s6p3 s_s6p4] let { sat_s6pm [Occ=Once] :: [Text.Regex.Base.RegexLike.MatchArray] [LclId] = [r_s6p3 s_s6p4] \u [] let { sat_s6pl [Occ=Once] :: GHC.Types.Char [LclId] = NO_CCS GHC.Types.C#! ['\n'#]; } in let { sat_s6pk [Occ=Once] :: Text.Regex.TDFA.Common.Position [LclId] = NO_CCS GHC.Types.I#! [0#]; } in let { sat_s6pj [Occ=Once] :: Text.Regex.TDFA.Common.Regex [LclId] = [r_s6p3] \u [] case r_s6p3 of wild_s6p5 { Text.Regex.TDFA.Common.Regex ds_s6p6 [Occ=Once] ds1_s6p7 [Occ=Once] ds2_s6p8 [Occ=Once] ds3_s6p9 [Occ=Once] ds4_s6pa [Occ=Once] ds5_s6pb [Occ=Once] ds6_s6pc [Occ=Once] ds7_s6pd [Occ=Once] ds8_s6pe [Occ=Once] _ [Occ=Dead] -> let { sat_s6pi [Occ=Once] :: Text.Regex.TDFA.Common.ExecOption [LclId] = [wild_s6p5] \u [] case Text.Regex.TDFA.Common.regex_execOptions wild_s6p5 of { Text.Regex.TDFA.Common.ExecOption _ [Occ=Dead] -> Text.Regex.TDFA.Common.ExecOption [GHC.Types.False]; }; } in Text.Regex.TDFA.Common.Regex [ds_s6p6 ds1_s6p7 ds2_s6p8 ds3_s6p9 ds4_s6pa ds5_s6pb ds6_s6pc ds7_s6pd ds8_s6pe sat_s6pi]; }; } in Text.Regex.TDFA.NewDFA.Engine.execMatch Data.Text.Lazy.uncons sat_s6pj sat_s6pk sat_s6pl s_s6p4; } in Data.Foldable.length Data.Foldable.$fFoldable[] sat_s6pm; $cmatchOnceText_r6nY :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> GHC.Base.Maybe (Data.Text.Internal.Lazy.Text, Text.Regex.Base.RegexLike.MatchText Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text) [GblId, Arity=2, Unf=OtherCon []] = [] \r [regex_s6pn source_s6po] let { sat_s6pL [Occ=Once] :: GHC.Base.Maybe (GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int)) [LclId] = [regex_s6pn source_s6po] \u [] let { sat_s6pK [Occ=Once] :: [Text.Regex.Base.RegexLike.MatchArray] [LclId] = [regex_s6pn source_s6po] \u [] let { sat_s6pJ [Occ=Once] :: GHC.Types.Char [LclId] = NO_CCS GHC.Types.C#! ['\n'#]; } in let { sat_s6pI [Occ=Once] :: Text.Regex.TDFA.Common.Position [LclId] = NO_CCS GHC.Types.I#! [0#]; } in Text.Regex.TDFA.NewDFA.Engine.execMatch Data.Text.Lazy.uncons regex_s6pn sat_s6pI sat_s6pJ source_s6po; } in Data.Maybe.listToMaybe sat_s6pK; } in let { sat_s6pH [Occ=Once] :: GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int) -> (Data.Text.Internal.Lazy.Text, GHC.Arr.Array GHC.Types.Int (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int)), Data.Text.Internal.Lazy.Text) [LclId] = [source_s6po] \r [ma_s6pp] let { ds_s6pq :: (GHC.Types.Int, GHC.Types.Int) [LclId] = [ma_s6pp] \u [] let { sat_s6pr [Occ=Once] :: GHC.Types.Int [LclId] = NO_CCS GHC.Types.I#! [0#]; } in Data.Array.Base.! Data.Array.Base.$fIArrayArraye GHC.Arr.$fIxInt ma_s6pp sat_s6pr; } in let { o_s6ps :: GHC.Types.Int [LclId] = [ds_s6pq] \u [] case ds_s6pq of { (,) o1_s6pu [Occ=Once] _ [Occ=Dead] -> o1_s6pu; }; } in let { sat_s6pG [Occ=Once] :: Data.Text.Internal.Lazy.Text [LclId] = [source_s6po ds_s6pq o_s6ps] \u [] let { sat_s6pF [Occ=Once] :: GHC.Types.Int [LclId] = [ds_s6pq o_s6ps] \u [] let { sat_s6pE [Occ=Once] :: GHC.Types.Int [LclId] = [ds_s6pq] \u [] case ds_s6pq of { (,) _ [Occ=Dead] l_s6pD [Occ=Once] -> l_s6pD; }; } in GHC.Num.+ GHC.Num.$fNumInt o_s6ps sat_s6pE; } in $cafter_r6mQ sat_s6pF source_s6po; } in let { sat_s6pA [Occ=Once] :: GHC.Arr.Array GHC.Types.Int (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int)) [LclId] = [source_s6po ma_s6pp] \u [] let { sat_s6pz [Occ=Once] :: (GHC.Types.Int, GHC.Types.Int) -> (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int)) [LclId] = [source_s6po] \r [ol_s6px] let { sat_s6py [Occ=Once] :: Data.Text.Internal.Lazy.Text [LclId] = [source_s6po ol_s6px] \u [] $cextract_r6nK ol_s6px source_s6po; } in (,) [sat_s6py ol_s6px]; } in GHC.Base.fmap GHC.Arr.$fFunctorArray sat_s6pz ma_s6pp; } in let { sat_s6pw [Occ=Once] :: Data.Text.Internal.Lazy.Text [LclId] = [source_s6po o_s6ps] \u [] $cbefore_r6nJ o_s6ps source_s6po; } in (,,) [sat_s6pw sat_s6pA sat_s6pG]; } in GHC.Base.fmap GHC.Base.$fFunctorMaybe sat_s6pH sat_s6pL; Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText [InlPrag=NOUSERINLINE CONLIKE] :: Text.Regex.Base.RegexLike.RegexLike Text.Regex.TDFA.Common.Regex Data.Text.Internal.Lazy.Text [GblId[DFunId]] = NO_CCS Text.Regex.Base.RegexLike.C:RegexLike! [Text.Regex.TDFA.Text.Lazy.$fExtractText $cmatchOnce_r6nV $cmatchAll_r6nU $cmatchCount_r6nX $cmatchTest_r6nT $cmatchAllText_r6nW $cmatchOnceText_r6nY]; Text.Regex.TDFA.Text.Lazy.regexec :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> Data.Either.Either GHC.Base.String (GHC.Base.Maybe (Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text, [Data.Text.Internal.Lazy.Text])) [GblId, Arity=2, Unf=OtherCon []] = [] \r [r_s6pM txt_s6pN] case $cmatchOnceText_r6nY r_s6pM txt_s6pN of { GHC.Base.Nothing -> Data.Either.Right [GHC.Base.Nothing]; GHC.Base.Just ds_s6pP [Occ=Once!] -> case ds_s6pP of { (,,) pre_s6pR [Occ=Once] mt_s6pS post_s6pT [Occ=Once] -> let { sat_s6pZ [Occ=Once] :: [Data.Text.Internal.Lazy.Text] [LclId] = [mt_s6pS] \u [] let { sat_s6pY [Occ=Once] :: [(Data.Text.Internal.Lazy.Text, (Text.Regex.Base.RegexLike.MatchOffset, Text.Regex.Base.RegexLike.MatchLength))] [LclId] = [mt_s6pS] \u [] let { sat_s6pX [Occ=Once] :: [(Data.Text.Internal.Lazy.Text, (Text.Regex.Base.RegexLike.MatchOffset, Text.Regex.Base.RegexLike.MatchLength))] [LclId] = [mt_s6pS] \u [] Data.Array.Base.elems Data.Array.Base.$fIArrayArraye GHC.Arr.$fIxInt mt_s6pS; } in GHC.List.tail sat_s6pX; } in GHC.Base.map Data.Tuple.fst sat_s6pY; } in let { sat_s6pW [Occ=Once] :: Data.Text.Internal.Lazy.Text [LclId] = [mt_s6pS] \u [] let { sat_s6pV [Occ=Once] :: (Data.Text.Internal.Lazy.Text, (Text.Regex.Base.RegexLike.MatchOffset, Text.Regex.Base.RegexLike.MatchLength)) [LclId] = [mt_s6pS] \u [] let { sat_s6pU [Occ=Once] :: GHC.Types.Int [LclId] = NO_CCS GHC.Types.I#! [0#]; } in Data.Array.Base.! Data.Array.Base.$fIArrayArraye GHC.Arr.$fIxInt mt_s6pS sat_s6pU; } in Data.Tuple.fst sat_s6pV; } in let { sat_s6q0 [Occ=Once] :: (Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text, [Data.Text.Internal.Lazy.Text]) [LclId] = NO_CCS (,,,)! [pre_s6pR sat_s6pW post_s6pT sat_s6pZ]; } in let { sat_s6q1 [Occ=Once] :: GHC.Base.Maybe (Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text, [Data.Text.Internal.Lazy.Text]) [LclId] = NO_CCS GHC.Base.Just! [sat_s6q0]; } in Data.Either.Right [sat_s6q1]; }; }; $cmatch_r6nZ :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text [GblId] = [] \u [] Text.Regex.Base.Impl.polymatch Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText; $cmatchM_r6o0 :: forall (m :: * -> *). GHC.Base.Monad m => Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> m Data.Text.Internal.Lazy.Text [GblId, Arity=1, Unf=OtherCon []] = [] \r [$dMonad_s6q2] Text.Regex.Base.Impl.polymatchM Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText $dMonad_s6q2; Text.Regex.TDFA.Text.Lazy.$fRegexContextRegexTextText [InlPrag=NOUSERINLINE CONLIKE] :: Text.Regex.Base.RegexLike.RegexContext Text.Regex.TDFA.Common.Regex Data.Text.Internal.Lazy.Text Data.Text.Internal.Lazy.Text [GblId[DFunId]] = NO_CCS Text.Regex.Base.RegexLike.C:RegexContext! [Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText $cmatch_r6nZ $cmatchM_r6o0]; Text.Regex.TDFA.Text.Lazy.execute :: Text.Regex.TDFA.Common.Regex -> Data.Text.Internal.Lazy.Text -> Data.Either.Either GHC.Base.String (GHC.Base.Maybe Text.Regex.Base.RegexLike.MatchArray) [GblId, Arity=2, Unf=OtherCon []] = [] \r [r_s6q3 txt_s6q4] let { sat_s6q5 [Occ=Once] :: GHC.Base.Maybe Text.Regex.Base.RegexLike.MatchArray [LclId] = [r_s6q3 txt_s6q4] \u [] $cmatchOnce_r6nV r_s6q3 txt_s6q4; } in Data.Either.Right [sat_s6q5]; }}} Which I think is strange, because `wnext2` doesn't look like anything GHC would autogenerate, but rather like a name a human would choose on purpose. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by mpickering): There is a function called `next` in `regex-tdfa` in `Text/Regex/TDFA/NewDFA` which is probably the one you are looking for? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Right, didn't think to try without the `w-` and the `-2`. On my setup, there's no `next` in `Text/Regex/TDFA/NewDFA`, but four similar modules underneath it (`Engine`, `Engine_FA`, `Engine_NC`, and `Engine_NC_FA`) do have such a function defined in a let binding. It seems that the `Lazy` module ends up using the one in `Engine` (without any suffix), via `execMatch`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Something is spitting out this line in the ticky dump {{{ 2250377760180010799840 0 2 Mi $wnext2{v reC2} (regex-tdfa- text-1.0.0.3-CIfFZ6rjdCoJI5EFpqTwBO:Text.Regex.TDFA.Text.Lazy) (fun) }}} So the string "$wnext2{v reC2}" must appear in some `.o` file. I'd do `strings *.o | grep ` to find it. It it easy to repro? I don't see precise instructions above. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ntc2): @simonpj, you can reproduce by checking out the repo with the `slow-regex` test package (https://github.com/ntc2/ghc-8.2.1-regex-lazy-text-bug), building it, and then running the `slow-regex` executable that produces with arguments "text-lazy" and the name of one of the test files. E.g. {{{ slow-regex text-lazy defs-10000.txt }}} The `defs-10000.txt` test file is included with the test package. With GHC 8.2.2, using new Cabal: {{{ git clone git@github.com:ntc2/ghc-8.2.1-regex-lazy-text-bug.git cd ghc-8.2.1-regex-lazy-text-bug.git cabal new-build slow-regex ./dist-newstyle/build/x86_64-linux/ghc-8.2.2/slow-regex-0.1.0.0/c/slow- regex/build/slow-regex/slow-regex text-lazy defs-10000.txt }}} With GHC 8.2.2, using Stack: {{{ git clone git@github.com:ntc2/ghc-8.2.1-regex-lazy-text-bug.git cd ghc-8.2.1-regex-lazy-text-bug.git stack build --stack-yaml stack-ghc-8.2.2.yaml slow-regex stack exec --stack-yaml stack-ghc-8.2.2.yaml slow-regex text-lazy defs-10000.txt }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers):
So the string "$wnext2{v reC2}" must appear in some .o file. I'd do strings *.o | grep to find it.
Using `strings -f **/*.o | grep wnext2`, I get: {{{ regex-tdfa-text-1.0.0.3/dist/dist-sandbox- 3f35acf6/build/Text/Regex/TDFA/Text/Lazy.o: $wnext2{v reyv} (regex-tdfa- text-1.0.0.3-CIfFZ6rjdCoJI5EFpqTwBO:Text.Regex.TDFA.Text.Lazy) (fun) }}} There is no identifier `wnext` or `next` in that module, but it does have this: {{{ {-# SPECIALIZE execMatch :: Regex -> Position -> Char -> L.Text -> [MatchArray] #-} execMatch :: Uncons text => Regex -> Position -> Char -> text -> [MatchArray] execMatch = Engine.execMatch }}} `execMatch` imported from `Text.Regex.TDFA.NewDFA.Engine`, in turn, defines local functions `next` and `next'`; the alias defined here seems to mainly exist in order to add the `SPECIALIZE` pragma for lazy `Text`. `execMatch` is monstrously large and quite complex, so I haven't managed to figure out entirely how it works, however there is a promising starting point for further digging: Whether or not `SPECIALIZE` triggers might depend on debugging and/or profiling build flags, which would explain why the problem disappears in a profiling build. The hypothesis would be that the specialized code does something that the non-specialized code doesn't do, and that something ends up being *less* efficient rather than *more*. The `execMatch` function is polymorphic over, among other things, the `text` type, with a typeclass constraint `Uncons text` (documentation [http://hackage.haskell.org/package/regex-tdfa-1.2.2/docs/Text-Regex-TDFA- NewDFA-Uncons.html here]), so we should expect `uncons` to inline for the specialized version, but not the un-specialized one (because in the specialized case we can statically resolve it to the non-polymorphic `uncons` from `Data.Text.Lazy`), and then, possibly maybe, the inlined uncons leads to a space leak. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers):
It it easy to repro? I don't see precise instructions above.
It's a bit elaborate; what I did is roughly this: 1. Set up a new cabal sandbox 2. Attempt to install `regex-tdfa-text` and its dependencies, using GHC HEAD, and with `-ticky` enabled. 3. Step 2 will fail due to several dependencies no longer compiling under GHC 8.2. Do whatever it takes to convince cabal otherwise: `--allow-newer` works for most, but for a few libraries, I had to check out the sources and install from the local source checkout. I started this was before the 8.2 release though, so it's possible that all the dependencies install cleanly by now. 4. Then inside the sandbox, compile and run an example that triggers the problematic behavior. I used a trimmed-down version of [https://ghc.haskell.org/trac/ghc/attachment/ticket/14519/Main.hs], feeding it the 30000-line example attached to this ticket. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj):
There is no identifier wnext or next in that module
That's fine -- it will have come from inlining. But I'd be surprised if there was no `wnext` in the `-ddump-simpl` or `-ddump-stg` for that module. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:23 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

That's fine -- it will have come from inlining. But I'd be surprised if
#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): there was no wnext in the -ddump-simpl or -ddump-stg for that module. Figured it out, I typo-ed my cabal command - `cabal install regex-tdfa- text` will of course not touch the local checkout of regex-tdfa-text, and so `-ddump-simpl -ddump-to-file` doesn't put dumps where I expect them, and grepping for `wnext` won't produce anything useful. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:24 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): - Compile without optimizations - Compile without the SPECIALIZE pragma -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:25 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Another line of approach.... Does the problem happen without `-O`? If not (and I bet it doesn't), then which modules must be compiled without `-O` to make the problem go away? And if compiling M without -O makes the problem go away, try with `-O` and `-fno-float-in` and other similar things to switch off various optimisations. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:26 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * milestone: 8.4.1 => 8.6.1 Comment: This likely won't be happening for 8.4. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:27 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Can confirm that the problem "goes away" when compiling with `-O0` (100 ms execution time vs. 50,000 for 30,000 lines of input). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:28 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers):
If not (and I bet it doesn't), then which modules must be compiled without -O to make the problem go away?
Any quick hints on how to do that? Can I somehow instruct cabal to use specific GHC flags selectively for some modules? Setting `-O0` for the main module alone doesn't make the problem go away in any case, while setting `-O0` for the entire regex-tdfa-text package does. But that doesn't tell us much of course. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:29 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by mpickering): Use an `{-# OPTIONS_GHC -O0 #-}` pragma at the top of the file you want to compile without optimisation. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:30 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers):
Use an {-# OPTIONS_GHC -O0 #-} pragma at the top of the file you want to compile without optimisation.
Hmm, I was hoping I could do it without changing the source files, that would be easier to script. But it'll have to do. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:31 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by mpickering): Right but you probably only need to add it to the module which creates the worker `$wnext`. Does compiling with `-fno-worker-wrapper` fix it? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:32 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers):
Does compiling with -fno-worker-wrapper fix it?
Haven't tried this one yet; `-fno-enable-rewrite-rules` seems to fix it, the other optimization flags listed as "implied by -O" on [https://downloads.haskell.org/~ghc/7.8.2/docs/html/users_guide/flag- reference.html#options-f-compact] don't seem to make a difference. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:33 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Specifically, running the minimal test code over defs-30000, with various flags, and an execution time limit of 1 second (enough to detect the "bad" behavior), gives us this list: {{{ unoptimized (-O0): Time: 101 ms no-float-in (-O -fno-float-in): Timeout exceeded no-strictness (-O -fno-strictness): Timeout exceeded no-full-laziness (-O -fno-full-laziness): Timeout exceeded no-specialise (-O -fno-specialise): Timeout exceeded no-do-eta-reduction (-O -fno-do-eta-reduction): Timeout exceeded no-cse (-O -fno-cse): Timeout exceeded no-case-merge (-O -fno-case-merge): Timeout exceeded no-enable-rewrite-rules (-O -fno-enable-rewrite-rules): Time: 88 ms no-worker-wrapper (-O -fno-worker-wrapper): Timeout exceeded }}} In other words, among these optimizations, `enable-rewrite-rules` seems to be the one that triggers the bad behavior. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:34 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by mpickering): You can try using `-ddump-rule-firings` to see which rules are firing. Be careful though, disabling rules will cripple other optimisation passes which rely on them to work properly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:35 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Disabling all optimizations (`-O0`) *only* on the `Text.Regex.TDFA.Text.Lazy`, with `-O` on all other modules, also produces "good" behavior. Removing the `SPECIALIZE` pragma on `execMatch` in that module however does *not* make a difference. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:36 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Compiling with -ddump-rule-firings gives us: {{{ Rule fired: Class op toEnum (BUILTIN) Rule fired: Class op toEnum (BUILTIN) Rule fired: Class op before (BUILTIN) Rule fired: Class op after (BUILTIN) Rule fired: LAZY TEXT drop -> fused (Data.Text.Lazy) Rule fired: LAZY TEXT take -> fused (Data.Text.Lazy) Rule fired: LAZY STREAM stream/unstream fusion (Data.Text.Internal.Lazy.Fusion) Rule fired: Class op makeRegexOptsM (BUILTIN) Rule fired: Class op $p1RegexMaker (BUILTIN) Rule fired: Class op makeRegexOptsM (BUILTIN) Rule fired: Class op defaultCompOpt (BUILTIN) Rule fired: Class op defaultExecOpt (BUILTIN) Rule fired: Class op makeRegexOptsM (BUILTIN) Rule fired: Class op $p1RegexMaker (BUILTIN) Rule fired: Class op makeRegexOpts (BUILTIN) Rule fired: Class op defaultCompOpt (BUILTIN) Rule fired: Class op defaultExecOpt (BUILTIN) Rule fired: unpack (GHC.Base) Rule fired: Class op show (BUILTIN) Rule fired: ++ (GHC.Base) Rule fired: fold/build (GHC.Base) Rule fired: Class op fmap (BUILTIN) Rule fired: Class op matchOnce (BUILTIN) Rule fired: Class op bounds (BUILTIN) Rule fired: Class op unsafeAt (BUILTIN) Rule fired: Class op numElements (BUILTIN) Rule fired: Class op index (BUILTIN) Rule fired: Class op before (BUILTIN) Rule fired: LAZY TEXT take -> fused (Data.Text.Lazy) Rule fired: Class op fmap (BUILTIN) Rule fired: Class op extract (BUILTIN) Rule fired: Class op after (BUILTIN) Rule fired: Class op + (BUILTIN) Rule fired: LAZY TEXT drop -> fused (Data.Text.Lazy) Rule fired: Class op length (BUILTIN) Rule fired: Class op matchAll (BUILTIN) Rule fired: length (GHC.List) Rule fired: unpack (GHC.Base) Rule fired: unpack (GHC.Base) Rule fired: unpack (GHC.Base) Rule fired: unpack (GHC.Base) Rule fired: Class op bounds (BUILTIN) Rule fired: Class op unsafeAt (BUILTIN) Rule fired: Class op numElements (BUILTIN) Rule fired: Class op index (BUILTIN) Rule fired: Class op fmap (BUILTIN) Rule fired: Class op extract (BUILTIN) Rule fired: Class op - (BUILTIN) Rule fired: Class op after (BUILTIN) Rule fired: Class op + (BUILTIN) Rule fired: Class op - (BUILTIN) Rule fired: LAZY TEXT drop -> fused (Data.Text.Lazy) Rule fired: Class op + (BUILTIN) Rule fired: Class op matchAll (BUILTIN) Rule fired: Class op matchAll (BUILTIN) Rule fired: Class op matchOnceText (BUILTIN) Rule fired: Class op bounds (BUILTIN) Rule fired: Class op unsafeAt (BUILTIN) Rule fired: Class op numElements (BUILTIN) Rule fired: Class op index (BUILTIN) Rule fired: Class op numElements (BUILTIN) Rule fired: Class op unsafeAt (BUILTIN) Rule fired: map (GHC.Base) Rule fired: Class op matchOnce (BUILTIN) Rule fired: Class op matchOnceText (BUILTIN) Rule fired: Class op $p1Integral (BUILTIN) Rule fired: Class op $p1Real (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op $p2Real (BUILTIN) Rule fired: Class op <= (BUILTIN) Rule fired: Class op - (BUILTIN) Rule fired: Class op <= (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op - (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op toInteger (BUILTIN) Rule fired: smallIntegerToInt (BUILTIN) Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy) Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy) Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy) Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy) Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy) Rule fired: Class op $p1Integral (BUILTIN) Rule fired: Class op $p1Real (BUILTIN) Rule fired: Class op $p2Real (BUILTIN) Rule fired: Class op toInteger (BUILTIN) Rule fired: smallIntegerToInt (BUILTIN) Rule fired: Class op $p1Integral (BUILTIN) Rule fired: Class op $p1Real (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op $p2Real (BUILTIN) Rule fired: Class op <= (BUILTIN) Rule fired: Class op - (BUILTIN) Rule fired: Class op <= (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op - (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op toInteger (BUILTIN) Rule fired: smallIntegerToInt (BUILTIN) Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy) Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy) Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy) Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy) Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy) Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy) Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy) Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy) Rule fired: Class op $p1Integral (BUILTIN) Rule fired: Class op $p1Real (BUILTIN) Rule fired: Class op $p2Real (BUILTIN) Rule fired: Class op toInteger (BUILTIN) Rule fired: smallIntegerToInt (BUILTIN) Rule fired: Class op $p1RegexLike (BUILTIN) Rule fired: Class op empty (BUILTIN) Rule fired: Class op matchOnceText (BUILTIN) Rule fired: Class op $p1RegexLike (BUILTIN) Rule fired: Class op matchOnceText (BUILTIN) Rule fired: Class op empty (BUILTIN) Rule fired: SPEC/Text.Regex.TDFA.Text.Lazy polymatch @ Regex @ Text (Text.Regex.TDFA.Text.Lazy) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: foldr/app (GHC.Base) Rule fired: unpack-append (GHC.Base) Rule fired: foldr/app (GHC.Base) Rule fired: unpack-append (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: foldr/app (GHC.Base) Rule fired: unpack-append (GHC.Base) Rule fired: foldr/app (GHC.Base) Rule fired: unpack-append (GHC.Base) Rule fired: lengthList (GHC.List) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: foldr/app (GHC.Base) Rule fired: unpack-append (GHC.Base) Rule fired: foldr/app (GHC.Base) Rule fired: unpack-append (GHC.Base) Rule fired: mapList (GHC.Base) Rule fired: unpack-list (GHC.Base) Rule fired: unpack-append (GHC.Base) Rule fired: Class op $p1Integral (BUILTIN) Rule fired: Class op $p1Real (BUILTIN) Rule fired: Class op $p2Real (BUILTIN) Rule fired: Class op $p1Integral (BUILTIN) Rule fired: Class op $p1Real (BUILTIN) Rule fired: Class op $p2Real (BUILTIN) Rule fired: Class op <= (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op - (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op <= (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op - (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op toInteger (BUILTIN) Rule fired: smallIntegerToInt (BUILTIN) Rule fired: Class op $p1Integral (BUILTIN) Rule fired: Class op $p1Real (BUILTIN) Rule fired: Class op $p2Real (BUILTIN) Rule fired: Class op toInteger (BUILTIN) Rule fired: smallIntegerToInt (BUILTIN) Rule fired: Class op $p1Integral (BUILTIN) Rule fired: Class op $p1Real (BUILTIN) Rule fired: Class op $p2Real (BUILTIN) Rule fired: Class op <= (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op - (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op <= (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op - (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op $p1Integral (BUILTIN) Rule fired: Class op $p1Real (BUILTIN) Rule fired: Class op $p2Real (BUILTIN) Rule fired: Class op <= (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op - (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op <= (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: Class op - (BUILTIN) Rule fired: Class op fromInteger (BUILTIN) Rule fired: integerToInt (BUILTIN) Rule fired: *# (BUILTIN) Rule fired: *# (BUILTIN) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:37 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Evidence so far: - We already suspected that whatever we're doing in `uncons` for lazy `Text` might break things. - The `next` function, which uses `uncons` a lot, is where things blow up in ticky profiles - Problem disappears when disabling rewrite rules. - Problem also disappears in profiling builds; this may however be due to rewrites and other optimizations not being fully applied in the presence of explicit cost centres. - Some `RULES` wrt `Text` fusion appear in the dump. So, updated hypothesis: This may not actually be a bug in GHC itself, but in the `text` library, particularly the fusion-related rules. Things like [https://github.com/haskell/text/pull/200 this PR] support the idea that `text`'s fusion behavior may not be perfect yet. If this is the case, then disabling the relevant RULES in `text` should also make the problem disappear. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:38 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

This may not actually be a bug in GHC itself, but in the text library,
#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): particularly the fusion-related rules. Sounds plausible. Since there only very few such rules triggered in the trace you sent, you could disable them individually and see what happens. Then hand it off to the text library maintainers! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:39 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Neutering the following `RULES` in `Data.Text.Lazy` (from `text`) makes the problem go away: - LAZY TEXT take -> fused - LAZY TEXT take -> unfused - LAZY TEXT drop -> fused - LAZY TEXT drop -> unfused Will now try to isolate which one of these causes the trouble. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:40 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Neutering only these two rules also makes the problem go away: - LAZY TEXT take -> fused - LAZY TEXT drop -> fused Neutering only the first one (take -> fused), the problem persists. Neutering only the last one (drop -> fused), the problem disappears. In other words, it seems that the culprit is this rule, somewhere around line #1119 in Data.Text.Lazy.hs: {{{ "LAZY TEXT drop -> fused" [~1] forall n t. drop n t = unstream (S.drop n (stream t)) }}} Going to run a few more tests to make sure though. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:41 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Clean build with only this one rule neutered confirms. So this particular rule causes, or at least triggers, the bug. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:42 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): A smaller reproduction case that depends only on `text`: {{{ module Main where import qualified Data.Text.Lazy as LText import Data.Char import System.Environment type LText = LText.Text matches :: LText -> ([LText], [LText]) matches str = case LText.uncons str of Nothing -> ([], []) Just (c, str') -> let (upperMatches, lowerMatches) = matches $ LText.drop 1 str upper = isUpper c lower = isLower c match = LText.take 10 str upperMatches' = if upper then match:upperMatches else upperMatches lowerMatches' = if lower then match:lowerMatches else lowerMatches in (upperMatches', lowerMatches') main = do (arg0:args) <- getArgs input <- LText.pack <$> readFile arg0 let (upper, lower) = matches input putStrLn $ "Lowercase: " ++ show (take 1 lower) putStrLn $ "Uppercase: " ++ show (take 1 upper) print $ LText.take 10 input }}} This example program tries to roughly mimic the usage of lazy `Text`s in `regex-tdfa-text` without actually using any code from that. Specifically, it uses `drop` (which triggers the offending `RULE`), it snatches off characters from the front of the remaining string one by one, it passes the tail through a recursive loop, it holds on to chunks of text as it traverses the input, thus preventing them from being collected, and it eventually forces at least the first one of the accumulated chunks by printing them to the console. I'm not 100% sure whether preventing collection is absolutely necessary to trigger the bug, but in any case, the above example runs slower by about a factor 2 when compiled with optimizations, the dump shows that the offending `RULE` is being hit, and the ticky profiles hint at the same performance issue as well. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:43 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by tdammers): * Attachment "repro-opt.ticky" added. Minimal reproduction, unoptimized (`-O0`) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by tdammers): * Attachment "repro-opt.2.ticky" added. Minimal reproduction, optimized (`-O`) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by tdammers): * Attachment "repro-unopt.ticky" added. Minimal reproduction, unoptimized (`-O0`) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Filed issue with the text library: [https://github.com/haskell/text/issues/216] -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:44 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA -------------------------------------+------------------------------------- Reporter: ntc2 | Owner: tdammers Type: bug | Status: closed Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: wontfix | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime | Test Case: performance bug | https://github.com/ntc2/ghc-8.2.1 | -regex-lazy-text- | bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 Blocked By: | Blocking: Related Tickets: #13745, #14564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by tdammers): * status: new => closed * resolution: => wontfix Comment: Reported against `text` library probably not a GHC bug. We can always reopen if GHC itself turns out to be the problem after all. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14519#comment:45 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC