Re: [GHC] #11822: Pattern match checker exceeded (2000000) iterations

6 May 2016

      #11822: Pattern match checker exceeded (2000000) iterations
-------------------------------------+-------------------------------------
        Reporter:  j.waldmann        |                Owner:  gkaracha
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  8.0.1-rc3
      Resolution:                    |             Keywords:
                                     |  PatternMatchWarnings
Operating System:  Unknown/Multiple  |         Architecture:  x86_64
 Type of failure:  Compile-time      |  (amd64)
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by gkaracha):

 Apologies for the late (and long) response. You can find below some
 explanation about the behavior of the checker, which I hope to clear
 things out a bit.

 == Why does this example exceed the limit ==
 In principle, the match of `sameAttribute` gives rise to 161*161 = 25,921
 cases. Also, the counter takes into account the arguments too and most
 constructors have at least one.

 Yet, the most important reason that the match iterates >2000000 times is
 that **all** unmatched cases from the first clause are checked with the
 second, and the resulting ones are checked with the third, etc.

 {{{#!hs
 data T = A | B | C

 f :: T -> ()
 f A = ()
 f B = () -- (*)
 f C = ()
 }}}

 That being said, for `f` above, the second clause will be checked against
 missing cases `B` and `C`. This is a pretty small example, but in general
 this gives rise to (something like) a quadratic number of checks. That's
 why the iteration counter goes over the roof.

 == General Comments on Performance ==
 About the general performance of the algorithm, there are some things
 worth mentioning:

 1. In principle, the new checker is slower than the previous one on
 average. This makes sense because we check term and type constraints, that
 the previous checker didn't, so we have to pay something for that.
 Nevertheless, the main reason that it's slower on average is that the new
 checker checks uncovered matches eagerly, while the past checker did it
 lazily (whether the specific match has GADTs or not, the new checking
 function is in `TcM` also, to be able to call the type-checker).

 2. Neither I or Simon were happy about the above, but (that's why we
 opened #11528 but it still needs a lot of work to figure out how to
 change) the constraint-based approach grew really fast, unless we
 performed constraint solving as soon as possible, to remove useless cases
 (as I said above, what is generated from one clause is passed to the next
 one so this was devastating).

 3. The takeaway is that, in order to avoid extreme cases (like #11195
 which is really difficult for the checker), we chose to have predictable
 behavior for all (with/without guards or GADTs) but a bit slower. The
 other result is that since the checker is now strict, it had to be
 bounded, to avoid memory-issues.

 I dislike the exceeded-iterations-warning too, but I know no better
 solution at the moment. My original limit was 10000000, which even worked
 on #11195 and gave no such warning. However, as Herbert said, memory
 consumption can be too high, if we let it. The best solution, the way I
 see it would be to collect statistics (like the examples from Neil or this
 whole bug report) and fine-tune it. Maybe 10000000 was too high but
 2000000 seems to be too low. I do not have anything better to offer at the
 moment, but I am open to suggestions.

--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11822#comment:7
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler