[GHC] #10359: Tuple constraint synonym led to asymptotic performance lossage

28 Apr 2015

      #10359: Tuple constraint synonym led to asymptotic performance lossage
-------------------------------------+-------------------------------------
              Reporter:  axch        |             Owner:
                  Type:  bug         |            Status:  new
              Priority:  normal      |         Milestone:
             Component:  Compiler    |           Version:  7.6.3
              Keywords:              |  Operating System:  Linux
          Architecture:  x86_64      |   Type of failure:  Runtime
  (amd64)                            |  performance bug
             Test Case:              |        Blocked By:
              Blocking:              |   Related Tickets:
Differential Revisions:              |
-------------------------------------+-------------------------------------
 I encountered a situation where using a tuple constraint synonym
 produced an asymptotic runtime performance degradation, from linear to
 quadratic time.  The enclosed Bug.hs reproduces the potentially
 relevant features of the problem.  The function `test` there is
 quadratic, but should be linear.  The file also contains a definition
 of `test2`, which differs only in eschewing the constraint synonym,
 and is linear, as it should be.

 In more detail: the program tries to count to 20,000 in a rather
 contrived way.  It maintains a Box, which holds the current count and
 a function that can accept an increment and a count to produce a new
 count (namely, `(+)`).  The function `do_step` lifts incrementation to
 Boxes, using their contained function to update their stored count.

 The first tricky thing is that the incrementing function stored in the
 Box is polymorphic in both arguments separately: it can accept
 increments that are not the same type as the stored count (and
 converts them).  The second tricky thing is that I don't want to
 expose the type variable for the increment type (as opposed to the
 count type) to clients of the Box, so the Box's incrementing function
 is stored polymorphic (with an explicit forall).  Finally, I want to
 impose the constraint `(Fractional num, Real num)` on allowable
 increments (even though this example only needs `Real num`).

 This constellation of desiderata conspires to produce quadratic
 performance: doubling the parameter to `test` (but not `test2`)
 multiplies its runtime by 4.  Inspecting the heap profile indicates a
 growing accumulation of closures when running `test` (but not
 `test2`).  Inspecting the generated stg (enclosed) locates a potential
 culprit: apparently when `do_step` (but not `do_step2`) reconstructs
 the Box, it changes the `func` stored there to be a fresh closure that
 destructures the `Numerical` constraint tuple, constructs a fresh one,
 and calls the function that was in that slot previously with it.  I
 hypothesize that this accumulates as a chain that performs a linear
 number of such destructurings and restructurings at each increment,
 leading to the observed quadratic runtime and linear memory growth.

 Incidentally, I checked whether record wildcards were the issue (no:
 `test4`) and whether updating just the `obj` field solves it (yes:
 `test3`).  Sadly, the latter solution is not directly usable in my
 actual application because of "Record update for insufficiently
 polymorphic field".

--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10359
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler