Re: [GHC] #10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>)

3 Jul 2015

      #10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into
<<loop>>)
-------------------------------------+-------------------------------------
        Reporter:  exio4             |                   Owner:
            Type:  bug               |                  Status:  new
        Priority:  normal            |               Milestone:
       Component:  Compiler          |                 Version:  7.10.1
      Resolution:                    |                Keywords:
Operating System:  Unknown/Multiple  |            Architecture:
 Type of failure:  Incorrect result  |  Unknown/Multiple
  at runtime                         |               Test Case:
      Blocked By:                    |                Blocking:
 Related Tickets:                    |  Differential Revisions:
-------------------------------------+-------------------------------------

Comment (by rwbarton):

 I think I've finally come up with at least part of a plausible
 explanation. Tell me if this seems right...

 Suppose I have a heap object whose initial value is
 {{{
   x = \u []  concat [[1],[]]   -- using "\u []" as syntax for "updatable
 thunk",
                                -- otherwise normal Core syntax (not STG)
 }}}
 and suppose first some thread evaluates `x` to WHNF. Using the definition
 of `concat` as `main_go` from above, this will allocate a single-entry
 thunk `y = \s []  concat []`, and then calling `(++)` will lead to
 {{{
   x = 1 : z
   y = \s []  concat []
   z = \u []  (++) [] y
 }}}
 At this point the heap has the following property

     (*) The heap contains a single-entry thunk (`y`) and a regular thunk
 (`z`) such that entering the regular thunk will cause the single-entry
 thunk to be entered as well.

 (The key point is that the single-entry thunk has already been allocated
 on the heap, in contrast to a situation in which entering a regular thunk
 would cause a ''new'' single-entry thunk to be allocated, possibly
 entered, and then become garbage, all before evaluation of that regular
 thunk is complete.)

 Now, consider the following execution path of two threads A and B.

 * Both threads enter `z` simultaneously (before either manages to
 overwrite it with a black hole, if eager blackholing was enabled when we
 compiled `(++)`; otherwise before either manages to update it to an
 indirection).

 * Thread A does the case analysis inside `(++)` and enters `y`, and
 overwrites it with a black hole before thread B does anything else.

 * Now thread B does the case analysis inside `(++)` and enters `y`, but
 `y` has been overwritten with a black hole so thread B blocks. But `y` is
 never going to be updated, so thread B will block forever. This is bad!

 * Finally thread A finishes evaluating `y` (to `[]`) and then updates `z`
 accordingly. But thread B is still blocking on the black hole and even if
 it could get unblocked by some mechanism (say in the GC) there's no longer
 enough information on the heap to recover the correct value of `y` and
 allow thread B to continue.

 This doesn't exactly explain why the programs in this ticket <<loop>>, but
 a thread becoming permanently blocked is equally bad behavior I think.

 Some extra evidence that something like this is going on is that if I
 inline the definition of `(++)` into par2.hs as well, so that it is
 compiled with eager blackholing enabled, the program <<loop>>s much less
 frequently, just a few percent of the time as opposed to over half the
 time. That would match up with a smaller window of simultaneity in the
 first step of the execution trace above.

 If this analysis is correct, then assuming we want to continue to allow
 threads to enter thunks in an unsynchronized way (to avoid prohibitive
 locking costs), it seems we have to ensure that the condition (*) never
 holds, at least when eager blackholing is enabled. Generating single-entry
 thunks is still okay as long as they never survive as live heap objects
 after the thunk that allocated them has been reduced to WHNF.

--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:21
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler