[Haskell-cafe] parallel package, rts ROOT vs WEAK policy and reachable sparks

17 May 2017

      Dear list

This mail is long, but only the 3 last paragraphs are questions, the
rest is context. Recently started reading the history of the parallel
package. On the changelog of version 2.2 says (I quote)

    "This module has been rewritten in version 2. The main change is to
    the 'Strategy a' type synonym, which was previously a -> Done and is
    now a -> Eval a. This change helps to fix the space leak described
    in "Runtime Support for Multicore Haskell". The problem is that the
    runtime will currently retain the memory referenced by all sparks,
    until they are evaluated. Hence, we must arrange to evaluate all the
    sparks eventually, just in case they aren't evaluated in parallel,
    so that they don't cause a space leak. This is why we must return a
    "new" value after applying a Strategy, so that the application can
    evaluate each spark created by the Strategy."

I know where are currently on version 3.3. Yet I decided to research a
little more on the paper cited. On it is discussed the issue of policy
for when to GC sparks. Around page 7 of the paper (I quote)

    ROOT: Treat (non-fizzled) sparks as roots for the garbage col-
    lector. That is, retain all such sparks and the graph they point to.

    WEAK: Only retain (non-fizzled) sparks that are reachable from the
    roots of the program.

    The problem is, neither of these policies is satisfactory. WEAK
    seems attractive, because it lets us discard sparks that are no
    longer required by the program. However, the WEAK policy is
    completely incompatible with strategies. Consider the parList
    strategy:

        parList :: Strategy a -> Strategy [a]
        parList strat []     = done
        parList strat (x:xs) = strat x ‘par‘ parList strat xs

    Each spark generated by parList is a thunk for the expression “
    strat x ”; this thunk is not shared, since it is created uniquely
    for the purposes of creating the spark, and hence can never fizzle.
    Hence, the WEAK policy will discard all sparks created by parList
    , which is obviously undesirable(*).

Question 1: Under WEAK policy, On (*) I understand the sparks cannot be
fizzle'd, but how are the parList spark not reachable from the roots of
the program? surely this function doesn't exist on its own. Somewhere in
the program I must be a function that is executed that has an

    let val = ... `using` parList ...
    in ... val ...

which would make the sparks reachable. Making it a *good* policy to
have IMHO. This probably has to do with my understanding of reachable
sparks.

Question 2: The ROOT policy is discuted to be better, because it
supports parList (and old parallel library versions). Yet, the old docs
of parallel recomend to that "if you are gonna spark, make sure you use
the result to avoid space leaks" ie filter you data before and only
spark on that. Is this intuition correct?

Question 3: What is the current situation of this? have weak pointers
helped in this regard?

-- Ruben

[Haskell-cafe] parallel package, rts ROOT vs WEAK policy and reachable sparks

Ruben Astudillo