
Colin [ccing GHC users in case there are other enthusiasts out there] | we briefly discussed the lack in | Concurrent Haskell of any way to set up a data-driven | thread -- one that works on a data structure D without | ever forcing its evaluation, only proceeding with the | computation over D as and when the needed parts get | evaluated by some other thread. | | An example application is assertion-checking in a program that depends | on laziness -- and indeed I am currently supervising a student | project on this topic. Concurrent Haskell looked like | an obvious basis for implementation, but we are stuck | without data-driven threads. | | You did say that it would not take too much work for someone | familiar with the Concurrent Haskell implementation, such as yourself, | to add a suitable primitive for the expression of data-driven threads. | Is there any possibility that you could indeed do this in the near future? It's not entirely obvious what you want here. Ideally you'd be able to say f xs = assert (length xs > 10) (...) and have the predicate (length xs > 10) evaluated only when xs itself was evaluated. But that means that the evaluations performed by length would have to be suspended somehow. That seems hard, if we are sharing a single copy of length. (Not *all* evaluations performed by these assertion-checking threads must be suspended... they may create thunks of their own.) Next idea is to have a function await :: a -> b -> b which is rather like 'seq' except that it waits for its first argument to be evaluated, whereas 'seq' evaluates it. Then you could write a new version of 'length' which used await before every case expression. Is that what you had in mind? A bit fragile, isn't it? How to implement 'await'. We thought of three ways: 1. Implement a primitive checkEval :: a -> Bool, which tells whether something is evaluated; if not, it yields to other threads, as well as returning False. So now you can have a busy-wait version of await thus: await x y = case checkEval x of True -> y False -> await x y 2. Implement await directly, by putting the thread in a pool of threads that the scheduler checks at regular intervals. They each just "watch" a thunk. Still a busy-waiting solution. 3. Do the "right thing": implement 'await' by copying the thunk, replacing it by a "smart indirection" which also points to the non-suspended thread. When the indirection is evaluated, it just puts the suspended thread in the runnable pool before entering the thunk. Actually, this is probably no harder than (2). Now, it's true that it would take Simon or I less time to implement one of these than it would take your student to do, but we have so *many* things to implement, and each takes time. And you will probably change your mind several times about what the right design should be (that's what research is like). So there's a lot to be said for the wading in yourself. What I can promise is that we'd turn around questions rapidly. Simon

Simon,
Next idea is to have a function await :: a -> b -> b which is rather like 'seq' except that it waits for its first argument to be evaluated, whereas 'seq' evaluates it. Then you could write a new version of 'length' which used await before every case expression. Is that what you had in mind? A bit fragile, isn't it?
Something like await would be enough I think. It would not be necessary to write new versions of all the functions to be data driven, just a single (overloaded) function to convert a demanding function into a data-driven one. Something like this: asEvaluated xs = await xs (case xs of (x:xs) -> asEvaluated x : asEvaluated xs [] -> []) dataDriven f = f . asEvaluated assert p xs = thread (... dataDriven p xs ...) xs
How to implement 'await'. We thought of three ways:
1. Implement a primitive checkEval :: a -> Bool, which tells whether something is evaluated; if not, it yields to other threads, as well as returning False. So now you can have a busy-wait version of await thus:
await x y = case checkEval x of True -> y False -> await x y
2. Implement await directly, by putting the thread in a pool of threads that the scheduler checks at regular intervals. They each just "watch" a thunk. Still a busy-waiting solution.
3. Do the "right thing": implement 'await' by copying the thunk, replacing it by a "smart indirection" which also points to the non-suspended thread. When the indirection is evaluated, it just puts the suspended thread in the runnable pool before entering the thunk. Actually, this is probably no harder than (2).
I agree that 2 seems the most straightfoward, and not quite so busy as 1. Unless I am misunderstanding 3 it could lead to problems if scheduling can occur before evaluation of the thunk reaches a normal form.
Now, it's true that it would take Simon or I less time to implement one of these than it would take your student to do, but we have so *many* things to implement, and each takes time. And you will probably change your mind several times about what the right design should be (that's what research is like). So there's a lot to be said for the wading in yourself.
The formula for implementation time is roughly initiationRites + inherentDifficulty / skillLevel and I am pretty sure we could get by with just the await primitive. So if there is any way you could find an early slot to do a basic implementation of await, it would be much appreciated.
What I can promise is that we'd turn around questions rapidly.
If we do end up trying to define await for ourselves, which source files should we expect to modify and are there any likely pitfalls in the subsequent rebuilding? Thanks Colin R
participants (2)
-
Colin Runciman
-
Simon Peyton-Jones