
I'm using a control structure that's a variation of a monad and I'm interested in whether - it's got a name - it deserves a name (!) - anything else similar is used elsewhere Please excuse the longer post... I have two programs that need to interact with the outside world, and I want to constrain the nature of these interactions. I don't want to just sprinkle IO throughout the code. In the first program, I am reading on-demand from a database - just reading, not making any changes. In the second, I am requesting computations to be evaluated externally, in order to take advantage of a grid of machines. In both of these cases, the external requests don't change the state of the world, and the programs can be consided "pure" as long as the world isn't being changed by some other means. Hence the requests can be reordered, performed in parallel, and optimised in various ways without affecting the result. To make this concrete, if an external request has a type like: a -> m b where a is the input to the request, b is the response, and m is some monad, probably IO. Then a data structure capturing calculations over these requests, with result type c can be: data SCalc a b c = SCResult c | SCStage { sc_req :: a, sc_calc :: b -> SCalc a b c } The idea is that we either have a result, or we need to make a external request, whose response is used to generate a new calculation. "Running" such a calculation is straightforward: runSC :: (Monad m ) => SCalc a b c -> (a -> m b) -> m c runSC (SCResult v) _ = return v runSC (SCStage a cf) reqf = do b <- reqf a runSC (cf b) reqf and calculations can be sequence by making a monad instance: instance Monad (SCalc a b) where return = SCResult (>>=) (SCResult v) cf = cf v (>>=) (SCStage a cf1) cf = SCStage a (\b -> cf1 b >>= cf) Where it gets interesting, however, is that when they don't depend on each other, we can run these calculations in parallel. We need the ability to merge requests, and split responses. Hence, class Req a where merge :: a -> a -> a class Resp b where split :: b -> (b,b) par :: (Req a, Resp b) => SCalc a b c -> SCalc a b d -> SCalc a b (c,d) The par primitive above can be used to define other parallel operations, such as parList :: (Req a, Resp b) => [SCalc a b c] -> SCalc a b [c] I found it worthwhile to try and visualise what's going on here. Let's say I have 4 calculations that I want to run in parallel. The first doesn't need a request; the second needs to make a single request (A1); the third needs to make two requests where the second (B2) depends on the result of the first (B1), etc. The resulting parallel operations will be done in 3 batches, looking like: batch1 batch2 batch3 result calc1 ------------------------> V0 calc2 A1 ----------------> V1 calc3 B1 B2 -------> V2 calc4 C1 C2 C3 --> V3 (excuse the ascii art). batch1 will consist of A1,B1,C1 merged together; batch2 of B2,C3 merged; etc. In practice, I've used the above data types to abstract out database access in some reporting code. It works quite well, as use of the parallel primitive above means that the haskell code talking to the database sees all of the information in each batch simultaneously, so it can optimise the queries, remove redundant requests etc. It also makes the reporting code pure despite the fact that information is being loaded on demand from the db (without any unsafe calls behind the scene). I guess the use of the term "pure" here should be qualified: the impure code has been factored out to a single function in a different module that has a limited and well defined interface. I haven't implemented the grid calculation example described above, though I see that it ought to be able to work similarly, potentially removing duplicate calculation requests, etc. So my questions are: does this sort of "monad allowing parallel evaluation" structure have a name? Is it an existing design pattern in fp somewhere that I haven't seen? thanks, Tim