
On January 15, 2011 22:43:32 Jan-Willem Maessen wrote:
For example, I find it relatively easy to understand >>= in the continuation monad, but have to spend a long time puzzling my way through join.
It's not that bad once you get used to thinking of it. Join simply merges an inner computation into an outer one. To put this in the continuation passing framework. You have a series of computations A, B, etc. strung together by continuation passing A--B--C--D--E ... For a predetermined string, you have a strictly applicative framework. If, say, you want to dynamically compute the next bit of the string of computations C1--C2--C3 based of A--B--C, you requires a monad (i.e., join). Join allows you to merge the C1--C2--C3 dynamically determined from A--B--C into the continuation passing string to get A--B--C--C1--C2--C3--D--E ... (C1--C2--C3 is determined by A--B--C) This is actually relatively clear from join's type join :: Monad m => m (m a) -> m a It takes a string of computations (as indicated by the outer m) that generates a string of computations generating an a (as indicated by the inner m a). It changes this into just a string of computations generating an a. How can it do this? It must set things up to first run the given string of computations in order to determine the string of computations that give a. Once it gets this, it must then run it in order to get a. join (CPS f) = CPS $ \k -> unCPS (f id) k That is, when invoked (i.e., when given a continuation k), invoke f with the continuation id. This will cause f to return the inner computation it computes. Then invoke this returned computation by passing it k. What would the actual haskell code look like for the string of computations? It's a bit messy in applicative style because it is sequence/flow orientated E <*> (D <*> join (C <*> (B <*> A))) - run A and feed the results in B, - run B and feed the results in C, - run the results of this (join) and feed them into D, and - run D and feed the results in E. Something a lot more friendly to the notation would be if A, B, and C were independent computations whose combined results were to determine another computation via a function g. This result, along with D and E, where then to be fed into yet another function f. This is simply expressed as f <$> join (g <$> A <*> B <*> C) <*> D <*> E - run A and bind the result as the first argument of g, - run B and bind the result as the second argument of g, - run C and bind the result as the third argument of g, - run g to dynamically determine the next thing to run, - run this (join) and bind its result as the first argument of f, - run D and bind the result as the second argument of f, and - run E and bind the result as the third argument of f. (run A, etc. may not do any more work than bind a thunk due to laziness) It is exactly what you would expect "f (g A B C) D E" to do. The difference is the interaction can be a lot more than just a pure application. You can have side effects, backtracking, outer joins, and so on. Although in the above I used pure functions (hence the <$>), this extends to functions too. The unfortunate pain you pay for this additional power is manually having to specify the application (<$> and <*>) and merging (join). If the compiler could figure this all out for you based on the underlying types, wow! Cheers! -Tyson PS: I hope the above example makes it clear exactly where join is required. This is precisely the additional power of a monad gives you over applicative.