Re: Making monadic code more concise

newer
[ANNOUNCE] Parallel Haskell...

older
Re: ANNOUNCE zeno 0.1.0

oleg＠okmij.org

16 Nov 2010 16 Nov '10

10:49 a.m.

Ling Yang wrote

...

I think a good question as a starting point is whether it's possible to do this 'monadic instance transformation' for any typeclass, and whether or not we were lucky to have been able to instance Num so easily (as Num, Fractional can just be seen as algebras over some base type plus a coercion function, making them unusually easy to lift if most typeclasses actually don't fit this description).

In general, what this seems to be is a transformation on functions that also depends explicitly on type signatures. For example in Num:

Things start to break down when we get to the higher order. In the first order, it is indeed easy to see that the monadification of the term Int -> Int -> Int should/could be m Int -> m Int -> m Int Indeed, liftM2 realizes this transformation. But what about (Int -> Int) -> Int ? Should it be (m Int -> m Int) -> m Int ? A moment of thought shows that there is no total function of the type Monad m => ((Int -> Int) -> Int) -> ((m Int -> m Int) -> m Int) because there is no way, in general, to get from (m Int -> m Int) to the pure function Int->Int. That is, we can't write Monad m => (m Int -> m Int) -> m (Int->Int) One may try tabulation (for finite domains) tf f = do vt <- f (return True) vf <- f (return False) return $ \x -> if x then vt else vf but that doesn't quite work: what if the resulting function is never invoked on the True argument. We have needlessly computed that value, vt. Worse, we have incurred the effect of computing vt; that effect could be failure. We have needlessly failed. One can say: we need runnable

...

class (Monad m) => Runnable m where exit : m a -> a

are there many monads that are members of that typeclass? For example, Maybe cannot be Runnable; otherwise, what is (exit Nothing)? Any Error or MonadPlus monad can't be Runnable.

Show replies by date

Ling Yang

16 Nov 16 Nov

8:36 p.m.

New subject: Making monadic code more concise

Thanks Oleg for the feedback. If I understand right, there is a hard (as in computability) limit for automatic instancing due to the general requirement of deriving functions off of the behavior of another. At this point, I wonder: How good are we at doing this? There's languages that reside in the more expressive corners of the lambda cube, such as Epigram. Some of the concepts have been translated to Haskell, such as Djinn. Are only 'trivial' results possible, or that the incomputability problems are just moved into type space? In any case, it's a good reason to limit the scope of autolifting. On Tue, Nov 16, 2010 at 2:49 AM, wrote:

...

Ling Yang wrote

...
I think a good question as a starting point is whether it's possible to do this 'monadic instance transformation' for any typeclass, and whether or not we were lucky to have been able to instance Num so easily (as Num, Fractional can just be seen as algebras over some base type plus a coercion function, making them unusually easy to lift if most typeclasses actually don't fit this description).

In general, what this seems to be is a transformation on functions that also depends explicitly on type signatures. For example in Num:

Things start to break down when we get to the higher order. In the first order, it is indeed easy to see that the monadification of the term Int -> Int -> Int should/could be m Int -> m Int -> m Int Indeed, liftM2 realizes this transformation. But what about (Int -> Int) -> Int ? Should it be (m Int -> m Int) -> m Int ? A moment of thought shows that there is no total function of the type

Monad m => ((Int -> Int) -> Int) -> ((m Int -> m Int) -> m Int)

because there is no way, in general, to get from (m Int -> m Int) to the pure function Int->Int. That is, we can't write Monad m => (m Int -> m Int) -> m (Int->Int) One may try tabulation (for finite domains)

tf f = do vt <- f (return True) vf <- f (return False) return $ \x -> if x then vt else vf

but that doesn't quite work: what if the resulting function is never invoked on the True argument. We have needlessly computed that value, vt. Worse, we have incurred the effect of computing vt; that effect could be failure. We have needlessly failed.

One can say: we need runnable

...
class (Monad m) => Runnable m where exit : m a -> a

are there many monads that are members of that typeclass? For example, Maybe cannot be Runnable; otherwise, what is (exit Nothing)? Any Error or MonadPlus monad can't be Runnable.

Alexander Solla

8:40 p.m.

New subject: Making monadic code more concise

On Nov 16, 2010, at 12:36 PM, Ling Yang wrote:

...

Are only 'trivial' results possible, or that the incomputability problems are just moved into type space?

That's typically the case, under Rice's theorem. A construct is derivable if it works for all cases (i.e., it's a "free theorem"), or if it works for none. If it works for some, you need to encode the differences between the cases yourself.

oleg＠okmij.org

17 Nov 17 Nov

8:08 a.m.

New subject: Making monadic code more concise

Let me point out another concern with autolifting: it makes it easy to overlook sharing. In the pure world, sharing is unobservable; not so when effects are involved. Let me take the example using the code posted in your first message:

...

t1 = let x = 1 + 2 in x + x

The term t1 is polymorphic and can be evaluated as an Int or as a distribution over Int:

...

t1r = t1 ::Int -- 6 t1p = t1 ::Prob Int -- Prob {getDist = [(6,1.0)]}

That looks wonderful. In fact, too wonderful. Suppose later on we modify the code to add a non-trivial choice:

...

t2 = let x = coin 0.5 + 1 in x + x -- Prob {getDist = [(4,0.25),(3,0.25),(3,0.25),(2,0.25)]}

The result isn't probably what one expected. Here, x is a shared computation rather than a shared value. Therefore, in (x + x) the two occurrences of 'x' correspond to two _independent_ coin flips. Errors like that are insidious and very difficult to find. There are no overt problems, no exceptions are thrown, and the results might just look right. Thus the original code had to be translated into monadic style more carefully: `let' should not have been translated as it was. We should have replaced let with bind, using either of the following patterns:

...

t2b1 = do x <- coin 0.5 + 1; return $ x + x -- Prob {getDist = [(4,0.5),(2,0.5)]}

...

t2b2 = coin 0.5 + 1 >>= \xv -> let x = return xv in x + x -- Prob {getDist = [(4,0.5),(2,0.5)]}

After all, let is bind (with the different order of arguments): see Moggi's original computational lambda-calculus. Our example points out that monadifying Int->Int function as m Int -> m Int can be quite dangerous. For example, suppose we have a pure function fi:

...

fi :: Int -> Int fi x = x + x

and we simple-mindedly monadified it:

...

fp :: Monad m => m Int -> m Int fp x = liftM2 (+) x x

We can use it as follows: after all, the function accepts arguments of the type m Int:

...

tfp = fp (coin 0.5) -- Prob {getDist = [(2,0.25),(1,0.25),(1,0.25),(0,0.25)]}

The result shows two independent coin flips. Most of the readers of the program will argue that in an expression (x + x), two occurrences of x should be _correlated_. After all, that's why we use the same name, 'x'. But they are not correlated in our translation. Furthermore, translating let as bind is suboptimal. Consider

...

t3b2 = coin 0.5 + 1 >>= \xv -> let x = return xv in (5 :: Prob Int) -- Prob {getDist = [(5,0.5),(5,0.5)]}

although we don't use the result of a coin flip, we performed the coin flip nevertheless, doubling the search space. We know that such doubling is disastrous even for toy probabilistic problems. These issues are discussed at great length in the paper with Sebastian Fischer and Chung-chieh Shan, ICFP 2009. http://okmij.org/ftp/Computation/monads.html#lazy-sharing-nondet

Ling Yang

10:33 a.m.

New subject: Making monadic code more concise

Thank you for highlighting these problems; I should really test my own code more thoroughly. After reading these most recent examples, the translation to existing monads is definitely too neat, and a lot of semantics of the monad are 'defaulted' on. In particular for the probability monad examples I see I had the mistaken impression that it would preserve the random-world semantics of the do-notation whereas autolifting actually imposes a random-evaluation semantics, which would not be how I envision an autolifted probabilistic DSL. Overall, I think I pretty much got caught up in how cool it was going to be to use <$>, <*>, join/enter/exit as primitives to make any monad-based DSL work 'concisely' in an environment of existing typeclasses. That is kind of the thing I want to do at a high level; implement DSLs like probabilistic programming languages as primitives that that play transparently with existing expressions. But now it seems clear to me that this autolifting approach will not be useful with any monad where it is important to control sharing and effects, which is critical in the probability monad (and all others I can think of); in fact it seems necessary to incur the do-notation 'overhead' (which I now see is not overhead at all!) to work with them meaningfully at all, no matter how 'pure' they look in other settings. Because of this we see that the Prob monad as it is defined here is mostly unusable with autolifting. Again, thanks for the examples; I think I now have a much better intuition for do/bind and why they are required. At this point, however, I would still want to see if it is possible to do the autolifting in a more useful manner, so that the user still has some control over effects. Things like the share combinator in the paper you linked will probably be very useful. I will definitely go over it in detail.

...

From my previous experience however, this might also be accomplished by inserting something between the autolifting and the target monad.

I think it would be more helpful now to talk more about where I'm coming from. Indeed, the probability monad examples feature heavily here because I'm coming off of implementing a probabilistic programming language in Python that worked through autolifting, so expressions in it looked like host language expressions. It preserved the random-world semantics because it was using a "quote"-like applicative functor to turn a function composition in the language into an expression-tree rep of the same. I am not sure yet how to express it in Haskell (as I need to get more comfortable with GADTs), but pure would take a term to an abstract version of it, and fmap would take a function and abstract term to an abstract term representing the answer. One would then have the call graph available after doing this on lifted functions. I think of this as an automatic way of performing the 'polymorphic embedding' referenced in [Hofer et al 2008] Polymorphic Embedding of DSLs. By keeping IDs on different abstract terms, expressions like X + X (where X = coin 0.5) would take the proper distributions under random-world semantics. In general for monads where the control of sharing is important, it can be seen as limiting the re-running of effects to one per name. Each occurence of a name using the same unwrapped value. I had the initial impression, now corrected, that I could just come up with an autolifting scheme in Haskell, use it with the usual probability monad and somehow get this random-world semantics for free. No; control of sharing and effects is in fact critical, but could be done through using the autolifting as a way to turn expressions into a form where control of them is possible. For now, though, it looks like I have a lot of things to read through. Again, thanks Oleg and everyone else for all the constructive feedback. This definitely sets a personal record for misconceptions corrected / ideas clarified per day. On Wed, Nov 17, 2010 at 12:08 AM, wrote:

...

Let me point out another concern with autolifting: it makes it easy to overlook sharing. In the pure world, sharing is unobservable; not so when effects are involved.

Let me take the example using the code posted in your first message:

...
t1 = let x = 1 + 2 in x + x

The term t1 is polymorphic and can be evaluated as an Int or as a distribution over Int:

...
t1r = t1 ::Int -- 6 t1p = t1 ::Prob Int -- Prob {getDist = [(6,1.0)]}

That looks wonderful. In fact, too wonderful. Suppose later on we modify the code to add a non-trivial choice:

...
t2 = let x = coin 0.5 + 1 in x + x -- Prob {getDist = [(4,0.25),(3,0.25),(3,0.25),(2,0.25)]}

The result isn't probably what one expected. Here, x is a shared computation rather than a shared value. Therefore, in (x + x) the two occurrences of 'x' correspond to two _independent_ coin flips. Errors like that are insidious and very difficult to find. There are no overt problems, no exceptions are thrown, and the results might just look right.

Thus the original code had to be translated into monadic style more carefully: `let' should not have been translated as it was. We should have replaced let with bind, using either of the following patterns:

...
t2b1 = do x <- coin 0.5 + 1; return $ x + x -- Prob {getDist = [(4,0.5),(2,0.5)]}

...
t2b2 = coin 0.5 + 1 >>= \xv -> let x = return xv in x + x -- Prob {getDist = [(4,0.5),(2,0.5)]}

After all, let is bind (with the different order of arguments): see Moggi's original computational lambda-calculus.

Our example points out that monadifying Int->Int function as m Int -> m Int can be quite dangerous. For example, suppose we have a pure function fi:

...
fi :: Int -> Int fi x = x + x

and we simple-mindedly monadified it:

...
fp :: Monad m => m Int -> m Int fp x = liftM2 (+) x x

We can use it as follows: after all, the function accepts arguments of the type m Int:

...
tfp = fp (coin 0.5) -- Prob {getDist = [(2,0.25),(1,0.25),(1,0.25),(0,0.25)]}

The result shows two independent coin flips. Most of the readers of the program will argue that in an expression (x + x), two occurrences of x should be _correlated_. After all, that's why we use the same name, 'x'. But they are not correlated in our translation.

Furthermore, translating let as bind is suboptimal. Consider

...
t3b2 = coin 0.5 + 1 >>= \xv -> let x = return xv in (5 :: Prob Int) -- Prob {getDist = [(5,0.5),(5,0.5)]}

although we don't use the result of a coin flip, we performed the coin flip nevertheless, doubling the search space. We know that such doubling is disastrous even for toy probabilistic problems.

These issues are discussed at great length in the paper with Sebastian Fischer and Chung-chieh Shan, ICFP 2009.

http://okmij.org/ftp/Computation/monads.html#lazy-sharing-nondet

5350

Age (days ago)

5351

Last active (days ago)

List overview

Download

4 comments

3 participants

participants (3)

Alexander Solla
Ling Yang
oleg＠okmij.org