Thank you for highlighting these problems; I should really test my own code

more thoroughly. After reading these most recent examples, the translation to

existing monads is definitely too neat, and a lot of semantics of the monad are

'defaulted' on. In particular for the probability monad examples I see I had

the mistaken impression that it would preserve the random-world semantics of

the do-notation whereas autolifting actually imposes a random-evaluation

semantics, which would not be how I envision an autolifted probabilistic DSL.

Overall, I think I pretty much got caught up in how cool it was going to be to

use <$>, <*>, join/enter/exit as primitives to make any monad-based DSL work

'concisely' in an environment of existing typeclasses. That is kind of the

thing I want to do at a high level; implement DSLs like probabilistic

programming languages as primitives that that play transparently with existing

But now it seems clear to me that this autolifting approach will not be useful

with any monad where it is important to control sharing and effects, which is

critical in the probability monad (and all others I can think of); in fact it

seems necessary to incur the do-notation 'overhead' (which I now see is not

overhead at all!) to work with them meaningfully at all, no matter how 'pure'

they look in other settings. Because of this we see that the Prob monad as it

is defined here is mostly unusable with autolifting. Again, thanks for the

examples; I think I now have a much better intuition for do/bind and why they

At this point, however, I would still want to see if it is possible to do the

autolifting in a more useful manner, so that the user still has some control

over effects. Things like the share combinator in the paper you linked will

probably be very useful. I will definitely go over it in detail.

From my previous experience however, this might also be accomplished by

inserting something between the autolifting and the target monad.

I think it would be more helpful now to talk more about where I'm coming from.

Indeed, the probability monad examples feature heavily here because I'm coming

off of implementing a probabilistic programming language in Python that worked

through autolifting, so expressions in it looked like host language

expressions. It preserved the random-world semantics because it was using a

"quote"-like applicative functor to turn a function composition in the language

into an expression-tree rep of the same. I am not sure yet how to express it in

Haskell (as I need to get more comfortable with GADTs), but pure would take a

term to an abstract version of it, and fmap would take a function and abstract

term to an abstract term representing the answer. One would then have the call

graph available after doing this on lifted functions. I think of this as an

automatic way of performing the 'polymorphic embedding' referenced in

[Hofer et al 2008] Polymorphic Embedding of DSLs.

By keeping IDs on different abstract terms, expressions like X + X (where X =

coin 0.5) would take the proper distributions under random-world semantics.

In general for monads where the control of sharing is important, it can be seen

as limiting the re-running of effects to one per name. Each occurence of a name

using the same unwrapped value.

I had the initial impression, now corrected, that I could just come up with an

autolifting scheme in Haskell, use it with the usual probability monad and

somehow get this random-world semantics for free. No; control of sharing and

effects is in fact critical, but could be done through using the autolifting as

a way to turn expressions into a form where control of them is possible.

For now, though, it looks like I have a lot of things to read through.

Again, thanks Oleg and everyone else for all the constructive feedback. This

definitely sets a personal record for misconceptions corrected / ideas

clarified per day.

On Wed, Nov 17, 2010 at 12:08 AM, <oleg@okmij.org> wrote:

Let me point out another concern with autolifting: it makes it easy to
overlook sharing. In the pure world, sharing is unobservable; not so
when effects are involved.

Let me take the example using the code posted in your first message:

> t1 = let x = 1 + 2 in x + x

The term t1 is polymorphic and can be evaluated as an Int or as a
distribution over Int:

> t1r = t1 ::Int -- 6
> t1p = t1 ::Prob Int -- Prob {getDist = [(6,1.0)]}

That looks wonderful. In fact, too wonderful. Suppose later on we
modify the code to add a non-trivial choice:

> t2 = let x = coin 0.5 + 1 in x + x
> -- Prob {getDist = [(4,0.25),(3,0.25),(3,0.25),(2,0.25)]}

The result isn't probably what one expected. Here, x is a shared
computation rather than a shared value. Therefore, in (x + x)
the two occurrences of 'x' correspond to two _independent_ coin flips.
Errors like that are insidious and very difficult to find. There are
no overt problems, no exceptions are thrown, and the results might
just look right.

Thus the original code had to be translated into monadic style more
carefully: `let' should not have been translated as it was. We should
have replaced let with bind, using either of the following patterns:

> t2b1 = do x <- coin 0.5 + 1; return $ x + x
> -- Prob {getDist = [(4,0.5),(2,0.5)]}

> t2b2 = coin 0.5 + 1 >>= \xv -> let x = return xv in x + x
> -- Prob {getDist = [(4,0.5),(2,0.5)]}

After all, let is bind (with the different order of arguments): see
Moggi's original computational lambda-calculus.

Our example points out that monadifying Int->Int function as
m Int -> m Int can be quite dangerous. For example, suppose we have
a pure function fi:

> fi :: Int -> Int
> fi x = x + x

and we simple-mindedly monadified it:

> fp :: Monad m => m Int -> m Int
> fp x = liftM2 (+) x x

We can use it as follows: after all, the function accepts arguments of
the type m Int:

> tfp = fp (coin 0.5)
> -- Prob {getDist = [(2,0.25),(1,0.25),(1,0.25),(0,0.25)]}

The result shows two independent coin flips. Most of the readers of
the program will argue that in an expression (x + x), two occurrences
of x should be _correlated_. After all, that's why we use the
same name, 'x'. But they are not correlated in our translation.

Furthermore, translating let as bind is suboptimal. Consider

> t3b2 = coin 0.5 + 1 >>= \xv -> let x = return xv in (5 :: Prob Int)
> -- Prob {getDist = [(5,0.5),(5,0.5)]}

although we don't use the result of a coin flip, we performed the coin
flip nevertheless, doubling the search space. We know that such
doubling is disastrous even for toy probabilistic problems.

These issues are discussed at great length in the paper with Sebastian
Fischer and Chung-chieh Shan, ICFP 2009.

http://okmij.org/ftp/Computation/monads.html#lazy-sharing-nondet