I've been wrestling with this for a while and I decided eventually to look for help. I've been hoping to design a domain specific embedded language in Haskell that would let me pipe a commutative monad throughout an expression written in the language. Special terms within the language will eventually have access to this monad. I've created a simplified version here to represent the main issues. 

Here is what I'd like from the language:

 * To use haskell syntax for substitution and pattern matching rather than implementing this myself.
 * To be able to express lambdas in my language.
 * To be able to embed any haskell terms including functions into the language.
 * I'd like the Haskell type checker to tell me about bad terms.
 * I'd like to thread a monad through the entire expression.

So here is the first implementation that I tried of this (full source here: http://lpaste.net/142959)

    data Exp m x where
       Val :: m x -> Exp m x  
       Lam1 :: m (a -> Exp m b)        -> Exp m (a -> b)
       Lam2 :: m (a -> Exp m (b -> c)) -> Exp m (a -> b -> c)

a function liftE allows me to lift a haskell term into an expression:

    liftE x = Val $ return x

Application is a function <@> so:

    (<@>) :: forall m a b. Monad m => Exp m (a -> b) -> Exp m a -> Exp m b
    (<@>) (Val  f) (Val x) = Val  $ f `ap` x
    (<@>) (Lam2 f) (Val x) = Lam1 $ f >>= \f' -> x >>= \x' -> unLam1 $ f' x'
    (<@>) (Lam1 f) (Val x) = Val $ f >>= \f' -> x >>= \x' -> unVal $ f' x'

Seems like it might work! In fact it does typecheck.

So the first test expressions I'd like to try are these:

    mapE :: Monad m => Exp m ((a -> b) -> [a] -> [b])
    mapE = Lam2 $ return $ \ f -> Lam1 $ return $ \ xxs ->
      case xxs of 
        [] -> liftE []
        (x:xs) -> liftE (:) <@> liftE (f x) <@> (mapE <@> liftE f <@> liftE xs)

    testExpression :: Monad m => Exp m [Int]
    testExpression = mapE <@> liftE (+10) <@> liftE [1,2,3,4]

and just to justify doing any of this I'll create a Monad called BindCounter that counts the number of times bind is called:

    newtype BindCounter a = BC (Int -> (Int, a))

    runBC (BC f) = f 0

    instance Functor (BindCounter) where
      fmap f (BC x) = BC $ \t -> let (t', x') = x t in (t', f x')

    instance Applicative (BindCounter) where
      pure x = BC (\t -> (t,x))
      f <*> x = f >>= \f' -> x >>= \x' -> return $ f' x' 

    instance Monad (BindCounter) where
      return = pure
      BC f >>= g = BC $ \old ->
                        let (new, val) = f (old + 1)
                            BC f' = g val
                        in f' new

now I can try a test 

    test = 
      let (count, result) = runBC (unVal testExpression) in
        putStrLn $ "Count: " ++ show count ++ " Result: " ++ show result 

The result in ghci is:
   Count: 36 Result: [11,12,13,14]

Ok so that's a lot. I was surprised I got this working. You can see from the code that my main gripe with this is I haven't found a way to remove the need to specify the number of embedded lambdas using Lam1 and Lam2 (we could easily add more) and I haven't found a way to apply a Lam to another Lam. I'm also curious if I am reinventing the wheel, I hadn't found a library yet that let's me do something similar.