
eval :: Expr -> Eval Value eval (Let decls body) = mdo let (names,exprs) = unzip decls let updateEnv env = foldr (uncurry M.insert) env $ zip names values (values,result) <- local updateEnv $ liftM2 (,) (mapM eval exprs) (eval
I am trying to write an interpreter for a very simple untyped functional language. I have a problem with mutually recursive let expressions, for which my interpreter loops :( This is a code snippet from the eval function: body)
return result
Module M is Data.Map, the environment is a simple map from strings to values. Values are defined as
data Value = Data String | Function (Value -> Eval Value)
The Eval monad is defined as
newtype Eval a = Eval { unEval :: ErrorT String (StateT Env (Writer [String])) a } deriving ( Monad, MonadFix, MonadWriter [String], -- for warnings & other messages MonadState Env, MonadError String )
instance MonadReader Env Eval where ask = get local tr act = do s <- get modify tr r <- act put s return r
When I test this with an extremely simple expression, something like "let x = 1 in x", the code above loops. I don't understand why, especially since in a previous version it worked. In the previous version I had a simpler monad stack that went
newtype Eval a = Eval { unEval :: ReaderT Env (Writer [String])) a } deriving ( Monad, MonadFix, MonadWriter [String], MonadReader Env )
(Replacing reader with state was done so I can add definitions to the environment at runtime. The ErrorT provides for errors, such as application of a non-function.) Expressions not involving let work fine. Also, if I replace the above definition by one which does not allow recursion (not using mdo, evaluating the defining expressions before the variable gets added to the environment), then non-recursive let-expressions (like the simple example above) work just fine. I am out of ideas as to what causes this problem. Does the addition of ErrorT make my monad too strict? How else can I implement mutual recursion? Cheers Ben

The following code works fine for me, so it seems you are missing some details that may help. {-# LANGUAGE RecursiveDo, GeneralizedNewtypeDeriving, TypeSynonymInstances, MultiParamTypeClasses #-} import Control.Monad import Control.Monad.State import Control.Monad.Error import Control.Monad.Writer import Control.Monad.Reader import Control.Monad.Fix import Data.Maybe import qualified Data.Map as M data Expr = Let [(String, Expr)] Expr | Const Int | Var String data Value = Data String | Function (Value -> Eval Value) instance Show Value where show (Data s) = s type Env = M.Map String Value example = Let [("x", Const 1)] (Var "x") eval :: Expr -> Eval Value eval (Const n) = return (Data (show n)) eval (Var x) = gets (fromJust . M.lookup x) eval (Let decls body) = mdo let (names,exprs) = unzip decls updateEnv env = foldr (uncurry M.insert) env $ zip names values (values,result) <- local updateEnv $ liftM2 (,) (mapM eval exprs) (eval body) return result newtype Eval a = Eval { unEval :: ErrorT String (StateT Env (Writer [String])) a } deriving ( Monad, MonadFix, MonadWriter [String], -- for warnings & other messages MonadState Env, MonadError String ) runEval :: Eval Value -> Either String Value runEval = fst . runWriter . flip evalStateT M.empty . runErrorT . unEval evaluate = runEval . eval instance MonadReader Env Eval where ask = get local tr act = do s <- get modify tr r <- act put s return r

Derek Elkins wrote:
The following code works fine for me, so it seems you are missing some details that may help. [...snip code...]
Thank you! Indeed I did simplify the code when writing the message -- because I thought that those other bits could not possibly be at fault... ;-) *trying out many changes to my own code and yours* Ok, I finally found it. What actually made the difference was the case for variables: Your version is
eval (Var x) = gets (fromJust . M.lookup x)
which is suitably lazy, whereas mine was (more or less)
eval e@(Var name) = do env <- ask case M.lookup name env of Nothing -> do -- undefined variable reference warning ("reference to undefined variable " ++ show name) let val = Data "" modify (M.insert name val) return val Just val -> return val
Note that whatever I do in the 'Nothing' case is irrelevant, your code with the Var case replaced by
eval e@(Var name) = do env <- ask case M.lookup name env of Just val -> return val
loops as well. My problem is that I still don't understand why this is so! I know of course that pattern matching is strict, but I thought this should be ok here, since I evaluate the declarations _before_ the body, so when evaluation of the body demands the variable, it will be defined. What am I missing? Cheers Ben

Ben Franksen wrote:
My problem is that I still don't understand why this is so! I know of course that pattern matching is strict, but I thought this should be ok here, since I evaluate the declarations _before_ the body, so when evaluation of the body demands the variable, it will be defined.
Another data point: It /has/ something to do with ErrorT. If I remove the ErrorT from the monad stack it works, even with the pattern matching in the variable lookup: newtype Eval a = Eval { unEval :: {- ErrorT String -} (StateT Env (Writer [String])) a } deriving ( Monad, MonadFix, MonadWriter [String], -- for warnings & other messages MonadState Env{- , MonadError String -} ) runEval :: Eval Value -> {- Either String -} Value runEval = fst . runWriter . flip evalStateT M.empty . {- runErrorT . -} unEval *Main> evaluate example 1 I am still lost as to how to make this work with ErrorT. Cheers Ben

On Wed, Nov 25, 2009 at 3:48 PM, Ben Franksen
Derek Elkins wrote:
The following code works fine for me, so it seems you are missing some details that may help. [...snip code...]
Thank you! Indeed I did simplify the code when writing the message -- because I thought that those other bits could not possibly be at fault... ;-)
*trying out many changes to my own code and yours*
Ok, I finally found it. What actually made the difference was the case for variables:
Your version is
eval (Var x) = gets (fromJust . M.lookup x)
which is suitably lazy, whereas mine was (more or less)
eval e@(Var name) = do env <- ask case M.lookup name env of Nothing -> do -- undefined variable reference warning ("reference to undefined variable " ++ show name) let val = Data "" modify (M.insert name val) return val Just val -> return val
Note that whatever I do in the 'Nothing' case is irrelevant, your code with the Var case replaced by
eval e@(Var name) = do env <- ask case M.lookup name env of Just val -> return val
loops as well.
My problem is that I still don't understand why this is so! I know of course that pattern matching is strict, but I thought this should be ok here, since I evaluate the declarations _before_ the body, so when evaluation of the body demands the variable, it will be defined.
What am I missing?
The problem is the liftM2 in the Let branch of eval. You are executing the body while making the bindings, so you are trying to look up x while you are still trying to bind it. One solution is to move the execution of the body after the binding as in: eval (Let decls body) = mdo let (names,exprs) = unzip decls updateEnv env = foldr (uncurry M.insert) env $ zip names values values <- local updateEnv $ mapM eval exprs local updateEnv $ eval body

Derek Elkins wrote:
On Wed, Nov 25, 2009 at 3:48 PM, Ben Franksen
What am I missing?
The problem is the liftM2 in the Let branch of eval. You are executing the body while making the bindings, so you are trying to look up x while you are still trying to bind it. One solution is to move the execution of the body after the binding as in:
eval (Let decls body) = mdo let (names,exprs) = unzip decls updateEnv env = foldr (uncurry M.insert) env $ zip names values values <- local updateEnv $ mapM eval exprs local updateEnv $ eval body
I already tried that :( It works for non-recursive expressions like 'example', but not for recursive ones; not even non-recursive ones that merely use a variable before it is defined like this one
example2 = Let [("y", Var "x"),("x", Const 1)] (Var "y")
which again makes eval loop. However, if I use your lazy version
eval (Var x) = gets (fromJust . M.lookup x)
_or_ remove the ErrorT from the monad stack (see my other message) eval does not loop, even with mutually recursive definitions. *some time later* Ok, it seems I have a version that does what I want. It is not very elegant, I have to manually wrap/unwrap the ErrorT stuff just for the Val case, but at least it seems to work. Here it goes:
eval (Var x) = Eval $ ErrorT $ do env <- get v <- case M.lookup x env of Just v -> return v Nothing -> do warning ("reference to undefined variable " ++ show x) let v = Data "" modify (M.insert x v) return v return (Right v)
warning s = tell $ ["Warning: " ++ s]
I suspect what is needed to avoid this is a combinator that convinces ErrorT that a computation is really going to succeed, no matter what. Hmm, now that I think about it this should be possible using catchError. I will investigate. Cheers Ben

Ben Franksen wrote:
Ok, it seems I have a version that does what I want. It is not very elegant, I have to manually wrap/unwrap the ErrorT stuff just for the Val case, but at least it seems to work. Here it goes:
eval (Var x) = Eval $ ErrorT $ do env <- get v <- case M.lookup x env of Just v -> return v Nothing -> do warning ("reference to undefined variable " ++ show x) let v = Data "" modify (M.insert x v) return v return (Right v)
warning s = tell $ ["Warning: " ++ s]
While this works for simple var=constant declarations, it breaks down again as soon as I add lambdas and application. Same symptoms as before: eval loops; and it works again if I remove the ErrorT (but then I get a pattern match failure if I apply a non-function which is of course what I wanted to avoid with the ErrorT). This is maddening! There must be some way to get mutual recursion to work while still allowing for clean handling of failure. What galls me the most is that it is so unpredictable whether the program will terminate with a given input or not. (The code is attached.) Cheers Ben

The problem is that ErrorT makes any argument to mdo *much* more
strict, and therefore much more likely to loop.
This is because each action needs to know whether the previous action
was successful before it can continue.
Notice that when you got it to work, you *always* return "Right v";
that is, you never actually have an error.
If you want to avoid introducing bottoms into your code, I would avoid
using fix/mdo except in cases where you can prove that the code is
non-strict. You keep running into cases where your code is more
strict than you would like which is introducing the bottoms.
To understand this better, consider the following function:
fixEither :: (a -> Either e a) -> Either e a
fixEither f = x where
x = f a
(Right a) = x
Here, f *cannot* attempt to do anything with its argument until it is
absolutely known that f is returning a "Right" value.
Perhaps there is a different way to write this interpreter that avoids
needing to tie the knot so tightly? Can you split recursive-let into
two stages, one where you construct the environment with dummy
variables and a second where you populate them with the results of
their evaluations? You might need some sort of mutable thunk that you
can store in the environment, which makes sense to me; in GHC Core,
"let" means "allocate a thunk on the heap".
-- ryan
On 11/27/09, Ben Franksen
Ben Franksen wrote:
Ok, it seems I have a version that does what I want. It is not very elegant, I have to manually wrap/unwrap the ErrorT stuff just for the Val case, but at least it seems to work. Here it goes:
eval (Var x) = Eval $ ErrorT $ do env <- get v <- case M.lookup x env of Just v -> return v Nothing -> do warning ("reference to undefined variable " ++ show x) let v = Data "" modify (M.insert x v) return v return (Right v)
warning s = tell $ ["Warning: " ++ s]
While this works for simple var=constant declarations, it breaks down again as soon as I add lambdas and application. Same symptoms as before: eval loops; and it works again if I remove the ErrorT (but then I get a pattern match failure if I apply a non-function which is of course what I wanted to avoid with the ErrorT).
This is maddening! There must be some way to get mutual recursion to work while still allowing for clean handling of failure. What galls me the most is that it is so unpredictable whether the program will terminate with a given input or not.
(The code is attached.)
Cheers Ben

Ryan Ingram wrote:
The problem is that ErrorT makes any argument to mdo *much* more strict, and therefore much more likely to loop.
This is because each action needs to know whether the previous action was successful before it can continue.
Thanks, you are confirming what I suspected. I just wasn't sure that I didn't do something stupid that could easily be fixed.
Notice that when you got it to work, you *always* return "Right v"; that is, you never actually have an error.
Yes, exactly. It helps in the simplest cases, but with function definitions even this is not enough.
If you want to avoid introducing bottoms into your code, I would avoid using fix/mdo except in cases where you can prove that the code is non-strict. You keep running into cases where your code is more strict than you would like which is introducing the bottoms.
To understand this better, consider the following function:
fixEither :: (a -> Either e a) -> Either e a fixEither f = x where x = f a (Right a) = x
Here, f *cannot* attempt to do anything with its argument until it is absolutely known that f is returning a "Right" value.
Interesting. I'll have to think about this one.
Perhaps there is a different way to write this interpreter that avoids needing to tie the knot so tightly? Can you split recursive-let into two stages, one where you construct the environment with dummy variables and a second where you populate them with the results of their evaluations? You might need some sort of mutable thunk that you can store in the environment, which makes sense to me; in GHC Core, "let" means "allocate a thunk on the heap".
Yea, this is how I would do it in an imperative language. I thought I could avoid mtuable cells by exploiting lazyness. I am not yet ready to give up, however. One thing I want to try is whether I can come up with a variation of ErrorT that is less strict. Another idea I just had is that maybe continuations might help, as they provide a possibility to 'jump' out of a calculation. Chers Ben

On Fri, Nov 27, 2009 at 1:40 PM, Ben Franksen
Thanks, you are confirming what I suspected. I just wasn't sure that I didn't do something stupid that could easily be fixed.
Well, lets unwrap your newtype into the actual functions: Eval (ErrorT String (StateT Env (Writer [String])) a) ErrorT e m a ~= m (Either e a) StateT s m a ~= s -> m (a,s) Writer w a ~= (a,w) So we have Eval a ~= ErrorT String (StateT Env (Writer [String])) a) ~= StateT Env (Writer [String])) (Either String a) ~= Env -> Writer [String] (Either String a, Env) ~= Env -> ((Either String a, Env), [String]) Also I notice you really only use Eval Value; everything else is just minor side effects. So this is pretty clear; we are given an environment, and we need to return another environment, a list of strings, and either an error message or a Value. Now the question is, what do you want to happen when given a malformed let expression? I am pretty sure that you need more complicated flow-control here in order to get the result. I believe you are on the right track with continuations. Here is a question; what should these expressions do?
let y = x; x = 1 in y let y = x x; x = 1 in x let x = x in x
The last one is quite telling; I can see three possible behaviors here: 1) Loop 2) return some simple undefined value 3) Give an error "blackhole" I will note that behavior (1) seems very difficult to achieve with your current monad stack; eval (Var x) terminates simply by looking up the value in the environment. I think you need to think hard about evaluation order and decide what you really want to happen. The simplest answer, if you want to stay with strict evaluation, is probably to only allow recursive *function* definitions. This way you can delay fully initializing the environment until after you've finished evaluating the functions themselves. Also, your definition of "Function" seems to have problems with scoping; unless you intended to make a dynamically scoped language, (Value -> Eval Value) seems very likely to get evaluated in the context it is called in. -- ryan

Hi Ryan, first, to get this out of the way, you wrote:
Also, your definition of "Function" seems to have problems with scoping; unless you intended to make a dynamically scoped language,
No, absolutely not! In fact, the whole exercise has been born out of frustration with certain ad-hoc extensions to an already evil domain-specific (macro substitution) language -- the extension being to add dynamically scoped local variables; and the basic evilness to allow substitution to occur in variable names (similar to make) as a poor man's substitute for functional abstraction. This makes for extremely cryptic programs whose result is very hard to predict. My aim is to show that there is a better way.
(Value -> Eval Value) seems very likely to get evaluated in the context it is called in.
Fortunately, this is not the case, as I explicitly capture the evironment at the definition site, ignoring the one at the call site: eval (Lam parm body) = do env <- ask return $ Function (\val -> local (\_ -> M.insert parm val env) (eval body)) Now to the interesting part:
Now the question is, what do you want to happen when given a malformed let expression? I am pretty sure that you need more complicated flow-control here in order to get the result. I believe you are on the right track with continuations.
My problem is that I have never really become comfortable with continuations; just couldn't wrap my head around all the nested lambdas involved. Is there a nice tutorial (preferably one of those functional perls, I love them) that explains how CPS actually works to produce those wonderful effects, like jumping around, fixing evaluation order and whatnot? I tried to follow the recent explanations by Jacques Carette and Oleg Kiselyov on this list but I must admit that I understood nought.
Here is a question; what should these expressions do?
let y = x; x = 1 in y let y = x x; x = 1 in x let x = x in x
The last one is quite telling; I can see three possible behaviors here:
1) Loop 2) return some simple undefined value 3) Give an error "blackhole"
I will note that behavior (1) seems very difficult to achieve with your current monad stack; eval (Var x) terminates simply by looking up the value in the environment.
I think you need to think hard about evaluation order and decide what you really want to happen. The simplest answer, if you want to stay with strict evaluation, is probably to only allow recursive *function* definitions. This way you can delay fully initializing the environment until after you've finished evaluating the functions themselves.
Thanks, Ryan. This got me thinking about the right questions. I found out that what I really want is a mixture of lazy and strict evaluation: I want variable definitions in a let expression to be lazy, but application of functions to be strict. (I don't know whether this kind of mixture has been used before.) Thus
let y = x; x = 1 in y
should evaluate to 1 . I want the meaning of declarations on the same level to be independent of their relative order. This is a purely functional language, after all, so why should it matter in which order things are defined?
let y = x x; x = 1 in x
Here y is never used, so again this evaluates to 1 .
let x = x in x
This should loop (or maybe better detected as a failure i.e. backhole), but only if and when x is used, either in an application or as the final result of the program. (In the former case it doesn't make a difference whether x is used in function or in argument position.) *********** Thinking about how to make it _explicit_ in my code that application is strict, whereas variables are lazy, I saw that this needs a change in the type of environments. It used to be a map from variable names to _values_, i.e. evaluated expressions. If I change this to a map from variable names to either thunks (i.e. unevaluated expressions) or (evaluated) values, then everything else falls smoothly into place; no need for mdo/mfix anymore, thus no need for fiddling with ErrorT internals to convince it that variable lookup always succeeds, and last not least all my examples behave as I expect them to do (see attached code). So, in a way I /have/ (finally) given up ;-) because variables are now (internally) mutable cells: when a variable is demanded (e.g. by an application) it gets mutated from thunk to value. Could as well revert to a Reader monad and use STRefs for efficiency. (Or maybe I will finally try to understand how to use continuations for stuff like this.) I have learned (at least) this: The problem with using the host language's lazyness for implementing lazyness in the defined language is that the former is not directly observable. Thus it works fine as long as you buy the whole package, i.e. either make sure that there can't be a failure, or else use not only the built-in evaluation order but also the built-in failure mode: error, pattern match failure, i.e. exceptions that can occur in pure code. This has the disadvantage that you can only handle such failures in the IO monad. Adding an explicit failure mode (as a value, i.e. Either, ErrorT, whatever) is not really compatible with relying on the implicit built-in lazyness: it makes the outcome hard to predict and the whole thing becomes fragile. This is interesting insofar as normally people stumble over the 'dual' problem: they find out that exceptions sometimes do not play nice with lazyness (e.g. exception handlers don't fire because some code gets evaluated only much later, etc). Cheers Ben
participants (3)
-
Ben Franksen
-
Derek Elkins
-
Ryan Ingram