Re: [Haskell-cafe] |> vs. $ (was: request for code review)

newer
A syntax proposal: data...

older
Re: Library survey results

Shannon -jj Behrens

6 Mar 2006 6 Mar '06

7:25 p.m.

By the way, thanks for everyone's comments so far! They're very helpful!

...

Also, most haskell programs use $ instead of |>

...
-- For convenience: currTokType :: ParseContext -> TokenType currTokType ctx = ctx |> currTok |> tokenType

this could be written as: tokenType $ currTok $ ctx

Concerning: -- |> is like a UNIX pipe. infixl 9 |> x |> f = f x I find "ctx |> currTok |> tokenType" to be more readable than "tokenType $ currTok $ ctx" because you're not reading the code in reverse. That's my primary complaint with "." and "$". That's especially the case when I'm spreading the code over multiple lines: -- Translate a C type declaration into English. translate :: String -> String translate s = s |> createParseContext |> readToFirstIdentifier |> dealWithDeclarator |> consolidateOutput I prefer |> for readability, but I understand that it can be bad to spring new constructs on programmers in the middle of a program, and I'm totally in favor of following standard Haskell idioms. Does anyone else have strong opinions one way or the other? It seems like most of my program is centered around pipes of "|>", so it's an important issue. Thanks! -jj

Show replies by date

Udo Stenzel

6 Mar 6 Mar

8:42 p.m.

New subject: |> vs. $ (was: request for code review)

Shannon -jj Behrens wrote:

...

I find "ctx |> currTok |> tokenType" to be more readable than "tokenType $ currTok $ ctx" because you're not reading the code in reverse. That's my primary complaint with "." and "$".

Seconded. That's why I'd like to see the arguments to (.) swapped, but it's too late for that. A useable replacement is (>>>) from the Arrow instance for (->), which gives:

...

translate :: String -> String translate = createParseContext >>> readToFirstIdentifier >>> dealWithDeclarator >>> consolidateOutput

It reads from left to right and it promotes a more functional style. The operator could be a bit easier on the eyes, though. Udo. -- Always call a spade a spade, except in classes that both dig holes and play bridge. -- a guideline for Eiffel programmers

Brian Hulley

8:43 p.m.

New subject: |> vs. $ (was: request for code review)

Shannon -jj Behrens wrote:

...

I find "ctx |> currTok |> tokenType" to be more readable than "tokenType $ currTok $ ctx" because you're not reading the code in reverse. That's my primary complaint with "." and "$". That's especially the case when I'm spreading the code over multiple lines:

-- Translate a C type declaration into English. translate :: String -> String translate s = s |> createParseContext |> readToFirstIdentifier |> dealWithDeclarator |> consolidateOutput

If you were wanting to be able to deal with exceptions/ errors etc during the translation process, you'd usually use a monad (Haskell's version of a pipe), in which case the operations could still be read left to right eg: translate :: (Monad m) => String -> m String translate = do createParseContext readToFirstIdentifier dealWithDeclarator consolidateOutput So while . and $ are useful for combining functions together in parts of a program, and are consistent with the idea that the function always comes first followed by its argument, the top level (at least) would usually be monadic and hence not require |> to get left-to-right readability. I've copied the links on monads below from a previous post by Jared: http://www.nomaware.com/monads/html/ http://haskell.org/hawiki/Monad Regards, Brian.

Brian Hulley

7 Mar 7 Mar

2:08 p.m.

New subject: |> vs. $ (was: request for code review)

Brian Hulley wrote:

...

translate :: (Monad m) => String -> m String translate = do createParseContext readToFirstIdentifier dealWithDeclarator consolidateOutput

The type signature above doesn't match the do block. It would either have to be changed to something like: translate :: Control.Monad.State.MonadState String m => m () (storing the string in the monad's state instead of using a monad which returns it) or the do block could be replaced with the >>= operator as below, to thread the returned string between the components of the "pipe": translate :: Monad m => String -> m String translate x = return x >>= createParseContext >>= readToFirstIdentifier >>= dealWithDeclarator >>= consolidateOutput

Shannon -jj Behrens

7:52 p.m.

New subject: |> vs. $ (was: request for code review)

I did think of using a monad, but being relatively new to Haskell, I was confused about a few things. Let's start by looking at one of my simpler functions: -- Keep pushing tokens until we hit an identifier. pushUntilIdentifier :: ParseContextTransformation pushUntilIdentifier ctx | currTokType ctx == Identifier = ctx | otherwise = let newStack = (currTok ctx) : (stack ctx) in (ctx {stack=newStack}) |> getToken |> pushUntilIdentifier The function itself is a ParseContextTransformation. It takes a context, transforms it, and returns it. Most of the pipelines in the whole application are ParseContextTransformations, and the |> (or $ or .) are ways of tying them together. My questions concerning Monads are in this example are: 1. Monads apply a strategy to computation. For instance, the list monad applies the strategy, "Try it with each of my members." What part of my code is the strategy? 2. Monads are containers that wrap a value. For instance, the Maybe monad can wrap any value, or it can wrap no value and just be Nothing. What part of my code is the thing being wrapped, and what part is "extra data" stored in the Monad itself? So I guess: 3. Is the ParseContext the monad or the thing being wrapped? 4. How do I divide the code between the functions on the right side of

...

...
= and the functions in the monad itself? The functions on the right side of >>= operate on the stuff inside the monad, and the functions in the monad itself operate on the stuff in the monad.

5. How does the ParseContextTransformation relate? It is because I did not understand the answers to these questions that I thought maybe a monad might not be appropriate. However, I surely could be wrong. Afterall, ParseContext, ParseContextTransformation, and |> are all *inspired* by what I knew about monads. Thanks for your help! -jj On 3/7/06, Brian Hulley wrote:

...

Brian Hulley wrote:

...
translate :: (Monad m) => String -> m String translate = do createParseContext readToFirstIdentifier dealWithDeclarator consolidateOutput

The type signature above doesn't match the do block. It would either have to be changed to something like:

translate :: Control.Monad.State.MonadState String m => m ()

(storing the string in the monad's state instead of using a monad which returns it) or the do block could be replaced with the >>= operator as below, to thread the returned string between the components of the "pipe":

translate :: Monad m => String -> m String translate x = return x >>= createParseContext >>= readToFirstIdentifier >>= dealWithDeclarator >>= consolidateOutput

Brian Hulley

10:07 p.m.

New subject: |> vs. $ (was: request for code review)

Shannon -jj Behrens wrote:

...

I did think of using a monad, but being relatively new to Haskell, I was confused about a few things. Let's start by looking at one of my simpler functions:

-- Keep pushing tokens until we hit an identifier. pushUntilIdentifier :: ParseContextTransformation pushUntilIdentifier ctx | currTokType ctx == Identifier = ctx | otherwise = let newStack = (currTok ctx) : (stack ctx) in (ctx {stack=newStack}) |> getToken |> pushUntilIdentifier

The function itself is a ParseContextTransformation. It takes a context, transforms it, and returns it. Most of the pipelines in the whole application are ParseContextTransformations, and the |> (or $ or .) are ways of tying them together. My questions concerning Monads are in this example are:

1. Monads apply a strategy to computation. For instance, the list monad applies the strategy, "Try it with each of my members." What part of my code is the strategy?

In the pipe in the 'otherwise' branch, at the moment you have to assume that each of the transformations can successfully be done. What happens if getToken can't get a token because there are no more tokens left? To solve this problem you could use a monad such as Maybe, to encapsulate the strategy "keep going as long as no problems have been encountered so far" eg: type ParseContextTransformation = ParseContext -> Maybe ParseContext pushUntilIdentifier :: ParseContextTransformation pushUntilIdentifier ctx | currTokType ctx == Identifier = Just ctx | otherwise = let newStack = (currTok ctx) : (stack ctx) in return ctx{stack=newStack} >>= getToken >>= pushUntilIdentifier -- Read the next token into currTok. getToken :: ParseContextTransformation getToken ctx@(ParseContext {input=s}) = let lstrip s = dropWhile isSpace s in case lexString (lstrip s) of (Just token, theRest) -> Just (ctx{currTok=token, input = theRest}) _ -> Nothing lexString :: String -> (Maybe Token, String) lexString s@(c:cs) | isAlphaNum c = let (tokString, theRest) = span isAlphaNum s token = classifyString tokString in (Just token, theRest) lexString ('*':cs) = (Just $ classifyString "*", cs) lexString (c:cs) = (Just $ classifyString (c:[]), cs) lexString [] = (Nothing, []) -- can now deal with this case lexString is itself a candidate for a monadic computation on a state monad where the state is the string and Maybe Token is the return type, but it depends on how much you want to "monadify" your code...

...

2. Monads are containers that wrap a value. For instance, the Maybe monad can wrap any value, or it can wrap no value and just be Nothing. What part of my code is the thing being wrapped, and what part is "extra data" stored in the Monad itself?

So I guess:

3. Is the ParseContext the monad or the thing being wrapped?

Using the Maybe monad as above, it is the monad's "return type". For any monad m, m a means "the monad m returning a value of type a" so Maybe ParseContext means "a Maybe monad returning a value of type ParseContext". I think "stored in the monad itself" would usually refer to the case where you use some sort of state monad where the ParseContext would be the state but AFAIK this wouldn't be the most natural way to structure this sort of application.

...

4. How do I divide the code between the functions on the right side of

...
...
= and the functions in the monad itself? The functions on the right side of >>= operate on the stuff inside the monad, and the functions in the monad itself operate on the stuff in the monad.

Using the Maybe monad you could access the result by: toplevel :: String -> IO () toplevel s = case translate s of Just s' -> putStrLn s' Nothing -> putStrLn "Error translating" where translate and each of its component functions are changed to return their results via the Maybe monad.

...

5. How does the ParseContextTransformation relate?

I just modified ParseContextTransformation so that the resulting ParseContext is returned via the Maybe monad to allow for failure in any of the transformation steps. You'd need to also change createParseContext to return Maybe ParseContext etc. There are more advanced ways of using monads, eg where you use Monad m => instead of hardcoding the Maybe monad into the result, but it probably makes more sense to understand monads using concrete examples first. The tutorials give more info on these advanced monadic ways (and are certainly far better than me at explaining them). Hope this helps, Brian.

Bulat Ziganshin

8 Mar 8 Mar

8:16 a.m.

New subject: Re[2]: |> vs. $ (was: request for code review)

Hello Shannon, Tuesday, March 7, 2006, 10:52:01 PM, you wrote: SjB> The function itself is a ParseContextTransformation. It takes a SjB> context, transforms it, and returns it. Most of the pipelines in the SjB> whole application are ParseContextTransformations, and the |> (or $ or SjB> .) are ways of tying them together. My questions concerning Monads SjB> are in this example are: SjB> 1. Monads apply a strategy to computation. For instance, the list SjB> monad applies the strategy, "Try it with each of my members." What SjB> part of my code is the strategy? SjB> 2. Monads are containers that wrap a value. For instance, the Maybe 1&2 is only possible variations, but they don't covers everything. in GENERAL, monad is the way to write code as high-order functions that then will be used in some special way. this allows to hide part of computation details in the rules of this internal processing what you need here, imho, is a state monad. your context will become a state and monad should contain operations to read/write this state. to be exact, you should use State monad here, which already contains appropriate operations in the State monad, each action has type: type State a = StateType -> (a, StateType) i.e. each action is higher order function which transforms state (having type StateType) and in addition can return value of type 'a' are you read http://www.nomaware.com/monads/monad_tutorial.zip ? it's comprehensive tutorial about monads -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Daniel Fischer

11:01 a.m.

New subject: |> vs. $ (was: request for code review)

Am Dienstag, 7. März 2006 20:52 schrieb Shannon -jj Behrens:

...

I did think of using a monad, but being relatively new to Haskell, I was confused about a few things. Let's start by looking at one of my simpler functions:

-- Keep pushing tokens until we hit an identifier. pushUntilIdentifier :: ParseContextTransformation pushUntilIdentifier ctx

| currTokType ctx == Identifier = ctx | otherwise =

let newStack = (currTok ctx) : (stack ctx) in (ctx {stack=newStack}) |> getToken |> pushUntilIdentifier

The function itself is a ParseContextTransformation. It takes a context, transforms it, and returns it. Most of the pipelines in the whole application are ParseContextTransformations, and the |> (or $ or .) are ways of tying them together. My questions concerning Monads are in this example are:

1. Monads apply a strategy to computation. For instance, the list monad applies the strategy, "Try it with each of my members." What part of my code is the strategy?

2. Monads are containers that wrap a value. For instance, the Maybe ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Some are, others embody computations that produce a value, yet others: ?

...

monad can wrap any value, or it can wrap no value and just be Nothing. What part of my code is the thing being wrapped, and what part is "extra data" stored in the Monad itself?

So I guess:

3. Is the ParseContext the monad or the thing being wrapped?

4. How do I divide the code between the functions on the right side of

...
...
= and the functions in the monad itself? The functions on the right

side of >>= operate on the stuff inside the monad, and the functions in the monad itself operate on the stuff in the monad.

5. How does the ParseContextTransformation relate?

It is because I did not understand the answers to these questions that I thought maybe a monad might not be appropriate. However, I surely could be wrong. Afterall, ParseContext, ParseContextTransformation, and |> are all *inspired* by what I knew about monads.

Thanks for your help!

-jj

I'd use a State-monad, say import Control.Monad.State type CDParser a = State ParseContext a -- or perhaps StateT ParseContext m a, where m is an appropriate monad, -- I haven't thought much about it then you'd have e.g. pushUntilIdentifier :: CDParser () pushUntilIdentifier = do tt <- gets currTokType case tt of Identifier -> return () _ -> do pushToken getToken pushUntilIdentifier okay, that doesn't look really better, but if you'd done it monadically from the start, you'd probably chosen a different design (I think, I'd leave the current token out of the ParseContext and have that returned by the appropriate actions). Alternatively, you could use Parsec with Parsecontext as user state (removing the input from ParseContext) and take advantage of the many provided combinators in Parsec. As another method, I've hacked up a translation by parsing a declaration and creating a customized Show-instance. It could be much improved, but for a quick hack, I can live with it. -- | Translate C-declarations to english, well, sort of module Translate where import Text.ParserCombinators.ReadPrec import qualified Text.ParserCombinators.ReadP as P import Text.Read import Data.Char (isAlpha, isAlphaNum) -- lift some operators from ReadP to ReadPrec, indicates that I -- should have originally worked with ReadP and lifted to -- ReadPrec afterwards. spaces = lift P.skipSpaces string = lift . P.string char = lift . P.char many p = lift $ P.many $ readPrec_to_P p 0 -- | list of known types, struct, union and enum don't really -- belong here, but since C is inherently sick, it doesn't matter typeNames :: [String] typeNames = [ "void", "char", "signed", "unsigned", "short", "int" , "long", "float", "double", "struct", "union", "enum"] -- | may this Char appear in a C-identifier? isIdLetter :: Char -> Bool isIdLetter c = c == '_' || isAlphaNum c -- | may this Char begin a C-identifier? isIdStart :: Char -> Bool isIdStart c = c == '_' || isAlpha c -- | the sort of types, we can handle data CType = Basic String -- ^ plain types like int, char... | Const CType -- ^ type with "const" | Ptr CType -- ^ pointer to type | Array [Maybe Int] CType -- ^ Array with dimensions -- | type synonym to check whether a variable is volatile type Volatile = Bool -- | the declarations we can parse, due to C's horrible syntax, -- we can't handle multiple variable declarations like -- -- > int *a, b[5], c; -- -- but the original programme couldn't either. data Decl = VarDecl CType Volatile String | FunDecl CType String [CType] ---------------------------------------------------------------------- -- Show Instances -- ---------------------------------------------------------------------- -- here we translate the declaration to english instance Show CType where showsPrec _ (Basic nm) = showString nm showsPrec _ (Const ty) = showString "read-only " . shows ty showsPrec _ (Ptr ty) = showString "pointer to " . shows ty showsPrec _ (Array dims ty) = showD dims . shows ty where showD [] = id showD (Just n:ds) = showString "array 0.." . shows (n-1) . showString " of " . showD ds showD (Nothing:ds) = showString "array of " . showD ds instance Show Decl where showsPrec _ (VarDecl ty vol nm) = showString nm . showString " is a " . showV vol . showString "variable of type " . shows ty where showV True = showString "volatile " showV False = id showsPrec _ (FunDecl rty nm atys) = showString nm . showString " is a function of " . showArgs atys . showString ", returning " . shows rty where showArgs [] = showString "no arguments" showArgs [t] = showString "one argument of type " . shows t showArgs tys@(t:ts) = shows (length tys) . showString " arguments of types " . shows t . showRest ts showRest [t] = showString " and " . shows t showRest (t:ts) = showString ", " . shows t . showRest ts ---------------------------------------------------------------------- -- Parsing -- ---------------------------------------------------------------------- lexeme :: String -> ReadPrec String lexeme str = do spaces string str rst <- look case rst of (c:_) | isIdLetter c -> pfail _ -> return str parseIdentifier :: ReadPrec String parseIdentifier = lift $ do P.skipSpaces c <- P.satisfy isIdStart cs <- P.many (P.satisfy isIdLetter) return (c:cs) parseBasic :: ReadPrec CType parseBasic = do tynam <- choice $ map lexeme typeNames return (Basic tynam) parseNoArray :: ReadPrec CType parseNoArray = (do ty <- parsePType lexeme "const" return (Const ty)) <++ parsePType parseConstT :: ReadPrec CType parseConstT = do lexeme "const" bs <- parseBasic return (Const bs) parsePrePtr :: ReadPrec CType parsePrePtr = parseConstT <++ parseBasic parsePType :: ReadPrec CType parsePType = do ty <- parsePrePtr complete ty where complete t = (do spaces char '*' complete (Ptr t)) <++ return t parseDim :: ReadPrec (Maybe Int) parseDim = (do spaces char '[' n <- readPrec spaces char ']' return (Just n)) <++ (spaces >> string "[]" >> return Nothing) parseVarDecl :: ReadPrec Decl parseVarDecl = do ty <- parseNoArray vol <- (lexeme "volatile" >> return True) <++ return False nam <- parseIdentifier ds <- many parseDim spaces char ',' <++ char ';' <++ escape let t = if null ds then ty else Array ds ty return (VarDecl t vol nam) where escape = do rst <- look case rst of (')':_) -> return '.' _ -> pfail parseFunDecl :: ReadPrec Decl parseFunDecl = do ty <- parsePType nam <- parseIdentifier ats <- parseFuncArgs spaces char ';' <++ char '{' return (FunDecl ty nam ats) parseFuncArgs :: ReadPrec [CType] parseFuncArgs = do char '(' vds <- many parseVarDecl char ')' return (map typ vds) where typ (VarDecl ty _ _) = ty instance Read Decl where readPrec = parseFunDecl <++ parseVarDecl readDecl :: String -> Decl readDecl = read translate :: String -> String translate = show . readDecl -------------------------------------------------------------------------- It's better than the original for some things: *CDecl> translate "const int * const a;" "a is read-only" *CDecl> :l Translate Compiling Translate ( Translate.hs, interpreted ) Ok, modules loaded: Translate. *Translate> translate "const int * const a;" "a is a variable of type read-only pointer to read-only int" but *CDecl> translate "union {int a, char b};" "a is int" *Translate> translate "union {int a, char b};" "*** Exception: Prelude.read: no parse Neither is really convincing. Cheers, Daniel -- "In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt." -- Blair P. Houghton

Shannon -jj Behrens

10:19 p.m.

New subject: |> vs. $ (was: request for code review)

First of all, thank you all so much for taking the time to help me with this exercise! My hope is that once I'm able to understand it, my understanding can come through in the article I write.

...

Brian Hulley: In the pipe in the 'otherwise' branch, at the moment you have to assume that each of the transformations can successfully be done. What happens if getToken can't get a token because there are no more tokens left? To solve this problem you could use a monad such as Maybe, to encapsulate the strategy "keep going as long as no problems have been encountered so far" eg:

I can see where you're going with the Maybe monad, and it does make sense. However, I see Maybe as a "hammer" for a nail I wasn't really all that interested in nailing ;) It's true that getToken might fail. Most of the program isn't prepared to handle errors gracefully, but neither was the C version. That's okay. If anything, handling it as an exception and printing out a generic error message would be more than enough. Aside from the better error handling, I fear the Maybe monad isn't buying much.

...

Bulat Ziganshin: what you need here, imho, is a state monad.

...

Danil Fischer I'd use a State-monad, say

I suspect you guys are right. I had always thought of states as being "isomorphic" to integers (i.e. you can be in state 0, state 1, ... state n), not as contexts (you have this input, that output, and this token stack), am I wrong? I suspect I need to spend more time trying to understand the state monad. I must admit that I baulked the last time I tried to squeeze it into my head. I'll just need to try again ;)

...

but if you'd done it monadically from the start, you'd probably chosen a different design

I specifically chose not to have a radically different design because I wanted to maintain the "nature" of the original C code. Naturally, if I were to try to do this from scratch, I'd use a powerful lexer and parser. However, the beauty of this code (i.e. the original C code) is that it works without *needing* to use or understand such powerful tools.

...

As another method, I've hacked up a translation by parsing a declaration and creating a customized Show-instance.

Yeah, I thought of that too, but decided against if for the reason given above. Think of camping--sometimes it's fun to "rough it". Sometimes it can be fun to solve this problem without powerful tools. Maybe I'm just being silly ;)

...

"In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt."

Yeah, yeah ;) I'm mostly a Python guy, so you're preaching to the choir ;) Thanks again, guys!!!

Jared Updike

9 Mar 9 Mar

1:52 a.m.

New subject: |> vs. $ (was: request for code review)

...

I suspect you guys are right. I had always thought of states as being "isomorphic" to integers (i.e. you can be in state 0, state 1, ... state n), not as contexts (you have this input, that output, and this token stack), am I wrong?

You're thinking of a state machine, I think, which is not quite what a state monad would do here. (I have nightmares of writing a state-machine parser in assembly like I did in an EE class once... ouch).

...

I suspect I need to spend more time trying to understand the state monad. I must admit that I baulked the last time I tried to squeeze it into my head. I'll just need to try again ;)

Here's the way I like to think about state in imperative programs---it's hard because it's not something you can get far away from enough to see, usually. In imperative programs, the value of a variable 'a' at one point is not always the value of the variable 'a' at another point later in the code. In some sense, each statement that gets executed is passed the entire state of the machine (the world) implicitly, and then when the statement ends, it passes the state of the world on to the next statement. If you want to access the value of the variable 'a', then 'a' gets looked up in the environment/state. In C/C++/Java/C#/Python/Perl, etc. this is done for you automatically and efficiently. It's just the way the machine works. But you don't have the choice to change this or, as someone put it, "overload the semicolon". In Haskell none of this variable-mutating, state-passing **can** occur, so it gets created from scratch, and voila, we have the State Monad. It makes it sound like a lot more work than it should be just to do something that comes for free in most other languages, but in these languages, you can't overload the semicolon! And if you could, who knows what could go wrong at runtime (imagine Perl with semicolon overloading... I bet some day they will do this just because they can...). In Haskell, everything is watched over by the type system, so the parts of your program that explicitly need to munge state are isolated with the some type tag, e.g. ParseContext, while the rest of your program is "normal" and pure and functional. The problem with monads is not that they are "advanced" but that they are so painfully and subtly abstract (I was going to say "subtly simple" but maybe they aren't for most working non-Haskell programmers...). (It just so happens that you **can** do amazing, convenient, efficient, magic and otherwise advanced things with them, especially with the libraries.) Another problem is that everyone has different ways of explaining them or trying to define what they are (a way of sequencing computation? or a type constructor? or a type class?). Of course, they are all those things, which makes it even more confusing. At a certain point, though, I think they just subconciously click and boom, now you get it. Anyway, if your goal is to get people to understand Haskell, then see if you can use monads to simplify things. If your goal is to do a straight translation of the C code, don't worry about monads. My 2c, Jared. -- http://www.updike.org/~jared/ reverse ")-:"

Shannon -jj Behrens

7:26 a.m.

New subject: |> vs. $ (was: request for code review)

On 3/8/06, Jared Updike wrote:

...

...
I suspect you guys are right. I had always thought of states as being "isomorphic" to integers (i.e. you can be in state 0, state 1, ... state n), not as contexts (you have this input, that output, and this token stack), am I wrong?

You're thinking of a state machine, I think, which is not quite what a state monad would do here. (I have nightmares of writing a state-machine parser in assembly like I did in an EE class once... ouch).

...
I suspect I need to spend more time trying to understand the state monad. I must admit that I baulked the last time I tried to squeeze it into my head. I'll just need to try again ;)

Here's the way I like to think about state in imperative programs---it's hard because it's not something you can get far away from enough to see, usually.

In imperative programs, the value of a variable 'a' at one point is not always the value of the variable 'a' at another point later in the code. In some sense, each statement that gets executed is passed the entire state of the machine (the world) implicitly, and then when the statement ends, it passes the state of the world on to the next statement. If you want to access the value of the variable 'a', then 'a' gets looked up in the environment/state. In C/C++/Java/C#/Python/Perl, etc. this is done for you automatically and efficiently. It's just the way the machine works. But you don't have the choice to change this or, as someone put it, "overload the semicolon".

In Haskell none of this variable-mutating, state-passing **can** occur, so it gets created from scratch, and voila, we have the State Monad. It makes it sound like a lot more work than it should be just to do something that comes for free in most other languages, but in these languages, you can't overload the semicolon! And if you could, who knows what could go wrong at runtime (imagine Perl with semicolon overloading... I bet some day they will do this just because they can...). In Haskell, everything is watched over by the type system, so the parts of your program that explicitly need to munge state are isolated with the some type tag, e.g. ParseContext, while the rest of your program is "normal" and pure and functional.

The problem with monads is not that they are "advanced" but that they are so painfully and subtly abstract (I was going to say "subtly simple" but maybe they aren't for most working non-Haskell programmers...). (It just so happens that you **can** do amazing, convenient, efficient, magic and otherwise advanced things with them, especially with the libraries.) Another problem is that everyone has different ways of explaining them or trying to define what they are (a way of sequencing computation? or a type constructor? or a type class?). Of course, they are all those things, which makes it even more confusing. At a certain point, though, I think they just subconciously click and boom, now you get it.

Anyway, if your goal is to get people to understand Haskell, then see if you can use monads to simplify things. If your goal is to do a straight translation of the C code, don't worry about monads.

Dude, that was a friggin' awesome email! I'm trying to figure out how I can just copy it wholesale into the article ;) I've been struggling with Haskell for long enough that my knowledge is now snowballing downhill. Everything you said made sense 100%. Nice! -jj

Shannon -jj Behrens

7:54 a.m.

New subject: |> vs. $ (was: request for code review)

On 3/8/06, Shannon -jj Behrens wrote:

...

On 3/8/06, Jared Updike wrote:

...
...
I suspect you guys are right. I had always thought of states as being "isomorphic" to integers (i.e. you can be in state 0, state 1, ... state n), not as contexts (you have this input, that output, and this token stack), am I wrong?

You're thinking of a state machine, I think, which is not quite what a state monad would do here. (I have nightmares of writing a state-machine parser in assembly like I did in an EE class once... ouch).

...
I suspect I need to spend more time trying to understand the state monad. I must admit that I baulked the last time I tried to squeeze it into my head. I'll just need to try again ;)

Here's the way I like to think about state in imperative programs---it's hard because it's not something you can get far away from enough to see, usually.

In imperative programs, the value of a variable 'a' at one point is not always the value of the variable 'a' at another point later in the code. In some sense, each statement that gets executed is passed the entire state of the machine (the world) implicitly, and then when the statement ends, it passes the state of the world on to the next statement. If you want to access the value of the variable 'a', then 'a' gets looked up in the environment/state. In C/C++/Java/C#/Python/Perl, etc. this is done for you automatically and efficiently. It's just the way the machine works. But you don't have the choice to change this or, as someone put it, "overload the semicolon".

In Haskell none of this variable-mutating, state-passing **can** occur, so it gets created from scratch, and voila, we have the State Monad. It makes it sound like a lot more work than it should be just to do something that comes for free in most other languages, but in these languages, you can't overload the semicolon! And if you could, who knows what could go wrong at runtime (imagine Perl with semicolon overloading... I bet some day they will do this just because they can...). In Haskell, everything is watched over by the type system, so the parts of your program that explicitly need to munge state are isolated with the some type tag, e.g. ParseContext, while the rest of your program is "normal" and pure and functional.

The problem with monads is not that they are "advanced" but that they are so painfully and subtly abstract (I was going to say "subtly simple" but maybe they aren't for most working non-Haskell programmers...). (It just so happens that you **can** do amazing, convenient, efficient, magic and otherwise advanced things with them, especially with the libraries.) Another problem is that everyone has different ways of explaining them or trying to define what they are (a way of sequencing computation? or a type constructor? or a type class?). Of course, they are all those things, which makes it even more confusing. At a certain point, though, I think they just subconciously click and boom, now you get it.

Anyway, if your goal is to get people to understand Haskell, then see if you can use monads to simplify things. If your goal is to do a straight translation of the C code, don't worry about monads.

Dude, that was a friggin' awesome email! I'm trying to figure out how I can just copy it wholesale into the article ;) I've been struggling with Haskell for long enough that my knowledge is now snowballing downhill. Everything you said made sense 100%.

Yes, having read more, I can see clearly that the State monad was what I was looking for. Consider: http://www.nomaware.com/monads/html/statemonad.html A pure functional language cannot update values in place because it violates referential transparency. A common idiom to simulate such stateful computations is to "thread" a state parameter through a sequence of functions...This approach works, but such code can be error-prone, messy and difficult to maintain. The State monad hides the threading of the state parameter inside the binding operation, simultaneously making the code easier to write, easier to read and easier to modify.

Jared Updike

8:39 a.m.

New subject: |> vs. $ (was: request for code review)

...

Dude, that was a friggin' awesome email! I'm trying to figure out how I can just copy it wholesale into the article ;)

Use what you need. Share and share alike.

...

I've been struggling with Haskell for long enough that my knowledge is now snowballing downhill.

I think I experienced that too. I like how Haskell is about the concepts (even things you didn't know you were already doing implicitly) and the language is just a very transparent vehicle for manipulating them. Haskell has really taught me a lot and helped me think a lot clearer---but now I think in Haskell! The danger is that since I've taken the red "Haskell" pill I have a hard time looking at the imperative world the same way. I have to resist trying to cram all these elegant approaches into a system that can't support them and instead program like a normal person. (The good news is that many languages are slowly soaking up features, e.g. C# delegates are higher-order functions and anonymous delegates are lambdas, C# 3.0 will have some kind of type inference, etc. but these features are many decades old.)

...

Everything you said made sense 100%.

To add a few more comments to this: in Haskell you don't overload the semicolon, instead you overload >>= ("bind") and create a new typeclass with this function (as well as a function called "return", where these two functions obey certain laws that keep things kosher, i.e. "return" is the identity, >>= is associative, and they interact as you would expect). It just so happens that the semicolons (or newlines) in do blocks get converted into >>=s and lambdas to scope your variables over the rest of your actions. The reason monads are cool and magical is that now that we've made this sequencing explicit in the type, we can swap out the machinery (what type of implicit state we're passing around or even the >>= function itself) and change a minimal amount of code, i.e. nothing... except adding code to use the new functionality (if all goes well). We can now get continuation passing, or true mutable variables (e.g. IORef), or a simple sort of non-determinism (e.g. List as a monad), or failure (Maybe) or exception handling, or any number of behaviors (or combinations) by changing the monad. BTW the original discussion was about composing functions and I gotta say since C programmers don't really think in terms of "f . g x = f (g x)" anyways, and since writing code in the order it executes is natural (think Python str.replace('\r', '').split('\n') ), I think the idea of explaining |> as a pipe is pretty cool. There might be a little too much overhead in explaining things if you drop the M-bomb (monads), but if you can make and keep monads clear and simple and use them there, you will have accomplished a great feat (and analogies like Unix pipes are a good thing). I'm curious to see how this evolves. Good luck. Jared. -- http://www.updike.org/~jared/ reverse ")-:"

Bulat Ziganshin

7:37 a.m.

New subject: Re[2]: |> vs. $ (was: request for code review)

Hello Shannon, Thursday, March 9, 2006, 1:19:39 AM, you wrote:

...

...
I'd use a State-monad, say

SjB> I suspect you guys are right. I had always thought of states as SjB> being "isomorphic" to integers (i.e. you can be in state 0, state 1, SjB> ... state n), not as contexts (you have this input, that output, and SjB> this token stack), am I wrong? I suspect I need to spend more time SjB> trying to understand the state monad. I must admit that I baulked SjB> the last time I tried to squeeze it into my head. I'll just need to SjB> try again ;) 1) read the "monad tutorial" 2) state is just a value what can be get/set. it's an "emulation" of global mutable variables present in other languages. because this state can be a tuple, you can emulate multiple vars. that's all as i said, in State monad each monadic computation is actually transition from old variable's contents to their new contents -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Bill Wood

6 Mar 6 Mar

11:54 p.m.

New subject: |> vs. $ (was: request for code review)

On Mon, 2006-03-06 at 11:25 -0800, Shannon -jj Behrens wrote: . . .

...

I find "ctx |> currTok |> tokenType" to be more readable than "tokenType $ currTok $ ctx" because you're not reading the code in reverse. That's my primary complaint with "." and "$". That's especially the case when I'm spreading the code over multiple lines:

(Just my $0.02 worth; no flames please :-) I notice that all but the last of your four functions read like commands in English. It seems natural to write a sequence of commands left-to-right in the order they are to be performed, so "|>" seem natural and "$" and "." seem backward. However, if the chain "f $ g $ h x" is read something like "the result of applying f to the result of applying g to the result of applying h to x" then the usual order seems more natural. And my preferences differ from Udo's -- I would not like to see the order of args to "." and "$" reversed *unless* the arg was written to the left of the chain, ala "x h $ g $ f", as is done by some algebraists. It does seem that the whole controversy boils down to how the writer thinks of the chain -- as a sequence of actions or as the evolution of a value. Neither one is "the One True View"; they're just different. -- Bill Wood

7060

Age (days ago)

7063

Last active (days ago)

List overview

Download

14 comments

7 participants

participants (7)

Bill Wood
Brian Hulley
Bulat Ziganshin
Daniel Fischer
Jared Updike
Shannon -jj Behrens
Udo Stenzel