Parsec, state and/of my cluelessness

Hi everybody, for my first real 'learn some haskell'-project I decided upon a parser. The resulting application would be somewhat useful to me and armed with such a cool library as Parsec, how could I fail? I was going to be a haskell hacker in notime. Oh, the wine, the women and the fame. Didn't work out though. I'm going to parse a tree of controls. Each control can have values, (a tree of) properties and more controls as children: type Guid = String type Value = (String, String) data Property = MkProperty String (Maybe Guid) [Value] [Property] data Control = MkControl String String [Value] [Property] [Control] The top parser is along the lines of: ctrlParser :: Parser Control ctrlParser = do { reserved "Begin" ; ctrl <- identifier ; name <- identifier ; (vs, ps, cs) <- ctrlBodyParser ; reserved "End" ; return (MkControl ctrl name vs ps cs) } Now comes the tricky part for me. Since the control can have three different types of children I use a helper that parses the body of the control using other parsers, collecting their results in three lists: ctrlBodyParser :: CharParser ([Value], [Property], [Control]) ([Value], [Property], [Control]) ctrlBodyParser = do { c <- ctrlParser -- parse child control ; (vs, ps, cs) <- getState ; setState (vs, ps, (c : cs)) ; ctrlBodyParser } <|> do { p <- propParser -- parse child property ; (vs, ps, cs) <- getState ; setState (vs, (p : ps), cs) ; ctrlBodyParser } <|> do { v <- valueParser -- parse value ; (vs, ps, cs) <- getState ; setState ((v : vs), ps, cs) ; ctrlBodyParser } <|> do { getState } -- we're finished, return children GHC error: ---8<--- Couldn't match `()' against `(a, b, c)' Expected type: () Inferred type: (a, b, c) When checking the pattern: (vs, ps, cs) In a 'do' expression: (vs, ps, cs) <- getState --->8--- So, I have a problem with the state handling. Any hints? Some other questions: Do I need to supply the ctrlBodyParser with an initial state when I call it from the ctrlParser? Should I modify the last alternative in the ctrlBodyParser so that it resets the state before calling return with the collected results? Thanks, Niklas

[snip]
Now comes the tricky part for me. Since the control can have three different types of children I use a helper that parses the body of the control using other parsers, collecting their results in three lists:
ctrlBodyParser :: CharParser ([Value], [Property], [Control]) ([Value], [Property], [Control]) ctrlBodyParser = do { c <- ctrlParser -- parse child control ; (vs, ps, cs) <- getState ; setState (vs, ps, (c : cs)) ; ctrlBodyParser } <|>
[snip some alternatives]
<|> do { getState } -- we're finished, return children
I think you might do better to make it tail-recursive (sort of) by passing intermediate lists as parameters to ctrlBodyParser rather than using state. Parsec state (if I recall correctly) needs to have the same type throughout the parse, but here you really just want a bit of help accumulating some results in a section of the parse tree. Try this: ctrlBodyParser :: Parser ([Value],[Property],[Control]) ctrlBodyParser = ctrlBodyParser0 [] [] [] ctrlBodyParser0 :: [Value] -> [Property] -> [Control] -> Parser ([Value],[Property],[Control]) ctrlBodyParser0 vs ps cs = do { c <- ctrlParser; ctrlBodyBodyParser0 vs ps (c : cs) } <|> .... etc <|> do { return (vs,ps,cs) } Be aware that your lists will come out in the reverse order that they apper in the text. You can also use a single labeled record instead of the three list parameters and a tuple.

Hi Niklas,
ctrlBodyParser :: CharParser ([Value], [Property], [Control]) ([Value], [Property], [Control]) ctrlBodyParser = do { c <- ctrlParser -- parse child control ; (vs, ps, cs) <- getState ; setState (vs, ps, (c : cs)) ; ctrlBodyParser } <|> do { p <- propParser -- parse child property ; (vs, ps, cs) <- getState ; setState (vs, (p : ps), cs) ; ctrlBodyParser } <|> do { v <- valueParser -- parse value ; (vs, ps, cs) <- getState ; setState ((v : vs), ps, cs) ; ctrlBodyParser } <|> do { getState } -- we're finished, return children
Uhm, maybe I'm being clueless here but I never use state to pass around results. This looks like a place where you want to parse "many" subparsers: ctrlBodyParser = many parseOneCtrlBody parseOneCtrlBody = do { c <- ctrlParser; return (Control c)} <|> do { p <- propParser; return (Property p)} <|> do { v <- valueParser; return (Value v)} data CtrlBody = Control | Property | Value Of course, ctrlBodyParser then has type [CtrlBody] so if you want your triple of lists you have to postprocess the list. Anyway, I don't think parsec state is what you want to use here and explicit recursion of parsers can often be avoided using the many (pun intended) combinators of Parsec. Cheers, Arjan
participants (3)
-
Arjan van IJzendoorn
-
Niklas
-
robert dockins