
Kannan Goundan wrote:
I'm writing a parser with Parsec. In the input language, elements of a sequence are separated by commas:
[1, 2, 3]
However, instead of a comma, you can also use an EOL:
[1, 2 3]
Anywhere else, EOL is considered ignorable whitespace. So it's not as simple as just making EOL a token and looking for (comma | eol).
Hi Kannan, let's construct the parser top-down. On the top level, you have opening and closing characters, '[' and ']'. Parsec has a function for that: between (char '[') (char ']) And what's in between? A list of elements separated by something. Parsec provides a sepBy function for that: element `sepBy` separator which parses a list of elements separated by separator. What's your separator? Well it's either ',' or a new line and spaces before and after that: mySpaces >> (newline <|> char ',') >> mySpaces -- [1] Let's combine what we've got: myListOf :: (Parsec String () a) -> Parsec String () [a] myListOf elem = between (char '[') (char ']') (elem `sepBy` (mySpaces >> (newline <|> char ',') >> mySpaces)) where mySpaces = many (oneOf (" \t")) And test it in ghci: *Main> parseTest (myListOf anyChar) "[a , b, d ,d\np]" "abddp" Hope this helps! Stephan PS: The important thing is that there are a lot solutions for tricky situations (like yours) in Text.Parsec.Combinator (especially the sepBy and many families). Knowing them can save a lot of work :) [1] I don't use parsec's spaces function because it also accepts newline characters.
I've implemented this functionality in a hand-written parser (basically a hack that keeps track of whether the last read token was preceded by an EOL, without making EOL itself a token). Does anybody have ideas about how to do this with Parsec?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Früher hieß es ja: Ich denke, also bin ich. Heute weiß man: Es geht auch so. - Dieter Nuhr