
On Tue, Sep 28, 2010 at 10:35 PM, Peter Schmitz
I am a new Parsec user, and having some trouble with a relatively simple parser.
The grammar I want to parse contains tags (not html) marked by angle brackets (e.g., "<some tag>"), with arbitrary text (no angle brackets allowed) optionally in between tags.
Tags may not nest, but the input must begin and end with a tag.
Whitespace may occur anywhere (beginning/end of input, inside/between tags, etc.), and is optional.
I think my problem may be a lack of using "try", but I'm not sure where.
At runtime I get:
Error parsing file: "...\sampleTaggedContent.txt" (line 4, column 1): unexpected end of input expecting "<"
The input was:
<tag1>stuff
more stuff < tag 3 > even more <lastTag> The code is below. (I'm using Parsec-2.1.0.1.) I don't really want to return anything meaningful yet; just parse okay.
Any advice about the error (or how to simplify or improve the code) would be appreciated.
Thanks much, -- Peter
-- Parsers: taggedContent = do optionalWhiteSpace aTag many tagOrContent aTag eof return "Parse complete."
tagOrContent = aTag <|> someContent > "tagOrContent"
aTag = do tagBegin xs <- many (noneOf [tagEndChar]) tagEnd optionalWhiteSpace return ()
someContent = do manyTill anyChar tagBegin return ()
optionalWhiteSpace = spaces -- i.e., any of " \v\f\t\r\n" tagBegin = char tagBeginChar tagEnd = char tagEndChar
-- Etc: tagBeginChar = '<' tagEndChar = '>'
-------- _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Here's something I put together: http://hpaste.org/40201/parsec_question_new_user_un?pid=40201&lang_40201=Haskell It doesn't have the whitespace handling you want. The big difference in what I did was that when parsing content, it needs to stop on EOF as well as the signal char. Otherwise it won't allow the document to end :-) Antoine