Parsec question (new user): unexpected end of input

I am a new Parsec user, and having some trouble with a relatively
simple parser.
The grammar I want to parse contains tags (not html) marked by
angle brackets (e.g., "<some tag>"), with arbitrary text (no angle
brackets allowed) optionally in between tags.
Tags may not nest, but the input must begin and end with a tag.
Whitespace may occur anywhere (beginning/end of input,
inside/between tags, etc.), and is optional.
I think my problem may be a lack of using "try", but I'm not sure
where.
At runtime I get:
Error parsing file: "...\sampleTaggedContent.txt" (line 4, column 1):
unexpected end of input
expecting "<"
The input was:
<tag1>stuff
-- Parsers: taggedContent = do optionalWhiteSpace aTag many tagOrContent aTag eof return "Parse complete."
tagOrContent = aTag <|> someContent > "tagOrContent"
aTag = do tagBegin xs <- many (noneOf [tagEndChar]) tagEnd optionalWhiteSpace return ()
someContent = do manyTill anyChar tagBegin return ()
optionalWhiteSpace = spaces -- i.e., any of " \v\f\t\r\n" tagBegin = char tagBeginChar tagEnd = char tagEndChar
-- Etc: tagBeginChar = '<' tagEndChar = '>'
--------

On Tue, Sep 28, 2010 at 10:35 PM, Peter Schmitz
I am a new Parsec user, and having some trouble with a relatively simple parser.
The grammar I want to parse contains tags (not html) marked by angle brackets (e.g., "<some tag>"), with arbitrary text (no angle brackets allowed) optionally in between tags.
Tags may not nest, but the input must begin and end with a tag.
Whitespace may occur anywhere (beginning/end of input, inside/between tags, etc.), and is optional.
I think my problem may be a lack of using "try", but I'm not sure where.
At runtime I get:
Error parsing file: "...\sampleTaggedContent.txt" (line 4, column 1): unexpected end of input expecting "<"
The input was:
<tag1>stuff
more stuff < tag 3 > even more <lastTag> The code is below. (I'm using Parsec-2.1.0.1.) I don't really want to return anything meaningful yet; just parse okay.
Any advice about the error (or how to simplify or improve the code) would be appreciated.
Thanks much, -- Peter
-- Parsers: taggedContent = do optionalWhiteSpace aTag many tagOrContent aTag eof return "Parse complete."
tagOrContent = aTag <|> someContent > "tagOrContent"
aTag = do tagBegin xs <- many (noneOf [tagEndChar]) tagEnd optionalWhiteSpace return ()
someContent = do manyTill anyChar tagBegin return ()
optionalWhiteSpace = spaces -- i.e., any of " \v\f\t\r\n" tagBegin = char tagBeginChar tagEnd = char tagEndChar
-- Etc: tagBeginChar = '<' tagEndChar = '>'
-------- _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Here's something I put together: http://hpaste.org/40201/parsec_question_new_user_un?pid=40201&lang_40201=Haskell It doesn't have the whitespace handling you want. The big difference in what I did was that when parsing content, it needs to stop on EOF as well as the signal char. Otherwise it won't allow the document to end :-) Antoine

Am 29.09.2010 05:35, schrieb Peter Schmitz: [...]
Error parsing file: "...\sampleTaggedContent.txt" (line 4, column 1): unexpected end of input expecting "<"
The input was: [...]
-- Parsers: taggedContent = do optionalWhiteSpace aTag many tagOrContent aTag
"many tagOrContent" will consume all tags, so that no tag for the following "aTag" will be left. Cheers Christian
eof return "Parse complete."
tagOrContent = aTag <|> someContent > "tagOrContent"
aTag = do tagBegin xs <- many (noneOf [tagEndChar]) tagEnd optionalWhiteSpace return ()
someContent = do manyTill anyChar tagBegin return ()
optionalWhiteSpace = spaces -- i.e., any of " \v\f\t\r\n" tagBegin = char tagBeginChar tagEnd = char tagEndChar
-- Etc: tagBeginChar = '<' tagEndChar = '>'
--------

Am 29.09.2010 09:54, schrieb Christian Maeder:
Am 29.09.2010 05:35, schrieb Peter Schmitz: [...]
Error parsing file: "...\sampleTaggedContent.txt" (line 4, column 1): unexpected end of input expecting "<"
The input was: [...]
-- Parsers: taggedContent = do optionalWhiteSpace aTag many tagOrContent aTag
"many tagOrContent" will consume all tags, so that no tag for the following "aTag" will be left.
if you want to match a final tag, you could try: manyTill tagOrContent (try (aTag >> eof))
Cheers Christian
eof return "Parse complete."
tagOrContent = aTag <|> someContent > "tagOrContent"
aTag = do tagBegin xs <- many (noneOf [tagEndChar])
this also looks like "manyTill anyChar tagEnd" C.
tagEnd optionalWhiteSpace return ()
someContent = do manyTill anyChar tagBegin return ()
optionalWhiteSpace = spaces -- i.e., any of " \v\f\t\r\n" tagBegin = char tagBeginChar tagEnd = char tagEndChar
-- Etc: tagBeginChar = '<' tagEndChar = '>'
--------

Am 29.09.2010 11:55, schrieb Christian Maeder:
Am 29.09.2010 09:54, schrieb Christian Maeder:
Am 29.09.2010 05:35, schrieb Peter Schmitz: [...]
Error parsing file: "...\sampleTaggedContent.txt" (line 4, column 1): unexpected end of input expecting "<"
The input was: [...]
-- Parsers: taggedContent = do optionalWhiteSpace aTag many tagOrContent aTag
"many tagOrContent" will consume all tags, so that no tag for the following "aTag" will be left.
if you want to match a final tag, you could try:
manyTill tagOrContent (try (aTag >> eof))
better yet, avoiding backtracking, return different things for aTag and someContents and check if the last entry is a tag. tagOrContent = fmap Left aTag <|> fmap Right someContent taggedContent = do spaces aTag l <- many tagOrContent eof case reverse l of Left _ : _ -> return () _ -> fail "expected final tag before EOF" C.
Cheers Christian
eof return "Parse complete."
tagOrContent = aTag <|> someContent > "tagOrContent"
aTag = do tagBegin xs <- many (noneOf [tagEndChar])
this also looks like "manyTill anyChar tagEnd"
C.
tagEnd optionalWhiteSpace return ()
someContent = do manyTill anyChar tagBegin return ()
optionalWhiteSpace = spaces -- i.e., any of " \v\f\t\r\n" tagBegin = char tagBeginChar tagEnd = char tagEndChar
-- Etc: tagBeginChar = '<' tagEndChar = '>'
--------

Antoine and Christian: Many thanks for your help on this thread. (I am still digesting it; much appreciated; will post when I get it working.) -- Peter
participants (3)
-
Antoine Latter
-
Christian Maeder
-
Peter Schmitz