Parsec is being weird at me

Anybody want to explain to me why this doesn't work? ___ ___ _ / _ \ /\ /\/ __(_) / /_\// /_/ / / | | GHC Interactive, version 6.6.1, for Haskell 98. / /_\\/ __ / /___| | http://www.haskell.org/ghc/ \____/\/ /_/\____/|_| Type :? for help. Loading package base ... linking ... done. Prelude> :m Text.ParserCombinators.Parsec Prelude Text.ParserCombinators.Parsec> parseTest (endBy anyToken (char '#')) "abc#" Loading package parsec-2.0 ... linking ... done. parse error at (line 1, column 1): unexpected "b" expecting "#"

Andrew Coppin wrote:
Prelude> :m Text.ParserCombinators.Parsec Prelude Text.ParserCombinators.Parsec> parseTest (endBy anyToken (char '#')) "abc#" Loading package parsec-2.0 ... linking ... done. parse error at (line 1, column 1): unexpected "b" expecting "#"
I read the doc and determined that it is perfectly correct behaviour. Hint: anyToken becomes anyChar because your input is [Char].

Andrew Coppin wrote:
Anybody want to explain to me why this doesn't work?
___ ___ _ / _ \ /\ /\/ __(_) / /_\// /_/ / / | | GHC Interactive, version 6.6.1, for Haskell 98. / /_\\/ __ / /___| | http://www.haskell.org/ghc/ \____/\/ /_/\____/|_| Type :? for help.
Loading package base ... linking ... done. Prelude> :m Text.ParserCombinators.Parsec Prelude Text.ParserCombinators.Parsec> parseTest (endBy anyToken (char '#')) "abc#" Loading package parsec-2.0 ... linking ... done. parse error at (line 1, column 1): unexpected "b" expecting "#"
anyToken is singular: it accepts a single token, in this case 'a'. Then endBy expects (char '#') to match and reads 'b' instead and gives the error message. So using (many anyToken) gets further:
Prelude Text.ParserCombinators.Parsec> parseTest (endBy (many anyToken) (char '#')) "abc#" Loading package parsec-2.0 ... linking ... done. parse error at (line 1, column 1): unexpected end of input expecting "#" Prelude Text.ParserCombinators.Parsec> parseTe
Here (many anyToken) reads all of "abc#" and then endBy wants to read (char '#') and get the end of input instead. So the working version of endBy is thus:
Prelude Text.ParserCombinators.Parsec> parseTest (endBy (many (noneOf "#")) (char '#')) "abc#" ["abc"]
Or you may need to not use endBy...

ChrisK wrote:
Andrew Coppin wrote:
Anybody want to explain to me why this doesn't work?
___ ___ _ / _ \ /\ /\/ __(_) / /_\// /_/ / / | | GHC Interactive, version 6.6.1, for Haskell 98. / /_\\/ __ / /___| | http://www.haskell.org/ghc/ \____/\/ /_/\____/|_| Type :? for help.
Loading package base ... linking ... done. Prelude> :m Text.ParserCombinators.Parsec Prelude Text.ParserCombinators.Parsec> parseTest (endBy anyToken (char '#')) "abc#" Loading package parsec-2.0 ... linking ... done. parse error at (line 1, column 1): unexpected "b" expecting "#"
anyToken is singular: it accepts a single token, in this case 'a'.
Then endBy expects (char '#') to match and reads 'b' instead and gives the error message.
So using (many anyToken) gets further:
Prelude Text.ParserCombinators.Parsec> parseTest (endBy (many anyToken) (char '#')) "abc#" Loading package parsec-2.0 ... linking ... done. parse error at (line 1, column 1): unexpected end of input expecting "#" Prelude Text.ParserCombinators.Parsec> parseTe
Here (many anyToken) reads all of "abc#" and then endBy wants to read (char '#') and get the end of input instead.
So the working version of endBy is thus:
Prelude Text.ParserCombinators.Parsec> parseTest (endBy (many (noneOf "#")) (char '#')) "abc#" ["abc"]
Or you may need to not use endBy...
But hang on a minute... "many" parses 0 or more occurrances of an item. "sepBy" parses 0 or more occurrances of an item, seperated by another item. "endBy" parses 0 or more occurrances of an item, terminated by another item. "sepEndBy" parses 0 or more occurrances of an item, seperated *and* terminated by another item. ...except that "endBy" doesn't seem to be working right. :-S

On Sat, Aug 25, 2007 at 08:18:29PM +0100, Andrew Coppin wrote:
But hang on a minute...
"many" parses 0 or more occurrances of an item.
"sepBy" parses 0 or more occurrances of an item, seperated by another item.
"endBy" parses 0 or more occurrances of an item, terminated by another item.
"sepEndBy" parses 0 or more occurrances of an item, seperated *and* terminated by another item.
...except that "endBy" doesn't seem to be working right. :-S
There is one other little bit of documented behavior. Parsec's normal combinators only parse LL(1) grammars. Consult any work on formal languages for the exact meaning and all the consequences, however for this example it serves to note that after seeing abc, the single character of lookahead '#' is not sufficient to determine the correct parse. Stefan

Stefan O'Rear wrote:
On Sat, Aug 25, 2007 at 08:18:29PM +0100, Andrew Coppin wrote:
But hang on a minute...
"many" parses 0 or more occurrances of an item.
"sepBy" parses 0 or more occurrances of an item, seperated by another item.
"endBy" parses 0 or more occurrances of an item, terminated by another item.
"sepEndBy" parses 0 or more occurrances of an item, seperated *and* terminated by another item.
...except that "endBy" doesn't seem to be working right. :-S
There is one other little bit of documented behavior. Parsec's normal combinators only parse LL(1) grammars. Consult any work on formal languages for the exact meaning and all the consequences, however for this example it serves to note that after seeing abc, the single character of lookahead '#' is not sufficient to determine the correct parse.
Heh. Starting to wish I had a significantly higher IQ... I thought the whole *purpose* of the endBy combinator was to keep applying one parser until the other one succeeds? In the example I posted, the two parsers are quite trivial. But in the real problem I actually want to solve, they are very non-trivial...
participants (4)
-
Albert Y. C. Lai
-
Andrew Coppin
-
ChrisK
-
Stefan O'Rear