Re: [Haskell-cafe] Token parsers in parsec consume trailing whitespace

14 Dec 2009

      Hi Edward,
...
1. Is there a more elegant way of doing number parsing?  In
particular, are there token parsers that don't consume trailing
whitespace, or is there a better way to do this with the
primitives.
Parsec defines a combinator it calls 'lexeme' which the tokenizer
wraps each of its functions in.  The purpose of the tokenizer is to
create a set of parsing combinators that ignore whitespace, comments,
and some other handy stuff like checking for collisions with reserved
keywords.  To consume the trailing whitespace is not a bug, it's an
abstraction layer, and Parsec is consistent about only using this
abstraction in the Token module.

It's too bad that the 'nat' function in Token is not defined in
Parsec's Char module, and because of that, you need to copy-paste that
code or roll your own.
...
It seems that the "token" approach of parsing lends itself
to a different style of parsing than the one I'm doing
That's correct.  Sounds to me like you shouldn't bother creating a
tokenizer.  You might even be able to get away with using the regex
library instead of Parsec.

-Greg

Re: [Haskell-cafe] Token parsers in parsec consume trailing whitespace

Greg Fitzgerald