tokenizing a string and parsing the string

In combinator parsing with say Parsec, you don't tokenize the input
the parsing - this is an instance of so called "scannerless" parsing
(slight exaggeration for sake of simplicity).
If you needed to tokenize then parse, this is the model followed by
Alex and Happy.
On 12 October 2011 06:28, kolli kolli
Can anyone help how to tokenize a string and parse it.

Stephen Tetley wrote:
In combinator parsing with say Parsec, you don't tokenize the input the parsing - this is an instance of so called "scannerless" parsing (slight exaggeration for sake of simplicity).
If you needed to tokenize then parse, this is the model followed by Alex and Happy.
It is actually possible to use alex to split the input into tokens and then use Parsec to parse the stream of tokens. Token parsers tend to run a bit faster than Char parsers. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

Despite the term "scannerless" parsing you'll typically have "lexical rules" for the tokens (like identifiers, numbers, separators, etc.) and normal parser/grammar rules. I recommend to use parsec also as scanner (and avoid a separate tokenizer). I don't think, speed matters that much. The point is that after every token the spaces or comments until the next token starts must be consumed from the input. (I call this "skipping", Daan Leijen has a "lexeme" parser for this in his Parsec.Token module.) HTH Christian Am 12.10.2011 10:39, schrieb Erik de Castro Lopo:
Stephen Tetley wrote:
In combinator parsing with say Parsec, you don't tokenize the input the parsing - this is an instance of so called "scannerless" parsing (slight exaggeration for sake of simplicity).
If you needed to tokenize then parse, this is the model followed by Alex and Happy.
It is actually possible to use alex to split the input into tokens and then use Parsec to parse the stream of tokens. Token parsers tend to run a bit faster than Char parsers.
Erik
participants (4)
-
Christian Maeder
-
Erik de Castro Lopo
-
kolli kolli
-
Stephen Tetley