Parsec - separating Parsing from Lexing - Haskell-Cafe

10 Nov 2009

      Hello.

I'm currently implementing a MicroJava compiler for a college assignment
(the implemented language was defined, the implementation language was of
free choice).

I've sucessfully implemented the lexer using Parsec. It has the type String
-> Parser [MJVal], where MJVal are all the possible tokens.

However, I don't know how to implement the parser, or at least how to do it
keeping it distinguished from the lexer.

For example, the MicroJava grammar specifies:

Program = "program" ident {ConstDecl | VarDecl | ClassDecl}
          "{" {MethodDecl} "}".

The natural solution (for me) would be:

program = do
  string "program"
  programName <- identifier
  ...

However, I can't do this because the file is already tokenized, what I have
is something like:
[Program_, identifier_ "testProgram", lBrace_, ...]
for the example program:

program testProgram {
...

How should I implement the parser separated from the lexer? That is, how
should I parse Tokens instead of Strings in the "Haskell way"?

Fernando Henrique Sanches

Parsec - separating Parsing from Lexing

Fernando Henrique Sanches

Sean Leather

Jason Dusek

Fernando Henrique Sanches

Stephen Tetley

Fernando Henrique Sanches

Ryan Ingram

tags

participants (5)