Keeping a symbol table with Parsec

Folks, Are there any examples of keeping a symbol table with Parsec? I'm translating a parser from OCaml and I do this OUTPUT COLON ID LP NUMERIC_SIMPLE RP { add $3 TypNumOut; SimpleOutputDec ($3, Number) } Meaning that if a keyword Output is followed by ":" and an identifier and then "(NumericSimple)" then add identifier to the symbol table as a Number and box it in a constructor. Then in my lexer I do a lookup to check if I have seen this identifier and if I have seen one of type TypeNumOut I return the token NUM instead of ID. This ensures that I can have rules with the token NUM as opposed to ID everywhere. How would I accomplish the same with Parsec, that is 1) update a symbol table and 2) check identifiers and "return a different token"? Thanks, Joel -- http://wagerlabs.com/

Section 2.12 of the Parsec manual[1] discusses "user state." It sounds
like that is what you are after.
Hope that helps,
Nick
[1] - http://www.cs.uu.nl/~daan/download/parsec/parsec.pdf
On 4/2/07, Joel Reymont
Folks,
Are there any examples of keeping a symbol table with Parsec?
I'm translating a parser from OCaml and I do this
OUTPUT COLON ID LP NUMERIC_SIMPLE RP { add $3 TypNumOut; SimpleOutputDec ($3, Number) }
Meaning that if a keyword Output is followed by ":" and an identifier and then "(NumericSimple)" then add identifier to the symbol table as a Number and box it in a constructor.
Then in my lexer I do a lookup to check if I have seen this identifier and if I have seen one of type TypeNumOut I return the token NUM instead of ID. This ensures that I can have rules with the token NUM as opposed to ID everywhere.
How would I accomplish the same with Parsec, that is 1) update a symbol table and 2) check identifiers and "return a different token"?
Thanks, Joel
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Apr 2, 2007, at 11:17 PM, Nicolas Frisby wrote:
Section 2.12 of the Parsec manual[1] discusses "user state." It sounds like that is what you are after.
Yes, thanks. My question is mostly about how to "return a different token" when the lexer finds an identifier that's already in the symbol table. -- http://wagerlabs.com/

Joel Reymont wrote:
Meaning that if a keyword Output is followed by ":" and an identifier and then "(NumericSimple)" then add identifier to the symbol table as a Number and box it in a constructor.
Then in my lexer I do a lookup to check if I have seen this identifier and if I have seen one of type TypeNumOut I return the token NUM instead of ID. This ensures that I can have rules with the token NUM as opposed to ID everywhere.
I use a set of strings for the symbol table (I don't record the types of the identifiers, but you can add it back). I don't allow for whitespace, but you can add it back. The parser returns a string rather than a constructor with a string, but you can add it back. It is necessary to fuse the lexer and the parser together, so that they share state; but we can fuse them in a way that still leaves recognizable boundary, e.g., in the below, string "blah", ident, num, name, and numeric_simple are lexers (thus when you add back whitespace you know who are the suspects), and p0 is a parser that calls the lexers and do extra. The name lexer returns a sum type, so you can use its two cases to signify whether a name is in the table or not; then ident and num can fail on the wrong cases. (Alternatively, you can eliminate the sum type by copying the name code into the ident code and the num code.) import Text.ParserCombinators.Parsec import Monad(mzero) import Data.Set as Set main = do { input <- getLine ; print (runParser p0 Set.empty "stdin" input) } p0 = do { string "Output" ; string ":" ; i <- ident ; string "(" ; numeric_simple ; string ")" ; updateState (Set.insert i) ; return i } numeric_simple = many digit ident = do { n <- name ; case n of { ID i -> return i ; _ -> mzero } } name = do { c0 <- letter ; cs <- many alphaNum ; let n = c0 : cs ; table <- getState ; return (if n `Set.member` table then NUM n else ID n) } data Name = NUM String | ID String num = do { n <- name ; case n of { NUM i -> return i ; _ -> mzero } }
participants (3)
-
Albert Y. C. Lai
-
Joel Reymont
-
Nicolas Frisby