Parsing R5RS Scheme with Parsec

Hallo, For fun and learning I'm trying to parse R5RS Scheme with Parsec. The code to parse lists follows: -- -- Lists -- parseLeftList :: Parser [SchDatum] parseLeftList = do char '(' many parseDatum >>= return . filter (/= SchAtmosphere) parseDottedList :: [SchDatum] -> Parser SchDatum parseDottedList ls = do char '.' many1 parseAtmosphere d <- parseDatum many parseAtmosphere char ')' return $ SchDottedList ls d parseProperList :: [SchDatum] -> Parser SchDatum parseProperList ls = do char ')' return $ SchList ls parseList :: Parser SchDatum parseList = do ls <- parseLeftList (parseDottedList ls) <|> (parseProperList ls) I've factored out the common left sub-expression in parseLeftList. The problem is that "..." is a valid identifier so when inside the left of the list the parser sees a single dot, it tries to match it with "...", which fails. Can anybody give advice on how to rewrite these list parsing functions? Cheers, -- -alex http://www.ventonegro.org/

On Oct 2, 2007, at 9:52 , Alex Queiroz wrote:
(parseDottedList ls) <|> (parseProperList ls)
I've factored out the common left sub-expression in parseLeftList. The problem is that "..." is a valid identifier so when inside the left of the list the parser sees a single dot, it tries to match it with "...", which fails. Can anybody give advice on how to rewrite these list parsing functions?
try (parseDottedList ls) <|> parseProperList ls Overuse of try is a bad idea because it's slow, but sometimes it's the only way to go; it insures backtracking in cases like this. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Hallo,
On 10/2/07, Brandon S. Allbery KF8NH
On Oct 2, 2007, at 9:52 , Alex Queiroz wrote:
(parseDottedList ls) <|> (parseProperList ls)
I've factored out the common left sub-expression in parseLeftList. The problem is that "..." is a valid identifier so when inside the left of the list the parser sees a single dot, it tries to match it with "...", which fails. Can anybody give advice on how to rewrite these list parsing functions?
try (parseDottedList ls) <|> parseProperList ls
Overuse of try is a bad idea because it's slow, but sometimes it's the only way to go; it insures backtracking in cases like this.
This does not work. The parser chokes in parseLeftList, because it finds a single dot which is not the beginning of "...". Cheers, -- -alex http://www.ventonegro.org/

On Oct 2, 2007, at 10:36 , Alex Queiroz wrote:
This does not work. The parser chokes in parseLeftList, because it finds a single dot which is not the beginning of "...".
Sorry, just woke up and still not quite tracking right, so I modified the wrong snippet of code. The trick is to wrap parseLeftList in a try, so the parser retries the next alternative when it fails. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Hallo,
On 10/2/07, Brandon S. Allbery KF8NH
Sorry, just woke up and still not quite tracking right, so I modified the wrong snippet of code. The trick is to wrap parseLeftList in a try, so the parser retries the next alternative when it fails.
Since "..." can only appear at the end of a list, I removed "..." from the possible symbols and added a new function: parseThreeDottedList :: [SchDatum] -> Parser SchDatum parseThreeDottedList ls = do string "..." many parseAtmosphere char ')' return $ SchList $ ls ++ [SchSymbol "..."] parseList :: Parser SchDatum parseList = do ls <- parseLeftList try (parseThreeDottedList ls) <|> (parseDottedList ls) <|> (parseProperList ls) Thanks for the help. Cheers, -- -alex http://www.ventonegro.org/

I don't know if this applies to Scheme parsing, but I find it's often
helpful to introduce a tokenizer into the parser to centralize the use
of "try" to one place::
type Token = String
tokRaw :: Parser Token
tokRaw = {- implement this yourself depending on language spec -}
tok :: Parser Token
tok = do
t <- tokRaw
many spaces
return t
-- wrap your outside parser with this; it gets the tokenizer
-- started because we only handle eating spaces after tokens,
-- not before
startParser :: Parser a -> Parser a
startParser a = many spaces >> a
sat :: (Token -> Maybe a) -> Parser a
sat f = try $ do
t <- tok
case f t of
Nothing -> mzero
Just a -> return a
lit :: Token -> Parser Token
lit s = sat (test s) > s where
test s t = if (s == t) then Just s else Nothing
Now if you replace uses of "string" and "char" in your code with
"lit", you avoid the problem of parses failing because they consumed
some input from the "wrong" token type before failing.
On 10/2/07, Alex Queiroz
Hallo,
On 10/2/07, Brandon S. Allbery KF8NH
wrote: Sorry, just woke up and still not quite tracking right, so I modified the wrong snippet of code. The trick is to wrap parseLeftList in a try, so the parser retries the next alternative when it fails.
Since "..." can only appear at the end of a list, I removed "..." from the possible symbols and added a new function:
parseThreeDottedList :: [SchDatum] -> Parser SchDatum parseThreeDottedList ls = do string "..." many parseAtmosphere char ')' return $ SchList $ ls ++ [SchSymbol "..."]
parseList :: Parser SchDatum parseList = do ls <- parseLeftList try (parseThreeDottedList ls) <|> (parseDottedList ls) <|> (parseProperList ls)
Thanks for the help.
Cheers, -- -alex http://www.ventonegro.org/ _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Tue, Oct 02, 2007 at 11:36:52AM -0300, Alex Queiroz wrote:
Hallo,
On 10/2/07, Brandon S. Allbery KF8NH
wrote: On Oct 2, 2007, at 9:52 , Alex Queiroz wrote:
(parseDottedList ls) <|> (parseProperList ls)
I've factored out the common left sub-expression in parseLeftList. The problem is that "..." is a valid identifier so when inside the left of the list the parser sees a single dot, it tries to match it with "...", which fails. Can anybody give advice on how to rewrite these list parsing functions?
try (parseDottedList ls) <|> parseProperList ls
Overuse of try is a bad idea because it's slow, but sometimes it's the only way to go; it insures backtracking in cases like this.
This does not work. The parser chokes in parseLeftList, because it finds a single dot which is not the beginning of "...".
I suggest left-factoring. parseThingyOrEOL = (space >> parseThingyOrEOL) <|> (fmap Left parseAtom) <|> (char '.' >> parseThingyOrEOL >>= \(Left x) -> Right x) <|> (char ')' >> return (Right nil)) <|> (char '(' >> fmap Left (fix (\ plist -> do obj <- parseThingyOrEOL case obj of Left x -> fmap (Cons x) plist Right x -> return x))) etc. Stefan
participants (4)
-
Alex Queiroz
-
Brandon S. Allbery KF8NH
-
Ryan Ingram
-
Stefan O'Rear