
Hi, to solve this ICFP task http://www.cs.cornell.edu/icfp/task.htm I'm currnetly working on the parser. With the hint from Thu (reading Phillip Wadlers monadic parser paper) and consulting http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Parsing I produce a first working version of the parser. After this great moment I completed the token definition and near all parsing rules. For the complete code have a look at http://code.google.com/p/hgmltracer/source/browse/#svn%2Ftrunk%2FhGmlTracer data GmlToken = -- structures FunctionToken TokenSequence | ArrayToken TokenSequence | -- control operators ApplyToken | IfToken | -- number operators AddiToken | AddfToken | ACosToken | ASinToken | ClampfToken | CosToken | DiviToken | DivfToken | EqiToken | EqfToken | FloorToken | FracToken | LessiToken | LessfToken | ModiToken | MuliToken | MulfToken | NegiToken | NegfToken | ToRealToken | SinToken | SqrtToken | SubiToken | SubfToken | -- points GetXToken | GetYToken | GetZToken | PointToken | -- arrays GetToken | LengthToken | -- environment IdentifierToken String | BinderToken String | -- types BoolToken Bool | IntToken Int | RealToken Double | StringToken String deriving Show And parseGml :: String -> [GmlToken] parseGml input = case parse parseList "gml" input of Left err -> error ("Parse error: " ++ (show err)) Right gml -> gml parseList = sepBy parseGml' spaces parseGml' = -- structures -- parseFunction -- <|> parseArray -- control operators parseControlOperators -- number operators <|> parseNumberOperators -- points <|> parsePointOperators -- arrays <|> parseArrayOperators -- types <|> parseBool <|> parseInteger <|> parseString -- environment <|> parseIdentifier <|> parseBinder parseArray = parseSequence '[' ']' parseFunction = parseSequence '{' '}' parseSequence start end = do char start array <- parseList char end return $ ArrayToken array parseControlOperators = parseApply <|> parseIf parseApply = do string "apply" return $ ApplyToken parseIf = do string "if" return $ IfToken parseNumberOperators = do string "addi" return $ AddiToken <|> do string "addf" return $ AddfToken <|> do string "acos" return $ ACosToken <|> do string "asind" return $ ASinToken <|> do string "clampf" return $ ClampfToken <|> do string "cos" return $ CosToken <|> do string "divi" return $ DiviToken <|> do string "divf" return $ DivfToken <|> do string "eqi" return $ EqiToken <|> do string "eqf" return $ EqfToken <|> do string "floor" return $ FloorToken <|> do string "frac" return $ FracToken <|> do string "lessi" return $ LessiToken <|> do string "lessf" return $ LessfToken <|> do string "modi" return $ ModiToken <|> do string "muli" return $ MuliToken <|> do string "mulf" return $ MulfToken <|> do string "negi" return $ NegiToken <|> do string "negf" return $ NegfToken <|> do string "real" return $ ToRealToken <|> do string "sin" return $ SinToken <|> do string "Sqrt" return $ SqrtToken <|> do string "subi" return $ SubiToken <|> do string "subf" return $ SubfToken parsePointOperators = do string "getx" return $ GetXToken <|> do string "gety" return $ GetYToken <|> do string "getz" return $ GetZToken <|> do string "point" return $ PointToken parseArrayOperators = do string "get" return $ GetToken <|> do string "length" return $ LengthToken parseInteger = liftM (IntToken . read) $ many1 digit parseBool = do string "true" return $ BoolToken True <|> do string "false" return $ BoolToken False parseString = do char '"' string <- many (noneOf "\"") char '"' return $ StringToken string parseIdentifier = do identifier <- many (noneOf " ") return $ IdentifierToken identifier parseBinder = do char '/' binder <- many (noneOf " ") return $ BinderToken binder parseComment = do char '%' many (noneOf "") newline return $ () spaces = skipMany1 space After gluing all this together in my mind all worked well. But it doesn't. The test file for parsing looks like: { /self /n n 2 lessi { 1 } { n 1 subi self self apply n muli } if } /fact 12 fact fact apply * I think there is a problem with overlaping rules. There is a parser rule consuming "negi" and resulting in the NegiToken. A single "n" is a valid identifier. For the example file my parser says: unexpected " ", expecting "negi" * I think the same problem is present for "parseInteger" and "parseReal" (currently no in code but looks like "parseReal = do a <- many1 digit \n char '.' \n b <- many1 digit \n return $ RealToken (read (a ++ "." ++ b)" * Something with "parseFunction" is going really wrong. * All parsig rules are designed with the condition that each "construct" is separated by whitespaces. For this before parsing the input will be preprocessed: insert spaces, removing whitespaces and so on. Now in the parsed result appears (IdentifierToken ""). I think my version is not the best way to parse a identifer: parseIdentifier = do identifier <- many (noneOf " ") return $ IdentifierToken identifier Please help me. Klaus

You should parse keywords using: keyword s = try (string s) >> notFollowedBy (letter <|> digit) C. Am 07.03.2011 11:34, schrieb Hauschild, Klaus (EXT):
Hi,
to solve this ICFP task _http://www.cs.cornell.edu/icfp/task.htm_ I'm currnetly working on the parser. With the hint from Thu (reading Phillip Wadlers monadic parser paper) and consulting _http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Parsing_ I produce a first working version of the parser. After this great moment I completed the token definition and near all parsing rules. For the complete code have a look at _http://code.google.com/p/hgmltracer/source/browse/#svn%2Ftrunk%2FhGmlTracer_
data GmlToken =
-- structures
FunctionToken TokenSequence |
ArrayToken TokenSequence |
-- control operators
ApplyToken |
IfToken |
-- number operators
AddiToken |
AddfToken |
ACosToken |
ASinToken |
ClampfToken |
CosToken |
DiviToken |
DivfToken |
EqiToken |
EqfToken |
FloorToken |
FracToken |
LessiToken |
LessfToken |
ModiToken |
MuliToken |
MulfToken |
NegiToken |
NegfToken |
ToRealToken |
SinToken |
SqrtToken |
SubiToken |
SubfToken |
-- points
GetXToken |
GetYToken |
GetZToken |
PointToken |
-- arrays
GetToken |
LengthToken |
-- environment
IdentifierToken String |
BinderToken String |
-- types
BoolToken Bool |
IntToken Int |
RealToken Double |
StringToken String deriving Show
And
parseGml :: String -> [GmlToken]
parseGml input = case parse parseList "gml" input of
Left err -> error ("Parse error: " ++ (show err))
Right gml -> gml
parseList = sepBy parseGml' spaces
parseGml' =
-- structures
-- parseFunction
-- <|> parseArray
-- control operators
parseControlOperators
-- number operators
<|> parseNumberOperators
-- points
<|> parsePointOperators
-- arrays
<|> parseArrayOperators
-- types
<|> parseBool
<|> parseInteger
<|> parseString
-- environment
<|> parseIdentifier
<|> parseBinder
parseArray = parseSequence '[' ']'
parseFunction = parseSequence '{' '}'
parseSequence start end = do char start
array <- parseList
char end
return $ ArrayToken array
parseControlOperators = parseApply <|> parseIf
parseApply = do string "apply"
return $ ApplyToken
parseIf = do string "if"
return $ IfToken
parseNumberOperators = do string "addi"
return $ AddiToken
<|> do string "addf"
return $ AddfToken
<|> do string "acos"
return $ ACosToken
<|> do string "asind"
return $ ASinToken
<|> do string "clampf"
return $ ClampfToken
<|> do string "cos"
return $ CosToken
<|> do string "divi"
return $ DiviToken
<|> do string "divf"
return $ DivfToken
<|> do string "eqi"
return $ EqiToken
<|> do string "eqf"
return $ EqfToken
<|> do string "floor"
return $ FloorToken
<|> do string "frac"
return $ FracToken
<|> do string "lessi"
return $ LessiToken
<|> do string "lessf"
return $ LessfToken
<|> do string "modi"
return $ ModiToken
<|> do string "muli"
return $ MuliToken
<|> do string "mulf"
return $ MulfToken
<|> do string "negi"
return $ NegiToken
<|> do string "negf"
return $ NegfToken
<|> do string "real"
return $ ToRealToken
<|> do string "sin"
return $ SinToken
<|> do string "Sqrt"
return $ SqrtToken
<|> do string "subi"
return $ SubiToken
<|> do string "subf"
return $ SubfToken
parsePointOperators = do string "getx"
return $ GetXToken
<|> do string "gety"
return $ GetYToken
<|> do string "getz"
return $ GetZToken
<|> do string "point"
return $ PointToken
parseArrayOperators = do string "get"
return $ GetToken
<|> do string "length"
return $ LengthToken
parseInteger = liftM (IntToken . read) $ many1 digit
parseBool = do string "true"
return $ BoolToken True
<|> do string "false"
return $ BoolToken False
parseString = do char '"'
string <- many (noneOf "\"")
char '"'
return $ StringToken string
parseIdentifier = do identifier <- many (noneOf " ")
return $ IdentifierToken identifier
parseBinder = do char '/'
binder <- many (noneOf " ")
return $ BinderToken binder
parseComment = do char '%'
many (noneOf "")
newline
return $ ()
spaces = skipMany1 space
After gluing all this together in my mind all worked well. But it doesn't. The test file for parsing looks like:
{ /self /n n 2 lessi { 1 } { n 1 subi self self apply n muli } if } /fact
12 fact fact apply
* I think there is a problem with overlaping rules. There is a parser rule consuming "negi" and resulting in the NegiToken. A single "n" is a valid identifier. For the example file my parser says: unexpected " ", expecting "negi" * I think the same problem is present for "parseInteger" and "parseReal" (currently no in code but looks like "parseReal = do a <- many1 digit \n char '.' \n b <- many1 digit \n return $ RealToken (read (a ++ "." ++ b)" * Something with "parseFunction" is going really wrong. * All parsig rules are designed with the condition that each "construct" is separated by whitespaces. For this before parsing the input will be preprocessed: insert spaces, removing whitespaces and so on. Now in the parsed result appears (IdentifierToken ""). I think my version is not the best way to parse a identifer:
parseIdentifier = *do* identifier <- many (noneOf " ") return $ *IdentifierToken* identifier
Please help me.
Klaus
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Thanks Christian, I adapted the "keyword" parser and now "n" <-> "negi" does not occur. But there are still other problems to solve. If I activate the parseFunction the parser will answer to fact.gml unexpected end of input, expecting space or "}". Any ideas? -----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Montag, 7. März 2011 12:23 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules You should parse keywords using: keyword s = try (string s) >> notFollowedBy (letter <|> digit) C. Am 07.03.2011 11:34, schrieb Hauschild, Klaus (EXT):
Hi,
to solve this ICFP task _http://www.cs.cornell.edu/icfp/task.htm_ I'm currnetly working on the parser. With the hint from Thu (reading Phillip Wadlers monadic parser paper) and consulting _http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Parsing_ I produce a first working version of the parser. After this great moment I completed the token definition and near all parsing rules. For the complete code have a look at _http://code.google.com/p/hgmltracer/source/browse/#svn%2Ftrunk%2FhGmlTracer_
data GmlToken =
-- structures
FunctionToken TokenSequence |
ArrayToken TokenSequence |
-- control operators
ApplyToken |
IfToken |
-- number operators
AddiToken |
AddfToken |
ACosToken |
ASinToken |
ClampfToken |
CosToken |
DiviToken |
DivfToken |
EqiToken |
EqfToken |
FloorToken |
FracToken |
LessiToken |
LessfToken |
ModiToken |
MuliToken |
MulfToken |
NegiToken |
NegfToken |
ToRealToken |
SinToken |
SqrtToken |
SubiToken |
SubfToken |
-- points
GetXToken |
GetYToken |
GetZToken |
PointToken |
-- arrays
GetToken |
LengthToken |
-- environment
IdentifierToken String |
BinderToken String |
-- types
BoolToken Bool |
IntToken Int |
RealToken Double |
StringToken String deriving Show
And
parseGml :: String -> [GmlToken]
parseGml input = case parse parseList "gml" input of
Left err -> error ("Parse error: " ++ (show err))
Right gml -> gml
parseList = sepBy parseGml' spaces
parseGml' =
-- structures
-- parseFunction
-- <|> parseArray
-- control operators
parseControlOperators
-- number operators
<|> parseNumberOperators
-- points
<|> parsePointOperators
-- arrays
<|> parseArrayOperators
-- types
<|> parseBool
<|> parseInteger
<|> parseString
-- environment
<|> parseIdentifier
<|> parseBinder
parseArray = parseSequence '[' ']'
parseFunction = parseSequence '{' '}'
parseSequence start end = do char start
array <- parseList
char end
return $ ArrayToken array
parseControlOperators = parseApply <|> parseIf
parseApply = do string "apply"
return $ ApplyToken
parseIf = do string "if"
return $ IfToken
parseNumberOperators = do string "addi"
return $ AddiToken
<|> do string "addf"
return $ AddfToken
<|> do string "acos"
return $ ACosToken
<|> do string "asind"
return $ ASinToken
<|> do string "clampf"
return $ ClampfToken
<|> do string "cos"
return $ CosToken
<|> do string "divi"
return $ DiviToken
<|> do string "divf"
return $ DivfToken
<|> do string "eqi"
return $ EqiToken
<|> do string "eqf"
return $ EqfToken
<|> do string "floor"
return $ FloorToken
<|> do string "frac"
return $ FracToken
<|> do string "lessi"
return $ LessiToken
<|> do string "lessf"
return $ LessfToken
<|> do string "modi"
return $ ModiToken
<|> do string "muli"
return $ MuliToken
<|> do string "mulf"
return $ MulfToken
<|> do string "negi"
return $ NegiToken
<|> do string "negf"
return $ NegfToken
<|> do string "real"
return $ ToRealToken
<|> do string "sin"
return $ SinToken
<|> do string "Sqrt"
return $ SqrtToken
<|> do string "subi"
return $ SubiToken
<|> do string "subf"
return $ SubfToken
parsePointOperators = do string "getx"
return $ GetXToken
<|> do string "gety"
return $ GetYToken
<|> do string "getz"
return $ GetZToken
<|> do string "point"
return $ PointToken
parseArrayOperators = do string "get"
return $ GetToken
<|> do string "length"
return $ LengthToken
parseInteger = liftM (IntToken . read) $ many1 digit
parseBool = do string "true"
return $ BoolToken True
<|> do string "false"
return $ BoolToken False
parseString = do char '"'
string <- many (noneOf "\"")
char '"'
return $ StringToken string
parseIdentifier = do identifier <- many (noneOf " ")
return $ IdentifierToken identifier
parseBinder = do char '/'
binder <- many (noneOf " ")
return $ BinderToken binder
parseComment = do char '%'
many (noneOf "")
newline
return $ ()
spaces = skipMany1 space
After gluing all this together in my mind all worked well. But it doesn't. The test file for parsing looks like:
{ /self /n n 2 lessi { 1 } { n 1 subi self self apply n muli } if } /fact
12 fact fact apply
* I think there is a problem with overlaping rules. There is a parser rule consuming "negi" and resulting in the NegiToken. A single "n" is a valid identifier. For the example file my parser says: unexpected " ", expecting "negi" * I think the same problem is present for "parseInteger" and "parseReal" (currently no in code but looks like "parseReal = do a <- many1 digit \n char '.' \n b <- many1 digit \n return $ RealToken (read (a ++ "." ++ b)" * Something with "parseFunction" is going really wrong. * All parsig rules are designed with the condition that each "construct" is separated by whitespaces. For this before parsing the input will be preprocessed: insert spaces, removing whitespaces and so on. Now in the parsed result appears (IdentifierToken ""). I think my version is not the best way to parse a identifer:
parseIdentifier = *do* identifier <- many (noneOf " ") return $ *IdentifierToken* identifier
Please help me.
Klaus
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Am 07.03.2011 13:48, schrieb Hauschild, Klaus (EXT):
Thanks Christian,
I adapted the "keyword" parser and now "n" <-> "negi" does not occur.
But there are still other problems to solve. If I activate the parseFunction the parser will answer to fact.gml unexpected end of input, expecting space or "}".
Your redefinition of spaces (= skipMany1 space) is problematic and: parseList = sepBy parseGml' spaces Allow for the original (optional) spaces after parseGml': parseGml'' = liftM2 const parseGml' spaces parseList = many parseGml'' C. P.S. why do you call? many (noneOf "") manyTill anyChar newline or just: many (noneOf "\n") (a trailing newline will be skipped by spaces)
Any ideas?
-----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Montag, 7. März 2011 12:23 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules
You should parse keywords using:
keyword s = try (string s) >> notFollowedBy (letter <|> digit)
C.
Am 07.03.2011 11:34, schrieb Hauschild, Klaus (EXT):
Hi,
to solve this ICFP task _http://www.cs.cornell.edu/icfp/task.htm_ I'm currnetly working on the parser. With the hint from Thu (reading Phillip Wadlers monadic parser paper) and consulting _http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Parsing_ I produce a first working version of the parser. After this great moment I completed the token definition and near all parsing rules. For the complete code have a look at _http://code.google.com/p/hgmltracer/source/browse/#svn%2Ftrunk%2FhGmlTracer_
data GmlToken =
-- structures
FunctionToken TokenSequence |
ArrayToken TokenSequence |
-- control operators
ApplyToken |
IfToken |
-- number operators
AddiToken |
AddfToken |
ACosToken |
ASinToken |
ClampfToken |
CosToken |
DiviToken |
DivfToken |
EqiToken |
EqfToken |
FloorToken |
FracToken |
LessiToken |
LessfToken |
ModiToken |
MuliToken |
MulfToken |
NegiToken |
NegfToken |
ToRealToken |
SinToken |
SqrtToken |
SubiToken |
SubfToken |
-- points
GetXToken |
GetYToken |
GetZToken |
PointToken |
-- arrays
GetToken |
LengthToken |
-- environment
IdentifierToken String |
BinderToken String |
-- types
BoolToken Bool |
IntToken Int |
RealToken Double |
StringToken String deriving Show
And
parseGml :: String -> [GmlToken]
parseGml input = case parse parseList "gml" input of
Left err -> error ("Parse error: " ++ (show err))
Right gml -> gml
parseList = sepBy parseGml' spaces
parseGml' =
-- structures
-- parseFunction
-- <|> parseArray
-- control operators
parseControlOperators
-- number operators
<|> parseNumberOperators
-- points
<|> parsePointOperators
-- arrays
<|> parseArrayOperators
-- types
<|> parseBool
<|> parseInteger
<|> parseString
-- environment
<|> parseIdentifier
<|> parseBinder
parseArray = parseSequence '[' ']'
parseFunction = parseSequence '{' '}'
parseSequence start end = do char start
array <- parseList
char end
return $ ArrayToken array
parseControlOperators = parseApply <|> parseIf
parseApply = do string "apply"
return $ ApplyToken
parseIf = do string "if"
return $ IfToken
parseNumberOperators = do string "addi"
return $ AddiToken
<|> do string "addf"
return $ AddfToken
<|> do string "acos"
return $ ACosToken
<|> do string "asind"
return $ ASinToken
<|> do string "clampf"
return $ ClampfToken
<|> do string "cos"
return $ CosToken
<|> do string "divi"
return $ DiviToken
<|> do string "divf"
return $ DivfToken
<|> do string "eqi"
return $ EqiToken
<|> do string "eqf"
return $ EqfToken
<|> do string "floor"
return $ FloorToken
<|> do string "frac"
return $ FracToken
<|> do string "lessi"
return $ LessiToken
<|> do string "lessf"
return $ LessfToken
<|> do string "modi"
return $ ModiToken
<|> do string "muli"
return $ MuliToken
<|> do string "mulf"
return $ MulfToken
<|> do string "negi"
return $ NegiToken
<|> do string "negf"
return $ NegfToken
<|> do string "real"
return $ ToRealToken
<|> do string "sin"
return $ SinToken
<|> do string "Sqrt"
return $ SqrtToken
<|> do string "subi"
return $ SubiToken
<|> do string "subf"
return $ SubfToken
parsePointOperators = do string "getx"
return $ GetXToken
<|> do string "gety"
return $ GetYToken
<|> do string "getz"
return $ GetZToken
<|> do string "point"
return $ PointToken
parseArrayOperators = do string "get"
return $ GetToken
<|> do string "length"
return $ LengthToken
parseInteger = liftM (IntToken . read) $ many1 digit
parseBool = do string "true"
return $ BoolToken True
<|> do string "false"
return $ BoolToken False
parseString = do char '"'
string <- many (noneOf "\"")
char '"'
return $ StringToken string
parseIdentifier = do identifier <- many (noneOf " ")
return $ IdentifierToken identifier
parseBinder = do char '/'
binder <- many (noneOf " ")
return $ BinderToken binder
parseComment = do char '%'
many (noneOf "")
newline
return $ ()
spaces = skipMany1 space
After gluing all this together in my mind all worked well. But it doesn't. The test file for parsing looks like:
{ /self /n n 2 lessi { 1 } { n 1 subi self self apply n muli } if } /fact
12 fact fact apply
* I think there is a problem with overlaping rules. There is a parser rule consuming "negi" and resulting in the NegiToken. A single "n" is a valid identifier. For the example file my parser says: unexpected " ", expecting "negi" * I think the same problem is present for "parseInteger" and "parseReal" (currently no in code but looks like "parseReal = do a <- many1 digit \n char '.' \n b <- many1 digit \n return $ RealToken (read (a ++ "." ++ b)" * Something with "parseFunction" is going really wrong. * All parsig rules are designed with the condition that each "construct" is separated by whitespaces. For this before parsing the input will be preprocessed: insert spaces, removing whitespaces and so on. Now in the parsed result appears (IdentifierToken ""). I think my version is not the best way to parse a identifer:
parseIdentifier = *do* identifier <- many (noneOf " ") return $ *IdentifierToken* identifier
Please help me.
Klaus
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hi Christian, my version of parseList works currently strange. The input "{2\n2}" will be parsed fine, but something like "{ }" or so fails with 'expecting space or "}"'. The redefinition of space is not necessary, it was copied from another tutorial code. How I write a version of parseGml that get gml token separated by any white space (space, tab, newline)? Thanks -----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Montag, 7. März 2011 14:18 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules Am 07.03.2011 13:48, schrieb Hauschild, Klaus (EXT):
Thanks Christian,
I adapted the "keyword" parser and now "n" <-> "negi" does not occur.
But there are still other problems to solve. If I activate the parseFunction the parser will answer to fact.gml unexpected end of input, expecting space or "}".
Your redefinition of spaces (= skipMany1 space) is problematic and: parseList = sepBy parseGml' spaces Allow for the original (optional) spaces after parseGml': parseGml'' = liftM2 const parseGml' spaces parseList = many parseGml'' C. P.S. why do you call? many (noneOf "") manyTill anyChar newline or just: many (noneOf "\n") (a trailing newline will be skipped by spaces)
Any ideas?
-----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Montag, 7. März 2011 12:23 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules
You should parse keywords using:
keyword s = try (string s) >> notFollowedBy (letter <|> digit)
C.
Am 07.03.2011 11:34, schrieb Hauschild, Klaus (EXT):
Hi,
to solve this ICFP task _http://www.cs.cornell.edu/icfp/task.htm_ I'm currnetly working on the parser. With the hint from Thu (reading Phillip Wadlers monadic parser paper) and consulting _http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Parsing_ I produce a first working version of the parser. After this great moment I completed the token definition and near all parsing rules. For the complete code have a look at _http://code.google.com/p/hgmltracer/source/browse/#svn%2Ftrunk%2FhGmlTracer_
data GmlToken =
-- structures
FunctionToken TokenSequence |
ArrayToken TokenSequence |
-- control operators
ApplyToken |
IfToken |
-- number operators
AddiToken |
AddfToken |
ACosToken |
ASinToken |
ClampfToken |
CosToken |
DiviToken |
DivfToken |
EqiToken |
EqfToken |
FloorToken |
FracToken |
LessiToken |
LessfToken |
ModiToken |
MuliToken |
MulfToken |
NegiToken |
NegfToken |
ToRealToken |
SinToken |
SqrtToken |
SubiToken |
SubfToken |
-- points
GetXToken |
GetYToken |
GetZToken |
PointToken |
-- arrays
GetToken |
LengthToken |
-- environment
IdentifierToken String |
BinderToken String |
-- types
BoolToken Bool |
IntToken Int |
RealToken Double |
StringToken String deriving Show
And
parseGml :: String -> [GmlToken]
parseGml input = case parse parseList "gml" input of
Left err -> error ("Parse error: " ++ (show err))
Right gml -> gml
parseList = sepBy parseGml' spaces
parseGml' =
-- structures
-- parseFunction
-- <|> parseArray
-- control operators
parseControlOperators
-- number operators
<|> parseNumberOperators
-- points
<|> parsePointOperators
-- arrays
<|> parseArrayOperators
-- types
<|> parseBool
<|> parseInteger
<|> parseString
-- environment
<|> parseIdentifier
<|> parseBinder
parseArray = parseSequence '[' ']'
parseFunction = parseSequence '{' '}'
parseSequence start end = do char start
array <- parseList
char end
return $ ArrayToken array
parseControlOperators = parseApply <|> parseIf
parseApply = do string "apply"
return $ ApplyToken
parseIf = do string "if"
return $ IfToken
parseNumberOperators = do string "addi"
return $ AddiToken
<|> do string "addf"
return $ AddfToken
<|> do string "acos"
return $ ACosToken
<|> do string "asind"
return $ ASinToken
<|> do string "clampf"
return $ ClampfToken
<|> do string "cos"
return $ CosToken
<|> do string "divi"
return $ DiviToken
<|> do string "divf"
return $ DivfToken
<|> do string "eqi"
return $ EqiToken
<|> do string "eqf"
return $ EqfToken
<|> do string "floor"
return $ FloorToken
<|> do string "frac"
return $ FracToken
<|> do string "lessi"
return $ LessiToken
<|> do string "lessf"
return $ LessfToken
<|> do string "modi"
return $ ModiToken
<|> do string "muli"
return $ MuliToken
<|> do string "mulf"
return $ MulfToken
<|> do string "negi"
return $ NegiToken
<|> do string "negf"
return $ NegfToken
<|> do string "real"
return $ ToRealToken
<|> do string "sin"
return $ SinToken
<|> do string "Sqrt"
return $ SqrtToken
<|> do string "subi"
return $ SubiToken
<|> do string "subf"
return $ SubfToken
parsePointOperators = do string "getx"
return $ GetXToken
<|> do string "gety"
return $ GetYToken
<|> do string "getz"
return $ GetZToken
<|> do string "point"
return $ PointToken
parseArrayOperators = do string "get"
return $ GetToken
<|> do string "length"
return $ LengthToken
parseInteger = liftM (IntToken . read) $ many1 digit
parseBool = do string "true"
return $ BoolToken True
<|> do string "false"
return $ BoolToken False
parseString = do char '"'
string <- many (noneOf "\"")
char '"'
return $ StringToken string
parseIdentifier = do identifier <- many (noneOf " ")
return $ IdentifierToken identifier
parseBinder = do char '/'
binder <- many (noneOf " ")
return $ BinderToken binder
parseComment = do char '%'
many (noneOf "")
newline
return $ ()
spaces = skipMany1 space
After gluing all this together in my mind all worked well. But it doesn't. The test file for parsing looks like:
{ /self /n n 2 lessi { 1 } { n 1 subi self self apply n muli } if } /fact
12 fact fact apply
* I think there is a problem with overlaping rules. There is a parser rule consuming "negi" and resulting in the NegiToken. A single "n" is a valid identifier. For the example file my parser says: unexpected " ", expecting "negi" * I think the same problem is present for "parseInteger" and "parseReal" (currently no in code but looks like "parseReal = do a <- many1 digit \n char '.' \n b <- many1 digit \n return $ RealToken (read (a ++ "." ++ b)" * Something with "parseFunction" is going really wrong. * All parsig rules are designed with the condition that each "construct" is separated by whitespaces. For this before parsing the input will be preprocessed: insert spaces, removing whitespaces and so on. Now in the parsed result appears (IdentifierToken ""). I think my version is not the best way to parse a identifer:
parseIdentifier = *do* identifier <- many (noneOf " ") return $ *IdentifierToken* identifier
Please help me.
Klaus
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Am 08.03.2011 12:30, schrieb Hauschild, Klaus (EXT):
Hi Christian,
my version of parseList works currently strange. The input "{2\n2}" will be parsed fine, but something like "{ }" or so fails with 'expecting space or "}"'.
You must skip (possible) spaces after "{", too. (Actually after every lexeme.) C.
The redefinition of space is not necessary, it was copied from another tutorial code. How I write a version of parseGml that get gml token separated by any white space (space, tab, newline)?
Thanks
-----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Montag, 7. März 2011 14:18 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules
Am 07.03.2011 13:48, schrieb Hauschild, Klaus (EXT):
Thanks Christian,
I adapted the "keyword" parser and now "n" <-> "negi" does not occur.
But there are still other problems to solve. If I activate the parseFunction the parser will answer to fact.gml unexpected end of input, expecting space or "}".
Your redefinition of spaces (= skipMany1 space) is problematic and:
parseList = sepBy parseGml' spaces
Allow for the original (optional) spaces after parseGml':
parseGml'' = liftM2 const parseGml' spaces
parseList = many parseGml''
C.
P.S. why do you call? many (noneOf "")
manyTill anyChar newline
or just: many (noneOf "\n")
(a trailing newline will be skipped by spaces)
Any ideas?
-----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Montag, 7. März 2011 12:23 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules
You should parse keywords using:
keyword s = try (string s) >> notFollowedBy (letter <|> digit)
C.
Am 07.03.2011 11:34, schrieb Hauschild, Klaus (EXT):
Hi,
to solve this ICFP task _http://www.cs.cornell.edu/icfp/task.htm_ I'm currnetly working on the parser. With the hint from Thu (reading Phillip Wadlers monadic parser paper) and consulting _http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Parsing_ I produce a first working version of the parser. After this great moment I completed the token definition and near all parsing rules. For the complete code have a look at _http://code.google.com/p/hgmltracer/source/browse/#svn%2Ftrunk%2FhGmlTracer_
data GmlToken =
-- structures
FunctionToken TokenSequence |
ArrayToken TokenSequence |
-- control operators
ApplyToken |
IfToken |
-- number operators
AddiToken |
AddfToken |
ACosToken |
ASinToken |
ClampfToken |
CosToken |
DiviToken |
DivfToken |
EqiToken |
EqfToken |
FloorToken |
FracToken |
LessiToken |
LessfToken |
ModiToken |
MuliToken |
MulfToken |
NegiToken |
NegfToken |
ToRealToken |
SinToken |
SqrtToken |
SubiToken |
SubfToken |
-- points
GetXToken |
GetYToken |
GetZToken |
PointToken |
-- arrays
GetToken |
LengthToken |
-- environment
IdentifierToken String |
BinderToken String |
-- types
BoolToken Bool |
IntToken Int |
RealToken Double |
StringToken String deriving Show
And
parseGml :: String -> [GmlToken]
parseGml input = case parse parseList "gml" input of
Left err -> error ("Parse error: " ++ (show err))
Right gml -> gml
parseList = sepBy parseGml' spaces
parseGml' =
-- structures
-- parseFunction
-- <|> parseArray
-- control operators
parseControlOperators
-- number operators
<|> parseNumberOperators
-- points
<|> parsePointOperators
-- arrays
<|> parseArrayOperators
-- types
<|> parseBool
<|> parseInteger
<|> parseString
-- environment
<|> parseIdentifier
<|> parseBinder
parseArray = parseSequence '[' ']'
parseFunction = parseSequence '{' '}'
parseSequence start end = do char start
array <- parseList
char end
return $ ArrayToken array
parseControlOperators = parseApply <|> parseIf
parseApply = do string "apply"
return $ ApplyToken
parseIf = do string "if"
return $ IfToken
parseNumberOperators = do string "addi"
return $ AddiToken
<|> do string "addf"
return $ AddfToken
<|> do string "acos"
return $ ACosToken
<|> do string "asind"
return $ ASinToken
<|> do string "clampf"
return $ ClampfToken
<|> do string "cos"
return $ CosToken
<|> do string "divi"
return $ DiviToken
<|> do string "divf"
return $ DivfToken
<|> do string "eqi"
return $ EqiToken
<|> do string "eqf"
return $ EqfToken
<|> do string "floor"
return $ FloorToken
<|> do string "frac"
return $ FracToken
<|> do string "lessi"
return $ LessiToken
<|> do string "lessf"
return $ LessfToken
<|> do string "modi"
return $ ModiToken
<|> do string "muli"
return $ MuliToken
<|> do string "mulf"
return $ MulfToken
<|> do string "negi"
return $ NegiToken
<|> do string "negf"
return $ NegfToken
<|> do string "real"
return $ ToRealToken
<|> do string "sin"
return $ SinToken
<|> do string "Sqrt"
return $ SqrtToken
<|> do string "subi"
return $ SubiToken
<|> do string "subf"
return $ SubfToken
parsePointOperators = do string "getx"
return $ GetXToken
<|> do string "gety"
return $ GetYToken
<|> do string "getz"
return $ GetZToken
<|> do string "point"
return $ PointToken
parseArrayOperators = do string "get"
return $ GetToken
<|> do string "length"
return $ LengthToken
parseInteger = liftM (IntToken . read) $ many1 digit
parseBool = do string "true"
return $ BoolToken True
<|> do string "false"
return $ BoolToken False
parseString = do char '"'
string <- many (noneOf "\"")
char '"'
return $ StringToken string
parseIdentifier = do identifier <- many (noneOf " ")
return $ IdentifierToken identifier
parseBinder = do char '/'
binder <- many (noneOf " ")
return $ BinderToken binder
parseComment = do char '%'
many (noneOf "")
newline
return $ ()
spaces = skipMany1 space
After gluing all this together in my mind all worked well. But it doesn't. The test file for parsing looks like:
{ /self /n n 2 lessi { 1 } { n 1 subi self self apply n muli } if } /fact
12 fact fact apply
* I think there is a problem with overlaping rules. There is a parser rule consuming "negi" and resulting in the NegiToken. A single "n" is a valid identifier. For the example file my parser says: unexpected " ", expecting "negi" * I think the same problem is present for "parseInteger" and "parseReal" (currently no in code but looks like "parseReal = do a <- many1 digit \n char '.' \n b <- many1 digit \n return $ RealToken (read (a ++ "." ++ b)" * Something with "parseFunction" is going really wrong. * All parsig rules are designed with the condition that each "construct" is separated by whitespaces. For this before parsing the input will be preprocessed: insert spaces, removing whitespaces and so on. Now in the parsed result appears (IdentifierToken ""). I think my version is not the best way to parse a identifer:
parseIdentifier = *do* identifier <- many (noneOf " ") return $ *IdentifierToken* identifier
Please help me.
Klaus
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

In http://code.google.com/p/hgmltracer/source/browse/trunk/hGmlTracer/src/Gml/P... I see "identifier <- many (noneOf " ")". You should at least consume one character by using "many1"! This also allows to call "many parseGml''" later. C. Am 08.03.2011 12:30, schrieb Hauschild, Klaus (EXT):
Hi Christian,
my version of parseList works currently strange. The input "{2\n2}" will be parsed fine, but something like "{ }" or so fails with 'expecting space or "}"'.
The redefinition of space is not necessary, it was copied from another tutorial code. How I write a version of parseGml that get gml token separated by any white space (space, tab, newline)?
Thanks
-----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Montag, 7. März 2011 14:18 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules
Am 07.03.2011 13:48, schrieb Hauschild, Klaus (EXT):
Thanks Christian,
I adapted the "keyword" parser and now "n" <-> "negi" does not occur.
But there are still other problems to solve. If I activate the parseFunction the parser will answer to fact.gml unexpected end of input, expecting space or "}".
Your redefinition of spaces (= skipMany1 space) is problematic and:
parseList = sepBy parseGml' spaces
Allow for the original (optional) spaces after parseGml':
parseGml'' = liftM2 const parseGml' spaces
parseList = many parseGml''
C.
P.S. why do you call? many (noneOf "")
manyTill anyChar newline
or just: many (noneOf "\n")
(a trailing newline will be skipped by spaces)
Any ideas?
-----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Montag, 7. März 2011 12:23 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules
You should parse keywords using:
keyword s = try (string s) >> notFollowedBy (letter <|> digit)
C.
Am 07.03.2011 11:34, schrieb Hauschild, Klaus (EXT):
Hi,
to solve this ICFP task _http://www.cs.cornell.edu/icfp/task.htm_ I'm currnetly working on the parser. With the hint from Thu (reading Phillip Wadlers monadic parser paper) and consulting _http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Parsing_ I produce a first working version of the parser. After this great moment I completed the token definition and near all parsing rules. For the complete code have a look at _http://code.google.com/p/hgmltracer/source/browse/#svn%2Ftrunk%2FhGmlTracer_
data GmlToken =
-- structures
FunctionToken TokenSequence |
ArrayToken TokenSequence |
-- control operators
ApplyToken |
IfToken |
-- number operators
AddiToken |
AddfToken |
ACosToken |
ASinToken |
ClampfToken |
CosToken |
DiviToken |
DivfToken |
EqiToken |
EqfToken |
FloorToken |
FracToken |
LessiToken |
LessfToken |
ModiToken |
MuliToken |
MulfToken |
NegiToken |
NegfToken |
ToRealToken |
SinToken |
SqrtToken |
SubiToken |
SubfToken |
-- points
GetXToken |
GetYToken |
GetZToken |
PointToken |
-- arrays
GetToken |
LengthToken |
-- environment
IdentifierToken String |
BinderToken String |
-- types
BoolToken Bool |
IntToken Int |
RealToken Double |
StringToken String deriving Show
And
parseGml :: String -> [GmlToken]
parseGml input = case parse parseList "gml" input of
Left err -> error ("Parse error: " ++ (show err))
Right gml -> gml
parseList = sepBy parseGml' spaces
parseGml' =
-- structures
-- parseFunction
-- <|> parseArray
-- control operators
parseControlOperators
-- number operators
<|> parseNumberOperators
-- points
<|> parsePointOperators
-- arrays
<|> parseArrayOperators
-- types
<|> parseBool
<|> parseInteger
<|> parseString
-- environment
<|> parseIdentifier
<|> parseBinder
parseArray = parseSequence '[' ']'
parseFunction = parseSequence '{' '}'
parseSequence start end = do char start
array <- parseList
char end
return $ ArrayToken array
parseControlOperators = parseApply <|> parseIf
parseApply = do string "apply"
return $ ApplyToken
parseIf = do string "if"
return $ IfToken
parseNumberOperators = do string "addi"
return $ AddiToken
<|> do string "addf"
return $ AddfToken
<|> do string "acos"
return $ ACosToken
<|> do string "asind"
return $ ASinToken
<|> do string "clampf"
return $ ClampfToken
<|> do string "cos"
return $ CosToken
<|> do string "divi"
return $ DiviToken
<|> do string "divf"
return $ DivfToken
<|> do string "eqi"
return $ EqiToken
<|> do string "eqf"
return $ EqfToken
<|> do string "floor"
return $ FloorToken
<|> do string "frac"
return $ FracToken
<|> do string "lessi"
return $ LessiToken
<|> do string "lessf"
return $ LessfToken
<|> do string "modi"
return $ ModiToken
<|> do string "muli"
return $ MuliToken
<|> do string "mulf"
return $ MulfToken
<|> do string "negi"
return $ NegiToken
<|> do string "negf"
return $ NegfToken
<|> do string "real"
return $ ToRealToken
<|> do string "sin"
return $ SinToken
<|> do string "Sqrt"
return $ SqrtToken
<|> do string "subi"
return $ SubiToken
<|> do string "subf"
return $ SubfToken
parsePointOperators = do string "getx"
return $ GetXToken
<|> do string "gety"
return $ GetYToken
<|> do string "getz"
return $ GetZToken
<|> do string "point"
return $ PointToken
parseArrayOperators = do string "get"
return $ GetToken
<|> do string "length"
return $ LengthToken
parseInteger = liftM (IntToken . read) $ many1 digit
parseBool = do string "true"
return $ BoolToken True
<|> do string "false"
return $ BoolToken False
parseString = do char '"'
string <- many (noneOf "\"")
char '"'
return $ StringToken string
parseIdentifier = do identifier <- many (noneOf " ")
return $ IdentifierToken identifier
parseBinder = do char '/'
binder <- many (noneOf " ")
return $ BinderToken binder
parseComment = do char '%'
many (noneOf "")
newline
return $ ()
spaces = skipMany1 space
After gluing all this together in my mind all worked well. But it doesn't. The test file for parsing looks like:
{ /self /n n 2 lessi { 1 } { n 1 subi self self apply n muli } if } /fact
12 fact fact apply
* I think there is a problem with overlaping rules. There is a parser rule consuming "negi" and resulting in the NegiToken. A single "n" is a valid identifier. For the example file my parser says: unexpected " ", expecting "negi" * I think the same problem is present for "parseInteger" and "parseReal" (currently no in code but looks like "parseReal = do a <- many1 digit \n char '.' \n b <- many1 digit \n return $ RealToken (read (a ++ "." ++ b)" * Something with "parseFunction" is going really wrong. * All parsig rules are designed with the condition that each "construct" is separated by whitespaces. For this before parsing the input will be preprocessed: insert spaces, removing whitespaces and so on. Now in the parsed result appears (IdentifierToken ""). I think my version is not the best way to parse a identifer:
parseIdentifier = *do* identifier <- many (noneOf " ") return $ *IdentifierToken* identifier
Please help me.
Klaus
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Am 08.03.2011 12:30, schrieb Hauschild, Klaus (EXT):
Hi Christian,
my version of parseList works currently strange. The input "{2\n2}" will be parsed fine, but something like "{ }" or so fails with 'expecting space or "}"'.
Also "}" (and "]", etc.) should be excluded as identifier letters. "}" is the second identifier in "{ }". (The first one is empty.) C.
The redefinition of space is not necessary, it was copied from another tutorial code. How I write a version of parseGml that get gml token separated by any white space (space, tab, newline)?
Thanks
-----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Montag, 7. März 2011 14:18 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules
Am 07.03.2011 13:48, schrieb Hauschild, Klaus (EXT):
Thanks Christian,
I adapted the "keyword" parser and now "n" <-> "negi" does not occur.
But there are still other problems to solve. If I activate the parseFunction the parser will answer to fact.gml unexpected end of input, expecting space or "}".
Your redefinition of spaces (= skipMany1 space) is problematic and:
parseList = sepBy parseGml' spaces
Allow for the original (optional) spaces after parseGml':
parseGml'' = liftM2 const parseGml' spaces
parseList = many parseGml''
C.
P.S. why do you call? many (noneOf "")
manyTill anyChar newline
or just: many (noneOf "\n")
(a trailing newline will be skipped by spaces)
Any ideas?
-----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Montag, 7. März 2011 12:23 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules
You should parse keywords using:
keyword s = try (string s) >> notFollowedBy (letter <|> digit)
C.
Am 07.03.2011 11:34, schrieb Hauschild, Klaus (EXT):
Hi,
to solve this ICFP task _http://www.cs.cornell.edu/icfp/task.htm_ I'm currnetly working on the parser. With the hint from Thu (reading Phillip Wadlers monadic parser paper) and consulting _http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Parsing_ I produce a first working version of the parser. After this great moment I completed the token definition and near all parsing rules. For the complete code have a look at _http://code.google.com/p/hgmltracer/source/browse/#svn%2Ftrunk%2FhGmlTracer_
data GmlToken =
-- structures
FunctionToken TokenSequence |
ArrayToken TokenSequence |
-- control operators
ApplyToken |
IfToken |
-- number operators
AddiToken |
AddfToken |
ACosToken |
ASinToken |
ClampfToken |
CosToken |
DiviToken |
DivfToken |
EqiToken |
EqfToken |
FloorToken |
FracToken |
LessiToken |
LessfToken |
ModiToken |
MuliToken |
MulfToken |
NegiToken |
NegfToken |
ToRealToken |
SinToken |
SqrtToken |
SubiToken |
SubfToken |
-- points
GetXToken |
GetYToken |
GetZToken |
PointToken |
-- arrays
GetToken |
LengthToken |
-- environment
IdentifierToken String |
BinderToken String |
-- types
BoolToken Bool |
IntToken Int |
RealToken Double |
StringToken String deriving Show
And
parseGml :: String -> [GmlToken]
parseGml input = case parse parseList "gml" input of
Left err -> error ("Parse error: " ++ (show err))
Right gml -> gml
parseList = sepBy parseGml' spaces
parseGml' =
-- structures
-- parseFunction
-- <|> parseArray
-- control operators
parseControlOperators
-- number operators
<|> parseNumberOperators
-- points
<|> parsePointOperators
-- arrays
<|> parseArrayOperators
-- types
<|> parseBool
<|> parseInteger
<|> parseString
-- environment
<|> parseIdentifier
<|> parseBinder
parseArray = parseSequence '[' ']'
parseFunction = parseSequence '{' '}'
parseSequence start end = do char start
array <- parseList
char end
return $ ArrayToken array
parseControlOperators = parseApply <|> parseIf
parseApply = do string "apply"
return $ ApplyToken
parseIf = do string "if"
return $ IfToken
parseNumberOperators = do string "addi"
return $ AddiToken
<|> do string "addf"
return $ AddfToken
<|> do string "acos"
return $ ACosToken
<|> do string "asind"
return $ ASinToken
<|> do string "clampf"
return $ ClampfToken
<|> do string "cos"
return $ CosToken
<|> do string "divi"
return $ DiviToken
<|> do string "divf"
return $ DivfToken
<|> do string "eqi"
return $ EqiToken
<|> do string "eqf"
return $ EqfToken
<|> do string "floor"
return $ FloorToken
<|> do string "frac"
return $ FracToken
<|> do string "lessi"
return $ LessiToken
<|> do string "lessf"
return $ LessfToken
<|> do string "modi"
return $ ModiToken
<|> do string "muli"
return $ MuliToken
<|> do string "mulf"
return $ MulfToken
<|> do string "negi"
return $ NegiToken
<|> do string "negf"
return $ NegfToken
<|> do string "real"
return $ ToRealToken
<|> do string "sin"
return $ SinToken
<|> do string "Sqrt"
return $ SqrtToken
<|> do string "subi"
return $ SubiToken
<|> do string "subf"
return $ SubfToken
parsePointOperators = do string "getx"
return $ GetXToken
<|> do string "gety"
return $ GetYToken
<|> do string "getz"
return $ GetZToken
<|> do string "point"
return $ PointToken
parseArrayOperators = do string "get"
return $ GetToken
<|> do string "length"
return $ LengthToken
parseInteger = liftM (IntToken . read) $ many1 digit
parseBool = do string "true"
return $ BoolToken True
<|> do string "false"
return $ BoolToken False
parseString = do char '"'
string <- many (noneOf "\"")
char '"'
return $ StringToken string
parseIdentifier = do identifier <- many (noneOf " ")
return $ IdentifierToken identifier
parseBinder = do char '/'
binder <- many (noneOf " ")
return $ BinderToken binder
parseComment = do char '%'
many (noneOf "")
newline
return $ ()
spaces = skipMany1 space
After gluing all this together in my mind all worked well. But it doesn't. The test file for parsing looks like:
{ /self /n n 2 lessi { 1 } { n 1 subi self self apply n muli } if } /fact
12 fact fact apply
* I think there is a problem with overlaping rules. There is a parser rule consuming "negi" and resulting in the NegiToken. A single "n" is a valid identifier. For the example file my parser says: unexpected " ", expecting "negi" * I think the same problem is present for "parseInteger" and "parseReal" (currently no in code but looks like "parseReal = do a <- many1 digit \n char '.' \n b <- many1 digit \n return $ RealToken (read (a ++ "." ++ b)" * Something with "parseFunction" is going really wrong. * All parsig rules are designed with the condition that each "construct" is separated by whitespaces. For this before parsing the input will be preprocessed: insert spaces, removing whitespaces and so on. Now in the parsed result appears (IdentifierToken ""). I think my version is not the best way to parse a identifer:
parseIdentifier = *do* identifier <- many (noneOf " ") return $ *IdentifierToken* identifier
Please help me.
Klaus
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hi Christian, Thank you for your help. Now the current version of Parse.hs (http://code.google.com/p/hgmltracer/source/browse/trunk/hGmlTracer/src/Gml/P...) works well for the test file fact.gml. Now the last thing is parsing the different numbers (integer and float). If have a rule for parsing integers (parseInteger = liftM (IntToken . read) $ many1 digit) but if have currently no idea how to handle floats. -----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Dienstag, 8. März 2011 13:26 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules Am 08.03.2011 12:30, schrieb Hauschild, Klaus (EXT):
Hi Christian,
my version of parseList works currently strange. The input "{2\n2}" will be parsed fine, but something like "{ }" or so fails with 'expecting space or "}"'.
Also "}" (and "]", etc.) should be excluded as identifier letters. "}" is the second identifier in "{ }". (The first one is empty.) C.
The redefinition of space is not necessary, it was copied from another tutorial code. How I write a version of parseGml that get gml token separated by any white space (space, tab, newline)?
Thanks
-----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Montag, 7. März 2011 14:18 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules
Am 07.03.2011 13:48, schrieb Hauschild, Klaus (EXT):
Thanks Christian,
I adapted the "keyword" parser and now "n" <-> "negi" does not occur.
But there are still other problems to solve. If I activate the parseFunction the parser will answer to fact.gml unexpected end of input, expecting space or "}".
Your redefinition of spaces (= skipMany1 space) is problematic and:
parseList = sepBy parseGml' spaces
Allow for the original (optional) spaces after parseGml':
parseGml'' = liftM2 const parseGml' spaces
parseList = many parseGml''
C.
P.S. why do you call? many (noneOf "")
manyTill anyChar newline
or just: many (noneOf "\n")
(a trailing newline will be skipped by spaces)
Any ideas?
-----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Montag, 7. März 2011 12:23 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules
You should parse keywords using:
keyword s = try (string s) >> notFollowedBy (letter <|> digit)
C.
Am 07.03.2011 11:34, schrieb Hauschild, Klaus (EXT):
Hi,
to solve this ICFP task _http://www.cs.cornell.edu/icfp/task.htm_ I'm currnetly working on the parser. With the hint from Thu (reading Phillip Wadlers monadic parser paper) and consulting _http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Parsing_ I produce a first working version of the parser. After this great moment I completed the token definition and near all parsing rules. For the complete code have a look at _http://code.google.com/p/hgmltracer/source/browse/#svn%2Ftrunk%2FhGmlTracer_
data GmlToken =
-- structures
FunctionToken TokenSequence |
ArrayToken TokenSequence |
-- control operators
ApplyToken |
IfToken |
-- number operators
AddiToken |
AddfToken |
ACosToken |
ASinToken |
ClampfToken |
CosToken |
DiviToken |
DivfToken |
EqiToken |
EqfToken |
FloorToken |
FracToken |
LessiToken |
LessfToken |
ModiToken |
MuliToken |
MulfToken |
NegiToken |
NegfToken |
ToRealToken |
SinToken |
SqrtToken |
SubiToken |
SubfToken |
-- points
GetXToken |
GetYToken |
GetZToken |
PointToken |
-- arrays
GetToken |
LengthToken |
-- environment
IdentifierToken String |
BinderToken String |
-- types
BoolToken Bool |
IntToken Int |
RealToken Double |
StringToken String deriving Show
And
parseGml :: String -> [GmlToken]
parseGml input = case parse parseList "gml" input of
Left err -> error ("Parse error: " ++ (show err))
Right gml -> gml
parseList = sepBy parseGml' spaces
parseGml' =
-- structures
-- parseFunction
-- <|> parseArray
-- control operators
parseControlOperators
-- number operators
<|> parseNumberOperators
-- points
<|> parsePointOperators
-- arrays
<|> parseArrayOperators
-- types
<|> parseBool
<|> parseInteger
<|> parseString
-- environment
<|> parseIdentifier
<|> parseBinder
parseArray = parseSequence '[' ']'
parseFunction = parseSequence '{' '}'
parseSequence start end = do char start
array <- parseList
char end
return $ ArrayToken array
parseControlOperators = parseApply <|> parseIf
parseApply = do string "apply"
return $ ApplyToken
parseIf = do string "if"
return $ IfToken
parseNumberOperators = do string "addi"
return $ AddiToken
<|> do string "addf"
return $ AddfToken
<|> do string "acos"
return $ ACosToken
<|> do string "asind"
return $ ASinToken
<|> do string "clampf"
return $ ClampfToken
<|> do string "cos"
return $ CosToken
<|> do string "divi"
return $ DiviToken
<|> do string "divf"
return $ DivfToken
<|> do string "eqi"
return $ EqiToken
<|> do string "eqf"
return $ EqfToken
<|> do string "floor"
return $ FloorToken
<|> do string "frac"
return $ FracToken
<|> do string "lessi"
return $ LessiToken
<|> do string "lessf"
return $ LessfToken
<|> do string "modi"
return $ ModiToken
<|> do string "muli"
return $ MuliToken
<|> do string "mulf"
return $ MulfToken
<|> do string "negi"
return $ NegiToken
<|> do string "negf"
return $ NegfToken
<|> do string "real"
return $ ToRealToken
<|> do string "sin"
return $ SinToken
<|> do string "Sqrt"
return $ SqrtToken
<|> do string "subi"
return $ SubiToken
<|> do string "subf"
return $ SubfToken
parsePointOperators = do string "getx"
return $ GetXToken
<|> do string "gety"
return $ GetYToken
<|> do string "getz"
return $ GetZToken
<|> do string "point"
return $ PointToken
parseArrayOperators = do string "get"
return $ GetToken
<|> do string "length"
return $ LengthToken
parseInteger = liftM (IntToken . read) $ many1 digit
parseBool = do string "true"
return $ BoolToken True
<|> do string "false"
return $ BoolToken False
parseString = do char '"'
string <- many (noneOf "\"")
char '"'
return $ StringToken string
parseIdentifier = do identifier <- many (noneOf " ")
return $ IdentifierToken identifier
parseBinder = do char '/'
binder <- many (noneOf " ")
return $ BinderToken binder
parseComment = do char '%'
many (noneOf "")
newline
return $ ()
spaces = skipMany1 space
After gluing all this together in my mind all worked well. But it doesn't. The test file for parsing looks like:
{ /self /n n 2 lessi { 1 } { n 1 subi self self apply n muli } if } /fact
12 fact fact apply
* I think there is a problem with overlaping rules. There is a parser rule consuming "negi" and resulting in the NegiToken. A single "n" is a valid identifier. For the example file my parser says: unexpected " ", expecting "negi" * I think the same problem is present for "parseInteger" and "parseReal" (currently no in code but looks like "parseReal = do a <- many1 digit \n char '.' \n b <- many1 digit \n return $ RealToken (read (a ++ "." ++ b)" * Something with "parseFunction" is going really wrong. * All parsig rules are designed with the condition that each "construct" is separated by whitespaces. For this before parsing the input will be preprocessed: insert spaces, removing whitespaces and so on. Now in the parsed result appears (IdentifierToken ""). I think my version is not the best way to parse a identifer:
parseIdentifier = *do* identifier <- many (noneOf " ") return $ *IdentifierToken* identifier
Please help me.
Klaus
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Am 08.03.2011 13:35, schrieb Hauschild, Klaus (EXT):
Hi Christian,
Thank you for your help. Now the current version of Parse.hs (http://code.google.com/p/hgmltracer/source/browse/trunk/hGmlTracer/src/Gml/P...) works well for the test file fact.gml.
spaces char end The idea is to call spaces _after_ every token. So the keyword parser should be extended by ">> spaces" and also the number, char, string, binder and identifier parsers. Rather than using 'noneOf " ]}\n\t"' I would precisely define the identifier letters. isIdChar :: Char -> Bool isIdChar c = isLetter c || isDigit c || .... and use "satisfy isIdChar" also in keyword's notFollowedBy.
Now the last thing is parsing the different numbers (integer and float). If have a rule for parsing integers (parseInteger = liftM (IntToken . read) $ many1 digit) but if have currently no idea how to handle floats.
Look inside http://hackage.haskell.org/packages/archive/parsec2/1.0.0/doc/html/src/Text-... under "floating" and copy and adjust the code for you. C.
-----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Dienstag, 8. März 2011 13:26 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules
Am 08.03.2011 12:30, schrieb Hauschild, Klaus (EXT):
Hi Christian,
my version of parseList works currently strange. The input "{2\n2}" will be parsed fine, but something like "{ }" or so fails with 'expecting space or "}"'.
Also "}" (and "]", etc.) should be excluded as identifier letters. "}" is the second identifier in "{ }". (The first one is empty.)
C.
The redefinition of space is not necessary, it was copied from another tutorial code. How I write a version of parseGml that get gml token separated by any white space (space, tab, newline)?
Thanks
-----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Montag, 7. März 2011 14:18 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules
Am 07.03.2011 13:48, schrieb Hauschild, Klaus (EXT):
Thanks Christian,
I adapted the "keyword" parser and now "n" <-> "negi" does not occur.
But there are still other problems to solve. If I activate the parseFunction the parser will answer to fact.gml unexpected end of input, expecting space or "}".
Your redefinition of spaces (= skipMany1 space) is problematic and:
parseList = sepBy parseGml' spaces
Allow for the original (optional) spaces after parseGml':
parseGml'' = liftM2 const parseGml' spaces
parseList = many parseGml''
C.
P.S. why do you call? many (noneOf "")
manyTill anyChar newline
or just: many (noneOf "\n")
(a trailing newline will be skipped by spaces)
Any ideas?
-----Ursprüngliche Nachricht----- Von: Christian Maeder [mailto:Christian.Maeder@dfki.de] Gesendet: Montag, 7. März 2011 12:23 An: Hauschild, Klaus (EXT) Cc: haskell-cafe@haskell.org Betreff: Re: Overlaping Parsec rules
You should parse keywords using:
keyword s = try (string s) >> notFollowedBy (letter <|> digit)
C.
Am 07.03.2011 11:34, schrieb Hauschild, Klaus (EXT):
Hi,
to solve this ICFP task _http://www.cs.cornell.edu/icfp/task.htm_ I'm currnetly working on the parser. With the hint from Thu (reading Phillip Wadlers monadic parser paper) and consulting _http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Parsing_ I produce a first working version of the parser. After this great moment I completed the token definition and near all parsing rules. For the complete code have a look at _http://code.google.com/p/hgmltracer/source/browse/#svn%2Ftrunk%2FhGmlTracer_
data GmlToken =
-- structures
FunctionToken TokenSequence |
ArrayToken TokenSequence |
-- control operators
ApplyToken |
IfToken |
-- number operators
AddiToken |
AddfToken |
ACosToken |
ASinToken |
ClampfToken |
CosToken |
DiviToken |
DivfToken |
EqiToken |
EqfToken |
FloorToken |
FracToken |
LessiToken |
LessfToken |
ModiToken |
MuliToken |
MulfToken |
NegiToken |
NegfToken |
ToRealToken |
SinToken |
SqrtToken |
SubiToken |
SubfToken |
-- points
GetXToken |
GetYToken |
GetZToken |
PointToken |
-- arrays
GetToken |
LengthToken |
-- environment
IdentifierToken String |
BinderToken String |
-- types
BoolToken Bool |
IntToken Int |
RealToken Double |
StringToken String deriving Show
And
parseGml :: String -> [GmlToken]
parseGml input = case parse parseList "gml" input of
Left err -> error ("Parse error: " ++ (show err))
Right gml -> gml
parseList = sepBy parseGml' spaces
parseGml' =
-- structures
-- parseFunction
-- <|> parseArray
-- control operators
parseControlOperators
-- number operators
<|> parseNumberOperators
-- points
<|> parsePointOperators
-- arrays
<|> parseArrayOperators
-- types
<|> parseBool
<|> parseInteger
<|> parseString
-- environment
<|> parseIdentifier
<|> parseBinder
parseArray = parseSequence '[' ']'
parseFunction = parseSequence '{' '}'
parseSequence start end = do char start
array <- parseList
char end
return $ ArrayToken array
parseControlOperators = parseApply <|> parseIf
parseApply = do string "apply"
return $ ApplyToken
parseIf = do string "if"
return $ IfToken
parseNumberOperators = do string "addi"
return $ AddiToken
<|> do string "addf"
return $ AddfToken
<|> do string "acos"
return $ ACosToken
<|> do string "asind"
return $ ASinToken
<|> do string "clampf"
return $ ClampfToken
<|> do string "cos"
return $ CosToken
<|> do string "divi"
return $ DiviToken
<|> do string "divf"
return $ DivfToken
<|> do string "eqi"
return $ EqiToken
<|> do string "eqf"
return $ EqfToken
<|> do string "floor"
return $ FloorToken
<|> do string "frac"
return $ FracToken
<|> do string "lessi"
return $ LessiToken
<|> do string "lessf"
return $ LessfToken
<|> do string "modi"
return $ ModiToken
<|> do string "muli"
return $ MuliToken
<|> do string "mulf"
return $ MulfToken
<|> do string "negi"
return $ NegiToken
<|> do string "negf"
return $ NegfToken
<|> do string "real"
return $ ToRealToken
<|> do string "sin"
return $ SinToken
<|> do string "Sqrt"
return $ SqrtToken
<|> do string "subi"
return $ SubiToken
<|> do string "subf"
return $ SubfToken
parsePointOperators = do string "getx"
return $ GetXToken
<|> do string "gety"
return $ GetYToken
<|> do string "getz"
return $ GetZToken
<|> do string "point"
return $ PointToken
parseArrayOperators = do string "get"
return $ GetToken
<|> do string "length"
return $ LengthToken
parseInteger = liftM (IntToken . read) $ many1 digit
parseBool = do string "true"
return $ BoolToken True
<|> do string "false"
return $ BoolToken False
parseString = do char '"'
string <- many (noneOf "\"")
char '"'
return $ StringToken string
parseIdentifier = do identifier <- many (noneOf " ")
return $ IdentifierToken identifier
parseBinder = do char '/'
binder <- many (noneOf " ")
return $ BinderToken binder
parseComment = do char '%'
many (noneOf "")
newline
return $ ()
spaces = skipMany1 space
After gluing all this together in my mind all worked well. But it doesn't. The test file for parsing looks like:
{ /self /n n 2 lessi { 1 } { n 1 subi self self apply n muli } if } /fact
12 fact fact apply
* I think there is a problem with overlaping rules. There is a parser rule consuming "negi" and resulting in the NegiToken. A single "n" is a valid identifier. For the example file my parser says: unexpected " ", expecting "negi" * I think the same problem is present for "parseInteger" and "parseReal" (currently no in code but looks like "parseReal = do a <- many1 digit \n char '.' \n b <- many1 digit \n return $ RealToken (read (a ++ "." ++ b)" * Something with "parseFunction" is going really wrong. * All parsig rules are designed with the condition that each "construct" is separated by whitespaces. For this before parsing the input will be preprocessed: insert spaces, removing whitespaces and so on. Now in the parsed result appears (IdentifierToken ""). I think my version is not the best way to parse a identifer:
parseIdentifier = *do* identifier <- many (noneOf " ") return $ *IdentifierToken* identifier
Please help me.
Klaus
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Am 08.03.2011 13:50, schrieb Christian Maeder:
Am 08.03.2011 13:35, schrieb Hauschild, Klaus (EXT):
Hi Christian,
Thank you for your help. Now the current version of Parse.hs (http://code.google.com/p/hgmltracer/source/browse/trunk/hGmlTracer/src/Gml/P...) works well for the test file fact.gml.
spaces char end
The idea is to call spaces _after_ every token.
In other words "spaces" before "char end" is not needed.
So the keyword parser should be extended by ">> spaces" and also the number, char, string, binder and identifier parsers.
This is not necessary since you skip spaces after parseGml'. "skip (A <|> B)" is better than "skip A <|> skip B". C.

Am 09.03.2011 14:44, schrieb Christian Maeder:
Am 08.03.2011 13:50, schrieb Christian Maeder:
Am 08.03.2011 13:35, schrieb Hauschild, Klaus (EXT):
Hi Christian,
Thank you for your help. Now the current version of Parse.hs (http://code.google.com/p/hgmltracer/source/browse/trunk/hGmlTracer/src/Gml/P...) works well for the test file fact.gml.
spaces char end
The idea is to call spaces _after_ every token.
In other words "spaces" before "char end" is not needed.
But in your top-level parser you may want to skip leading spaces and check for a final "eof" to ensure that everything was consumed. spaces >> liftM2 const parseList eof (Instead of "liftM2 const" the symbol "<*" from "Control.Applicative" may be used: spaces *> parseList <* eof) I defined: infixl 1 << (<<) :: Monad m => m a -> m b -> m a (<<) = liftM2 const and would use: spaces >> parseList << eof C.

Hi Klaus Unless you have very specific white-space considerations, you really want to be using Parsec's Token and LanguageDef modules. There are examples in the "legacy" Parsec 2.0 distribution available from Daan Leijen's old website: http://legacy.cs.uu.nl/daan/parsec.html Best wishes Stephen

Am 07.03.2011 14:20, schrieb Stephen Tetley:
Hi Klaus
Unless you have very specific white-space considerations, you really want to be using Parsec's Token and LanguageDef modules.
Well, I would want to use Parsec's Token and LanguageDef modules just to get the white spaces (and comments to be ignored) right. Simply skip spaces and comments after every lexical token (by your own wrapper function like "Text.ParserCombinators.Parsec.Token.lexeme"). C.
There are examples in the "legacy" Parsec 2.0 distribution available from Daan Leijen's old website: http://legacy.cs.uu.nl/daan/parsec.html
Best wishes
Stephen

On Monday 07 March 2011 13:48:31, Hauschild, Klaus (EXT) wrote:
But there are still other problems to solve.
One thing that I saw which wasn't mentioned before (or I missed it) is that you need to try parsing binders before parsing identifiers. Currently, all binders are happily accepted as identifiers.

Hi Daniel, I actually corrected the order of these two. Thanks -----Ursprüngliche Nachricht----- Von: Daniel Fischer [mailto:daniel.is.fischer@googlemail.com] Gesendet: Montag, 7. März 2011 14:29 An: haskell-cafe@haskell.org Cc: Hauschild, Klaus (EXT) Betreff: Re: [Haskell-cafe] Overlaping Parsec rules On Monday 07 March 2011 13:48:31, Hauschild, Klaus (EXT) wrote:
But there are still other problems to solve.
One thing that I saw which wasn't mentioned before (or I missed it) is that you need to try parsing binders before parsing identifiers. Currently, all binders are happily accepted as identifiers.
participants (4)
-
Christian Maeder
-
Daniel Fischer
-
Hauschild, Klaus (EXT)
-
Stephen Tetley