
Am 22.07.2012 17:21, schrieb C K Kashyap:
I've updated the parser here - https://github.com/ckkashyap/LearningPrograms/blob/master/Haskell/Parsing/xm...
The whole thing is less than 100 lines and it can handle comments as well.
This code is still not nice: Duplicate code in openTag and withoutExplictCloseTag. The "toplevel-try" in try withoutExplictCloseTag <|> withExplicitCloseTag should be avoided by factoring out the common prefix. Again, I would avoid notFollowedBy by using many1. tag <- try(char '<' >> many1 (letter <|> digit)) In quotedChar you do not only want to escape the quote but at least the backslash, too. You could allow to escape any character by a backslash using: quotedChar c = try (char '\\' >> anyChar) <|> noneOf [c, '\\'] Writing a separate parser stripLeadingSpaces is overkill. Just use "spaces >> parseXML" (or apply "dropWhile isSpace" to the input string) C. [...]