I was trying to whet my Haskell by trying out Parsec today to try and parse out XML. Here's the code I cam up with -
I wanted some help with the "gettext" parser that I've written. I had to do a dummy "char ' ') in there just to satisfy the "many" used in the xml parser. I'd appreciate it very much if someone could give me some feedback.
data XML = Node String [XML]
| Body String deriving Show
gettext = do
x <- many (letter <|> digit )
if (length x) > 0 then
return (Body x)
else (char ' ' >> (return $ Body ""))
xml :: Parser XML
xml = do {
name <- openTag
; innerXML <- many innerXML
; endTag name
; return (Node name innerXML)
}
innerXML = do
x <- (try xml <|> gettext)
return x
openTag :: Parser String
openTag = do
char '<'
content <- many (noneOf ">")
char '>'
return content
endTag :: String -> Parser String
endTag str = do
char '<'
char '/'
string str
char '>'
return str
h1 = parse xml "" "<a>A</a>"
h2 = parse xml "" "<a><b>A</b></a>"
h3 = parse xml "" "<a><b><c></c></b></a>"
h4 = parse xml "" "<a><b></b><c></c></a>"