
Am 19.07.2012 14:53, schrieb C K Kashyap:
Dear gentle Haskellers,
I was trying to whet my Haskell by trying out Parsec today to try and parse out XML. Here's the code I cam up with -
I wanted some help with the "gettext" parser that I've written. I had to do a dummy "char ' ') in there just to satisfy the "many" used in the xml parser. I'd appreciate it very much if someone could give me some feedback.
You don't want empty bodies! So use many1 in gettext. gettext = fmap Body $ many1 $ letter <|> digit If you have spaces in your bodies, skip them or allow them with noneOf "<". HTH Christian
data XML = Node String [XML] | Body String deriving Show
gettext = do x <- many (letter <|> digit ) if (length x) > 0 then return (Body x) else (char ' ' >> (return $ Body ""))
xml :: Parser XML xml = do { name <- openTag ; innerXML <- many innerXML ; endTag name ; return (Node name innerXML) }
innerXML = do x <- (try xml <|> gettext) return x
openTag :: Parser String openTag = do char '<' content <- many (noneOf ">") char '>' return content
endTag :: String -> Parser String endTag str = do char '<' char '/' string str char '>' return str
h1 = parse xml "" "<a>A</a>" h2 = parse xml "" "<a><b>A</b></a>" h3 = parse xml "" "<a><b><c></c></b></a>" h4 = parse xml "" "<a><b></b><c></c></a>"
Regards, Kashyap
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe