
gettext = (many1 $ noneOf "><") >>= (return . Body)
works for your case.
On Thu, Jul 19, 2012 at 6:37 PM, Christian Maeder
Am 19.07.2012 14:53, schrieb C K Kashyap:
Dear gentle Haskellers,
I was trying to whet my Haskell by trying out Parsec today to try and parse out XML. Here's the code I cam up with -
I wanted some help with the "gettext" parser that I've written. I had to do a dummy "char ' ') in there just to satisfy the "many" used in the xml parser. I'd appreciate it very much if someone could give me some feedback.
You don't want empty bodies! So use many1 in gettext.
gettext = fmap Body $ many1 $ letter <|> digit
If you have spaces in your bodies, skip them or allow them with noneOf "<".
HTH Christian
data XML = Node String [XML] | Body String deriving Show
gettext = do x <- many (letter <|> digit ) if (length x) > 0 then return (Body x) else (char ' ' >> (return $ Body ""))
xml :: Parser XML xml = do { name <- openTag ; innerXML <- many innerXML ; endTag name ; return (Node name innerXML) }
innerXML = do x <- (try xml <|> gettext) return x
openTag :: Parser String openTag = do char '<' content <- many (noneOf ">") char '>' return content
endTag :: String -> Parser String endTag str = do char '<' char '/' string str char '>' return str
h1 = parse xml "" "<a>A</a>" h2 = parse xml "" "<a><b>A</b></a>" h3 = parse xml "" "<a><b><c></c></b></a>" h4 = parse xml "" "<a><b></b><c></c></a>"
Regards, Kashyap
______________________________**_________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe
______________________________**_________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe
-- I drink I am thunk.