
On Saturday 24 February 2007 21:22, Johan Tibell wrote:
So my question is. How can I write the function 'extractElementsIntoRecords' below. Or, perhaps HXT is the wrong tool for the job and I should be trying to walk the DOM tree instead?
module HCard where
import Text.XML.HXT.Arrow
data HCard = HCard { familyName :: String, givenName :: String org :: Maybe String url :: Maybe String } deriving Show
parseHCards xml = runX $ parseXml xml
parseXml xml = readString [(a_parse_html, v_1)] xml >>> deep (hasClassName "vcard") >>> extractElementsIntoRecords
extractElementsIntoRecords = undefined
Perhaps something like the following (which is likely to be wrong seen I'm adlibing): extractElementsIntoRecords = findFName <+> findGName <+> findOrg <+> findURL where findX c = deep (hasName "span" >>> hasAttrValue "class" (== c)) >>> getChildren >>> getText findFName = findX "family-name" >>> arr Just findGName = findX "given-name" >>> arr Just findOrg = (findX "org" >>> arr Just) `withDefault` Nothing findURL = (deep (hasName "a" >>> hasAttrValue "class" (== "url)) >>> getAttrValue "href" >>> arr Just) `withDefault` Nothing and use the following at an appropriate place: composeHCard (Just fn:Just gn:morg:murl:xs) = (HCard fn gn morg murl):(compose xs) composeHCard _ = [] There's several other possibilities for dealing with bad data and simplifications you could do of course. Daniel