hxt and pickler combinations

Hello, I'm using HXT for writing a Citation Style Language (http://xbiblio.sourceforge.net) implementation in Haskell and I'm trying to use the hxt pickler library to parse XML data contained in elements that can be interleaved, that is to say, elements that can appear in any order within other elements. For instance: <data> <string>ciao</string> <int>2</int> </data> or <data> <int>2</int> <string>ciao</string> </data> are both permitted. I'm not able to write picklers able to parse such kind of data. I indeed noticed that this is not possible with interleaved elements, but it is possible with attributes. To make myself hopefully clearer I included some code below. Suppose we have a data structure like: data Term = T Int String deriving ( Show ) If encoded in XML without respecting the ordering (first the Int and then the String), picklers seem to fail. But if I use attributes to store the values, this doesn't happen. In other word, the xp1 pickler (taken from the example below) will fail with such an xml doc: <data> <string>ciao</string> <int>2</int> </data> since it requires the 'int' element to appear before the 'string' element. To test this behaviour, run the code below and see that: - test doc1 xp1 will fail - test doc2 xp1 will succeed while: - test doc3 xp2 will succeed - test doc4 xp2 will succeed What am I getting wrong? It's just a matter of wrong combinator I'm choosing or I'm getting wrong something more fundamental? TIA. Andrea The code: import Text.XML.HXT.Arrow test :: String -> PU Term -> IO () test t xp = do p <- runX ( constA t >>> xread >>> xunpickleVal xp ) putStrLn (show p) data Term = T Int String deriving ( Show ) xp1, xp2 :: PU Term xp1 = xpElem "data" $ xpWrap (uncurry T, \(T i s) -> (i, s)) $ xpPair (xpElem "int" xpickle) (xpElem "string" xpText ) xp2 = xpElem "data" $ xpWrap (uncurry T, \(T i s) -> (i, s)) $ xpPair (xpAttr "int" xpickle) (xpAttr "string" xpText ) doc1, doc2, doc3, doc4 :: String doc1 = "<data><string>ciao</string><int>2</int></data>" doc2 = "<data><int>2</int><string>ciao</string></data>" doc3 = "" doc4 = ""

On Thu, Jun 26, 2008 at 04:11:58PM +0200, Andrea Rossato wrote:
Hello,
I'm using HXT for writing a Citation Style Language (http://xbiblio.sourceforge.net) implementation in Haskell and I'm trying to use the hxt pickler library to parse XML data contained in elements that can be interleaved, that is to say, elements that can appear in any order within other elements.
For instance: <data> <string>ciao</string> <int>2</int> </data> or <data> <int>2</int> <string>ciao</string> </data>
are both permitted.
I'm not able to write picklers able to parse such kind of data. I indeed noticed that this is not possible with interleaved elements, but it is possible with attributes.
Thanks to a suggestion from Uwe (I contacted the HXT authors too, since I thought it could be a non intended behaviour: instead it is, in order to conform to the standard DTD validation), I came up with this solution. I'm leaving it here too, for the archives. This is a pickler that search the element in the contents and match it without regard to the elements' order: xpElem' :: String -> PU a -> PU a xpElem' name pa = PU { appPickle = ( \ (a, st) -> let st' = appPickle pa (a, emptySt) in addCont (XN.mkElement (mkName name) (attributes st') (contents st')) st ) , appUnPickle = \ st -> fromMaybe (Nothing, st) (unpickleElement st) , theSchema = scElem name (theSchema pa) } where unpickleElement st = do let t = contents st n <- mapM XN.getElemName t case elemIndex name (map qualifiedName n) of Nothing -> fail "element name does not match" Just i -> do let cs = XN.getChildren (t !! i) al <- XN.getAttrl (t !! i) res <- fst . appUnPickle pa $ St {attributes = al, contents = cs} return (Just res, st {contents = take i t ++ drop (i + 1) t}) Andrea
participants (1)
-
Andrea Rossato