Capturing the parent element as I parse XML using parsec

Hi, With the help of the cafe I've been able to write up the xml parser using parsec - https://github.com/ckkashyap/really-simple-xml-parser/blob/master/RSXP.hs I am struggling with an idea though - How can I capture the parent element of each element as I parse? Is it possible or would I have to do a second pass to do the fixup? Regards, Kashyap

On Sun, Jul 29, 2012 at 1:21 AM, C K Kashyap
Hi,
With the help of the cafe I've been able to write up the xml parser using parsec - https://github.com/ckkashyap/really-simple-xml-parser/blob/master/RSXP.hs
I am struggling with an idea though - How can I capture the parent element of each element as I parse? Is it possible or would I have to do a second pass to do the fixup?
What are you trying to do? Maybe you could give an example of what you'd like to produce? Generally speaking, having tree elements in a Haskell datatype point to their parent and their children is asking for trouble - it means you can't change any part of the tree without re-building the entire tree (otherwise your parent pointers point to the parent in the old version of the tree). If you're interested in complex traversals and transformation of XML trees, I like the cursor API here: http://hackage.haskell.org/packages/archive/xml/1.3.12/doc/html/Text-XML-Lig... HaXML is also popular for whole-tree queries and transformations. Antoine

On 29/07/2012, at 6:21 PM, C K Kashyap wrote:
I am struggling with an idea though - How can I capture the parent element of each element as I parse? Is it possible or would I have to do a second pass to do the fixup?
Why do you *want* the parent element of each element? One of the insanely horrible aspects of the Document Object Model is that every element is nailed in place by pointers everywhere, with the result that you cannot share elements, and even moving an element was painful. I still do a fair bit of SGML/XML process in C using a "Document Value Model" library that uses hash consing, and it's so much easier it isn't funny. While you are traversing a document tree it is useful to keep track of the path from the root. Given data XML = Element String [(String,String)] [XML] | Text String you do something like traverse :: ([XML] -> [a] -> a) -> ([XML] -> String -> a) -> XML -> a traverse f g xml = loop [] xml where loop ancs (Text s) = g ancs s loop ancs e@(Element _ _ ks) = f ancs' (map (loop ancs') ks) where ancs' = e:ancs (This is yet another area where Haskell's non-strictness pays off.) If you do that, then you have the parent information available without it being stored in the tree.

Thank you Richard and Antoine.
I think I see the pointlessness of my ask.
Regards,
Kashyap
On Mon, Jul 30, 2012 at 4:14 AM, Richard O'Keefe
On 29/07/2012, at 6:21 PM, C K Kashyap wrote:
I am struggling with an idea though - How can I capture the parent element of each element as I parse? Is it possible or would I have to do a second pass to do the fixup?
Why do you *want* the parent element of each element? One of the insanely horrible aspects of the Document Object Model is that every element is nailed in place by pointers everywhere, with the result that you cannot share elements, and even moving an element was painful. I still do a fair bit of SGML/XML process in C using a "Document Value Model" library that uses hash consing, and it's so much easier it isn't funny.
While you are traversing a document tree it is useful to keep track of the path from the root. Given
data XML = Element String [(String,String)] [XML] | Text String
you do something like
traverse :: ([XML] -> [a] -> a) -> ([XML] -> String -> a) -> XML -> a traverse f g xml = loop [] xml where loop ancs (Text s) = g ancs s loop ancs e@(Element _ _ ks) = f ancs' (map (loop ancs') ks) where ancs' = e:ancs
(This is yet another area where Haskell's non-strictness pays off.) If you do that, then you have the parent information available without it being stored in the tree.
participants (3)
-
Antoine Latter
-
C K Kashyap
-
Richard O'Keefe