
Hello, I'm trying to transform a fragment of XML using HXT, specifically I'm trying to strip out any disallowed content while retaining as much as possible. For example I might be given "<bar><evil>blah</evil></bar>". I want to get rid of the 'evil' element but retain it's content resulting in "<bar>blah</bar>", rather than just "<bar></bar>". My basic theory was to do a bottom up traversal replacing any invalid elements with their children. Unfortunately I've got myself thoroughly confused. The code below shows what I've tried so far (except the many random permutations of collapseInvalids I've tried!). I think one problem is that validElements is choosing based on the parent element but processing will just be given the child. Also I think some inputs will require multiple passes over the children during the traversal, eg given "...<baz>s6<bar>s7</bar></baz>..." I'd like "...<baz>s6s7</baz>..." but when processing the bar it is valid, it's only once you're processing the baz you see that you need to pull the s7 up. I'd really appreciate any advice, and I'm not wedded to the current code so feel free to throw it all out! :) Thanks P. ps: Is it possible to render an XmlTree as String without invoking IO? I've just been doing runLA (hread >>> collapseInvalids) frag1 and looking at the nodes. module Test where import Text.XML.HXT.Arrow --collapseInvalids = processBottomUp (processChildren ((getChildren `whenNot` validElements) `orElse` this)) -- <foo> can contain <foo>, <bar> or <baz>, not text validInFoo :: (ArrowXml a) => a XmlTree XmlTree validInFoo = foldr1 orElse ( map hasName ["foo", "bar", "baz"] ) -- <bar> can contain <foo>, <baz> or text, not <foo> validInBar :: (ArrowXml a) => a XmlTree XmlTree validInBar = foldr1 orElse ( isText : map hasName ["foo", "baz"] ) -- <baz> can only contain text validInBaz :: (ArrowXml a) => a XmlTree XmlTree validInBaz = foldr1 orElse ( isText : []) validElements :: (ArrowXml a) => a XmlTree XmlTree validElements = choiceA ( hasName "foo" :-> validInFoo : hasName "bar" :-> validInBar : hasName "baz" :-> validInBaz : isText :-> this : []) frag1 = concat ( "s1" : "<foo>" : "<bar>" : "s2" : "<evil>" : "s3" : "<baz>" : "s4" : "</baz>" : "s5" : "</evil>" : "s6" : "</bar>" : "</foo>" : "s7" : []) result1 = concat ( "s1" : "<foo>" : "<bar>" : "s2s3" : "<baz>" : "s4" : "</baz>" : "s5s6" : "</bar>" : "</foo>" : "s7" : []) frag2 = concat ( "s1" : "<foo>" : "s2" : "<evil>" : "s3" : "</evil>" : "<bar>" : "s4" : "<evil>" : "s5" : "<baz>" : "s6" : "<bar>" : "s7" : "</bar>" : "</baz>" : "s8" : "<bar>" : "s9" : "</bar>" : "</evil>" : "s10" : "</bar>" : "</foo>" : "s11" : []) result2 = concat ( "s1" : "<foo>" : "<bar>" : "s4" : "<evil>" : "s5" : "<baz>" : "s6" : "<bar>" : "s7" : "</bar>" : "</baz>" : "s8" : "<bar>" : "s9" : "</bar>" : "</evil>" : "s10" : "</bar>" : "</foo>" : "s11" : [])