
From what I could gather from the documentation, it should be something
Hi all, I'm new to Haskell and HaXml and I'm playing around with the latter to clean some (well-formed) 'legacy' html. This works fine except for the following cases. Some of the elements to be cleaned are: <font size="4"><i>Hello World</i></font> <i><font size="4">Hello World</font></i> This should become: <h1 class="subtitle">Hello World</h1> I can build filters to find <font>, <font size="4"> or <i>, but the combination does not seem to work. like: foldXml (txt ?> keep :> (attrval("size",AttValue[Left "4"]) `o` tag "font") /> tag "i" ?> replaceTag "h1" :> children) This doesn't work. I am clearly missing something elementary here, so any hints are welcome. Cheers, K.

Koen.Roelandt@mineco.fgov.be writes:
I'm new to Haskell and HaXml and I'm playing around with the latter to clean some (well-formed) 'legacy' html. This works fine except for the following cases. Some of the elements to be cleaned are:
<font size="4"><i>Hello World</i></font> <i><font size="4">Hello World</font></i>
This should become:
<h1 class="subtitle">Hello World</h1>
From what I could gather from the documentation, it should be something like:
foldXml (txt ?> keep :> (attrval("size",AttValue[Left "4"]) `o` tag "font") /> tag "i" ?> replaceTag "h1" :> children)
Is the bracketing correct? I can't remember the precedence of the operators offhand, but perhaps it should be foldXml (txt ?> keep :> (((attrval("size",AttValue[Left "4"]) `o` tag "font") /> tag "i") ?> replaceTag "h1" :> children)) Regards, Malcolm

I'm new to Haskell and HaXml and I'm playing around with the latter to clean some (well-formed) 'legacy' html. This works fine except for the following cases. Some of the elements to be cleaned are:
<font size="4"><i>Hello World</i></font> <i><font size="4">Hello World</font></i>
This should become:
<h1 class="subtitle">Hello World</h1>
From what I could gather from the documentation, it should be something like:
foldXml (txt ?> keep :> (attrval("size",AttValue[Left "4"]) `o` tag "font") /> tag "i" ?> replaceTag "h1" :> children)
Is the bracketing correct? I can't remember the precedence of the
operators offhand, but perhaps it should be
foldXml (txt ?> keep :>
(((attrval("size",AttValue[Left "4"]) `o` tag "font")
/> tag "i") ?> replaceTag "h1" :> children))
Yes, the braketing is correct since the following code:
foldXml (txt ?> keep :>
fontSize4 /> tag "em" ?> mkSubtitle :>
children)
fontSize4 = (attrval("size",AttValue[Left "4"]) `o` tag "font")
mkSubtitle = mkElemAttr "h1" [("class", ("subtitle"!))]
[children]
now transforms
<font size="4"><em>Hello World</em></font>
into
<h1 class="subtitle"><em>
participants (2)
-
Koen.Roelandt@mineco.fgov.be
-
Malcolm Wallace