HaXml: ampersand in attribute value

HaXml seems to choke on finding an ampersand in an attribute value. Is this normal? Is there any workaround? Cheers, Koen.

Koen.Roelandt@mineco.fgov.be wrote:
HaXml seems to choke on finding an ampersand in an attribute value. Is this normal? Is there any workaround?
Yes, it is expected. An ampersand indicates the start of a reference, e.g. < or If there is no semicolon to indicate the end of the reference, then it is a parse error. The XML specification is quite clear that neither & nor < are valid standalone characters in an attribute value. Regards, Malcolm

But speaking of HaXml bugs, I'm pretty sure HaXml doesn't handle % correctly. It seem to treat % specially everywhere, but I think it is only special inside DTDs. I have many XML files produced by other tools that the HaXml parser fails to process because of this. -- Lennart Malcolm Wallace wrote:
Koen.Roelandt@mineco.fgov.be wrote:
HaXml seems to choke on finding an ampersand in an attribute value. Is this normal? Is there any workaround?
Yes, it is expected. An ampersand indicates the start of a reference, e.g. < or If there is no semicolon to indicate the end of the reference, then it is a parse error. The XML specification is quite clear that neither & nor < are valid standalone characters in an attribute value.
Regards, Malcolm _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Lennart Augustsson wrote:
But speaking of HaXml bugs, I'm pretty sure HaXml doesn't handle % correctly. It seem to treat % specially everywhere, but I think it is only special inside DTDs. I have many XML files produced by other tools that the HaXml parser fails to process because of this.
I believe I fixed at least one bug to do with % characters around version 1.14. But that is the development branch in darcs, not formally released yet. Nevertheless, if you know of such bugs, do report them; even better if you can send a small test case. Regards, Malcolm

Malcolm Wallace wrote:
Lennart Augustsson wrote:
But speaking of HaXml bugs, I'm pretty sure HaXml doesn't handle % correctly. It seem to treat % specially everywhere, but I think it is only special inside DTDs. I have many XML files produced by other tools that the HaXml parser fails to process because of this.
I believe I fixed at least one bug to do with % characters around version 1.14. But that is the development branch in darcs, not formally released yet. Nevertheless, if you know of such bugs, do report them; even better if you can send a small test case.
Malcolm, Did you come across the HaXml test harness I created based on a subset of W3C conformance tests? http://www.ninebynine.org/Software/HaskellUtils/HaXml-1.12/test/ This covers all the parameter entity problems I fixed some time ago. #g -- Graham Klyne For email: http://www.ninebynine.org/#Contact

Graham Klyne
Did you come across the HaXml test harness I created based on a subset of W3C conformance tests? http://www.ninebynine.org/Software/HaskellUtils/HaXml-1.12/test/ This covers all the parameter entity problems I fixed some time ago.
Indeed, and an excellent resource. I have been wondering how to merge it back into my version of HaXml ever since. Regards, Malcolm

Lennart Augustsson wrote:
But speaking of HaXml bugs, I'm pretty sure HaXml doesn't handle % correctly. It seem to treat % specially everywhere, but I think it is only special inside DTDs. I have many XML files produced by other tools that the HaXml parser fails to process because of this.
Indeed. This is an area that I found required a fair amount of work on the version of HaXML I was playing with, some time ago. The change log at the end of: http://www.ninebynine.org/Software/HaskellUtils/HaXml-1.12/src/Text/XML/HaXm... has some clues to what I had to do. Notably: [[ -- Revision 1.12 2004/06/04 21:59:13 graham -- Wortk-in-progress: creating intermediate filter to handle parameter -- entity replacement. Separated common features from parse module. -- Created new module based on simplified use of parsing utilities -- to dtect and substitute PEs. The result is a modifed token sequence -- passed to the main XML parser. ]] The parameter entity filter is defined by: http://www.ninebynine.org/Software/HaskellUtils/HaXml-1.12/src/Text/XML/HaXm... The parameter and entity entity handling aspect of the code was not pretty, due mainly to the somewhat quirky nature of XML syntax, especially concerning parameter and general entities. #g -- Graham Klyne For email: http://www.ninebynine.org/#Contact
participants (4)
-
Graham Klyne
-
Koen.Roelandt@mineco.fgov.be
-
Lennart Augustsson
-
Malcolm Wallace