
Further to my previous message [1], I've created a new snapshot [2] of my developments to the HaXml package. Browseable source code is at [3]. It relies on a version of Network functions that are at [4]. The main feature of this release is the addition of a "filter" to perform general entity substitution in the parsed XML, and some fixes to the parameter entity substitution. [1] http://www.haskell.org//pipermail/libraries/2004-June/002243.html [2] http://www.ninebynine.org/Software/HaskellUtils/20040617-Haxml-1.12.zip [3] http://www.ninebynine.org/Software/HaskellUtils/HaXml-1.12/ [4] http://www.ninebynine.org/Software/HaskellUtils/20040609-Network.zip The additions have necessitated some further reorganization of the handling of entity definitions, in order that some of the more subtle examples noted in the XML specification work as documented. The code still needs some tidying up, but it works on all the test cases I've assembled to date. Next steps are: - create a test suite for XML validation, based on the W3C conformance test suite, and make sure the validation functions (still) work. - tidy up the code, in particular with a view to pruning out deadwood. - create a filter to perform namespace processing (that which got me started on all this in the first place). ... There's one change I've made which I'm not entirely happy with: in order to be able to collect diagnostic information from XML filter (CFilter) processing, I've added an option CErr to the XML content model. A cleaner solution would, I think, be to extend the return type of a CFilter value, but this would be a significant change to the package interface in an area that I assume is particularly used by applications. ... I've created separate code paths for internal (non-IO) and external (IO using) entity processing, but currently external entities are read using unsafePerformIO. I've thought a little about trying to have the "external" interfaces return an IO value, and avoid using unsafePerformIO, but I can't currently see how to do that without sacrificing a very high degree of code sharing that is currently achieved. As I write this, I think I've just realized how to do this. If all the relevant shared code runs in some unspecified monad (i.e. is polymorphic in a monadic return type), then the non-IO code can use an identity monad and simply pick out the resulting value as a pure value, but code which depends on IO will be forced to return an IO value (or use unsafe...). Does this sound plausible? ... #g -- [1] http://www.haskell.org//pipermail/libraries/2004-June/002243.html [2] http://www.ninebynine.org/Software/HaskellUtils/20040617-Haxml-1.12.zip [3] http://www.ninebynine.org/Software/HaskellUtils/HaXml-1.12/ [4] http://www.ninebynine.org/Software/HaskellUtils/20040609-Network.zip ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact