HaXml revisions (continued)

17 Jun 2004

      Further to my previous message [1], I've created a new snapshot [2] of my 
developments to the HaXml package.  Browseable source code is at [3].  It 
relies on a version of Network functions that are at [4].  The main feature 
of this release is the addition of a "filter" to perform general entity 
substitution in the parsed XML, and some fixes to the parameter entity 
substitution.

[1] http://www.haskell.org//pipermail/libraries/2004-June/002243.html
[2] http://www.ninebynine.org/Software/HaskellUtils/20040617-Haxml-1.12.zip
[3] http://www.ninebynine.org/Software/HaskellUtils/HaXml-1.12/
[4] http://www.ninebynine.org/Software/HaskellUtils/20040609-Network.zip

The additions have necessitated some further reorganization of the handling 
of entity definitions, in order that some of the more subtle examples noted 
in the XML specification work as documented.

The code still needs some tidying up, but it works on all the test cases 
I've assembled to date.

Next steps are:
- create a test suite for XML validation, based on the W3C conformance test 
suite, and make sure the validation functions (still) work.
- tidy up the code, in particular with a view to pruning out deadwood.
- create a filter to perform namespace processing (that which got me 
started on all this in the first place).

...

There's one change I've made which I'm not entirely happy with:  in order 
to be able to collect diagnostic information from XML filter (CFilter) 
processing, I've added an option CErr to the XML content model.  A cleaner 
solution would, I think, be to extend the return type of a CFilter value, 
but this would be a significant change to the package interface in an area 
that I assume is particularly used by applications.

...

I've created separate code paths for internal (non-IO) and external (IO 
using) entity processing, but currently external entities are read using 
unsafePerformIO.  I've thought a little about trying to have the "external" 
interfaces return an IO value, and avoid using unsafePerformIO, but I can't 
currently see how to do that without sacrificing a very high degree of code 
sharing that is currently achieved.

As I write this, I think I've just realized how to do this.  If all the 
relevant shared code runs in some unspecified monad (i.e. is polymorphic in 
a monadic return type), then the non-IO code can use an identity monad and 
simply pick out the resulting value as a pure value, but code which depends 
on IO will be forced to return an IO value (or use unsafe...).  Does this 
sound plausible?

...

#g
--

[1] http://www.haskell.org//pipermail/libraries/2004-June/002243.html

[2] http://www.ninebynine.org/Software/HaskellUtils/20040617-Haxml-1.12.zip

[3] http://www.ninebynine.org/Software/HaskellUtils/HaXml-1.12/

[4] http://www.ninebynine.org/Software/HaskellUtils/20040609-Network.zip

------------
Graham Klyne
For email:
http://www.ninebynine.org/#Contact