
On vrijdag, okt 31, 2003, at 21:06 Europe/Amsterdam, Mark Carroll wrote:
Ralf Hinze and Simon Peyton-Jones wrote an interesting paper on generic programming and derivable type classes. It looked like maybe programmers would be able to write their own "deriving xml" stuff and whatever, which looked great because, if there's not already one out there, I'd love to derive some read/show analogue automatically for data in some encoding that's very efficient to write and parse (i.e. not XML (-:).
Johan Jeuring and I submitted a paper [1] to PLAN-X concerning this topic. In an earlier paper [2] we described a Haskell-XML data binding, that is, a type-safe translation scheme from (a sizeable subset of) XML Schema to Haskell. In [1] we describe a Generic Haskell program which automatically infers certain coercions between the translation of an XML Schema type, which is very large and ugly, and user-defined Haskell datatype capable of representing values of the Schema type. The idea is to infer the function that transforms values of the ugly type picked by the translator to values of a traditional, Haskellish datatype picked by the user. For example, our translator takes the Schema type doc (representing a bibliographic entry): <element name="doc" type="docType"/> <complexType name="docType"> <sequence> <element ref="author" minOccurs="0" maxOccurs="unbounded"/> <element ref="title"/> <element ref="pubDate" minOccurs="0"/> </sequence> <attribute name="key" type="string"/> </complexType> <element name="author" type="string"/> <element name="title" type="string"/> <complexType name="pubDateType"> <sequence> <element ref="year"/> <element ref="month"/> </sequence> </complexType> <element name="pubDate" type="pubDateType"/> <element name="year" type="int"/> <element name="month" type="int"/> to a certain ugly datatype X. [2] defines generic functions: parse{|t|} :: String -> Maybe t unparse{|t|} :: t -> Maybe String (Well, we only describe parse, but unparse is very easy...) Now say the user defines the following datatype in some module:
data Doc = Doc { key :: String, authors :: [String], title :: String, pubDate :: Maybe PubDate }
data PubDate= PubDate { year :: Integer, month :: Integer }
This is, IMO, the `ideal' translation of the Schema type. Now, although X /= Doc, there is in fact a `canonical' injection X -> Doc, determined by the types alone, which happens to do what one wants. In [1] we define generic functions: reduce{|t|} :: t -> Univ expand{|t|} :: Univ -> t where Univ is a universal type which you don't need to know anything about. The program
expand{|T|} . reduce{|S|} :: S -> T
denotes the canonical function, which is inferred generically by inspecting the types S and T, relieving the user of the burden of writing it out themselves. So now, say you want to write a GH program which reads in a document conforming to the Schema type `doc' from standard input, deletes all authors named "Dubya", and writes the result to standard output. Here it is: < main = interact work < toE_doc = unparse{|E_doc|} . expand{|E_doc|} . < reduce{|Doc|} < toDoc = expand{|Doc|} . reduce{|E_doc|} . < parse{|E_doc|} < work = toE_doc . < (\d -> d { authors = < filter (/= "Dubya") (authors d) }) . < toDoc And that's it. All the messy stuff is inferred by GH and the translator. OK, now the reason that I prepended this message with "FWIW": although we have an implementation of the translator and coercion inferencer, they're only prototypes and far from usable in practice. In fact, the translator doesn't read XML at all but rather operates on XML abstract syntax (a tree datatype). Frankly, I don't think I will take the time to turn the prototype into anything releasable, but I wouldn't mind turning over the sources (such as they are :) to someone who has a serious interest. Take a look at the papers and see if it appeals to you. Regards, Frank [1] @TechReport{ACJ03c, author = {Atanassow, Frank and Clarke, Dave and Jeuring, Johan}, title = {Scripting {XML} with {G}eneric {H}askell}, institution = {Utrecht University}, year = {2003}, url = {ftp://ftp.cs.uu.nl/pub/RUU/CS/techreps/CS-2003/2003-023.pdf}, number = "UU-CS-2003" } [2] @misc{AJ03, author = {Frank Atanassow and Johan Jeuring}, title = {Type isomorphisms simplify {XML} programming}, year = 2003, note = {Submitted to PLAN-X 2004}, url = {http://www.cs.uu.nl/~franka/pub}, urlpdf = {http://www.cs.uu.nl/~franka/planx04.pdf}, pubcat = {journal}, }