SYB and/or HList for XML, deserialization and collections

Having just done a major refactor of the HAppS HTTP API to make it much much easier to use, I am now thinking about simplifying the current boilerplate associated with XML serialization and state deserialization. In the case of XML, Currently the developer must manually write ToElement for every type he/she wants to output. HAppS provides functions that makes doing this a lot easier if the types are Haskell records with field labels of the form: data Animal = Animal {key::Int, name:: String, breed::Breed, price::Float} Looking at the XML example from the SYB site, a shift to SYB means shifting from Record types with field labels to identifying fields by their types so e.g. data Animal = Animal Key Name Breed Price type Key = Int type Name = String type Species = Cow | Sheep type Price = Float This model seems ok, but adds some verbosity. It also entails a learning curve because either you also have field labels or you end up with non-standard extra code to access fields. In this context, HList provides good record functionality and a little template haskell makes it easy to code so we end up with e.g. $(hList_labels "Animal" "key name breed price") type Breed = Cow | Sheep This seems superior from a programming perspective because we have good record functionality (and implicitly support for XML namespaces!), but we lose something on the XML side. In particular, we end up with two new problems: 1. We no longer have type names, so we can't automatically generate an XML element name from an default HList record. The solution may be to define something like this: typeName name item = typeLabel .=. name .*. item animal = typeName "animal" myAnimal = animal .*. breed .=. Cow And since update and extension are different functions, if we hide "typeLabel" from the apps then this gets very safe (or we expose it and get a form of typesafe coerce). 2. We don't know how to deal with non-atomic field values e.g. <animal><breed>Cow></breed></animal> makes sense, but we probably don't want read/show for product types. Perhaps it is possible to use overlapping instances to do something like: instance ToElement HCons a b where ... instance ToElement a where toElement = show But perhaps we can use SYB to allow us to distinguish between atomic and non-atomic types and handle appropriately. I have to say I don't think I undersand the SYB papers well enough to give an example. Note: I can't find an actual example of generating XML from HLists in any of the HList docs so it may be that it is not actually as easy as it looks. All of this may be an open issue in theory as well as practice. == Deserialization == HAppS periodically checkpoints application state to disk. Developers may want to add or remove fields from their state types for from data types used by their state types. The current solution is to have the developer assign a version number to state. If state changes then the developer provides dispatch to a deserialization function based on that version number. It is not at all clear from either the SYB or the HList papers how to deserialize generally. That being said, the HList library provides a way to call functions with optional keyword arguments that looks like it would also generalize to schema transitions. Anyone who has some experience with this issue in the context of HList or SYB? == Haskell Collections == Currently HAppS developers use e.g. Data.Set or Data.Map as collection types for their application. If we push devlopers to use HList for eveything then they are going to need a way of handling collections of Hlist items. My sense is that HList style data structures can't be stored by Data.Map or Data.Set because they are not heterogenous collection types (and will break simply from varying the order of field names). If we use HList as the assumed record type, then I think we need to recode Data.Map and Data.Set for them. Has anyone implemented more interesting collection types for HList objects? -Alex- PS Obviously people can continue to use regular haskell types and implement stuff manually. My goal here is to support people who want more automation and are willing to learn HList or SYB in order to get it.

== Deserialization ==
HAppS periodically checkpoints application state to disk. Developers may want to add or remove fields from their state types for from data types used by their state types. The current solution is to have the developer assign a version number to state. If state changes then the developer provides dispatch to a deserialization function based on that version number. Well, it is indeed a fact of life that apps get updated, and data- models change. I think Ruby on Rails has implemented this quite good, using migrations. Basically, when you're using a migration, you specify the differences between two datatypes (or in their case, tables). If we found some way to determine those diffs automatically, and then require a function that maps the old types to the new types... but maybe I'm thinking too difficult. Just thinking out loud, really. We could encode the structure of the used datatypes into the state, and let HAppS automatically check if it needs to migrate...
One of the concepts that I also like about RoR is that they have 3 modes, development, production and test. Three different databases are used for the three modes, so you're not messing in your production database. Maybe this is something to think about, too. -chris

Hello S., Wednesday, December 27, 2006, 2:24:00 AM, you wrote:
Having just done a major refactor of the HAppS HTTP API to make it much much easier to use, I am now thinking about simplifying the current boilerplate associated with XML serialization and state deserialization.
are you considered using Template Haskell to do it? at least it is used for automatic generation of class instances for binary serialization -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

I'd really rather factor our the template haskell. It does not leave me feeling good. At the specific level, TemplateHaskell doesn't solve the problem of getting good XML element names. For example with HList lets me annotate labels with information about whether they are attributes or elements. At the usage level, it does not traverse other modules easily and forces you to do weird things with code order in order to compile. At the concept level, I find it way too hard to think in terms of the abstract syntax of Haskell. I'd rather be thinking in terms of application semantics. At the implementation level, it forces the generation of standard accessor names e.g. withFoo for every foo, rather than supporting a general syntax for access or update. -Alex- Bulat Ziganshin wrote:
Hello S.,
Wednesday, December 27, 2006, 2:24:00 AM, you wrote:
Having just done a major refactor of the HAppS HTTP API to make it much much easier to use, I am now thinking about simplifying the current boilerplate associated with XML serialization and state deserialization.
are you considered using Template Haskell to do it? at least it is used for automatic generation of class instances for binary serialization
participants (4)
-
Alex Jacobson
-
Bulat Ziganshin
-
Chris Eidhof
-
S. Alexander Jacobson