
Hi Bruno (and all), Sorry for being so sluggish---but I just wanted to chime in with my comments on Bruno's summary. On Oct 3, 2006, at 7:47 AM, Bruno Oliveira wrote:
Hello all,
====================================================================== = Approach: A Lightweight Implementation of Generics and Dynamics
Required features: - Haskell 98 + existential types
Usage: - Library Writer: Responsible for the sums-of-products machinery, the representation(s) data type(s) to be defined in the library and defining the class and standard instances for "Representable". The library writer needs to be have a deep knowledge about generic programming. - Power User: Defines the boilerplate for new data types. Needs to be fairly familiar with the implementation. - User: defines generic functions by pattern matching on the structure of "Rep". This requires that he is comfortable with sums of products and isomorphisms. For convenience, the user should also provide a definition that uses the type class "Representable", which avoids the representations to be passed explicitly. - End User: If "Representable" is used, then the end can just use generic functions in the same way as any other ad-hoc (that is, type class overloaded) functions.
I'm trying to get these roles straight in my head. I think we're going to have a similar division for many of the library proposals. A Generic library has several important components: the "framework" or way of representing a particular type, instances of this framework for the primitive types, helper functions for creating instances for new types, and a base set of generic operations/support code to get things started. The "Library writer" is responsible for all of these. A user of the generic library may want to do two things: (a) define new type-indexed functions (b) ensure that their datatypes can be used with type-indexed functions In Bruno's terminology, - Power User: does both of these - User: does (b) but not (a) - End User: does neither, just calls existing type-indexed functions on existing datatypes. Note that we could imagine eliminating/automating (b) for some platforms with Template Haskell.
Extensibility: - Data types are non-extensible, however we can try to combine the approach with type classes to get some form of extensibility (RepLib proposes one solution like this).
I think I prefer the term "Specializability" (even though it is more of a mouthful.) Generic functions can be specialized at particular datatypes to specific operations. (LIGD does not support this behavior, but could be extended to do so by pushing RepLib through the GADT translation.)
Performance considerations: - Sums of products based and therefore involving considerable performance penalties.
Besides the runtime types and sums-of-products explosion, there is also the issue of applying the identity function type coercions (if they are not optimized away). Those at least would be eliminated by using a GADT. Aside: should we be developing a benchmark suite if we are actually going to be making decisions about performance? I don't want to guess here. Given that we will be collecting versions of several functions for each library, we should be able to run sort of "Library shootout". I also would like to know if there are some specific Haskell extensions that could dramatically improve performance. These extensions would not be portable, of course, but perhaps for specific platforms we could support optimized execution.
Expressibility: - Can do both producer and consumer functions. - Multiple representations needed for generic functions of different arities; - No local redefinitions.
I think we should separate expressibility of the generic functions and expressibility wrt to what type information can index such functions. For the former we talk about producer/consumer functions (read/show), functions of different arities (map,zip), functions defined by higher- order types (map, count, reduce), functions requiring type-indexed types (the zipper, bit-packed arrays, generalized tries). For the latter, we talk about the ability to use the library with types that contain records nested datatypes higher-kinded types (what kinds?) datatypes with existential components (both of kind type and higher kinds) datatypes with first-class polymorphic components (both of kind type and higher kinds) above with class constraints GADTs (simple, and those that subsume existentials...) datatypes with higher-kinded type arguments (I see that Johan also mentions this division in his email.) What do you mean by "local redefinitions"?
Boilerplate: - Isomorphisms between sums of products and data types - Instances of Representable - Smart constructors for representation of data types
Helpful extra features: - GADTs (already implemented in GHC) allow more direct definitions of generic functions, since there is no need to apply the type coercions explicitly;
- A boilerplate generation mechanism. This would effectively mean that power users would not be necessary.
- Open data types and functions in order for extensibility.
- Kind polymorphism, for eliminating the need for multiple representations;
I don't think that kind polymorphism would solve this problem---the multiple representations are needed for functions (like map) that have different arities of kind-indexed types. Kind polymorphism would help a little, but it wouldn't be completely useful without some form of kind-indexed types. (The representations of types with higher kinds is not parametric in those kinds, nor is the behavior of type-indexed operations on those representations.)
Discussion: An interesting approach which is quite lightweight and that is fairly easy to use (specially if we would not need to handle any of the boilerplate). It is a bit outdated because with GADTs, the use can be slightly simplified. However, this also takes us further away from Haskell 98 or even Haskell' (since GADTs will not be there).
I think the case for GADTs in Haskell' would be strengthened if there were other Haskell compilers than GHC that implement GADTs. Does anyone know of any?
I think that LIGD is still a reference for a generic programming library that uses a data type for type representations and sums of products as a family of data types. A drawback may be performance, since we need to convert a data type into its isomorphic sum of products representation. =============================================================
Cheers,
Bruno
Thanks, Stephanie