Re: [Hs-Generics] A Lightweight Implementation of Generics and Dynamics (LIGD)

19 Oct 2006

      Hi Bruno (and all),

Sorry for being so sluggish---but I just wanted to chime in with my  
comments on Bruno's summary.

On Oct 3, 2006, at 7:47 AM, Bruno Oliveira wrote:
...
Hello all,
====================================================================== 
=
Approach: A Lightweight Implementation of Generics and Dynamics
Required features:
   - Haskell 98 + existential types
Usage:
   - Library Writer: Responsible for the sums-of-products machinery,
     the representation(s) data type(s) to be defined in the  
library and
     defining the class and standard instances for "Representable".
     The library writer needs to be have a deep knowledge about  
generic
     programming.
   - Power User: Defines the boilerplate for new data types. Needs  
to be fairly
     familiar with the implementation.
   - User: defines generic functions by pattern matching on the  
structure of "Rep".
     This requires that he is comfortable with sums of products and  
isomorphisms.
     For convenience, the user should also provide a definition  
that uses the
     type class "Representable", which avoids the representations  
to be passed
     explicitly.
   - End User: If "Representable" is used, then the end can just  
use generic functions
     in the same way as any other ad-hoc (that is, type class  
overloaded) functions.
I'm trying to get these roles straight in my head. I think we're  
going to have a similar division for many of the library proposals.

A Generic library has several important components: the "framework"  
or way of representing a particular type, instances of this framework  
for the primitive types, helper functions for creating instances for  
new types, and a base set of generic operations/support code to get  
things started. The "Library writer" is responsible for all of these.

A user of the generic library may want to do two things:
   (a) define new type-indexed functions
   (b) ensure that their datatypes can be used with type-indexed  
functions
In Bruno's terminology,
   - Power User: does both of these
   - User: does (b) but not (a)
   - End User: does neither, just calls existing type-indexed  
functions on existing datatypes.

Note that we could imagine eliminating/automating (b) for some  
platforms with Template Haskell.
...
Extensibility:
   - Data types are non-extensible, however we can try to combine the
     approach with type classes to get some form of extensibility  
(RepLib
     proposes one solution like this).
I think I prefer the term "Specializability" (even though it is more  
of a mouthful.) Generic
functions can be specialized at particular datatypes to specific  
operations.
(LIGD does not support this behavior, but could be extended to do so  
by pushing RepLib through the GADT translation.)
...
Performance considerations:
   - Sums of products based and therefore involving considerable
     performance penalties.
Besides the runtime types and sums-of-products explosion, there is  
also the issue of applying the identity function type coercions (if  
they are not optimized away). Those at least would be eliminated by  
using a GADT.

Aside: should we be developing a benchmark suite if we are actually  
going to be making decisions about performance? I don't want to guess  
here. Given that we will be collecting versions of several functions  
for each library, we should be able to run sort of "Library shootout".

I also would like to know if there are some specific Haskell  
extensions that could dramatically improve performance. These  
extensions would not be portable, of course, but perhaps for specific  
platforms we could support optimized execution.
...
Expressibility:
   - Can do both producer and consumer functions.
   - Multiple representations needed for generic functions of  
different
     arities;
   - No local redefinitions.
I think we should separate expressibility of the generic functions  
and expressibility wrt to what type information can index such  
functions.

For the former we talk about producer/consumer functions (read/show),  
functions of different arities (map,zip), functions defined by higher- 
order types (map, count, reduce), functions requiring type-indexed  
types (the zipper, bit-packed arrays, generalized tries).

For the latter, we talk about the ability to use the library with  
types that contain
   records
   nested datatypes
   higher-kinded types (what kinds?)
   datatypes with existential components (both of kind type and  
higher kinds)
   datatypes with first-class polymorphic components (both of kind  
type and higher kinds)
   above with class constraints
   GADTs (simple, and those that subsume existentials...)
   datatypes with higher-kinded type arguments

(I see that Johan also mentions this division in his email.)

What do you mean by "local redefinitions"?
...
Boilerplate:
   - Isomorphisms between sums of products and data types
   - Instances of Representable
   - Smart constructors for representation of data types
Helpful extra features:
   - GADTs (already implemented in GHC) allow more direct definitions
     of generic functions, since there is no need to apply the type
     coercions explicitly;

...
- A boilerplate generation mechanism. This would effectively mean
     that power users would not be necessary.
...
- Open data types and functions in order for extensibility.
...
- Kind polymorphism, for eliminating the need for multiple
     representations;
I don't think that kind polymorphism would solve this problem---the  
multiple representations are needed for functions (like map) that  
have different arities of kind-indexed types.

Kind polymorphism would help a little, but it wouldn't be completely  
useful without some form of kind-indexed types. (The representations  
of types with higher kinds is not parametric in those kinds, nor is  
the behavior of type-indexed operations on those representations.)
...
Discussion:
    An interesting approach which is quite lightweight and that is  
fairly easy to
use (specially if we would not need to  handle any of the  
boilerplate). It is a bit
outdated because with GADTs, the use can be slightly simplified.  
However, this also
takes us further away from Haskell 98 or even Haskell' (since GADTs  
will not be there).
I think the case for GADTs in Haskell' would be strengthened if there  
were other Haskell compilers than GHC that implement GADTs. Does  
anyone know of any?
...
I think that LIGD is still a reference for a generic programming  
library that uses
a data type for type representations and sums of products as a  
family of data types.
   A drawback may be performance, since we need to convert a data  
type into its
isomorphic sum of products representation.
=============================================================
Cheers,
Bruno
Thanks,
Stephanie

Re: [Hs-Generics] A Lightweight Implementation of Generics and Dynamics (LIGD)

Stephanie Weirich