Re: [Haskell-cafe] Re: could we get a Data instance for Data.Text.Text?

24 Jan 2010

      On Sat, Jan 23, 2010 at 4:57 PM, Jeremy Shaw  wrote:
...
 On Sat, Jan 23, 2010 at 7:57 AM, Neil Mitchell 
wrote:
...
No, that's definitely not correct, or even remotely scalable as we
increase the number of abstract types in disparate packages.
Yes.. happstack is facing another aspect of this scalability issue as well.
We have a class, Serialize, which is used to serialize and deserialize data.
It builds on the binary library, but adds the ability to version your data
types and migrate data from older versions to newer versions.
This has a serious scalability issue though, because it requires that each
type a user might want to serialize has a Serialize instance.
So do we:
  1. provide Serialize instances for as many data types from libraries on
hackage as we can, resulting in depending on a large number of packages that
people are required to install, even though they will only use a small
fraction of them.
  2. convince people that Serialize deserves the same status as Data, and
then convince authors to create Serialize instances for their type? It would
be nice, but authors will start complaining if they are asked to provide a
zillion other instances for their types as well. And they will be annoyed if
they their library has to depend on a bunch of other libraries, just so they
can provide some instances that only a small fraction of their users might
use. So, this method does not scale as the number of 'interesting' classes
grows.
  3. let individual users define the Serialize instances as they need them.
Unfortunately, if two different library authors defined a Serialize instance
for Text in their libraries, you could not use both libraries in your
application because of the conflicting Serialize instances. So this method
does not scale when the number of libraries using the Serialize class grows.
Not really sure what the work around is. #1 could work if there was some way
to just selectively install the pieces as you need them. But the only way to
do this now would be to create a lot of cabal packages which just defined a
single instance -- happstack-text, happstack-map, happstack-time,
happstack-etc. One for each package that has types we want to create a
serialization instance for...
Any other suggestions?
- jeremy
The only safe rule is: if you don't control the class, C, or you don't
control the type constructor, T, don't make instance C T.  Application
writers can often relax that rule as the set of dependencies for the
whole application is known and in many cases any reasonable instance
for a class C and constructor T is acceptable.  Under those
conditions, the worst-case scenario is that the application writer may
need to remove an instance declaration when migrating to new versions
of the dependencies.  When you control a class C, you should make as
many (relevant) type constructors instances of it as is reasonably
possible, i.e. without adding any extensive dependencies.  So at the
very least, all standard type constructors.  Similarly for those who
control a type constructor T.  This is for convenience.  These
correspond to solutions #1 and #2 only significantly weakened.
Definitely, making a package depend on tons of other packages just to
add instances is NOT the correct solution.

The library writers depending on a package for a class and another
package for a type are the problem case.  There are three potential
solutions in this case which basically are reduce the problem to one
of the above three cases.  Either introduce a new type and add it to a
class, introduce a new class and add the types to it, or try to push
the resolution of such things onto the application writer.  The first
two options have the benefit that they also protect you from the
upstream libraries introducing instances that won't work for you.
These two options have the drawback that they are usually less
convenient to use.  The last option has the benefit that it usually
corresponds to having a more flexible/generic library, in some cases
you can even go so far as to remove your dependence on the libraries
altogether.

One solution to this problem though it can't be done post-hoc usually,
is to simply not use the class mechanism except as a convenience.
This has the benefit that it usually leads to more flexibility and it
helps to realize the third option above.  Using Monoid as an example,
one can provide functions of the form: f :: m -> (m -> m -> m) -> ...
and then also provide f' = f mempty mappend :: Monoid m => ...  The
parameters can be collected into a record as well.  You could even
systematize this into: class C a where getCDict :: CDict a, and then
write f :: CDict a -> ... and f' = f getCDict :: C a => ...

Whatever one does, do NOT add instances of type constructors you don't
control to classes you don't control.  This can lead to cases where
two libraries can't be used together at all.

Re: [Haskell-cafe] Re: could we get a Data instance for Data.Text.Text?

Derek Elkins