
[Probably libraries@haskell.org is the right list for this message, so I'm fwding your message below, and will reply there.] | -----Original Message----- | From: haskell-cafe-bounces@haskell.org [mailto:haskell-cafe-bounces@haskell.org] On Behalf Of Benjamin | Franksen | Sent: 23 March 2007 22:56 | To: haskell-cafe@haskell.org | Cc: haskell@haskell.org | Subject: [Haskell-cafe] Can we do better than duplicate APIs? [was: Data.CompactString 0.3] | | [sorry for the somewhat longer rant, you may want to skip to the more | technical questions at the end of the post] | | Twan van Laarhoven wrote: | > I would like to announce version 0.3 of my Data.CompactString library. | > Data.CompactString is a wrapper around Data.ByteString that represents a | > Unicode string. This new version supports different encodings, as can be | > seen from the data type: | > | > [...] | > | > Homepage: http://twan.home.fmf.nl/compact-string/ | > Haddock: http://twan.home.fmf.nl/compact-string/doc/html/ | > Source: darcs get http://twan.home.fmf.nl/repos/compact-string | | After taking a look at the Haddock docs, I was impressed by the amount of | repetition in the APIs. Not ony does Data.CompactString duplicate the whole | Data.ByteString interface (~100 functions, adding some more for encoding | and decoding), the whole interface is again repeated another four times, | once for each supported encoding. | | Now, this is /not/ meant as a criticism of the compact-string package in | particular. To the contrary, duplicating a fat interface for almost | identical functionality is apparently state-of-the-art in Haskell library | design, viz. the celebrated Data.Bytesting, whose API is similarly | repetitive (see Data.List, Data.ByteString.Lazy, etc...), as well as | Map/IntMap/SetIntSet etc. I greatly appreciate the effort that went into | these libraries, and admire the elegance of the implementation as well as | the stunning results wrt. efficiency gains etc.. However I fear that | duplicating interfaces in this way will prove to be problematic in the long | run. | | The problems I (for-)see are for maintenance and usability, both of which | are of course two sides of the same coin. For the library implementer, | maintenance will become more difficult, as ever more of such 'almost equal' | interfaces must be maintained and kept in sync. One could use code | generation or macro expansion to alleviate this, but IMO the necessity to | use extra-language pre-processors points to a weakness in the language; it | be much less complicated and more satisfying to use a language feature that | avoids the repetition instead of generating code to facilitate it. On the | other side of teh coin, usability suffers as one has to lookup the (almost) | same function in more and more different (but 'almost equal') module | interfaces, depending on whether the string in question is Char vs. Byte, | strict vs. lazy, packed vs. unpacked, encoded in X or Y or Z..., especially | since there is no guarantee that the function is /really/ spelled the same | everywhere and also really does what the user expects.(*) | | I am certain that most, if not all, people involved with these new libraries | are well aware of these infelicities. Of course, type classes come to mind | as a possible solution. However, something seems to prevent developers from | using them to capture e.g. a common String or ListLike interface. Whatever | this 'something' is, I think it should be discussed and addressed, before | the number of 'almost equal' APIs becomes unmanageable for users and | maintainers. | | Here are some raw ideas: | | One reason why I think type classes have not (yet) been used to reduce the | amount of API repetition is that Haskell doesn't (directly) support | abstraction over type constraints nor over the number of type parameters | (polykinded types?). Often such 'almost equal' module APIs differ in | exactly these aspects, i.e. one has an additional type parameter, while yet | another one needs slightly different or additional constraints on certain | types. Oleg K. has shown that some if these limitations can be overcome w/o | changing or adding features to the language, however these tricks are not | easy to learn and apply. | | Another problem is the engineering question of how much to put into the | class proper: there is a tension between keeping the class as simple as | possible (few methods, many parametric functions) for maximum usability vs. | making it large (many methods, less parametric functions) for maximum | efficiency via specialized implementations. It is often hard to decide this | question up front, i.e. before enough instances are available. (This has | been stated as a cause for defering the decision for a common interface to | list-like values or strings). Since the type of a function doesn't reveal | whether it is a normal function with a class constraint or a real class | method, I imagine a language feature that (somehow) enables me to | specialize such a function for a particular instance even if it is not a | proper class member. | | Or maybe we have come to the point where Haskell's lack of a 'real' module | system, like e.g. in SML, actually starts to hurt? Can associated types | come to the rescue? | | Cheers | Ben | -- | (*) I know that strictly speaking a class doesn't guarantee any semantic | conformance either, but at least there is a common place to document the | expected laws that all implementations should obey. With duplicated module | APIs there is no such single place. | | _______________________________________________ | Haskell-Cafe mailing list | Haskell-Cafe@haskell.org | http://www.haskell.org/mailman/listinfo/haskell-cafe