
Hi Antoine, thanks for your feedback.
2011/5/18 Antoine Latter
On Wed, May 18, 2011 at 12:32 PM, Simon Meier
wrote: Hello Haskell-Cafe,
There are many providers of Writes. Each bounded-length-encoding of a standard Haskell value is likely to have a corresponding Write. For example, encoding an Int32 as a big-endian, little-endian, and host-endian byte-sequence is currently achieved with the following three functions.
writeInt32BE :: Write Int32 writeInt32LE :: Write Int32 writeInt32HE :: Write Int32
I would like to avoid naming all these encodings individually. Especially, as the situation becomes worse for more elaborate encodings like hexadecimal encodings. There, we encounter encodings like the utf8-encoding of the hexadecimal-encoding with lower-case letters of an Int32.
writeInt32HexLowerUtf8 :: Write Int32
I really don't like that. Therefore, I'm thinking about the following solution based on type-classes. We introduce a single typeclass
class Writable a where write :: Write a
and use a bunch of newtypes to denote our encodings.
newtype Ascii7 a = Ascii7 { unAscii7 :: a } newtype Utf8 a = Utf8 { unUtf8 :: a } newtype HexUpper a = HexUpper { unHexUpper :: a } newtype HexLower a = HexLower { unHexLower :: a } ...
Assuming FlexibleInstnaces, we can write encodings like the above hex-encoding as instances
instance Write (Utf8 (HexLower Int32)) where write = ...
This composes rather nicely and allows the implementations to exploit special properties of the involved data. For example, if we also had a HTML escaping marker
newtype Html a = Html { unHtml :: a }
Then, the instance
instance Write (Utf8 (HTML (HexLower Int32))) where write (Utf8 (HTML (HexLower i))) = write (Utf8 (HexLower i))
If I were authoring the above code, I don't see why that code is any easier to write or easier to read than:
urf8HtmlHexLower i = utf8HexLower i
And if I were using the encoding functions, I would much prefer to see:
urf8HtmlHexLower magicNumber
In my code, instead of:
write $ Utf8 $ HTML $ HexLower magicNumber
In addition, this would be difficult for me as a developer using the proposed library, because I would have no way to know which combinations of newtypes are valid from reading the haddocks.
Maybe I'm missing something fundamental, but this approach seems more cumbersome to me as a library author (more boilerplate) and as the user of the library (less clarity in the docs and in the resultant code).
Hmm, that's a valid point you raise here. Especially, the documentation issue bothers me. The core problem that drove me towards this solution is the abundance of different IntX and WordX types. Each of them requiring a separate Write for big-endian, little-endian, host-endian, lower-case-hex, and uper-case-hex encodings; i.e., currently, there are int8BE :: Write Int8 int16BE :: Write Int16 int32BE :: Write Int32 ... hexLowerInt8 :: Write Int8 ... and so on. As you can see (http://hackage.haskell.org/packages/archive/blaze-builder/0.3.0.1/doc/html/B...) this approach clutters the public API quite a bit. Hence, I'm thinking of using a separate type-class for each encoding; i.e., class BigEndian a where bigEndian :: Write a This collapses the big-endian encodings of all 10 bounded-size (signed and unsigned) integer types under a single name with a well-defined semantics. Moreover, it's standard Haskell 98. For the hex-encodings, I'm thinking about providing type-classes class HexLower a where hexLower :: Write a class HexLowerNoLead a where hexLowerNoLead :: Write a ... for ASCII encoding and each of the standard Unicode encodings in a separate module. The user can then select the right ones using qualified imports. In most cases, he won't even need qualification, as mixing different character encodings is seldomly used. What do you think about such an interface? Is there another catch hidden, I'm not seeing? BTW, note that Writes are a pure compile time abstraction and are thought to be completely inlined. In typical, uses cases there's no efficiency overhead stemming from these typeclasses. best regards, Simon