
Hello Haskell-Cafe, my main question is whether requiring FlexibleInstances is a problem for code that aims to become part of the Haskell platform. The following explanation gives the context for this question. As some of you may know the blaze-builder library is now used in quite a few places. That's nice, but it doesn't mean that blaze-builder is a finished solution to the problem of providing an API for high-performance buffered output (creation of chunked representations) of sequences of bytes. In fact, one of my current goals with this work is to polish it such that it can be integrated into the 'bytestring' library. This has the benefit that functions that create lazy bytestrings (e.g., pack, map, unfoldr, filter) can be implemented such that they create well-sized chunks even if the argument bytestring is hugely fragmented. Moreover, this integration also establishes a single builder type as the output representation. Therefore, other creators of bytestrings (e.g., 'text', 'base16-bytestring', 'zlib') can provide results of type Builder, which enables O(1) appends of their results and the preservation of well-sizedness of the created chunks. As part of this goal, I'm currently working on the first of the following three points that are paramount to achieving great encoding performance: 1. Ensure that individual Haskell values are encoded with minimal overhead. 2. Ensure that concatenation of sequences of bytes is efficient. 3. Ensure that the average chunk size is large. The core principle used to tackle (1) is avoiding intermediate data structures. The core abstraction used is the one of a Write (see [1] for the corresponding library.) data Write a = Write Int (a -> Ptr Word8 -> IO (Ptr Word8)) A value `Write bound io :: Write a` denotes an encoding scheme for values of type `a` that uses at most `bound` bytes space. Given a values `x :: a` and a pointer `po` to the next free byte `io x po` encodes `x` to memory starting from `po` and returns the pointer to the next free byte after the encoding of `x`. In most cases Writes are used as an abstract datatype. They serve as an interface between implementors of the low-level bit-twiddling required to efficiently implement encodings like UTF-8 or Base16 and the providers of efficient traversal functions through streams of Haskell values. Hence, typical users of Writes are functions like fromWrite :: Write a -> a -> Builder fromWriteList :: Write a -> [a] -> Builder fromWriteUnfoldr :: Write b -> (a -> Maybe (b, a)) -> a -> Builder mapWriteByteString :: Write Word8 -> S.ByteString -> Builder They consume the given datastructure efficiently and wrap the Writes in the bounds checking code to detect when a buffer is full and request a new one. There are many providers of Writes. Each bounded-length-encoding of a standard Haskell value is likely to have a corresponding Write. For example, encoding an Int32 as a big-endian, little-endian, and host-endian byte-sequence is currently achieved with the following three functions. writeInt32BE :: Write Int32 writeInt32LE :: Write Int32 writeInt32HE :: Write Int32 I would like to avoid naming all these encodings individually. Especially, as the situation becomes worse for more elaborate encodings like hexadecimal encodings. There, we encounter encodings like the utf8-encoding of the hexadecimal-encoding with lower-case letters of an Int32. writeInt32HexLowerUtf8 :: Write Int32 I really don't like that. Therefore, I'm thinking about the following solution based on type-classes. We introduce a single typeclass class Writable a where write :: Write a and use a bunch of newtypes to denote our encodings. newtype Ascii7 a = Ascii7 { unAscii7 :: a } newtype Utf8 a = Utf8 { unUtf8 :: a } newtype HexUpper a = HexUpper { unHexUpper :: a } newtype HexLower a = HexLower { unHexLower :: a } ... Assuming FlexibleInstnaces, we can write encodings like the above hex-encoding as instances instance Write (Utf8 (HexLower Int32)) where write = ... This composes rather nicely and allows the implementations to exploit special properties of the involved data. For example, if we also had a HTML escaping marker newtype Html a = Html { unHtml :: a } Then, the instance instance Write (Utf8 (HTML (HexLower Int32))) where write (Utf8 (HTML (HexLower i))) = write (Utf8 (HexLower i)) exploits that no HTML escaping is required for a hex-number. Assuming FlexibleContexts, the user can also build abbreviations for builders using fixed encodings. utf8 :: Writable (Utf8 a) => a -> Builder utf8 = fromWrite write . Utf8 Note that, on the Builder level, a probably better way would be to have an analogous 'ToBuilder' typeclass to abstract the various encodings. Part of these instances then reuse the corresponding instances from Writable. I think this type-class based interface to select the correct efficient implementation of an encoding is rather nice. However, I don't know if 'FlexibleInstances' and 'FlexibleContexts' are fine to use in code that aims to become part of the Haskell platform. Moreover, I might well overlook some drawbacks of this design. I'm looking forward to your feedback. best regards, Simon [1] https://github.com/meiersi/system-io-write