Re: [Haskell-cafe] Policy for taking over a package on Hackage

25 May 2011


      On 26 May 2011 08:49, wren ng thornton  wrote:
...
On 5/25/11 1:03 PM, Bryan O'Sullivan wrote:
...
On Wed, May 25, 2011 at 5:59 AM, Ivan Lazar Miljenovic<
ivan.miljenovic@gmail.com>  wrote:
...
Well, using the Char8 version.
Just because you *could* do that, it doesn't mean that you *should*. It's
a
bad idea to use bytestrings for manipulating text, yet the only plausible
reason to have wl-pprint handle bytestrings is so that they can be used as
text.
It's worth highlighting that even with the Char8 version of ByteStrings you
still run into encoding issues. Remember the days before Unicode came about?
True, 8-bit encodings are often ASCII-compatible and therefore the
representation of digits and whitespace are consistent regardless of
(ASCII-compatible) encoding, but that's still just begging for issues. What
are the semantics of the byte 0xA0 with respect to pretty-printing issues
like linewraps? Are they consistent among all extant 8-bit encodings? What
about bytes in 0x80..0x9F? What about 0x7F for that matter?
I won't say that ByteStrings should never be used for text (there are plenty
of programs whose use of text involves only whitespace splitting and moving
around the resultant opaque blobs of memory). But at a bare minimum, the use
of ByteStrings for encoding text needs to be done via newtype wrapper(s)
which keep track of the encoding. Especially for typeclass instances.
*shrug* this discussion on #haskell came about because lispy wanted to
generate textual ByteStrings (using just ASCII) and would prefer not
to have the overhead of Text.

-- 
Ivan Lazar Miljenovic
Ivan.Miljenovic@gmail.com
IvanMiljenovic.wordpress.com