2011/3/30 Simon Marlow <marlowsd@gmail.com>

On 04/01/2011 14:50, Simon Marlow wrote:

On 27/12/2010 17:51, Ross Paterson wrote:

On Mon, Dec 27, 2010 at 09:04:41AM -0800, Mark Lentczner wrote:

On Dec 25, 2010, at 7:34 AM, Wolfgang Jeltsch wrote:

The documentation of hSetBinaryMode says:

This has the same effect as calling hSetEncoding with latin1,
together with hSetNewlineMode with noNewlineTranslation.

It seems that this sentence is wrong.

It seems wrong to me in intent. When a handle is in "binary" mode, it
shouldn't have any encoding. If things were different, I'd want to
propose that doing String I/O to such handles should fail, and that
you should only be able to use ByteString with them. But I suppose
that isn't viable...

That sounds like a very good idea. Even better, flag this error at
compile time by having a different type for unencoded handles.

Good plan. I'll make a proposal to add System.IO.binary. A different
type for binary handles is the right thing, but it's a larger
undertaking so I don't plan to attack it right now (someone else is
welcome to do so).

As per the above discussion, I formally propose to add the following to System.IO:

-- | An encoding in which Unicode code points are translated to bytes
-- by taking the code point modulo 256. When decoding, bytes are
-- translated directly into the equivalent code point.
--
-- This encoding never fails in either direction. However, encoding
-- discards informaiton, so encode followed by decode is not the
-- identity.
binary :: TextEncoding

Any objections?

In the 'bytestring' library, String handling based on this encoding is provided by the Data.ByteString.Char8 module and its lazy cousin. Therefore, I'd suggest to name this non-standard (or is there a standard?) encoding

char8 :: TextEncoding

Apart from historical reasons, I also prefer this name as it conveys more information about its semantics.

best regards,

Simon