Re: Data.ByteString candidate 3

25 Apr 2006

      On Tue, Apr 25, 2006 at 02:34:20PM +0100, Simon Marlow wrote:
...
Duncan Coutts wrote:
...
How would we distinguish a full fixed0width 4-byte Unicode version?
Good point, and that's why using the Data.PackedString hierarchy was 
nice, because it accomodated various different character widths.  I 
quite like
Data.ByteString
  Data.PackedString.Latin1
  Data.PackedString.UTF8
  Data.PackedString.UCS4
  etc.
Do we really need all of these? UCS4BE? UTF16? if you care intimatly
about the underlying binary representation, then you should be using
ByteString directly, since you are working with binary data. if you just
want a fast string replacement, then you don't care about the internal
representation, you just want it to be fast.

We don't want issues where someones library takes UTF8 strings but
someone elses takes UCS4 strings and you want them to play nice
together.

I think all we really need are

Data.ByteString
Data.PackedString

(Though, I suppose Latin1 could be useful)

but note, do the people that want latin1 just need ASCII? because it should be
noted that if we have a UTF8 PackedString, then we can make
ASCII-specific access routines that are just as fast as the ones in the
Latin1 variety without giving up the ability to store full unicode
values in the string.

        John

-- 
John Meacham - ⑆repetae.net⑆john⑈