Re: [Haskell-cafe] PROPOSAL: New efficient Unicode string library.

24 Sep 2007

      Johan Tibell wrote:
...
Dear haskell-cafe,
I would like to propose a new, ByteString like, Unicode string library
which can be used where both efficiency (currently offered by
ByteString) and i18n support (currently offered by vanilla Strings)
are needed. I wrote a skeleton draft today but I'm a bit tired so I
didn't get all the details. Nevertheless I think it fleshed out enough
for some initial feedback. If I can get the important parts nailed
down before Hackathon I could hack on it there.
Apologies for not getting everything we discussed on #haskell down in
the first draft. It'll get in there eventually.
Bring out your Unicode kung-fu!
http://haskell.org/haskellwiki/UnicodeByteString
Have you looked at my CompactString library[1]? It essentially does 
exactly this, with one extension: the type is parameterized over the 
encoding. From the discussion on #haskell it would seem that some people 
consider this unforgivable, while others consider it essential.

In my opinion flexibility should be more important, you can always 
restrict things later. For the common case where encoding doesn't matter 
there is Data.CompactString.UTF8, which provides an un-parameterized 
type. I called this type 'CompactString' as well, which might be a bit 
unfortunate. I don't like the name UnicodeString, since it suggests that 
the normal string somehow doesn't support unicode. This module could be 
made more prominent. Maybe Data.CompactString could be the specialized 
type, while Data.CompactString.Parameterized supports different encodings.

A word of warning: The library is still in the alpha stage of 
development. I don't fully trust it myself yet :)

[1] http://twan.home.fmf.nl/compact-string/

Twan

Re: [Haskell-cafe] PROPOSAL: New efficient Unicode string library.

Twan van Laarhoven