Re: [Haskell-cafe] PROPOSAL: New efficient Unicode string library.

26 Sep 2007


      On Wed, 2007-09-26 at 09:05 +0200, Johan Tibell wrote:
...
...
I'll look over the proposal more carefully when I get time, but the
most important issue is to not let the storage type leak into the
interface.
Agreed,
...
From an implementation point of view, UTF-16 is the most efficient
representation for processing Unicode. It's the native Unicode
representation for Windows, Mac OS X, and the ICU open source i18n
library. UTF-8 is not very efficient for anything except English. Its
most valuable property is compatibility with software that thinks of
character strings as byte arrays, and in fact that's why it was
invented.
If UTF-16 is what's used by everyone else (how about Java? Python?) I
think that's a strong reason to use it. I don't know Unicode well
enough to say otherwise.
I disagree.  I realize I'm a dissenter in this regard, but my position
is: excellent Unix support first, portability second, excellent support
for Win32/MacOS a distant third.  That seems to be the opposite of every
language's position.  Unix absolutely needs UTF-8 for backward
compatibility.

jcc

Re: [Haskell-cafe] PROPOSAL: New efficient Unicode string library.

Jonathan Cast