Re: [Haskell-cafe] Re: PROPOSAL: New efficient Unicode string library.

2 Oct 2007


      On Tue, Oct 02, 2007 at 08:02:30AM -0700, Deborah Goldsmith wrote:
...
UTF-16 is the type used in all the APIs. Everything else is considered an 
encoding conversion.
CoreFoundation uses UTF-16 internally except when the string fits entirely 
in a single-byte legacy encoding like MacRoman or MacCyrillic. If any kind 
of Unicode processing needs to be done to the string, it is first coerced 
to UTF-16. If it weren't for backwards compatibility issues, I think we'd 
use UTF-16 all the time as the machinery for switching encodings adds 
complexity. I wouldn't advise it for a new library.
I do not believe that anyone was seriously advocating multiple blessed
encodings.  The main question is *which* encoding to bless.  99+% of
text I encounter is in US-ASCII, so I would favor UTF-8.  Why is UTF-16
better for me?

Stefan

Re: [Haskell-cafe] Re: PROPOSAL: New efficient Unicode string library.

Stefan O'Rear