Re: [Haskell-cafe] Re: String vs ByteString

On Tue, Aug 17, 2010 at 13:29, Ketil Malde
Tako Schotanus
writes: Just like Char is capable of encoding any valid Unicode codepoint.
Unless a Char in Haskell is 32 bits (or at least more than 16 bits) it con NOT encode all Unicode points.
And since it can encode (or rather, represent) any valid Unicode codepoint, it follows that it is 32 bits (and at least more than 16 bits).
:-)
(Char is basically a 32bit value, limited valid Unicode code points, so it corresponds to UCS-4/UTF-32.)
Yeah, I tried looking it up but I could find the technical definition for Char, but in the end I found that "maxBound" was "0x10FFFF" making it basically 24 bits :) I know for example that Java uses only 16 bits for its Chars and therefore can NOT give you all Unicode code points with a single Char, with Strings you can because of the extension points. -Tako

On Tue, Aug 17, 2010 at 1:36 PM, Tako Schotanus
Yeah, I tried looking it up but I could find the technical definition for Char, but in the end I found that "maxBound" was "0x10FFFF" making it basically 24 bits :)
I think that's enough to represent all the assigned Unicode code points. I also think the Unicode consortium (or whatever it is called) made some statement about the maximum number of bits they'll ever use. -- Johan

"Johan" == Johan Tibell
writes:
Johan> On Tue, Aug 17, 2010 at 1:36 PM, Tako Schotanus
participants (3)
-
Colin Paul Adams
-
Johan Tibell
-
Tako Schotanus