
Quoth Brandon S Allbery KF8NH
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 8/14/10 01:29 , Kevin Jardine wrote:
I think that this kind of programming detail should be handled internally (even if necessary by switching automatically from UTF-8 to UTF-16 depending upon the language).
It seems like the right thing, described in the wrong words - wouldn't it be a more sensible ideal, to simply `switch' depending on the character encoding? I mean, to start with, you'd surely wish for some standardization, so that the difference between UTF-8 and UTF-16 is essentially internal, while you use the same API indifferently. Second, a key requirement to effectively work with external data is support for multiple character encodings. E.g., if Text is internally UTF-16, it still must be able to input and output UTF-8, and presumably also UTF-16 where appropriate. So given full support for _both_ encodings (for example, Text implementation for `native' UTF-8), and support for input data of _either_ encoding as encountered at run time ... then the internal implementation choice should simply follow the external data. For Chinese inputs you'd be running UTF-16 functions, for French UTF-8. Donn Cave, donn@avvanta.com