
John Millikin
The reason many Japanese and Chinese users reject UTF-8 isn't due to space constraints (UTF-8 and UTF-16 are roughly equal), it's because they reject Unicode itself.
Probably because they don't think it's complicated enough¹?
Shift-JIS and the various Chinese encodings both contain Han characters which are missing from Unicode, either due to the Han unification or simply were not considered important enough to include
Surely there's enough space left? I seem to remember some Han characters outside of the BMP, so I would have guessed this is an argument from back in the UCS-2 days. (BTW, on a long train ride, I brought the linear-B alphabet, and practiced writing notes to my kids. So linear-B isn't entirely useless :-)
From casual browsing of Wikipedia, the current status in CJK-land seems to be something like this:
China: GB2312 and its successor GB18030 Taiwan, Macao, and Hong Kong: Big5 Japan: Shift-JIS Korea: EUC-KR It is interesting that some of these provide a lot fewer characters than Unicode. Another feature of several of them is that ASCII and e.g. kana scripts take up one byte, and ideograms take up two, which correlates with the expected width of the glyphs. Several of the pages indicate that Unicode, and mainly UTF-8, is gradually taking over. -k ¹ Those who remember Emacs in the MULE days will know what I mean. -- If I haven't seen further, it is by standing in the footprints of giants