
Sean Leather wrote:
So then, what is the standard? ...I also noticeably don't see UTF-16.
Right there are a handful of language-specific 16-bit encodings that are popular, from what I understand.
So, if this is the case, then a similar question still arises for CJK text: What format/library to use for it (assuming one doesn't want a performance penalty for translating between Data.Text's internal format and the target format)? It appears that there are no ideal answers to such questions.
Right. If you know you'll be in a specific encoding - whether UTF-8, Latin1, one of the CJK encodings, or whatever, it might sometimes make sense to skip Data.Text and do the IO as raw bytes using ByteString and then encode/decode manually only when needed. Otherwise, Data.Text is probably the way to go.