
(...) When it's phrased as "truncates to 8 bits" it sounds so simple, surely all we need to do is not truncate to 8 bits right?
The problem is, what encoding should it pick? UTF8, 16, 32, EBDIC? (...)
One sensible suggestion many people have made is that H98 file IO should use the locale encoding and do Unicode/String <-> locale conversion. (...)
I'm really afraid of solutions where the behavior of your program changes with an environment variable that not everybody has configured properly, or even know to exist.
Wouldn't it be sensible not to use the H98 file I/O operations at all anymore with binary files? A Char represents a Unicode code point value and is not the right data type to use to represent a byte from a binary stream.
That seems nice, we would not have to create a "wide char" type just for Unicode. This topic made me search the net for that nice quote: "Explanations exist: they have existed for all times, for there is always an easy solution to every problem — neat, plausible and wrong." (See: en.wikiquote.org/wiki/H._L._Mencken That guy has many quotes worth reading.) Strings as char lists is a very good example of that. It's simple and clean, but strings are not char lists in any reasonable sense. Best, Maurício