
On Tue, 2007-11-27 at 18:38 +0000, Paul Johnson wrote:
Brandon S. Allbery KF8NH wrote:
However, the IO system truncates [characters] to 8 bits.
Should this be considered a bug?
A design problem.
I presume that its because
was defined in the days of ASCII-only strings, and the functions in System.IO are defined in terms of . But does this need to be the case in the future?
When it's phrased as "truncates to 8 bits" it sounds so simple, surely all we need to do is not truncate to 8 bits right? The problem is, what encoding should it pick? UTF8, 16, 32, EBDIC? How would people specify that they really want to use a binary file. Whatever we change it'll break programs that use the existing meanings. One sensible suggestion many people have made is that H98 file IO should use the locale encoding and do Unicode/String <-> locale conversion. So that'd all be text files. Then openBinaryFile would be used for binary files. Of course then we'd need control over setting the encoding and what to do on encountering encoding errors. IMHO, someone should make a full proposal by implementing an alternative System.IO library that deals with all these encoding issues and implements H98 IO in terms of that. It doesn't have to be fast initially, it just has to get the API right and not design the API so as to exclude the possibility of a fast implementation later. Duncan