
Simon Marlow wrote:
Lifting this constraint in GHC is a bit tricky because we currenly use the OS's text<->binary translation to do I/O (doesn't matter on Unix, right now).
So what? If the handle is in text mode, you won't get the exact bytes as they are in the file, when calling hGetWord8 (at least on some systems). But that is exactly what I would expect if the handle is in text mode. If I need the exact binary representation of a text file, I have to use binary, of course.
So if you open a file in text mode and read it with hGetWord8, what bytes do you expect to get, exactly? Perhaps each byte of the UCS-4 representation? Big endian or little endian order?
No. You would get the raw stream of bytes from the file, except that
\r\n (Windows) or \r (Mac) would be converted to \n.
That should be the only difference between binary and text modes.
Similarly, actual text (i.e. Unicode) I/O should work in both binary
and text modes, with the only difference being EOL conversion.
--
Glynn Clements