
Johan Tibell wrote:
What *does* matter to the programmer is what encodings putStr and getLine use. AFAIK, they use "lower 8 bits of unicode code point" which is almost functionally equivalent to latin-1.
Which is terrible! You should have to be explicit about what encoding you expect. Python 3000 does it right.
Presumably there wasn't a sufficiently good answer available in time for haskell98.
Will there be one for haskell prime ?
The I/O library needs an overhaul but I'm not sure how to do this in a backwards compatible manner which probably would be required for inclusion in Haskell'. One could, like Python 3000, break backwards compatibility. I'm not sure about the implications of doing this. Maybe introducing a new System.IO.Unicode module would be an option. There are already some libraries that attempt to create a new string and I/O library for Haskell, based on Unicode, with a separation of byte semantics and character semantics. See for example Streams [1] or CompactString [2].
Regards, Reinier [1]: http://haskell.org/haskellwiki/Library/Streams [2]: http://twan.home.fmf.nl/compact-string/