
Alexander V Vershilov
The problem is that Prelude.getLine uses current locale to load characters: for example if you have utf8 locale, then everything works out of the box:
$ runhaskell 1.hs résumé 履歴書 резюме résumé 履歴書 резюме
But if you change locale you'll have error:
LANG="C" runhaskell 1.hs résumé 履歴書 резюме 1.hs: <stdin>: hGetLine: invalid argument (invalid byte sequence)
That seems to be correct behaviour: the only way to know the meaning of the bits input by a user is what encoding the user says they are in. But in general this issue is an instance of inheriting sins from the OS: the meaning of the bit pattern in a file should be part of the file, but we are stuck with OSs that use a global variable (which should be anathema to Haskell). So if user A has locale set one way and inputs a file and sends the filename to user B on the same system, user B might well see something completely different to A when looking at the file.
To force haskell use UTF8 you can load string as byte sequence and convert it to UTF-8 charecters
but of course, the programmer can only hope that utf-8 will work here. If the user is typing in KOI-8R, reading it as utf-8 is going to be wrong. -- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk