Fwd: How to input Unicode string in Haskell program?

---------- Forwarded message ----------
From: Semyon Kholodnov
The problem is that Prelude.getLine uses current locale to load characters: for example if you have utf8 locale, then everything works out of the box:
$ runhaskell 1.hs résumé 履歴書 резюме résumé 履歴書 резюме
But if you change locale you'll have error:
LANG="C" runhaskell 1.hs résumé 履歴書 резюме 1.hs: <stdin>: hGetLine: invalid argument (invalid byte sequence)
To force haskell use UTF8 you can load string as byte sequence and convert it to UTF-8 charecters for example by
import qualified Data.ByteString as S import qualified Data.Text.Encoding as T
main = do x <- fmap T.decodeUtf8 S.getLine
now code will work even with different locale, and you'll load UTF8 from shell independenty of user input's there
-- Alexander
On 21 February 2013 13:58, Semyon Kholodnov
wrote: Imagine we have this simple program:
module Main(main) where
main = do x <- getLine putStrLn x
Now I want to run it somehow, enter "résumé 履歴書 резюме" and see this string printed back as "résumé 履歴書 резюме". Now, the first problem is that my computer runs Windows, which means that I can't use ghci ":main" or result of "ghc main.hs" to enter such an outrageous string — Windows console is locked to one specific local code page, and no codepage contains Latin-1, Cyrillic and Kanji symbols at the same time.
But there is also WinGHCi. So I do ":main", copy-paste this string into the window (It works! Because Windows has Unicode for 20 years now), but the output is all messed up. In a rather curious way, actually: the input string is converted to UTF-8 byte string, and its bytes are treated as being characters from my local code page.
So, it appears that I have no way to enter Unicode strings into my Haskell programs by hands, I should read them from files. That's sad, and I refuse to think I am the first one with such a problem, so I assume there is a solution/workaround. Now would someone please tell me this solution? Except from "Just stick to 127 letters of ASCII", of course.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Alexander
participants (1)
-
Semyon Kholodnov