Newbie unicode utf8 question

Hi there, I want to read a unicode file and print it to screen. My terminal was unintentionally set to POSIX locale and that was the reason that it didn't work properly. When I added two hSetEncoding statements it worked fine, even under POSIX. Test file uc.in contains a simple line Möbius Château Here is the small test program I tried: <-----------------------------snip----------------------------> module Main where import System.IO import Text.Printf main :: IO () main = do h <- openFile "uc.in" ReadMode hSetEncoding h utf8 hSetEncoding stdout utf8 s <- hGetContents h print s putStrLn $ "File contains line: " ++ s printf "File contains line: %s\n" s <-----------------------------snap----------------------------> Both putStrLn, and printf work fine. When setting a different locale, e.g. en_GB.iso88591 then both putStrLn and printf work fine even if I haven't set encoding to utf in the source. Then I discovered System.IO.UTF8. However, when I use this as below then the output is garbled. <-----------------------------snip----------------------------> module Main where import System.IO import System.IO.UTF8 as U import Text.Printf main :: IO () main = do h <- openFile "uc.in" ReadMode hSetEncoding stdout utf8 s <- U.hGetContents h U.print s U.putStrLn $ "File contains line: " ++ s printf "File contains line: %s\n" s <-----------------------------snap----------------------------> Any idea why this won't work. -- Manfred

On Sun, Oct 23, 2011 at 08:52, Manfred Lotz
Then I discovered System.IO.UTF8. However, when I use this as below then the output is garbled.
System.IO.UTF8 is intended for earlier versions of ghc which didn't support I/O encoding; if you use it with modern ghc you're likely to get things encoded twice, which will indeed be garbled. -- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms

On Sun, 23 Oct 2011 09:01:13 -0400
Brandon Allbery
On Sun, Oct 23, 2011 at 08:52, Manfred Lotz
wrote: Then I discovered System.IO.UTF8. However, when I use this as below then the output is garbled.
System.IO.UTF8 is intended for earlier versions of ghc which didn't support I/O encoding; if you use it with modern ghc you're likely to get things encoded twice, which will indeed be garbled.
Aah, ok. Then I may safely ignore System.IO.UTF8. -- Thanks, Manfred
participants (2)
-
Brandon Allbery
-
Manfred Lotz