
2009/8/20 Ketil Malde
Stuart Cook
writes: GHCi, version 6.8.2: http://www.haskell.org/ghc/ :? for help Loading package base ... linking ... done. Prelude> map Data.Char.ord "饁" [39233] <== 0x9941 Prelude> putStrLn "饁" A <== 0x41
It seems that GHCi is clever enough to decode UTF-8 input, which only serves to confuse System.IO even more.
I get:
GHCi, version 6.8.2: http://www.haskell.org/ghc/ :? for help Loading package base ... linking ... done. Prelude> map Data.Char.ord "饁" [39233]
and
Prelude> map Data.Char.ord "£" [163]
but also:
% ghci -e 'map Data.Char.ord "饁"' <interactive>:1:21: lexical error in string/character literal at character '\129'
but again:
% ghci -e 'map Data.Char.ord "£"' [194,163]
So GHCi used interactively translates input from the terminal's UTF-8, but outputs truncates output to eight bits. Executing a string with -e, it appears to read byte for byte (which I think was the original behavior at some point).
-k --
I get the same behaviour here. If you want to try Latin 1 (ISO-8859-1) then you can use a utility called Luit (maybe only Linux?) luit -encoding ISO-8859-1 ghci £ becomes £, but gives the same byte output as above. Iain