
On Sun, Nov 28, 2010 at 8:53 AM, Yitzchak Gale
Michael Snoyman wrote:
Perhaps a silly question, but are you certain that the input file is valid UTF-8?
That is a very good point.
You could also try using the readFile from utf8-string... [or] read the contents as a lazy bytestring and then use the decode functions...
Those approaches are now both deprecated. Either do what you are doing, which gives you conceptually simple strings as lists of Char. Or, for better efficiency, use the text package:
import qualified Data.Text.Lazy as T main :: IO () main = do text <- T.readFile "unicode.txt" T.putStr text
In any case, you still need to have the correct encoding set on the handles as before. (And the input needs to be valid for your selected encoding.)
Which is why I would actually recommend sticking with the bytestring/text combination when you know what the file encoding will be and it is not system-dependent. It's the approach that I use with Hamlet et al for precisely that reason. Michael