
Christian Maeder wrote:
Simon Marlow wrote:
Christian Maeder wrote:
I'm tempted to replace "ä" bei "\228" in literals. What does haddock do with utf-8 in comments? Will DrIFT -- using read- and writeFile -- still work correctly?
The problem I fear is that writeFile does not produce a utf-8 encoded file:
writeFile "t.hs" "main = putStrLn \"äöüßÄÖÜ\""
Using "\228\246\252\223\196\214\220" instead of "äöüßÄÖÜ" only avoids conversion to utf-8 of the initial file l1.hs (attached), but the generated file t.hs is a latin-1 file in both cases.
Cheers Christian
*Main> :l l1.hs Compiling Main ( l1.hs, interpreted ) Ok, modules loaded: Main. *Main> main *Main> :l t.hs Compiling Main ( t.hs, interpreted ) Ok, modules loaded: Main. *Main> main äöüßÄÖÜ
I'm not sure I see the problem - the I/O library doesn't do unicode encoding/decoding, it always just takes the low 8 bits of each character, hence truncating Unicode to Latin-1. If you restrict yourself to Latin-1 characters in string literals, then I/O will work as expected (i.e. Latin-1 only). If you need to do I/O in a different encoding, I'm afraid you'll have to code it up yourself right now, or use some other library (there are packed string libraries around that can do I/O in UTF-8, for example, and Bulat's new I/O library does char encodings). Cheers, Simon