
Char in Haskell represents a Unicode character. I don't know exactly what its size is, but it must be at least 16 bits and maybe more. String would then share those properties. However, usually I'm accustomed to dealing with data in 8-bit words. So I have some questions: * If I use hPutStr on a string, is it guaranteed that the number of 8-bit bytes written equals (length stringWritten)? + If no, what is the representation written? I'm assuming UTF-8. How could I find out how many bytes were actually written? + If yes, what happens to the upper 8 bits? Are they simply stripped off? * If I run hGetChar, is it possible that it would consume more than one byte of input? How can I determine whether or not this has happend? * Does Haskell treat the "this is a Unicode file" marker special in any way? * Same questions on withCString and related String<->CString conversions. -- John