
On Sep 12, 2007, at 10:18 , Andrea Rossato wrote:
supposed that, in a Linux system, in an utf-8 locale, you create a file with non ascii characters. For instance: touch abèèè
Now, I would expect that the output of a shell command such as "ls ab*" would be a string/list of 5 chars. Instead I find it to be a list of 8 chars...;-)
That is expected. The low level filesystem storage doesn't know about character sets, so non-ASCII filenames must be encoded in e.g. UTF-8. 8 characters is therefore correct, and you must do UTF-8 decoding on input because Haskell does not do so automatically. This will also be true with getdirent() aka getDirectoryContents. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH