
On Sunday 22 August 2010 19:23:03, Yves Parès wrote:
In fact the encoding problem is more general.
When I simply do 'readFile "bar/fooé"', then I'm told: *** Exception: bar/fooé: openFile: does not exist (No such file or directory)
Try ghci> readFile (Data.ByteString.Char8.unpack (Data.ByteString.UTF8.fromString "fooé")) (same trick for find). The problem is probably that readFile filePath truncates the characters in filePath to 8 bits while the filepath on your system is UTF-8 encoded, so you have to give a pseudo-UTF-8 encoded filepath to readFile. At least, that's how it works here, inconvenient though it is.
How am I supposed to read files whose names contains non-ASCII characters? (I use GHC 6.12.3 under Ubuntu 10.04 32bits)
While the inconvenience lasts (people are thinking about how to handle the situation correctly), avoid non-ASCII characters in filepaths if possible.
My locale is fr_FR.utf8 For instance, with HSH: I have a 'bar' directory, containing a file 'fooé'
run $ "find bar" :: IO [String] returns me : ["bar", "bar/foo*\233*"]
That one is okay, 'é' is '\233' and the Show instance for Char escapes all characters > '\127'.
and run $ "find bar -name fooé" returns []
Maybe the same issue, try run $ "find bar -name foo\195\169"
When I provoke an error by running: run $ "find fooé" it says : find: "foo*\351*": No file or directory
On the other hand, if it now says \351, which is ş, there seems to be something else amiss.
So it is not the same encoding!