
On Thu, Sep 13, 2007 at 12:23:33AM +0000,
Aaron Denney
the characters read and written should correspond to the native environment notions and encodings. These are, under Unix, determined by the locale system.
Locales, while fine for things like the language of the error messages or the format to use to display the time, are *not* a good solution for things like file names and file contents. Even on a single Unix machine (without networking), there are *several* users. Using the locale to find out the charset used for a file name won't work if these users use different locales. Same thing for file contents. The charset used must be marked in the file (XML...) or in the metadata, somehow. Otherwise, there is no way to exchange files or even to change the locale (if I switch from Latin1 to UTF-8, what do my files become?)