
On Wed, 2005-03-16 at 13:09 +0000, Duncan Coutts wrote:
On Wed, 2005-03-16 at 11:55 +0000, Ross Paterson wrote:
It doesn't affect functions added by the hierarchical libraries, i.e. those functions are safe only with the ASCII subset. (There is a vague plan to make Foreign.C.String conform to the FFI spec, which mandates locale-based encoding, and thus would change all those, but it's still up in the air.)
Hmm. I'm not convinced that automatically converting to the current locale is the ideal behaviour (it'd certianly break all my programs!). Certainly a function for converting into the encoding of the current locale would be useful for may users but it's important to be able to know the encoding with certainty. For example some libraries (eg Gtk+) take all strings in UTF-8 irrespective of the current locale (it does locale-dependent conversions on IO etc but the internal representation is always UTF8). We do the conversion to UTF8 on the Haskell side and so produce a byte string which we marshal using the FFI CString functions.
Silly me! There are C marshaling functions that are specified to do just this but I never noticed them before! withCAString and similar functions treat haskell Strings as byte strings. Duncan