
FYI: I just released new versions of system-filepath and
system-fileio, which attempt to work around the changes in GHC 7.2.
On Wed, Nov 2, 2011 at 11:55, Max Bolingbroke
Maybe I'm misunderstanding, but it sounds like you're still trying to treat posix file paths as text. There should not be any iconv or locales or anything involved in looking up a posix file path.
The thing is that on every non-Unix OS paths *can* be interpreted as text, and people expect them to be. In fact, even on Unix most programs/frameworks interpret them as text - e.g. IIRC QT's QString class is used for filenames in that framework, and if you view filenames in an end-user app like Nautilus it obviously decodes them in the current locale for presentation.
There is a difference between how paths are rendered to users, and how they are handled by applications. Applications *must* use whatever the operating system says a path is. If a path is bytes, they must use bytes. If a path is text, they must use text. How they present paths to the user is a matter of user interface design. For what it's worth, on my Ubuntu system, Nautilus ignores the locale and just treats all paths as either UTF8 or invalid. To me, this seems like the most reasonable option; the concept of "locale encoding" is entirely vestigal, and should only be used in certain specialized cases.
Paths as text is just what people expect, and is grandfathered into the Haskell libraries itself as "type FilePath = String". PEP-383 behaviour is (I think) a good way to satisfy this expectation while still not sacrificing the ability to deal with files that have names encoded in some way other than the locale encoding.
Paths as text is what *Windows* programmers expect. Paths as bytes is what's expected by programmers on non-Windows OSes, including Linux and OS X. I'm not saying one is inherently better than the other, but considering that various UNIX and UNIX-like operating systems have been using byte-based paths for near on forty years now, trying to abolish them by redefining the type is not a useful action.
(Perhaps if Haskell had an abstract FilePath data type rather than FilePath = String we could do something different.
This is the general purpose of my system-filepath package, which provides a set of generic modifications, applicable to paths from various OS families.
But it's not clear if we could, without also having ugliness like getArgs :: IO [Byte])
We *ought* to have getArgs :: IO [ByteString], at least on POSIX systems. It's totally OK if high-level packages like "directory" want to hide details behind some nice abstractions. But the low-level libraries, like "base" and "unix" and "Win32", must must must provide direct low-level access to the operating system's APIs. The only other option is to re-implement half of the standard library using FFI bindings, which is ugly (for file/directory manipulation) or nearly impossible (for opening handles). If you're going to make all the System.IO stuff use text, at least give us an escape hatch. The "unix" package is ideally suited, as it's already inherently OS-specific. Something like this would be perfect: ------------------ System.Posix.File.openHandle :: CString -> IOMode -> IO Handle System.Posix.File.rename :: CString -> CString -> IO () ------------------