RE: Filename handling

At 14:04 17/08/04 +0100, Simon Marlow wrote:
On 17 August 2004 12:44, Graham Klyne wrote:
Anyway, I'd like to see the common library functions provide at least minimal capabilities to allow multi-platform applications to do the right thing when handling filenames. I'm pretty agnostic about what they actually look like, but as an example I've found the Python and/or Java libraries to be pretty usable in this respect.
I think there's general agreement that this would be a good thing, but discussion never seems to reach a conclusion. Anyone like to whip up a concrete proposal?
This may be a bit radical, but I'll float it anyway: pathToUri :: String -> String -- convert filename to a file: URI according to local system conventions uriToPath :: String -> String -- convert a file: URI to filename according to local system conventions Hmmm... to preserve referential transparency, I suppose that should be: pathToUri :: String -> IO String uriToPath :: String -> IO String The rationale here is that these two functions can be used to get any filename on any system into a form with well-defined syntax and properties and back again, allowing the other filename processing requirements (splitting apart, putting together, relative path evaluations, etc.) to be performed with the common form. Of course, this doesn't deal with operations that need to actually access the file system (directory scanning, etc.), but many of these seem pretty well catered for in any case (cf. Directory library functions). ... Failing this, I'd say that Isaac's module [1] has some pretty reasonable functions. I'd pick out: splitLastComp :: FilePath -> (FilePath,FilePath) isAbsolute :: FilePath -> Bool splitExt :: FilePath -> (FilePath, String) The next function would be useful, but I'd be reluctant to include it until we're confident of having consistent regex support on all platforms: matchPath :: String -- ^RegExp -> IO [FilePath] -- ^IO because it must look to see what exists An alternative, avoiding regex dependence, might be: matchPath :: (FilePath -> Bool) -> IO [FilePath] And a very important (IMO) function that I don't see in Isaac's module would be something like: relativeTo :: FilePath -> FilePath -> FilePath In my URI processing code, I've also added a complementary function: relativeFrom :: FilePath -> FilePath -> FilePath which returns a relative path such that: (path `relativeFrom` base) `relativeTo` base == path noting that the result relativeFrom is not always uniquely determined. Maybe it's better to leave this out. I think that a function like: isDirectory :: FilePath -> IO Bool may also be needed when performing directory scanning operations. [1] http://www.syntaxpolice.org/darcs_repos/OS.Path/Path.hs ... Some related questions to consider: - should we take seriously the point I make above about using IO so that referential transparency is rigorously preserved? If so, all of the above functions should return IO values, as the result may vary depending on the environment in which the program runs. - do we care about legacy operating systems like VAX/VMS? (that would require version number support, and doesn't work well with interfaces that assume a single path separator character). - how does the interface work with forthcoming systems like Microsoft's Longhorn. I hear that the directory tree concept is being replaced by file "attributes". Which leads me to think of... - how does the interface work with WebDAV, which builds a file system like interface over HTTP, and adds property lists to the resources identified. #g ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact
participants (1)
-
Graham Klyne