
On Wed, 2009-09-23 at 14:17 +0100, Simon Marlow wrote:
So I'm happy with the first two changes, I'm less convinced about changing> to elide "." on the left. Perhaps people who worry about leading "./" when the FilePath is displayed to the user should just use normalise. That's what they do now and we seem to get along ok.
So this subject was discussed between Ian and myself in the original thread, see e.g.
http://www.haskell.org/pipermail/libraries/2007-December/008776.html
The conclusion was that in the filesystem semantics "./foo" is equal to "foo", but the string passed to rawSystem (and execp()) is not a FilePath, it is something like Either FilePath String.
Right. The file system OS calls take complete file paths, relative to "/" or implicitly relative to ".". system/rawSystem (and some other libs) take either a complete filepath, or an incomplete one which they then complete relative to some search path. In my experimental typed filepath it'd be Either CompletePath IncompletePath. So I would consider both as file paths (though different types). System.FilePath is pretty good for manipulating incomplete (relative) paths so I think it'd be a shame to declare that what rawSystem takes is not a FilePath and thus we do not need to consider functions of its ilk.
However, there are some oddities with the current proposal. e.g.
splitFileName "./foo" == ("./", "foo") "./" > "foo" == "./foo"
The current proposal just about hangs together because the tiny bit of normalisation that > does exactly undoes the creation of "." in splitFileName, and there's no other way that splitFileName can generate ".". That's a terribly fragile property.
So if we take the first bit of your proposal and not the second we have: splitFileName "foo" == ("./", "foo") "./" > "foo" == "./foo" and thus we do not have: uncurry (>) (splitFileName x) == x because we end up with "foo" an "./foo" However I think that's fine. The splitFileName function is asking for the directory part of a relative/incomplete filepath and expecting it to be a real directory. Thus we are interpreting the original filepath as a complete filepath that is relative to ".". So given that by applying splitFileName we are taking that interpretation it's ok to get back "./foo", because we in that context we really were interpreting "foo" as "./foo".
If we "fixed" > to do more normalisation, then we would no longer have the property that
uncurry (>) (splitFileName x) == x (*1)
I don't actually care about the details here as long as we have a story that is reasonably consistent. My main concern is that
isValid x => isValid (takeDirectory x)
the lack of which is the main problem with the current formulation, as you (Duncan) mentioned above.
Right.
Perhaps we should drop the magic normalisation that > does, and apply the normalise function to both sides of (*1). There would probably be a bunch more properties that would have to change too, I'm guessing that normalise would proliferate.
We don't want full normalise, it does too much. So the interpretation I quite like is the complete/incomplete one. Under that model splitFileName can be used for both complete and incomplete paths. splitFileName :: AnyFilePath path => path -> (path, IncompletePath) For complete paths we get back a complete directory part and an incomplete relative path for the filename. For incomplete paths we get back another incomplete directory part. What you want though in the existing untyped interface is that takeDirectory gives you a complete path (so you can pass it to system function), which means the input must also have been a complete path. I think this interpretation justifies: uncurry (>) (splitFileName "foo") == "./foo" In a typed version we could have both interpretations depending on the type and the property would hold in both I think. Does that help or just add more mud? :-)
Incedentally, I dislike the way that trailing slashes are treated in the current filepath implementation. A trailing slash is significant in POSIX:
$ ls -l foo lrwxrwxrwx 1 simonmar GHC 15 2009-09-23 14:04 foo -> /does/not/exist $ ls foo foo@ $ ls foo/ ls: cannot access foo/: No such file or directory
The path with the trailing slash dereferences a symbolic link, and may fail if the link points nowhere.
I think trailing slashes should be dropped by splitFileName.
I don't follow. Isn't it exactly because they are significant that they should be preserved. There are System.FilePath functions for testing for, adding and removing trailing slashes. Duncan