
I tried writing a little command-line utility to find the relative path of one thing from another thing (with Unix-like systems in mind). For example, $ ./pathfromof /etc/init.d/ /etc/X11/XF86Config-4 ../X11/XF86Config-4 $ ./pathfromof /tmp/baz/ /tmp/foo/ . $ ls -l /tmp/baz lrwxr-xr-x 1 markc markc 8 2005-01-20 12:01 /tmp/baz -> /tmp/foo It turned out surprisingly complex, though, and doesn't feel very neat or tidy at all, nor is it very portable given that I couldn't find generic library functions for manipulating bits of filepaths. Anyhow, it's at http://www.chiark.greenend.org.uk/~markc/PathFromOf.hs and may yet have egregious bugs. It seems to me like it could certainly be improved in various ways. If anyone has any thoughts, as to how I could improve my style, make more use of standard libraries, etc., I'd certainly appreciate them. Thanks, Mark

You might be interested in the new FilePath module that's in the works. There's been a lot of work to make these functions portable. http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/libraries/base/System/File... peace, isaac

Isaac Jones
You might be interested in the new FilePath module that's in the works. There's been a lot of work to make these functions portable.
http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/libraries/base/System/File...
splitFileExt "foo" = ("foo", "") splitFileExt "foo." = ("foo", "") I think the second case should be changed to give ("foo.", "") so joinFileExt could undo splitFileExt. On Windows "foo" and "foo." are equivalent, but on Unix they are not. What about splitFileExt "foo.bar."? ("foo", "bar.") or ("foo.bar.", "")? -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/

Aaron Denney
What about splitFileExt "foo.bar."? ("foo", "bar.") or ("foo.bar.", "")?
The latter makes more sense to me, as an extension of the first case you give and splitting "foo.tar.gz" to ("foo.tar", "gz").
It's not that obvious: both choices are compatible with these. The former is produced by rules: - split the filename before the last dot *which is not the last character of the filename*, or at the end if there is no such dot - remove the first character of the extension if it's non-empty (the character must have been a dot) The latter is produced by rules: - split the filename before the last dot, or at the end if there is no dot at all - *if the extension is a sole dot, append a dot to the basename* - remove the first character of the extension if it's non-empty (the character must have been a dot) Special filenames of "." and ".." are treated separately, before these rules apply. Both choices are inverted by the same joinFileExt, which inserts a dot between the name and extension unless the extension is empty. These rules agree on "foo", "foo." and "foo.tar.gz", yet disagree on "foo.bar."; I don't know which is more natural. The difference influences the behavior of changeFileExt. These cases are the same with both choices: changeFileExt "foo.bar" "" = "foo" changeFileExt "foo.tar.gz" "" = "foo.tar" changeFileExt "foo." "" = "foo." changeFileExt "foo." "baz" = "foo..baz" but these differ - first choice: changeFileExt "foo.bar." "" = "foo" changeFileExt "foo.bar." "baz" = "foo.baz" or the second: changeFileExt "foo.bar." "" = "foo.bar." changeFileExt "foo.bar." "baz" = "foo.bar..baz" ? -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/

Marcin 'Qrczak' Kowalczyk wrote:
These rules agree on "foo", "foo." and "foo.tar.gz", yet disagree on "foo.bar."; I don't know which is more natural.
Filename extensions come from DOS 8.3 format. In these kind of names only one '.' is allowed. Unix does not have filename extensions, as '.' is just a normal filename character (with the exception of '.', '..', and filenames starting with a '.' which are hidden files). As far as I know unix utilities like gzip look for specific extensions like '.gz', so it would make more sense on a unix platform to just look for a filename ending '.gz'... this applies recursively so: fred.tar.gz Is a tarred gzip file, so first ending is '.gz' the next is '.tar'... So as far as unix is concerned: "foo.bar." is just as it is... as would any other combination unless the extension matches that specifically used by your application... So the most sensible approach would be to have a list of known extensions which can be recursively applied to the filenames, and leave any other filenames alone. [".gz",".tar",".zip"] ... In other words just splitting on a '.' seems the wrong operation. (Imagine gziping a file called "a..." you get "a....gz", in other words simply an appended ".gz") Keean

Isaac Jones wrote:
You might be interested in the new FilePath module that's in the works. There's been a lot of work to make these functions portable.
http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/libraries/base/System/File...
I didn't realize this was in CVS. IMHO this library is deeply broken, and should not be in GHC 6.4. We should be replacing ill-specified hacks with a carefully designed library, not an official collection of ill-specified hacks. It took me only a few minutes to find a bunch of cases which the CVS code mishandles, ranging from simple bugs, to cases where the existing behavior might be okay if documented, to cases where I'm not convinced there's any sensible behavior consistent with the function's type. (Win32) splitFileName "\\\\server\\share" ==> ("\\\\server","share") (should probably be ("\\\\server\\share","")) splitFileName "foo:xyz" ==> ("foo:.","xyz") (should be (".","foo:xyz") -- this refers to the named stream xyz of foo) joinPaths "c:\\" "\\foo" ==> "\\foo" (should be "c:\\foo". I realize that "cd c:\\" on Windows doesn't actually make "c:\\" the current directory, but ";" doesn't separate shell commands either.) (Posix) splitFileName "/foo" ==> ("/","foo"), splitFileName "/foo/" ==> ("/foo","") (arguably makes sense, but why isn't it documented?) splitFileName "/foo/bar" ==> ("/foo","bar") splitFileName "/foo//bar" ==> ("/foo/","bar") (definitely a bug) pathParents "/foo///bar" ==> ["/","/foo","/foo","/foo","/foo/bar"] pathParents "foo/../bar" ==> [".","foo/../bar"] (what if foo doesn't exist and we wanted to create it?) Add to those the fundamental problems with splitFileExt which were already mentioned on this thread. I don't even think the broad approach taken by the library interface is right. Manipulating pathnames with FilePath->FilePath functions is like refactoring a Haskell module with String->String functions. There should be parsing and serialization functions which convert between the external FilePath representation and an internal ADT, and the manipulation should happen on the ADT. Please, let's not ship this with the hierarchical libraries. It's not ready for prime time. -- Ben

At 15:17 20/01/05 -0500, Mark Carroll wrote:
I tried writing a little command-line utility to find the relative path of one thing from another thing (with Unix-like systems in mind). ...
FWIW, there's logic to do something like this in my URI module [1]. Bear in mind that there is not, in general, a unique solution (e.g. in extremis, the absolute path of the target might be a legitimate solution, regardless of the base). [1] http://www.ninebynine.org/Software/HaskellUtils/Network/URI.hs There's also a slightly later copy in the Haskell libraries CVS, which I believe is due to ship with the next GHC release. Look for function relativeFrom. See also module URITest.hs [2], for examples of relative paths created by this algorithm (look for function testRelSplit). [2] http://www.ninebynine.org/Software/HaskellUtils/Network/URITest.hs #g -- At 15:17 20/01/05 -0500, Mark Carroll wrote:
I tried writing a little command-line utility to find the relative path of one thing from another thing (with Unix-like systems in mind). For example,
$ ./pathfromof /etc/init.d/ /etc/X11/XF86Config-4 ../X11/XF86Config-4 $ ./pathfromof /tmp/baz/ /tmp/foo/ . $ ls -l /tmp/baz lrwxr-xr-x 1 markc markc 8 2005-01-20 12:01 /tmp/baz -> /tmp/foo
It turned out surprisingly complex, though, and doesn't feel very neat or tidy at all, nor is it very portable given that I couldn't find generic library functions for manipulating bits of filepaths. Anyhow, it's at http://www.chiark.greenend.org.uk/~markc/PathFromOf.hs and may yet have egregious bugs.
It seems to me like it could certainly be improved in various ways. If anyone has any thoughts, as to how I could improve my style, make more use of standard libraries, etc., I'd certainly appreciate them.
Thanks, Mark _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
------------ Graham Klyne For email: http://www.ninebynine.org/#Contact
participants (7)
-
Aaron Denney
-
Ben Rudiak-Gould
-
Graham Klyne
-
Isaac Jones
-
Keean Schupke
-
Marcin 'Qrczak' Kowalczyk
-
Mark Carroll