
On 23 October 2004 10:01, Krasimir Angelov wrote:
One way to make Cabal really portable is to move the entire platform specific code out of it. One example is Distribution.Simple.GHCPackageConfig.localPackageConfig function, which uses HOME environment variable to determine the user's home directory, but this doesn't work under Windows. The CVS System.Directory module already provides the getAppUserDataDirectory function which is more suitable (and portable) in this case.
localPackageConfig is only for GHC 6.2: in 6.4, GHC and ghc-pkg will know where the local package conf lives, so Cabal won't need this knowledge built-in.
Another example is Distribution.Simple.Utils.findBinary. Recently I had updated it and now it is portable but now I see that a similar function is used in runghc tool but in its non portable version. If we move this to the standard libraries then we will be able to use it in both cases. Most of the other #ifdef-s are related to the filepath handling. It will be great benefit if we isolate it in a separate library. I already use the filepath routines in VSHaskell. You can see that a similar functions are used in haddock, happy and alex tools. Maybe if we agree on this it is not too late to add it before the next GHC release.
Last time this came up I asked for a concrete proposal, but no-one came forward with one. I'd do it myself, but I'm kind of busy right now. Would someone care to whip up a list of functions & signatures? I'm inclined to just stick something into the libraries even if it's not perfect. The filename handling issue is one of those things where trying to define a perfectly portable library is hard if not impossible, but having *something* is going to be useful to a lot of people, and as Krasimir says it'll have a lot of duplicated code. Data.FilePath or System.Directory? I'm inclined towards the latter, since we clearly want functions that actually inspect the filesystem (findBinary, getExecutableFilePath) along with functions that just manipulate filenames, and it seems strange to split them up. Cheers, Simon

--- Simon Marlow
localPackageConfig is only for GHC 6.2: in 6.4, GHC and ghc-pkg will know where the local package conf lives, so Cabal won't need this knowledge built-in.
Good!
Last time this came up I asked for a concrete proposal, but no-one came forward with one. I'd do it myself, but I'm kind of busy right now. Would someone care to whip up a list of functions & signatures?
Ok. I will look at Python's OS.Path module, .Net's System.IO.Path class and the Utils module from Cabal and will summarize an API proposal together with some implementation.
I'm inclined to just stick something into the libraries even if it's not perfect. The filename handling issue is one of those things where trying to define a perfectly portable library is hard if not impossible, but having *something* is going to be useful to a lot of people, and as Krasimir says it'll have a lot of duplicated code.
Data.FilePath or System.Directory? I'm inclined towards the latter, since we clearly want functions that actually inspect the filesystem (findBinary, getExecutableFilePath) along with functions that just manipulate filenames, and it seems strange to split them up.
I prefer System.FilePath and System.Directory. Data.FilePath was my previous propsal but System is more suitable. In both Python and .Net there are separated namespaces for functions which are doing real IO operations and for FilePath parsing routines. Cheers, Krasimir _______________________________ Do you Yahoo!? Declare Yourself - Register online to vote today! http://vote.yahoo.com

--- Simon Marlow
Last time this came up I asked for a concrete proposal, but no-one came forward with one. I'd do it myself, but I'm kind of busy right now. Would someone care to whip up a list of functions & signatures?
Ok. Here is one concrete proposal for System.FilePath. It contains nearly all useful FilePath functions from Cabal, .Net's System.IO.Path and Python's OS.Path. There are only few exceptions: - findBinary from Cabal isn't included here. I think that System.Directory is more appropriate place. - OS.Path provides expanduser function which replaces ~ and ~user with the right home directories. I am not sure how to do this in platform independent way. Maybe on Windows this function must be identity. - OS.Path provides expandvar function which replaces $var and ${var} with the value of the corresponding environment variable. I am not sure wheter this function should be there since it can be used to replace values in any text. Maybe System.Environment is better place for it. Another issue is whether it must replace $var or %var% under Windows. %var% is more natural for Windows. Some function names are slightly different from those in Cabal. Propsals for future extensions and better function names are welcome. Cheers, Krasimir __________________________________ Do you Yahoo!? Yahoo! Mail - Helps protect you from nasty viruses. http://promotions.yahoo.com/new_mail

-- | Normalize the case of a file path. On Unix, this returns the path unchanged; -- on case-insensitive filesystems, it converts the path to lowercase. -- On Windows, it also converts forward slashes to backward slashes. normalizeCase :: FilePath -> FilePath
I'm afraid I don't like this function. The case-insensitive file systems used on Windows and Mac OS X by default are _case-preserving_ [well, FAT32 is almost case-preserving], so if you use normalizeCase in any other situation than normalizeCase a == normalizeCase b, then it'll probably be wrong. Also, it has been discussed before that Mac OS X, Linux, and probably even Windows support mounting both case-sensitive and case-insensitive file systems. So whether a file name should be case sensitive really depends on where a file is. So maybe we need normalizeCase :: FilePath -> IO FilePath ... where the FilePath must refer to an existing file or directory. That's definitely not a simple path utility function any more. Are there enough situations where the simple but not quite correct pure normalizeCase function would be The Right Thing (or at least Sufficiently Close To The Right Thing)? Cheers, Wolfgang

Wolfgang Thaller writes:
On Windows, it also converts forward slashes to backward slashes.
normalizeCase :: FilePath -> FilePath
Are there enough situations where the simple but not quite correct pure normalizeCase function would be The Right Thing (or at least Sufficiently Close To The Right Thing)?
I think the use of FilePath is not a good idea to begin with. A file path should be a data type something like this, IMHO: type Segment = String data FilePath = Path [Segment] | RootedPath [Segment] You shouldn't write slashes or backslashes which are converted later, you should use a, say :/: operator to build paths portably right from the start. Peter

--- Peter Simons
I think the use of FilePath is not a good idea to begin with. A file path should be a data type something like this, IMHO:
type Segment = String data FilePath = Path [Segment] | RootedPath [Segment]
You shouldn't write slashes or backslashes which are converted later, you should use a, say :/: operator to build paths portably right from the start.
I don't think I will be happy with that. Most of I/O functions in Haskell uses FilePath as String. If we use data type instead of String we need to pretty print/parse the structure each time when we use it for I/O. Cheers, Krasimir __________________________________ Do you Yahoo!? Yahoo! Mail Address AutoComplete - You start. We finish. http://promotions.yahoo.com/new_mail

normalizeCase is an equivalen of normcase from
OS.Path. I also wondered whether this function is
useful at all but decided to add it for completeness.
I tend to agree to remove it.
Cheers,
Krasimir
--- Wolfgang Thaller
-- | Normalize the case of a file path. On Unix, this returns the path unchanged; -- on case-insensitive filesystems, it converts the path to lowercase. -- On Windows, it also converts forward slashes to backward slashes. normalizeCase :: FilePath -> FilePath
I'm afraid I don't like this function. The case-insensitive file systems used on Windows and Mac OS X by default are _case-preserving_ [well, FAT32 is almost case-preserving], so if you use normalizeCase in any other situation than normalizeCase a == normalizeCase b, then it'll probably be wrong.
Also, it has been discussed before that Mac OS X, Linux, and probably even Windows support mounting both case-sensitive and case-insensitive file systems. So whether a file name should be case sensitive really depends on where a file is. So maybe we need
normalizeCase :: FilePath -> IO FilePath
... where the FilePath must refer to an existing file or directory. That's definitely not a simple path utility function any more.
Are there enough situations where the simple but not quite correct pure normalizeCase function would be The Right Thing (or at least Sufficiently Close To The Right Thing)?
Cheers,
Wolfgang
__________________________________ Do you Yahoo!? Yahoo! Mail Address AutoComplete - You start. We finish. http://promotions.yahoo.com/new_mail

On Tue, Oct 26, 2004 at 06:33:43AM -0700, Krasimir Angelov wrote:
Ok. Here is one concrete proposal for System.FilePath.
I want to reiterate my earlier comments. http://www.haskell.org/pipermail/libraries/2004-June/002283.html Andrew

--- Andrew Pimlott
On Tue, Oct 26, 2004 at 06:33:43AM -0700, Krasimir Angelov wrote:
Ok. Here is one concrete proposal for System.FilePath.
I want to reiterate my earlier comments.
http://www.haskell.org/pipermail/libraries/2004-June/002283.html
Andrew _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
From your comments:
> You don't seem to address the problem of volumes or
> drives (a:, c:). I don't think you can punt this.
The drives are handled implicitly. They are always
part of directory name. If you think that it is useful
to make this explicitly then you would suggest a
concrete API and possible use cases for it.
> Be very explicit in the documentation about what
> your model of a path is. Most libraries implictly
> rely upon some form of unix conventions, without
> pinning them down. I find it hard to use File::Spec
> without trial and error.
The Haddock documentation can be improved.
> Finally, it's probably not acceptible for
> a "production library" to hard-code the path
> separator by platform. There will have to be a way
> to manipulate foreign paths.
It is not hard to split System.FilePath to
System.FilePath.Posix and System.FilePath.Windows. I
will do that if other people also agree on that.
Cheers,
Krasimir
__________________________________
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail

On Tue, Oct 26, 2004 at 11:43:40AM -0700, Krasimir Angelov wrote:
--- Andrew Pimlott
wrote: You don't seem to address the problem of volumes or drives (a:, c:). I don't think you can punt this.
The drives are handled implicitly. They are always part of directory name.
Well, for example you couldn't write a file picker with this API, because you can't ask for all the roots. Maybe that's enough, because pathInits can be used to find the root for a given file (the first element of pathInits being the root).
Be very explicit in the documentation about what your model of a path is. Most libraries implictly rely upon some form of unix conventions, without pinning them down. I find it hard to use File::Spec without trial and error.
The Haddock documentation can be improved.
This isn't mostly about the documentation, it's about having a model in mind when you design the library. What's your model of a path? For example, what should splitFileName "/" (unix) or splitFileName "." be, and why? I can't think of any answers that are consistent with the type (FilePath -> (String, String)). Andrew

--- Andrew Pimlott
--- Andrew Pimlott
wrote: You don't seem to address the problem of volumes or drives (a:, c:). I don't think you can punt
On Tue, Oct 26, 2004 at 11:43:40AM -0700, Krasimir Angelov wrote: this.
The drives are handled implicitly. They are always part of directory name.
Well, for example you couldn't write a file picker with this API, because you can't ask for all the roots. Maybe that's enough, because pathInits can be used to find the root for a given file (the first element of pathInits being the root).
The pathInits isn't suitable in this case. It is useful in case if you want to create some file but you are not sure wheter all directories in the path are created. The pathInits returns a list of all directories that must must be created. Some examples: pathInits "c:" == [] pathInits "c:\\" == [] pathInits "c:\\dir1" == ["c:\\dir1\\"] pathInits "c:\\dir1\\dir2" == ["c:\\dir1\\", "c:\\dir1\\dir2"] In the above example c: and c:\ aren't included in the list because you can't create c:\ as directory. The getPathRoot function is more appropriate in your case. getPathRoot "c:\\dir" == "c:\\"
For example, what should splitFileName "/" (unix) or splitFileName "." be, and why? I can't think of any answers that are consistent with the type (FilePath -> (String, String)).
In the current implementation: splitFileName "/" == ("/", "") splitFileName "." == (".", ".") This is consistent in the sense of rules: "/" `joinFileName` "" == "/" "." `joinFileName` "." == "." i.e. the joinFileName is the reverse function of splitFileName. Cheers, Krasimir __________________________________ Do you Yahoo!? Y! Messenger - Communicate in real time. Download now. http://messenger.yahoo.com

Krasimir Angelov wrote:
For example, what should splitFileName "/" (unix) or splitFileName "." be, and why? I can't think of any answers that are consistent with the type (FilePath -> (String, String)).
In the current implementation: splitFileName "/" == ("/", "") splitFileName "." == (".", ".")
The latter is OK, but not the former; "" isn't a valid filename.
splitFileName "/" == ("/", ".")
would be more reasonable, insofar as chdir()ing to the first element
then accessing the second will have the expected behaviour.
This leads to a more general question: what should
splitFileName "/foo/bar"
(where bar is a directory) equal?
Both:
splitFileName "/foo/bar" == ("/foo", "bar")
and:
splitFileName "/foo/bar" == ("/foo/bar", ".")
are defensible in the sense that reversing the operation with
joinFileName will refer to the same object as the original path.
The former is probably the "expected" result, although the latter is
consistent with the suggested handling of "/". Essentially, the former
assumes that the pathname refers to an object (file or directory)
which resides within a directory, which isn't really true of the root
directory.
--
Glynn Clements

On Wed, Oct 27, 2004 at 12:47:24AM -0700, Krasimir Angelov wrote:
--- Andrew Pimlott
wrote: Well, for example you couldn't write a file picker with this API, because you can't ask for all the roots. Maybe that's enough, because pathInits can be used to find the root for a given file (the first element of pathInits being the root).
The pathInits isn't suitable in this case. It is useful in case if you want to create some file but you are not sure wheter all directories in the path are created. The pathInits returns a list of all directories that must must be created. Some examples:
pathInits "c:" == [] pathInits "c:\\" == [] pathInits "c:\\dir1" == ["c:\\dir1\\"] pathInits "c:\\dir1\\dir2" == ["c:\\dir1\\", "c:\\dir1\\dir2"]
This is an example of why I find this library so unsatisfactory: Effectively, the only way to use this function is to know the specific purpose for which it was written. There's no model I can keep in my head to make sense of it. Even if you document it, I suspect it will confuse people. (In my case, when I saw this function, I immediately assumed it would return all parent directories, so I could eg search upward for a file.) If you really want it, call it parentDirectoriesYouMightNeedToCreateBeforeCreatingThisFile. ;-) I realize you're just collecting existing functions, and that they have all proven useful for some task, but that doesn't necessarily make a useful general-purpose library.
For example, what should splitFileName "/" (unix) or splitFileName "." be, and why? I can't think of any answers that are consistent with the type (FilePath -> (String, String)).
In the current implementation: splitFileName "/" == ("/", "")
What does an empty filename mean?
splitFileName "." == (".", ".")
This again runs contrary to my intuition: I expect to get back the parent, and the name of the file relative to the parent. All these weird cases make the library hard to use without reading the code, going by trial and error, or reading the documentation (assuming it is scrupulously complete) _very_ carefully. (Which, as I complained in the old message I cited, also the case with every other path API I've used.) Andrew

--- Andrew Pimlott
Some examples:
pathInits "c:" == [] pathInits "c:\\" == [] pathInits "c:\\dir1" == ["c:\\dir1\\"] pathInits "c:\\dir1\\dir2" == ["c:\\dir1\\", "c:\\dir1\\dir2"]
This is an example of why I find this library so unsatisfactory: Effectively, the only way to use this function is to know the specific purpose for which it was written. There's no model I can keep in my head to make sense of it. Even if you document it, I suspect it will confuse people. (In my case, when I saw this function, I immediately assumed it would return all parent directories, so I could eg search upward for a file.) If you really want it, call it
parentDirectoriesYouMightNeedToCreateBeforeCreatingThisFile.
;-)
Another way is to assume the following rules for pathParents (pathInits is renamed to pathParents in the last version): pathParents "/foo/bar" == ["/","/foo","/foo/bar"] pathParents "foo/bar" == [".","/foo","/foo/bar"] Note that for local paths the first element in the list is ".". This allows us to easily get the original behaviour: oldPathParents = tail . pathParents This looks more useful. You can easily search upward for a file. :-)
In the current implementation: splitFileName "/" == ("/", "")
What does an empty filename mean?
There already was suggested that in this case the result must be ("/", ".") but I am not sure whether this is right because in this case we really don't have any file name.
splitFileName "." == (".", ".")
This again runs contrary to my intuition: I expect to get back the parent, and the name of the file relative to the parent.
These are exactly the names of parent and the name of file. This is exactly "./." which is equivalent to ".". Actualy: "." `joinFileName` "." == "." All these
weird cases make the library hard to use without reading the code, going by trial and error, or reading the documentation (assuming it is scrupulously complete) _very_ carefully. (Which, as I complained in the old message I cited, also the case with every other path API I've used.)
Andrew
Suggestions for improvements are always wellcome. Cheers, Krasimir __________________________________ Do you Yahoo!? Yahoo! Mail Address AutoComplete - You start. We finish. http://promotions.yahoo.com/new_mail

Andrew Pimlott wrote:
splitFileName "." == (".", ".")
This again runs contrary to my intuition: I expect to get back the parent, and the name of the file relative to the parent.
I don't see what other result is possible in this case. It can't return the current directory as the first member without being in the IO monad. And in any case, that may not be what you want; there are reasons for wanting to manipulate paths without reference to the current state of the process or of the filesystem. You may be dealing with paths which existed at some point in the past (e.g. in a tar/zip/etc archive), or which will exist at some point in the future (i.e. once you've started performing the installation sequence which you're in the process of planning), or which exist (did exist, will exist) on another system, but which don't exist here and now.
All these weird cases make the library hard to use without reading the code, going by trial and error, or reading the documentation (assuming it is scrupulously complete) _very_ carefully. (Which, as I complained in the old message I cited, also the case with every other path API I've used.)
It will probably be the case with every path API which you will ever
use. There are many difficult problems here.
Certainly, the predictive case ("what will be the effect of passing
this path to a given system call?") is undecidable on a multi-tasking
OS.
E.g. you can't reliably determine in advance where chdir("foo") will
take you. You have to execute the call then discover where you
actually ended up afterwards.
And this isn't just a theoretical problem; the bugtraq archives are
full of examples of exploiting race conditions between testing the
current state of the filesystem and making modifications based upon
those results.
--
Glynn Clements

On Thu, Oct 28, 2004 at 01:43:19AM +0100, Glynn Clements wrote:
Andrew Pimlott wrote:
splitFileName "." == (".", ".")
This again runs contrary to my intuition: I expect to get back the parent, and the name of the file relative to the parent.
I don't see what other result is possible in this case.
Well, that's what I meant to imply by "consistent with the type". If the type were FilePath -> Maybe (String, String) you could return Nothing. Or FilePath -> (Maybe String, Maybe String). To me, that's better than doing something odd, and makes the user aware of the special cases.
All these weird cases make the library hard to use without reading the code, going by trial and error, or reading the documentation (assuming it is scrupulously complete) _very_ carefully. (Which, as I complained in the old message I cited, also the case with every other path API I've used.)
It will probably be the case with every path API which you will ever use. There are many difficult problems here.
Well, it's messy, but I don't believe it's that hard. Of course, you'll never be correct outside the IO monad, but if you enumerate your assumptions about what a path means outside of IO, I think you can come up with a fairly coherent set of operations. Andrew

On Fri, 29 Oct 2004, Andrew Pimlott wrote:
Well, it's messy, but I don't believe it's that hard. Of course, you'll never be correct outside the IO monad, but if you enumerate your assumptions about what a path means outside of IO, I think you can come up with a fairly coherent set of operations.
Okay, lemme try... -- Given a directory path "dir" and a file/directory path "rel", -- returns a merged path "full" with the property that -- (cd dir; do_something_with rel) is equivalent to -- (do_something_with full). (XXX is a Nothing return ever needed?) mergePath :: FilePath -> FilePath -> FilePath -- Given a file/directory path "full", return a path "dir" -- referring to a directory which contains that file/dir as an -- entry, and a path "rel" such that "mergePath dir rel" refers -- to the same file/dir as "full". Returns NeedIO if the answer -- cannot be determined without IO operations, and NoParent if -- there cannot be an answer. (XXX hideous return type) splitPath :: FilePath -> SplitPathResult data SplitPathResult = NeedIO | NoParent | Split FilePath FilePath -- Given path referring to a file or directory, returns a -- canonicalized path, with the intent that two paths referring -- to the same file/directory will map to the same canonicalized -- path. Note that it is impossible to guarantee that the -- implication (same file/dir <=> same canonicalizedPath) holds -- in either direction: this function can make only a best-effort -- attempt. If the path does not refer to an existing file or -- directory, returns Nothing. (XXX not ideal, but what else can -- you do?) canonicalizePath :: FilePath -> IO (Maybe FilePath) -- Does as much as possible to simplify and regularize a path -- in a meaning-preserving manner, without IO monad operations. -- This function is idempotent, and is the identity on all -- FilePaths returned by FilePath library functions. -- (XXX there's not much this can do -- it's unsafe to convert -- "a/b/../c" to "a/c", for example, though I think "a/b/./c" -- to "a/b/c" is safe.) normalizePath :: FilePath -> FilePath -- Returns True if this path's meaning is independent of any OS -- "working directory", False if it isn't. (XXX would probably be -- better to make FilePath an abstract type and make it always -- independent of working directories.) isAbsolutePath :: FilePath -> Bool -- Returns True if the path always denotes a directory -- regardless of the current filesystem state. (XXX is this -- the right thing for eliminating . and .. from directory -- listings, or is another function needed?) isSpecialDirectory :: FilePath -> Bool -- Returns True if the path denotes a root. This implies -- isSpecialDirectory and isAbsolutePath, and a NoParent return -- from splitPath. isRoot :: FilePath -> Bool -- getRoots returns a list of paths denoting directories which exist -- and satisfy isRoot. Can't guarantee to return all of them, since -- they may include e.g. "\\\\foo\\bar" on Win32. getRoots :: IO [FilePath] Omitted: * functions which operate on file names rather than paths * functions which can be implemented portably in terms of the above primitives * isRootedPath -- I can't see a use for this * various other stuff -- Ben

On Tue, 26 Oct 2004, Krasimir Angelov wrote:
It is not hard to split System.FilePath to System.FilePath.Posix and System.FilePath.Windows. I will do that if other people also agree on that.
That makes life... irritating for those of us trying to write genuinely portable code, insofar as there's no System.FilePath.DoTheNativeThing. It'd be nice to be able to do things like get a list of the files, just the files[1], in a given directory without having to use a preprocessor or similar for portability. [1] As opposed to directory entries like ".." -- flippa@flippac.org

There is an updated proposal. The new things are: - normalizeCase function is removed - absolutePath function is renamed to absolutizePath this is more consistent with normalizePath function. As Simon suggested it might be moved to System.Directory. - pathInits and commonInit functions are renamed to pathParents and commonParent. I think that these are more relevant names. - added detailed Haddock comments. The comments also contain examples for some functions. - fixed some bugs in the implementation Cheers, Krasimir __________________________________ Do you Yahoo!? Yahoo! Mail Address AutoComplete - You start. We finish. http://promotions.yahoo.com/new_mail

Here is the third version. - FilePath is added to export list - The pathParents is modified and now it returns the root directory (see the examples in comments). You can easily get the previous behaviour: oldPathParents = tail . pathParents - The getPathRoot function is removed. It can be easily emulated with pathParents: getPathRoot path = case pathParents path of (".":path) -> Nothing (root:path) -> Just root I can't see any reason to keep this function. Cheers, Krasimir __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

Hello, Appologies if someone already noticed this, but the 'normalize' function does not seem quite correct (at least on Unix like machines). The problem is that ".." behaves weird when links are used. When you get somewhere via a link, and then use ".." you end up above the directory that the link pointed to. This is probably best illustrated with an example: -- Suppose that I am in my home directory:
pwd /home/diatchki
-- Now I create a link that points to /usr/local
ln -s /usr/local test
-- This makes a link in my current directory that points to /usr/local -- Now I type the following command:
cd test/..
Now if test was not a link, this does nothing and I should still be in my home directory. However since 'test' is a link, I actually end up in /usr. I happen to think that this is not very nice, but I think most utilities probably conform to this rule, so perhaps so should we. -Iavor

I guess this begs the question of whether the path manipulation functions should be purely syntactic, or should also take account of the file system as-is (in which case, they should be in the IO monad, no?). I have my preference for purely syntactic processing, but the issue needs to be determined by actual needs. #g -- At 13:38 28/10/04 -0700, diatchki@cse.ogi.edu wrote:
Hello,
Appologies if someone already noticed this, but the 'normalize' function does not seem quite correct (at least on Unix like machines).
The problem is that ".." behaves weird when links are used. When you get somewhere via a link, and then use ".." you end up above the directory that the link pointed to. This is probably best illustrated with an example:
-- Suppose that I am in my home directory:
pwd /home/diatchki
-- Now I create a link that points to /usr/local
ln -s /usr/local test
-- This makes a link in my current directory that points to /usr/local -- Now I type the following command:
cd test/..
Now if test was not a link, this does nothing and I should still be in my home directory. However since 'test' is a link, I actually end up in /usr.
I happen to think that this is not very nice, but I think most utilities probably conform to this rule, so perhaps so should we.
-Iavor
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

Good point. In order to do proper normalization we need IO monad. The absolutizePath function is already in IO. I propose to remove normalizePath function and to keep only absolutizePath function (in System.Directory). On Windows it can be implemented on top of GetFullPathName and on Unix there is realpath function. Cheers, Krasimir --- diatchki@cse.ogi.edu wrote:
Hello,
Appologies if someone already noticed this, but the 'normalize' function does not seem quite correct (at least on Unix like machines).
The problem is that ".." behaves weird when links are used. When you get somewhere via a link, and then use ".." you end up above the directory that the link pointed to. This is probably best illustrated with an example:
-- Suppose that I am in my home directory:
pwd /home/diatchki
-- Now I create a link that points to /usr/local
ln -s /usr/local test
-- This makes a link in my current directory that points to /usr/local -- Now I type the following command:
cd test/..
Now if test was not a link, this does nothing and I should still be in my home directory. However since 'test' is a link, I actually end up in /usr.
I happen to think that this is not very nice, but I think most utilities probably conform to this rule, so perhaps so should we.
-Iavor
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
__________________________________ Do you Yahoo!? Yahoo! Mail Address AutoComplete - You start. We finish. http://promotions.yahoo.com/new_mail

On 2004-10-29, Krasimir Angelov
Good point. In order to do proper normalization we need IO monad. The absolutizePath function is already in IO. I propose to remove normalizePath function and to keep only absolutizePath function (in System.Directory). On Windows it can be implemented on top of GetFullPathName and on Unix there is realpath function.
They're both useful functions, and I think both should be kept, and the differences documented. (See also zsh's "chase_dots" and "chase_links" options.) -- Aaron Denney -><-

A remark on naming: I don't like names like "absolutizePath". The "Path" suffix in fact expresses type (resp. module) information, and I would think that it is typical for languages *without* proper type resp. module system to force programmers to simulate them by mere naming conventions. So I would suggest to use qualified names (Path.absolutize). Of course that's seems to be a general problem, for instance, with insertFM, lookupFM. But then, the FiniteMap module is much older than hierachical namespaces. But for Paths, the point is designing a new library, so we should use the best techniques available. In fact it's only one technique, qualified names - the other candidate would be a type system that allows static overloading... Best regards, J. W.

On Fri, 29 Oct 2004, Johannes Waldmann wrote:
A remark on naming: I don't like names like "absolutizePath".
The "Path" suffix in fact expresses type (resp. module) information, and I would think that it is typical for languages *without* proper type resp. module system to force programmers to simulate them by mere naming conventions.
yep
So I would suggest to use qualified names (Path.absolutize).
*me too*
Of course that's seems to be a general problem, for instance, with insertFM, lookupFM. But then, the FiniteMap module is much older than hierachical namespaces. But for Paths, the point is designing a new library, so we should use the best techniques available.
I really advise to take a look at Modula-3. There, not only every function name omits the information of the type, even more each module has the name of the type it describes and the main type has the name T. In Modula-3 this convention is essential for template modules. This doesn't introduce new problems because module names have to be unique. So the user of a library can choose if he wants to use a short unqualified function name, some abbreviated module name for qualification or the full module name. I started to use this style for my Haskell modules, too, e.g. http://cvs.haskell.org/darcs/numericprelude/physunit/PhysicalValue.hs I'm still uncertain what identifier to use for constructors and type classes.
In fact it's only one technique, qualified names - the other candidate would be a type system that allows static overloading...
If you don't propose the way C++ does it ... :-) But how can overloading be cleaner than type classes?

Henning Thielemann wrote:
I really advise to take a look at Modula-3. There, not only every function name omits the information of the type, even more each module has the name of the type it describes and the main type has the name T.
Indeed this is a useful convention. If we mainly use unqualified imports, then we probably want module Foo where data Foo = ... but indeed larger projects sort of require qualified imports. but this leads to strange code like x :: Foo.Foo Where it really looks better to have module Foo where data Type = .. as this allows import qualified Foo ; x :: Foo.Type You know where this leads to? Like, Java. There, a module (class) automaticall contains a data declaration, and from the outside it is just accesed by module name (i. e., omitting the ".Type" suffix from the above example). Not a bad thing, IMHO. It's generally a good idea to structure the code (functions) according to the structure of data, and to keep data declarations separate from each other (i. e., each in their own module). Best regards, -- -- Johannes Waldmann, Tel/Fax: (0341) 3076 6479 / 6480 -- ------ http://www.imn.htwk-leipzig.de/~waldmann/ ---------

diatchki@cse.ogi.edu writes:
pwd /home/diatchki ln -s /usr/local test cd test/..
Now if test was not a link, this does nothing and I should still be in my home directory. However since 'test' is a link, I actually end up in /usr.
On Linux, there's no such difference, the example leaves me in my home directory. I believe Solaris has the behaviour you describe. -kzm -- If I haven't seen further, it is by standing in the footprints of giants

Ketil Malde wrote:
pwd
/home/diatchki
ln -s /usr/local test cd test/.. Now if test was not a link, this does nothing and I should still be in my home directory. However since 'test' is a link, I actually end up in /usr. On Linux, there's no such difference, the example leaves me in my home
diatchki@cse.ogi.edu writes: directory. I believe Solaris has the behaviour you describe.
That's a feature of most current shells, not a feature of the OS: If you strace your shell in question (e.g. bash), you'll see that it calls chdir() only with absolute paths in that scenario, probably using the PWD environment variable as a helper. If you use e.g. ftp, you'll see the behaviour Iavor describes, the "real" OS behaviour. Cheers, S.

Doing a quick scan... Generally it looks pretty good, but without actually trying to use it I can't be sure. I have some small questions: 1. How are multiple extensions handled? (e.g. foo.tar.gz) I think this is catered for, with the final extension being picked off. 2. How does path joining work in this case: joinPaths C:/root/path/file.ext altPath/file2.ext2 i.e. is "file.ext" discarded? 3. What is the purpose of commonInit? Also, a very small point: I forgot that FilePath is defined in the prelude... maybe a comment to this effect on the module export list? ... [later] I tend to agree with the comments about case normalization. maybe better to have a path comparison function that takes system-dependent equivalences into account? I'm rather taken by the approach that Peter Simons suggests [1] (abstracting the path composition), but I suspect there are more details to be worked out there. It also makes my earlier suggestion [2] of using a URI-based representation as a common abstraction seem slightly less radical. I also note Simon's comments on this (don't do it yet_) are persuasive. I'll reserve my campaign to unify filenames and URIs for the "ultimate" FilePath abstraction ;-) [1] http://www.haskell.org//pipermail/libraries/2004-October/002593.html [2] http://www.haskell.org//pipermail/libraries/2004-August/002416.html #g -- At 06:33 26/10/04 -0700, Krasimir Angelov wrote:
--- Simon Marlow
wrote: Last time this came up I asked for a concrete proposal, but no-one came forward with one. I'd do it myself, but I'm kind of busy right now. Would someone care to whip up a list of functions & signatures?
Ok. Here is one concrete proposal for System.FilePath. It contains nearly all useful FilePath functions from Cabal, .Net's System.IO.Path and Python's OS.Path. There are only few exceptions:
- findBinary from Cabal isn't included here. I think that System.Directory is more appropriate place. - OS.Path provides expanduser function which replaces ~ and ~user with the right home directories. I am not sure how to do this in platform independent way. Maybe on Windows this function must be identity. - OS.Path provides expandvar function which replaces $var and ${var} with the value of the corresponding environment variable. I am not sure wheter this function should be there since it can be used to replace values in any text. Maybe System.Environment is better place for it. Another issue is whether it must replace $var or %var% under Windows. %var% is more natural for Windows.
Some function names are slightly different from those in Cabal. Propsals for future extensions and better function names are welcome.
Cheers, Krasimir
__________________________________ Do you Yahoo!? Yahoo! Mail - Helps protect you from nasty viruses. http://promotions.yahoo.com/new_mail _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

--- Graham Klyne
1. How are multiple extensions handled? (e.g. foo.tar.gz) I think this is catered for, with the final extension being picked off.
splitFileExt always return the last extension.
2. How does path joining work in this case:
joinPaths C:/root/path/file.ext altPath/file2.ext2
i.e. is "file.ext" discarded?
No. You will get: C:/root/path/file.ext/altPath/file2.ext2
3. What is the purpose of commonInit?
commonInit is renamed to commonParent in the last version. It returns the largest path that is common in all paths in the list.
Also, a very small point: I forgot that FilePath is defined in the prelude... maybe a comment to this effect on the module export list?
I will add FilePath to the export list (Simon's suggestion).
...
[later]
I tend to agree with the comments about case normalization. maybe better to have a path comparison function that takes system-dependent equivalences into account?
The normalizeCase function was removed from the proposal.
I'm rather taken by the approach that Peter Simons suggests [1] (abstracting the path composition), but I suspect there are more details to be worked out there. It also makes my earlier suggestion [2] of using a URI-based representation as a common abstraction seem slightly less radical. I also note Simon's comments on this (don't do it yet_) are persuasive. I'll reserve my campaign to unify filenames and URIs for the "ultimate" FilePath abstraction ;-)
It could be great to have filePathToURI and uriToFilePath functions in Network.URI. There we can also have URI manipulation functions but this doesn't mean that we don't need have System.FilePath in addition. Cheers, Krasimir __________________________________ Do you Yahoo!? Yahoo! Mail Address AutoComplete - You start. We finish. http://promotions.yahoo.com/new_mail

Krasimir, Thanks for your work on this, and your responses... At 13:05 27/10/04 -0700, Krasimir Angelov wrote:
--- Graham Klyne
wrote: 1. How are multiple extensions handled? (e.g. foo.tar.gz) I think this is catered for, with the final extension being picked off.
splitFileExt always return the last extension.
Sounds good to me.
2. How does path joining work in this case:
joinPaths C:/root/path/file.ext altPath/file2.ext2
i.e. is "file.ext" discarded?
No. You will get:
C:/root/path/file.ext/altPath/file2.ext2
Hmmm. I can't claim it's wrong, but I'm uneasy about that. It is, for example, different from the way that URI path joining works. What about: joinPaths C:/root/path/ altPath/file2.ext2 does this give: C:/root/path/altPath/file2.ext2 or C:/root/path//altPath/file2.ext2 ?
3. What is the purpose of commonInit?
commonInit is renamed to commonParent in the last version. It returns the largest path that is common in all paths in the list.
Ah, I see.
Also, a very small point: I forgot that FilePath is defined in the prelude... maybe a comment to this effect on the module export list?
I will add FilePath to the export list (Simon's suggestion).
With a comment that it is defined in the Prelude?
I'm rather taken by the approach that Peter Simons suggests [1] (abstracting the path composition), but I suspect there are more details to be worked out there. It also makes my earlier suggestion [2] of using a URI-based representation as a common abstraction seem slightly less radical. I also note Simon's comments on this (don't do it yet_) are persuasive. I'll reserve my campaign to unify filenames and URIs for the "ultimate" FilePath abstraction ;-)
It could be great to have filePathToURI and uriToFilePath functions in Network.URI. There we can also have URI manipulation functions but this doesn't mean that we don't need have System.FilePath in addition.
I've been toying with ideas for creating scheme-specific URI functions in sub-modules, e.g. Network.URI.http Network.URI.file etc. When the IETF decides what to do about the revised file: URI specification, I might give this a try, lifting some code I wrote for XML processing. #g ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

--- Graham Klyne
2. How does path joining work in this case:
joinPaths C:/root/path/file.ext altPath/file2.ext2
i.e. is "file.ext" discarded?
No. You will get:
C:/root/path/file.ext/altPath/file2.ext2
Hmmm. I can't claim it's wrong, but I'm uneasy about that. It is, for example, different from the way that URI path joining works.
What do you expect to get? file.ext is a valid directory name. By the way joinPaths "c:/dir1/file.ext" "c:/dir2" == "c:/dir2" because the second path is absolute.
What about:
joinPaths C:/root/path/ altPath/file2.ext2
does this give: C:/root/path/altPath/file2.ext2 or C:/root/path//altPath/file2.ext2
?
The first one. "//" is equal to "/", isn't it? Cheers, Krasimir __________________________________ Do you Yahoo!? Take Yahoo! Mail with you! Get it on your mobile phone. http://mobile.yahoo.com/maildemo

At 06:52 28/10/04 -0700, Krasimir Angelov wrote:
--- Graham Klyne
wrote: 2. How does path joining work in this case:
joinPaths C:/root/path/file.ext altPath/file2.ext2
i.e. is "file.ext" discarded?
No. You will get:
C:/root/path/file.ext/altPath/file2.ext2
Hmmm. I can't claim it's wrong, but I'm uneasy about that. It is, for example, different from the way that URI path joining works.
What do you expect to get? file.ext is a valid directory name.
Indeed it is. This is, I believe, an ambiguity with the way Unix-like-systems present directory and file names, which cannot be disambiguated without actually consulting the live file system. Windows has a similar issue. The URI path-joining algorithm would drop the final "filename" element 'file.ext' (not being terminated by '/'), and give: C:/root/path/altPath/file2.ext2
By the way
joinPaths "c:/dir1/file.ext" "c:/dir2" == "c:/dir2"
because the second path is absolute.
Good. That's as I'd expect.
What about:
joinPaths C:/root/path/ altPath/file2.ext2
does this give: C:/root/path/altPath/file2.ext2 or C:/root/path//altPath/file2.ext2
?
The first one. "//" is equal to "/", isn't it?
Is it? Is it definitely prohibited to have a zero length path segment. What is the correct interpretation of this: C:/root/path//../altPath/file2.ext2 ? (I'm not saying you're wrong, just that the answer isn't entirely clear-cut.) ... BTW, whatever the final answer, I think it would be a great idea to have a unit test module (preferably using HUnit) with all these as test cases. If only to definitively settle questions from awkard people like me. #g ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

Graham Klyne
The first one. "//" is equal to "/", isn't it?
Is it? Is it definitely prohibited to have a zero length path segment.
FWIW, on Linux, the system calls (stat, open, etc), and thus the fileutils, seem to treat it as above, a strace of ls gives me: stat64("/tmp///////foo//////", {st_mode=S_IFDIR|0755, st_size=60,...}) = 0 open("/tmp///////foo//////", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 3 OTOH, emacs treats foo//bar as /bar -kzm -- If I haven't seen further, it is by standing in the footprints of giants
participants (16)
-
Aaron Denney
-
Andrew Pimlott
-
Ben Rudiak-Gould
-
diatchki@cse.ogi.edu
-
Glynn Clements
-
Graham Klyne
-
Graham Klyne
-
Henning Thielemann
-
Johannes Waldmann
-
Ketil Malde
-
Krasimir Angelov
-
Peter Simons
-
Philippa Cowderoy
-
Simon Marlow
-
Sven Panne
-
Wolfgang Thaller