Re: Abstract FilePath Proposal

On 28 June 2015 at 16:34, Sven Panne
2015-06-28 12:03 GMT+02:00 Boespflug, Mathieu
: why does the proposal *not* include normalization? [...]
I think this is intentional, because otherwise we are in the IO monad for basically all operations. What's the normalized representation of "foo/bar/../baz"?
Notice that the kind of normalization I'm talking about, specified in the link I provided, does not include this kind of normalization. Because it requires the IO monad to perform correctly, and only on real paths. Here is the link again: https://hackage.haskell.org/package/filepath-1.1.0.2/docs/System-FilePath-Po... Full canonicalization of paths, stripping out redundant ".." and whatnot, should certainly be done in a separate function, in IO.

2015-06-28 16:47 GMT+02:00 Boespflug, Mathieu
Notice that the kind of normalization I'm talking about, specified in the link I provided, does not include this kind of normalization. Because it requires the IO monad to perform correctly, and only on real paths.
Here is the link again:
https://hackage.haskell.org/package/filepath-1.1.0.2/docs/System-FilePath-Po... [...]
OK, then I misunderstood what you meant by "normalizing". But a question remains then: What is a use case for having equality modulo "normalise"? It throws together a few more paths which plain equality on strings would consider different, but it is still not semantic equality in the sense of "these 2 paths refer to the same dir/file". So unless there is a compelling use case (which I don't see), I don't see a point in always doing "normalise". Or do I miss something?

On Sun, Jun 28, 2015 at 12:21 PM, Sven Panne
OK, then I misunderstood what you meant by "normalizing". But a question remains then: What is a use case for having equality modulo "normalise"? It throws together a few more paths which plain equality on strings would consider different, but it is still not semantic equality in the sense of "these 2 paths refer to the same dir/file". So unless there is a compelling use case (which I don't see), I don't see a point in always doing "normalise". Or do I miss something?
In my experience, the number one reason for (me to bother with) normalizing paths is because I'm doing metaprogramming and want to generate clean scripts. This includes both the IO and non-IO notions of normalization. It's all about producing hygienic output, and it would be nice to have all the OS-dependent things programmed away in a library somewhere so I don't have to rewrite it. IMO, normalization has nothing to do with equality. Then again, I consider the notion of equality on file paths[1] to be as dubious as the notion of equality on IEEE-754 floats. [1] path *components* do have a well-formed notion of equality; it's just that entire paths do not. -- Live well, ~wren

When talking about "filepath" specifically as something you prod at a
specific file system, you can think about normalisation, provided you
have a specific drive in mind. However, if you're going to do that, on
Linux it's almost better to use an inode number. I have two specific
problems with normalisation:
* You might hope for the property that after normalisation equality of
file locations is equality of FilePath values. That's not true. On
Linux you have symlinks. On Windows you have case sensitivity, which
is a property of the filesystem, not the OS. You might be able to
assume a == b ==> same a b, but you can never assume a /= b ==> not
(same a b), so normalisation doesn't really change the guarantee.
* I believe some Emacs user once told me that a//b is not the same as
a/b in some circumstances. On Windows, there are lots of programs that
require / or \. Some programs require ./foo and some require foo. The
exact path you pass to some programs gets baked into their output,
which is visible in the final release. While paths might be equal for
the purpose of asking a file system to get some bytes back, they are
often different for the other things people want to do with them, like
pass them to other programs that use the paths.
On Sun, Jun 28, 2015 at 3:47 PM, Boespflug, Mathieu
On 28 June 2015 at 16:34, Sven Panne
wrote: 2015-06-28 12:03 GMT+02:00 Boespflug, Mathieu
: why does the proposal *not* include normalization? [...]
I think this is intentional, because otherwise we are in the IO monad for basically all operations. What's the normalized representation of "foo/bar/../baz"?
Notice that the kind of normalization I'm talking about, specified in the link I provided, does not include this kind of normalization. Because it requires the IO monad to perform correctly, and only on real paths.
Here is the link again:
https://hackage.haskell.org/package/filepath-1.1.0.2/docs/System-FilePath-Po...
Full canonicalization of paths, stripping out redundant ".." and whatnot, should certainly be done in a separate function, in IO. _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On Sun, Jun 28, 2015 at 5:09 PM, Neil Mitchell
* I believe some Emacs user once told me that a//b is not the same as a/b in some circumstances.
Only when typing to an interactive path prompt; it lets you reset to / without erasing the existing path prefix. This should never happen in elisp, for example. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Right - given that filepath is a path manipulation library, not tied
to any particular filesystem, and indeed one in which arbitrary
imaginary paths can be manipulated, any normalization under
consideration can only be that which is independent of global system
state. What does happen in practice is that people compare paths
without taking into account that a user might have written "a//b"
where you have an entry for say "a/b" in your database. Normalization
avoids these errors, but I can see the appeal of not interpreting the
byte sequences at all, and the kind of normalization that can be done
without appeal to global system state ends up being smaller than
expected anyways.
So +1 for the proposal as-is.
On 28 June 2015 at 23:09, Neil Mitchell
When talking about "filepath" specifically as something you prod at a specific file system, you can think about normalisation, provided you have a specific drive in mind. However, if you're going to do that, on Linux it's almost better to use an inode number. I have two specific problems with normalisation:
* You might hope for the property that after normalisation equality of file locations is equality of FilePath values. That's not true. On Linux you have symlinks. On Windows you have case sensitivity, which is a property of the filesystem, not the OS. You might be able to assume a == b ==> same a b, but you can never assume a /= b ==> not (same a b), so normalisation doesn't really change the guarantee.
* I believe some Emacs user once told me that a//b is not the same as a/b in some circumstances. On Windows, there are lots of programs that require / or \. Some programs require ./foo and some require foo. The exact path you pass to some programs gets baked into their output, which is visible in the final release. While paths might be equal for the purpose of asking a file system to get some bytes back, they are often different for the other things people want to do with them, like pass them to other programs that use the paths.
On Sun, Jun 28, 2015 at 3:47 PM, Boespflug, Mathieu
wrote: On 28 June 2015 at 16:34, Sven Panne
wrote: 2015-06-28 12:03 GMT+02:00 Boespflug, Mathieu
: why does the proposal *not* include normalization? [...]
I think this is intentional, because otherwise we are in the IO monad for basically all operations. What's the normalized representation of "foo/bar/../baz"?
Notice that the kind of normalization I'm talking about, specified in the link I provided, does not include this kind of normalization. Because it requires the IO monad to perform correctly, and only on real paths.
Here is the link again:
https://hackage.haskell.org/package/filepath-1.1.0.2/docs/System-FilePath-Po...
Full canonicalization of paths, stripping out redundant ".." and whatnot, should certainly be done in a separate function, in IO. _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
participants (5)
-
Boespflug, Mathieu
-
Brandon Allbery
-
Neil Mitchell
-
Sven Panne
-
wren romano