
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello *, What? ===== We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym type FilePath = String into an abstract/opaque data type instead. Why/How/When? ============= For details (including motivation and a suggested transition scheme) please consult https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE-----

On Fri, 26 Jun 2015, Herbert Valerio Riedel wrote:
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Has someone else tried the pathtype package? http://hackage.haskell.org/package/pathtype I found its idea great, but in practice I could not use it because it always tries to canonicalize paths on initialization.

On Fri, 26 Jun 2015, Henning Thielemann wrote:
On Fri, 26 Jun 2015, Herbert Valerio Riedel wrote:
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Has someone else tried the pathtype package? http://hackage.haskell.org/package/pathtype
Hm, your last point "Decisions assumed by this Proposal" seems to mean, that you want to leave out more specialised types from this proposal. That is, dir/file distinction might be defined on top of the new FilePath type.

On 2015-06-26 at 18:22:16 +0200, Henning Thielemann wrote: [...]
type FilePath = String
into an abstract/opaque data type instead.
Has someone else tried the pathtype package? http://hackage.haskell.org/package/pathtype
Hm, your last point "Decisions assumed by this Proposal" seems to mean, that you want to leave out more specialised types from this proposal. That is, dir/file distinction might be defined on top of the new FilePath type.
Yes, because the proposal is meant to only make the smallest incremental change needed (i.e. change FilePath datatype, provide conversion functions) to to achieve the primary goals (reduce space/time overhead & make it a distinct type from String) in a way suitable for a future Haskell Report, while trying to stay close enough that you can still write code that works with both the H2010 and a AFPP definition of `FilePath` Trying to redesign the FilePath type to also include dir/file distinction seemed too daunting, as there's quite some additional design-space area to explore (do drive-letters deserve a separate type? do we use DataKinds? What invariants can/shall be represented at the type-level? what errors are caught at the type-level, which are caught at runtime? etc...), parts of which may require type-system extensions, while just having a KISS-style opaque FilePath evades this.

One other point, while still maintaining +0 on the proposal itself:
This proposal is likely to break lots and lots of code a la the FTP, and we all remember the unpleasant surprise that the community had with that one.
If libraries@ agrees this is a worthwhile change, I think we need to let the community vote on it and its deprecation cycle.
Tom
El Jun 27, 2015, a las 3:36, Herbert Valerio Riedel
On 2015-06-26 at 18:22:16 +0200, Henning Thielemann wrote: [...]
type FilePath = String
into an abstract/opaque data type instead.
Has someone else tried the pathtype package? http://hackage.haskell.org/package/pathtype
Hm, your last point "Decisions assumed by this Proposal" seems to mean, that you want to leave out more specialised types from this proposal. That is, dir/file distinction might be defined on top of the new FilePath type.
Yes, because the proposal is meant to only make the smallest incremental change needed (i.e. change FilePath datatype, provide conversion functions) to to achieve the primary goals (reduce space/time overhead & make it a distinct type from String) in a way suitable for a future Haskell Report, while trying to stay close enough that you can still write code that works with both the H2010 and a AFPP definition of `FilePath`
Trying to redesign the FilePath type to also include dir/file distinction seemed too daunting, as there's quite some additional design-space area to explore (do drive-letters deserve a separate type? do we use DataKinds? What invariants can/shall be represented at the type-level? what errors are caught at the type-level, which are caught at runtime? etc...), parts of which may require type-system extensions, while just having a KISS-style opaque FilePath evades this.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

-1 on this proposal in all its variations right now. There seems to be no
strong consensus on the positive nature of any of the proposed changes at
this point, and folks seem to disagree on the scope and content of the
changes. We should leave well enough alone until someone has a proposal
most of us can agree is awesome.
On Sun, Jun 28, 2015 at 10:05 AM
One other point, while still maintaining +0 on the proposal itself:
This proposal is likely to break lots and lots of code a la the FTP, and we all remember the unpleasant surprise that the community had with that one.
If libraries@ agrees this is a worthwhile change, I think we need to let the community vote on it and its deprecation cycle.
Tom
El Jun 27, 2015, a las 3:36, Herbert Valerio Riedel
escribió: On 2015-06-26 at 18:22:16 +0200, Henning Thielemann wrote: [...]
type FilePath = String
into an abstract/opaque data type instead.
Has someone else tried the pathtype package? http://hackage.haskell.org/package/pathtype
Hm, your last point "Decisions assumed by this Proposal" seems to mean, that you want to leave out more specialised types from this proposal. That is, dir/file distinction might be defined on top of the new FilePath type.
Yes, because the proposal is meant to only make the smallest incremental change needed (i.e. change FilePath datatype, provide conversion functions) to to achieve the primary goals (reduce space/time overhead & make it a distinct type from String) in a way suitable for a future Haskell Report, while trying to stay close enough that you can still write code that works with both the H2010 and a AFPP definition of `FilePath`
Trying to redesign the FilePath type to also include dir/file distinction seemed too daunting, as there's quite some additional design-space area to explore (do drive-letters deserve a separate type? do we use DataKinds? What invariants can/shall be represented at the type-level? what errors are caught at the type-level, which are caught at runtime? etc...), parts of which may require type-system extensions, while just having a KISS-style opaque FilePath evades this.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

+1 on the original proposal, sans variations. The advantage is that it provides a very clear and smooth upgrade path, that with correct signaling can be nearly seamless. As argued, it isn’t about enforcing the “right” semantics, which can be overlaid with further proposals or even in userland libraries — it is that the current datatype is just plain wrong for capturing the various weird things that filepaths can really be in practice, so to even have the possibility of putting in place semantics that are “more correct” we need to have the right sort of opaque type to begin with. Since this proposal is designed to be the smoothest way to get us over that hump, I support it. —gershom On June 28, 2015 at 7:38:06 PM, Bart Massey (bart.massey@gmail.com) wrote:
-1 on this proposal in all its variations right now. There seems to be no strong consensus on the positive nature of any of the proposed changes at this point, and folks seem to disagree on the scope and content of the changes. We should leave well enough alone until someone has a proposal most of us can agree is awesome.
On Sun, Jun 28, 2015 at 10:05 AM wrote:
One other point, while still maintaining +0 on the proposal itself:
This proposal is likely to break lots and lots of code a la the FTP, and we all remember the unpleasant surprise that the community had with that one.
If libraries@ agrees this is a worthwhile change, I think we need to let the community vote on it and its deprecation cycle.
Tom
El Jun 27, 2015, a las 3:36, Herbert Valerio Riedel escribió:
On 2015-06-26 at 18:22:16 +0200, Henning Thielemann wrote: [...]
type FilePath = String
into an abstract/opaque data type instead.
Has someone else tried the pathtype package? http://hackage.haskell.org/package/pathtype
Hm, your last point "Decisions assumed by this Proposal" seems to mean, that you want to leave out more specialised types from this proposal. That is, dir/file distinction might be defined on top of the new FilePath type.
Yes, because the proposal is meant to only make the smallest incremental change needed (i.e. change FilePath datatype, provide conversion functions) to to achieve the primary goals (reduce space/time overhead & make it a distinct type from String) in a way suitable for a future Haskell Report, while trying to stay close enough that you can still write code that works with both the H2010 and a AFPP definition of `FilePath`
Trying to redesign the FilePath type to also include dir/file distinction seemed too daunting, as there's quite some additional design-space area to explore (do drive-letters deserve a separate type? do we use DataKinds? What invariants can/shall be represented at the type-level? what errors are caught at the type-level, which are caught at runtime? etc...), parts of which may require type-system extensions, while just having a KISS-style opaque FilePath evades this.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Hi, Am Freitag, den 26.06.2015, 18:08 +0200 schrieb Herbert Valerio Riedel:
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
unless proven wrong, I assume that this breaks lots of code. So by default, I am doubtful towards that proposal. Also, the argument about inefficiency does not convince me. In a Prelude where "type String = [Char]", this is – as bad as it may be – consistent and expected. Just like you manually have to reach out for dedicated string types if you want efficient Haskell, you can manually reach out for dedicated FilePath libraries. I doubt that it is a good idea to fix the String problem in small, isolated chunks. However, the argument that, depending on the system, "type FilePath = String" is simply wrong (because the semantics of file paths are not unicode points), is one that I buy in. On such systems, the Prelude is, in a moral sense, wrongly typed, and the situation is comparable to a hypothetical "type String = [Word8]" – which is just wrong in a world of unicode. So for me, the faithful representation of system filepaths makes this proposal interesting. I like it when Haskell “gets it right”. So if the transition can be made somewhat smooth, I’m leaning towards supporting it. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

I am strong -1 on this proposal, due to the arguments of Joachim, unless we are getting the semantics right. Are we? By semantics, I mean just the lowest-level definition of what a path is made of. On POSIX, it's a raw string of bytes, with interpretation in any human-readable way delegated to the application. On Mac OS X, it's normalized Unicode. The important point is *normalized* - if you create a FilePath from two different Unicode strings that have the same normalized form, the result FilePaths must be equal on Mac OS X. On Windows, it's UTF-16. Basically, modulo the quirks. Correct semantics means: 1. There are different FilePath types for each platform, because the semantics of a path are vastly different for each. 2. By default, the FilePath for the current platform is used. But the FilePaths for other platforms are also available, and there is a simple API for best-effort conversion between them. Will this proposal provide that? If not, don't do this to us. Please. Joachim Breitner wrote:
unless proven wrong, I assume that this breaks lots of code. So by default, I am doubtful towards that proposal.
Also, the argument about inefficiency does not convince me. In a Prelude where "type String = [Char]", this is – as bad as it may be – consistent and expected. Just like you manually have to reach out for dedicated string types if you want efficient Haskell, you can manually reach out for dedicated FilePath libraries. I doubt that it is a good idea to fix the String problem in small, isolated chunks.
However, the argument that, depending on the system, "type FilePath = String" is simply wrong (because the semantics of file paths are not unicode points), is one that I buy in. On such systems, the Prelude is, in a moral sense, wrongly typed, and the situation is comparable to a hypothetical "type String = [Word8]" – which is just wrong in a world of unicode. So for me, the faithful representation of system filepaths makes this proposal interesting. I like it when Haskell “gets it right”.
The problem is that I'm not sure this proposal "gets it right". And if not, I agree with your arguments that this proposal would not be worth the pain.

On Sat, Jun 27, 2015 at 4:50 PM, Yitzchak Gale
On Mac OS X, it's normalized Unicode. The important point is *normalized* - if you create a FilePath from two different Unicode strings that have the same normalized form, the result FilePaths must be equal on Mac OS X.
This is only true for higher level OS X APIs. ghc normally operates in the BSD layer, which mostly follows POSIX semantics; in particular, filesystem paths are bytestrings in the BSD layer, and only normalized in Cocoa APIs. (Which, among other things, means you can make a GUI application dump core by trying to use a file dialog in a directory containing a filename created using the BSD API which does not use a UTF8 encoding.) -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Hi,
I'm +1 on the general idea of this proposal. Using String for filenames has
caused me all sorts of trouble, particularly when I've had to deal with a
bunch of files whose names don't all use the same encoding.
However, be careful about the exact semantics of filenames on Windows.
Quoting MSDN:
There is no need to perform any Unicode normalization on path and file name
strings for use by the Windows file I/O API functions because* the file
system treats path and file names as an opaque sequence of WCHARs*. Any
normalization that your application requires should be performed with this
in mind, external of any calls to related Windows file I/O API functions.
(from
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).a...,
emphasis mine)
Thus FilePath = String (or Text) doesn't really seem correct on Windows
either (although it'll be pretty close as long as you stay within the BMP).
By my reckoning, when you get down to brass tacks, all filesystems on all
platforms name files with sequences of bytes. There are various interesting
ways to represent these bytes to human beings as sequences of characters,
but aiming for FilePath = ByteString everywhere and dealing with the
conversion to characters elsewhere seems more correct.
Cheers,
David
On 27 June 2015 at 22:02, Brandon Allbery
On Sat, Jun 27, 2015 at 4:50 PM, Yitzchak Gale
wrote: On Mac OS X, it's normalized Unicode. The important point is *normalized* - if you create a FilePath from two different Unicode strings that have the same normalized form, the result FilePaths must be equal on Mac OS X.
This is only true for higher level OS X APIs. ghc normally operates in the BSD layer, which mostly follows POSIX semantics; in particular, filesystem paths are bytestrings in the BSD layer, and only normalized in Cocoa APIs. (Which, among other things, means you can make a GUI application dump core by trying to use a file dialog in a directory containing a filename created using the BSD API which does not use a UTF8 encoding.)
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

OK, based on what David and Brandon wrote, I guess
that representing paths as bytestrings does make
some low-level sense on all platforms. Although
for Windows we would still need some way to deal
with the requirement that the bytestring have an even
length.
We will need platform-dependent coercions of
paths to and from String/Text. Those might sometimes
be partial functions. We need a notion of the coercions
for the current platform, and we also need it to be
possible to access the coercions for all platforms.
On Sun, Jun 28, 2015 at 12:28 AM David Turner
Hi,
I'm +1 on the general idea of this proposal. Using String for filenames has caused me all sorts of trouble, particularly when I've had to deal with a bunch of files whose names don't all use the same encoding.
However, be careful about the exact semantics of filenames on Windows. Quoting MSDN:
There is no need to perform any Unicode normalization on path and file name strings for use by the Windows file I/O API functions because* the file system treats path and file names as an opaque sequence of WCHARs*. Any normalization that your application requires should be performed with this in mind, external of any calls to related Windows file I/O API functions.
(from https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).a..., emphasis mine)
Thus FilePath = String (or Text) doesn't really seem correct on Windows either (although it'll be pretty close as long as you stay within the BMP).
By my reckoning, when you get down to brass tacks, all filesystems on all platforms name files with sequences of bytes. There are various interesting ways to represent these bytes to human beings as sequences of characters, but aiming for FilePath = ByteString everywhere and dealing with the conversion to characters elsewhere seems more correct.
Cheers,
David
On Sat, Jun 27, 2015 at 4:50 PM, Yitzchak Gale
wrote: On Mac OS X, it's normalized Unicode. The important point is *normalized* - if you create a FilePath from two different Unicode strings that have the same normalized form, the result FilePaths must be equal on Mac OS X.
This is only true for higher level OS X APIs. ghc normally operates in
On 27 June 2015 at 22:02, Brandon Allbery
wrote: the BSD layer, which mostly follows POSIX semantics; in particular, filesystem paths are bytestrings in the BSD layer, and only normalized in Cocoa APIs. (Which, among other things, means you can make a GUI application dump core by trying to use a file dialog in a directory containing a filename created using the BSD API which does not use a UTF8 encoding.)
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On 28 June 2015 at 02:02, Yitzchak Gale
OK, based on what David and Brandon wrote, I guess that representing paths as bytestrings does make some low-level sense on all platforms. Although for Windows we would still need some way to deal with the requirement that the bytestring have an even length.
I would guess this could just be done by making the type abstract so you
can't easily get to the underlying bytes. Windows will only ever give you
even-length bytestrings (in directory listings or similar) and all the
other ways of synthesizing paths from strings could be set up to preserve
the evenness.
If you end up passing an odd-length bytestring to Windows as a path then
Bad Things could certainly happen, but no worse than mucking around with
other unsafe APIs like Data.ByteString.Internal.
On 28 June 2015 at 02:02, Yitzchak Gale
OK, based on what David and Brandon wrote, I guess that representing paths as bytestrings does make some low-level sense on all platforms. Although for Windows we would still need some way to deal with the requirement that the bytestring have an even length.
We will need platform-dependent coercions of paths to and from String/Text. Those might sometimes be partial functions. We need a notion of the coercions for the current platform, and we also need it to be possible to access the coercions for all platforms.
On Sun, Jun 28, 2015 at 12:28 AM David Turner < dct25-561bs@mythic-beasts.com> wrote:
Hi,
I'm +1 on the general idea of this proposal. Using String for filenames has caused me all sorts of trouble, particularly when I've had to deal with a bunch of files whose names don't all use the same encoding.
However, be careful about the exact semantics of filenames on Windows. Quoting MSDN:
There is no need to perform any Unicode normalization on path and file name strings for use by the Windows file I/O API functions because* the file system treats path and file names as an opaque sequence of WCHARs*. Any normalization that your application requires should be performed with this in mind, external of any calls to related Windows file I/O API functions.
(from https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).a..., emphasis mine)
Thus FilePath = String (or Text) doesn't really seem correct on Windows either (although it'll be pretty close as long as you stay within the BMP).
By my reckoning, when you get down to brass tacks, all filesystems on all platforms name files with sequences of bytes. There are various interesting ways to represent these bytes to human beings as sequences of characters, but aiming for FilePath = ByteString everywhere and dealing with the conversion to characters elsewhere seems more correct.
Cheers,
David
On Sat, Jun 27, 2015 at 4:50 PM, Yitzchak Gale
wrote: On Mac OS X, it's normalized Unicode. The important point is *normalized* - if you create a FilePath from two different Unicode strings that have the same normalized form, the result FilePaths must be equal on Mac OS X.
This is only true for higher level OS X APIs. ghc normally operates in
On 27 June 2015 at 22:02, Brandon Allbery
wrote: the BSD layer, which mostly follows POSIX semantics; in particular, filesystem paths are bytestrings in the BSD layer, and only normalized in Cocoa APIs. (Which, among other things, means you can make a GUI application dump core by trying to use a file dialog in a directory containing a filename created using the BSD API which does not use a UTF8 encoding.)
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On 27/06/2015, Yitzchak Gale
OK, based on what David and Brandon wrote, I guess that representing paths as bytestrings does make some low-level sense on all platforms.
Not quite: in 9p for example, each component is a byte string, and a path is a list of components. 9p forbids null bytes in a component tho, so one could use it as a separator in FilePath. That said, this is an edge case and (file path ~ byte string) may be good enough.

I wrote:
On Mac OS X, it's normalized Unicode.
Brandon Allbery wrote:
This is only true for higher level OS X APIs. ghc normally operates in the BSD layer, which mostly follows POSIX semantics; in particular, filesystem paths are bytestrings in the BSD layer, and only normalized in Cocoa APIs.
If I happen to be writing a program that only interacts with the BSD layer, then I can use the POSIX-style FilePath. But as an application writer, most of the time I'll be dealing with Mac OS X paths that need to work correctly with Cocoa.

Brandon Allbery
On Sat, Jun 27, 2015 at 4:50 PM, Yitzchak Gale
wrote: On Mac OS X, it's normalized Unicode. The important
point is *normalized* - if you create a FilePath from two different Unicode strings that have the same normalized form, the result FilePaths must be equal on Mac OS X.
This is only true for higher level OS X APIs. ghc normally operates in the BSD layer, which mostly follows POSIX semantics; in particular, filesystem paths are bytestrings in the BSD layer, and only normalized in Cocoa APIs.
Normalization here is a property of the filesystem, not the API. So it will be normalized in the POSIX layer too.

Hi!
Instead of trying to minimally patch the existing API and still breaking loads of code, why not make a new API that doesn't have to compromise and depreciate the old one?
Niklas
----- Ursprungligt meddelande -----
Från: "Herbert Valerio Riedel"

Because new api already exists in libraries, but FilePath from base is
still being used, which makes things worse (now your programs have all
those conversions all over).
I like the idea with gradual deprecation warning, but it's not clear if
it's feasible to implement.
27 черв. 2015 12:33 "Niklas Larsson"
Hi!
Instead of trying to minimally patch the existing API and still breaking loads of code, why not make a new API that doesn't have to compromise and depreciate the old one?
Niklas ------------------------------ Från: Herbert Valerio Riedel
Skickat: 2015-06-26 18:09 Till: libraries@haskell.org; ghc-devs@haskell.org Ämne: Abstract FilePath Proposal -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Hi Niklas,
The function writeFile takes a FilePath. We could fork base or tell
everyone to use writeFile2, but in practice everyone will keep using
writeFile, and this String for FilePath. This approach is the only thing we
could figure that made sense.
Henning: we do not propose normalisation on initialisation. For ASCII
strings fromFilePath . toFilePath will be id. It might also be for unicode
on some/all platforms. Of course, you can write your own FilePath creator
that does normalisation on construction.
Thanks, Neil
On Saturday, 27 June 2015, Niklas Larsson
Hi!
Instead of trying to minimally patch the existing API and still breaking loads of code, why not make a new API that doesn't have to compromise and depreciate the old one?
Niklas ------------------------------ Från: Herbert Valerio Riedel Skickat: 2015-06-26 18:09 Till: libraries@haskell.org; ghc-devs@haskell.org Ämne: Abstract FilePath Proposal
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Hi Neil,
why does the proposal *not* include normalization?
There are four advantages that I see to making FilePath a datatype:
1. it makes it possible to implement the correct semantics for some
systems (including POSIX),
2. it allows for information hiding, which in turn helps modularity,
3. the type is distinct from any other type, hence static checks are stronger,
4. it becomes possible to quotient values over some arbitrary set of
identities that makes sense. i.e. in the case of FilePath, arguably
"foo/bar//baz" *is* "foo/bar/baz" *is* "foo//bar/baz" for all intents
and purposes, so it is not useful to distinguish these three ways of
writing down the same path (and in fact in practice distinguishing
them leads to subtle bugs). That is, the Eq instance compares
FilePath's modulo a few laws.
Do you propose to forego (4)? If so why so?
If we're going through a deprecation process, could we do so once, by
getting the notion of path equality we want right the first time?
Contrary to type indexing FilePath, it seems to me that the design
space for path identities is much smaller. Essentially, exactly the
ones here: https://hackage.haskell.org/package/filepath-1.1.0.2/docs/System-FilePath-Po....
Best,
Mathieu
On 27 June 2015 at 12:12, Neil Mitchell
Hi Niklas,
The function writeFile takes a FilePath. We could fork base or tell everyone to use writeFile2, but in practice everyone will keep using writeFile, and this String for FilePath. This approach is the only thing we could figure that made sense.
Henning: we do not propose normalisation on initialisation. For ASCII strings fromFilePath . toFilePath will be id. It might also be for unicode on some/all platforms. Of course, you can write your own FilePath creator that does normalisation on construction.
Thanks, Neil
On Saturday, 27 June 2015, Niklas Larsson
wrote: Hi!
Instead of trying to minimally patch the existing API and still breaking loads of code, why not make a new API that doesn't have to compromise and depreciate the old one?
Niklas ________________________________ Från: Herbert Valerio Riedel Skickat: 2015-06-26 18:09 Till: libraries@haskell.org; ghc-devs@haskell.org Ämne: Abstract FilePath Proposal
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Hi,
I think it'd be more robust to handle normalisation when converting from
String/Text to FilePath (and combining things with (>) and so on) rather
than in the underlying representation.
It's absolutely crucial that you can ask the OS for a filename (which it
gives you as a sequence of bytes) and then pass that exact same sequence of
bytes back to the OS without any normalisation or other useful alterations
having taken place.
You can do some deeply weird stuff in Windows by starting an absolute path
with \\?\, including apparently using '.' and '..' as the name of a
filesystem component:
Because it turns off automatic expansion of the path string, the "\\?\"
prefix also allows the use of ".." and "." in the path names, which can be
useful if you are attempting to perform operations on a file with these
otherwise reserved relative path specifiers as part of the fully qualified
path.
(from
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).a...
)
I don't fancy shaking all the corner cases out of this. An explicit
'normalise' function seems ok, but baking normalisation into the type
itself seems bad.
Cheers,
David
On 28 June 2015 at 11:03, Boespflug, Mathieu
Hi Neil,
why does the proposal *not* include normalization?
There are four advantages that I see to making FilePath a datatype:
1. it makes it possible to implement the correct semantics for some systems (including POSIX), 2. it allows for information hiding, which in turn helps modularity, 3. the type is distinct from any other type, hence static checks are stronger, 4. it becomes possible to quotient values over some arbitrary set of identities that makes sense. i.e. in the case of FilePath, arguably "foo/bar//baz" *is* "foo/bar/baz" *is* "foo//bar/baz" for all intents and purposes, so it is not useful to distinguish these three ways of writing down the same path (and in fact in practice distinguishing them leads to subtle bugs). That is, the Eq instance compares FilePath's modulo a few laws.
Do you propose to forego (4)? If so why so?
If we're going through a deprecation process, could we do so once, by getting the notion of path equality we want right the first time? Contrary to type indexing FilePath, it seems to me that the design space for path identities is much smaller. Essentially, exactly the ones here: https://hackage.haskell.org/package/filepath-1.1.0.2/docs/System-FilePath-Po... .
Best,
Mathieu
On 27 June 2015 at 12:12, Neil Mitchell
wrote: Hi Niklas,
The function writeFile takes a FilePath. We could fork base or tell everyone to use writeFile2, but in practice everyone will keep using writeFile, and this String for FilePath. This approach is the only thing we could figure that made sense.
Henning: we do not propose normalisation on initialisation. For ASCII strings fromFilePath . toFilePath will be id. It might also be for unicode on some/all platforms. Of course, you can write your own FilePath creator that does normalisation on construction.
Thanks, Neil
On Saturday, 27 June 2015, Niklas Larsson
wrote: Hi!
Instead of trying to minimally patch the existing API and still breaking loads of code, why not make a new API that doesn't have to compromise
and
depreciate the old one?
Niklas ________________________________ Från: Herbert Valerio Riedel Skickat: 2015-06-26 18:09 Till: libraries@haskell.org; ghc-devs@haskell.org Ämne: Abstract FilePath Proposal
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Worse there are situations where you absolutely _have_ to be able to use
\\?\ encoding of a path on Windows to read, modify or delete files with
"impossible names" that were created by other means.
e.g. Filenames like AUX, that had traditional roles under DOS cause weird
interactions, or that were created with "impossibly long names" -- which
can happen in the wild when you move directories around, etc.
I'm weakly in favor of the proposal precisely because it is the first
version of this concept that I've seen that DOESN'T try to get too clever
with regards to adding all sorts of normalization and this proposal seems
to be the simplest move that would enable us to do something correctly in
the future, regardless of what that correct thing winds up being.
-Edward
On Sun, Jun 28, 2015 at 8:09 AM, David Turner wrote: Hi, I think it'd be more robust to handle normalisation when converting from
String/Text to FilePath (and combining things with (>) and so on) rather
than in the underlying representation. It's absolutely crucial that you can ask the OS for a filename (which it
gives you as a sequence of bytes) and then pass that exact same sequence of
bytes back to the OS without any normalisation or other useful alterations
having taken place. You can do some deeply weird stuff in Windows by starting an absolute path
with \\?\, including apparently using '.' and '..' as the name of a
filesystem component: Because it turns off automatic expansion of the path string, the "\\?\"
prefix also allows the use of ".." and "." in the path names, which can be
useful if you are attempting to perform operations on a file with these
otherwise reserved relative path specifiers as part of the fully qualified
path. (from
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).a...
) I don't fancy shaking all the corner cases out of this. An explicit
'normalise' function seems ok, but baking normalisation into the type
itself seems bad. Cheers, David On 28 June 2015 at 11:03, Boespflug, Mathieu Hi Neil, why does the proposal *not* include normalization? There are four advantages that I see to making FilePath a datatype: 1. it makes it possible to implement the correct semantics for some
systems (including POSIX),
2. it allows for information hiding, which in turn helps modularity,
3. the type is distinct from any other type, hence static checks are
stronger,
4. it becomes possible to quotient values over some arbitrary set of
identities that makes sense. i.e. in the case of FilePath, arguably
"foo/bar//baz" *is* "foo/bar/baz" *is* "foo//bar/baz" for all intents
and purposes, so it is not useful to distinguish these three ways of
writing down the same path (and in fact in practice distinguishing
them leads to subtle bugs). That is, the Eq instance compares
FilePath's modulo a few laws. Do you propose to forego (4)? If so why so? If we're going through a deprecation process, could we do so once, by
getting the notion of path equality we want right the first time?
Contrary to type indexing FilePath, it seems to me that the design
space for path identities is much smaller. Essentially, exactly the
ones here:
https://hackage.haskell.org/package/filepath-1.1.0.2/docs/System-FilePath-Po...
. Best, Mathieu Hi Niklas, The function writeFile takes a FilePath. We could fork base or tell
everyone
to use writeFile2, but in practice everyone will keep using writeFile,
and
this String for FilePath. This approach is the only thing we could On 27 June 2015 at 12:12, Neil Mitchell that made sense. Henning: we do not propose normalisation on initialisation. For ASCII
strings fromFilePath . toFilePath will be id. It might also be for
unicode
on some/all platforms. Of course, you can write your own FilePath
creator
that does normalisation on construction. Thanks, Neil On Saturday, 27 June 2015, Niklas Larsson Hi! Instead of trying to minimally patch the existing API and still breaking loads of code, why not make a new API that doesn't have to compromise
and
depreciate the old one? Niklas
________________________________
Från: Herbert Valerio Riedel
Skickat: 2015-06-26 18:09
Till: libraries@haskell.org; ghc-devs@haskell.org
Ämne: Abstract FilePath Proposal -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1 Hello *, What?
===== We (see From: & CC: headers) propose, plain and simple, to turn the
currently defined type-synonym type FilePath = String into an abstract/opaque data type instead. Why/How/When?
============= For details (including motivation and a suggested transition scheme)
please consult https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath Suggested discussion period: 4 weeks
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1 iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon
BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526
YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2
28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn
koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN
qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5
KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+
NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU
tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm
awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv
aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb
HjIPRrJbVK9AABo4AZ/Y
=lg0o
-----END PGP SIGNATURE-----
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs _______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs _______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs _______________________________________________
Libraries mailing list
Libraries@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Normalization is a very hairy issue, which is not just platform specific
but also filesystem specific. Mac OS X is probably the worst of all words
in that respect, where HFS+ will do NFD normalization and may or may not
have case sensitivity depending on how that partition was formatted.
Network file shares and disk images may or may not have case sensitivity
and can use either NFD or NFC normalization based on mount options.
Contrary to statements earlier in the thread, NFD normalization happens on
HFS+ filesystems (the default) regardless of whether you're using POSIX
APIs or not. It's easy to prove this to yourself by creating a file with
U+00c9 (LATIN SMALL LETTER E WITH ACUTE) in the name (from any of the APIs)
and you'll see it come back out (e.g. from readdir) as two code points: 'e'
and then U+0301 (COMBINING ACUTE ACCENT). It'll also do some weird
transformations to file names that contain byte sequences that are not
valid UTF-8.
On Sun, Jun 28, 2015 at 12:05 PM, Edward Kmett
Worse there are situations where you absolutely _have_ to be able to use \\?\ encoding of a path on Windows to read, modify or delete files with "impossible names" that were created by other means.
e.g. Filenames like AUX, that had traditional roles under DOS cause weird interactions, or that were created with "impossibly long names" -- which can happen in the wild when you move directories around, etc.
I'm weakly in favor of the proposal precisely because it is the first version of this concept that I've seen that DOESN'T try to get too clever with regards to adding all sorts of normalization and this proposal seems to be the simplest move that would enable us to do something correctly in the future, regardless of what that correct thing winds up being.
-Edward
On Sun, Jun 28, 2015 at 8:09 AM, David Turner < dct25-561bs@mythic-beasts.com> wrote:
Hi,
I think it'd be more robust to handle normalisation when converting from String/Text to FilePath (and combining things with (>) and so on) rather than in the underlying representation.
It's absolutely crucial that you can ask the OS for a filename (which it gives you as a sequence of bytes) and then pass that exact same sequence of bytes back to the OS without any normalisation or other useful alterations having taken place.
You can do some deeply weird stuff in Windows by starting an absolute path with \\?\, including apparently using '.' and '..' as the name of a filesystem component:
Because it turns off automatic expansion of the path string, the "\\?\" prefix also allows the use of ".." and "." in the path names, which can be useful if you are attempting to perform operations on a file with these otherwise reserved relative path specifiers as part of the fully qualified path.
(from https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).a... )
I don't fancy shaking all the corner cases out of this. An explicit 'normalise' function seems ok, but baking normalisation into the type itself seems bad.
Cheers,
David
On 28 June 2015 at 11:03, Boespflug, Mathieu
wrote: Hi Neil,
why does the proposal *not* include normalization?
There are four advantages that I see to making FilePath a datatype:
1. it makes it possible to implement the correct semantics for some systems (including POSIX), 2. it allows for information hiding, which in turn helps modularity, 3. the type is distinct from any other type, hence static checks are stronger, 4. it becomes possible to quotient values over some arbitrary set of identities that makes sense. i.e. in the case of FilePath, arguably "foo/bar//baz" *is* "foo/bar/baz" *is* "foo//bar/baz" for all intents and purposes, so it is not useful to distinguish these three ways of writing down the same path (and in fact in practice distinguishing them leads to subtle bugs). That is, the Eq instance compares FilePath's modulo a few laws.
Do you propose to forego (4)? If so why so?
If we're going through a deprecation process, could we do so once, by getting the notion of path equality we want right the first time? Contrary to type indexing FilePath, it seems to me that the design space for path identities is much smaller. Essentially, exactly the ones here: https://hackage.haskell.org/package/filepath-1.1.0.2/docs/System-FilePath-Po... .
Best,
Mathieu
Hi Niklas,
The function writeFile takes a FilePath. We could fork base or tell everyone to use writeFile2, but in practice everyone will keep using writeFile, and this String for FilePath. This approach is the only thing we could
On 27 June 2015 at 12:12, Neil Mitchell
wrote: figure that made sense.
Henning: we do not propose normalisation on initialisation. For ASCII strings fromFilePath . toFilePath will be id. It might also be for unicode on some/all platforms. Of course, you can write your own FilePath creator that does normalisation on construction.
Thanks, Neil
On Saturday, 27 June 2015, Niklas Larsson
wrote: Hi!
Instead of trying to minimally patch the existing API and still
breaking
loads of code, why not make a new API that doesn't have to compromise and depreciate the old one?
Niklas ________________________________ Från: Herbert Valerio Riedel Skickat: 2015-06-26 18:09 Till: libraries@haskell.org; ghc-devs@haskell.org Ämne: Abstract FilePath Proposal
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On Fri, Jun 26, 2015 at 9:08 AM, Herbert Valerio Riedel
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When?
I've had success with a slightly different "How": Phase 1: Replace FilePath with a type class, with instances for the old FilePath (i.e. String) and the new implementation. Phase 2: Wait until a suitable amount of hackage builds without the string instance. Phase 3: Deprecate the String instance - move it to an old-filepath package. Phase 4: Replace the type class with the new implementation This way the new implementation is available immediately, packages can begin converting at once, benefits can be assessed.

On 2015-06-27 at 14:56:33 +0200, David Fox wrote: [...]
I've had success with a slightly different "How":
What was your concrete use-case btw?
Phase 1: Replace FilePath with a type class, with instances for the old FilePath (i.e. String) and the new implementation.
what would that comprise in the FilePath case? I assume adding a transitional class whose methods are not exposed (and whose typeclass name is exported from some GHC-specific internal-marked module)? i.e. class IsFilePath a where privateToFilePath :: a -> FilePath privateFromFilePath :: FilePath -> a instance IsFilePath FilePath where privateToFilePath = id privateFromFilePath = id instance IsFilePath [Char] where privateToFilePath = System.IO.toFilePath privateFromFilePath = System.IO.fromFilePath ? as well as changing a lot of type-sigs in base & filepath from e.g. writeFile :: FilePath -> String -> IO () openTempFile :: FilePath -> String -> IO (FilePath, Handle) to writeFile :: IsFilePath a => a -> String -> IO () openTempFile :: IsFilePath a => a -> String -> IO (a, Handle) ?
Phase 2: Wait until a suitable amount of hackage builds without the string instance.
I can see Stackage helping with that by using a custom GHC which lacks the legacy `IsFilePath [Char]`-instance. So I'd be optimistic that Phase2 could be accomplished within one year for the Stackage-subset of Hackage.
Phase 3: Deprecate the String instance - move it to an old-filepath package.
Phase 4: Replace the type class with the new implementation
I assume this means getting rid again of the typeclass, and changing the type-sigs back to i.e. writeFile :: FilePath -> String -> IO () openTempFile :: FilePath -> String -> IO (FilePath, Handle) (but now with with the new opaque `FilePath`)?
This way the new implementation is available immediately, packages can begin converting at once, benefits can be assessed.
This scheme seems feasible at first glance, as long as the typeclass doesn't start spreading across packages and find its way into type-sigs (in which case it'd become more disruptive to get rid of it again). Otoh, I'm not sure (assuming I understood how your scheme works) it can be avoided to have the typeclass spread, since if not every API that now has `FilePath` arguments in their type-sigs gets generalised to have `IsFilePath a => a` arguments instead, we can't reach the goal of "Phase 2". But I suspect that I didn't fully understand how your proposed transition scheme works exactly... so please correct me where I got it wrong! Cheers, hvr

On Sat, Jun 27, 2015 at 6:37 AM, Herbert Valerio Riedel
On 2015-06-27 at 14:56:33 +0200, David Fox wrote:
[...]
I've had success with a slightly different "How":
What was your concrete use-case btw?
Phase 1: Replace FilePath with a type class, with instances for the old FilePath (i.e. String) and the new implementation.
what would that comprise in the FilePath case?
I assume adding a transitional class whose methods are not exposed (and whose typeclass name is exported from some GHC-specific internal-marked module)? i.e.
class IsFilePath a where privateToFilePath :: a -> FilePath privateFromFilePath :: FilePath -> a
instance IsFilePath FilePath where privateToFilePath = id privateFromFilePath = id
instance IsFilePath [Char] where privateToFilePath = System.IO.toFilePath privateFromFilePath = System.IO.fromFilePath
?
as well as changing a lot of type-sigs in base & filepath from e.g.
writeFile :: FilePath -> String -> IO () openTempFile :: FilePath -> String -> IO (FilePath, Handle)
to
writeFile :: IsFilePath a => a -> String -> IO () openTempFile :: IsFilePath a => a -> String -> IO (a, Handle)
?
Phase 2: Wait until a suitable amount of hackage builds without the string instance.
I can see Stackage helping with that by using a custom GHC which lacks the legacy `IsFilePath [Char]`-instance. So I'd be optimistic that Phase2 could be accomplished within one year for the Stackage-subset of Hackage.
Phase 3: Deprecate the String instance - move it to an old-filepath package.
Phase 4: Replace the type class with the new implementation
I assume this means getting rid again of the typeclass, and changing the type-sigs back to i.e.
writeFile :: FilePath -> String -> IO () openTempFile :: FilePath -> String -> IO (FilePath, Handle) (but now with with the new opaque `FilePath`)?
This way the new implementation is available immediately, packages can begin converting at once, benefits can be assessed.
This scheme seems feasible at first glance, as long as the typeclass doesn't start spreading across packages and find its way into type-sigs (in which case it'd become more disruptive to get rid of it again). Otoh, I'm not sure (assuming I understood how your scheme works) it can be avoided to have the typeclass spread, since if not every API that now has `FilePath` arguments in their type-sigs gets generalised to have `IsFilePath a => a` arguments instead, we can't reach the goal of "Phase 2".
But I suspect that I didn't fully understand how your proposed transition scheme works exactly... so please correct me where I got it wrong!
You are right, your approach is more appropriate for use by a community. I missed some of the problems that would arise.

On 06/27/2015 09:37 AM, Herbert Valerio Riedel wrote:
On 2015-06-27 at 14:56:33 +0200, David Fox wrote:
[...]
I've had success with a slightly different "How": What was your concrete use-case btw?
Phase 1: Replace FilePath with a type class, with instances for the old FilePath (i.e. String) and the new implementation. what would that comprise in the FilePath case?
I assume adding a transitional class whose methods are not exposed (and whose typeclass name is exported from some GHC-specific internal-marked module)? i.e.
class IsFilePath a where privateToFilePath :: a -> FilePath privateFromFilePath :: FilePath -> a
instance IsFilePath FilePath where privateToFilePath = id privateFromFilePath = id
instance IsFilePath [Char] where privateToFilePath = System.IO.toFilePath privateFromFilePath = System.IO.fromFilePath
It's probably better to not export the class at all, only its two instances and its methods. See below.
as well as changing a lot of type-sigs in base & filepath from e.g.
writeFile :: FilePath -> String -> IO () openTempFile :: FilePath -> String -> IO (FilePath, Handle)
to
writeFile :: IsFilePath a => a -> String -> IO () openTempFile :: IsFilePath a => a -> String -> IO (a, Handle)
?
Phase 2: Wait until a suitable amount of hackage builds without the string instance. I can see Stackage helping with that by using a custom GHC which lacks the legacy `IsFilePath [Char]`-instance. So I'd be optimistic that Phase2 could be accomplished within one year for the Stackage-subset of Hackage.
Phase 3: Deprecate the String instance - move it to an old-filepath package.
Phase 4: Replace the type class with the new implementation I assume this means getting rid again of the typeclass, and changing the type-sigs back to i.e.
writeFile :: FilePath -> String -> IO () openTempFile :: FilePath -> String -> IO (FilePath, Handle) (but now with with the new opaque `FilePath`)?
This way the new implementation is available immediately, packages can begin converting at once, benefits can be assessed. This scheme seems feasible at first glance, as long as the typeclass doesn't start spreading across packages and find its way into type-sigs (in which case it'd become more disruptive to get rid of it again). Otoh, I'm not sure (assuming I understood how your scheme works) it can be avoided to have the typeclass spread, since if not every API that now has `FilePath` arguments in their type-sigs gets generalised to have `IsFilePath a => a` arguments instead, we can't reach the goal of "Phase 2".
As long as the typeclass itself is not exported, this is not a big problem. Every *explicit* type signature would have to contain either a String or a FilePath. Yes, the transition to the FilePath-only state would probably be slower, but that is the intended feature, not a bug. GHC could do more to help with the transition if it allowed the DEPRECATED pragma on class instances: instance IsFilePath String {-# DEPRECATED "Use the FilePath type instead" #-} where ... The warning would be reported wherever the compiler can statically determine that the deprecated instance is used.

On Sat, 27 Jun 2015, David Fox wrote:
I've had success with a slightly different "How":
Phase 1: Replace FilePath with a type class, with instances for the old FilePath (i.e. String) and the new implementation.
There could be more instances for platform specific FilePath types, e.g. for creating Windows file paths on Unix.

I think this proposal is currently underspecified. For example, it's
not clear to me what the semantics of a FilePath are. I have the
feeling that `toFilePah` should return a Maybe, for example, but it's
hard to say without knowing what it's converting to, exactly.
I also worry about the immense breakage this will cause. This is not
just an issue of causing a lot of work for maintainers, but also of
lots of unmaintained libraries, printed code etc breaking. I feel that
there is not enough gain in this proposal relative to the amount of
breakage.
Has any thought been given to introduce new modules for this type, and
leave the old ones in place?
Erik
On Fri, Jun 26, 2015 at 6:08 PM, Herbert Valerio Riedel
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Regarding underspecified: I think that's appropriate at this phase. The
main proposal is: maybe FilePath an abstract type. It will take multiple
GHC releases before we switch over fully, with plenty of time to hash out
details of how the filepath package should work, and the opportunity to
experiment with different wrappers around a core abstract type.
Having used an alternate FilePath type for a while (via system-filepath), I
can say that it doesn't give the same benefit of just fixing the central
FilePath type. Having to convert between types all over the place is
tedious, defeats a lot of the performance benefits we're going for, and
hurts type safety.
As someone who typically is very much opposed to breaking changes in core
libraries: I think this one is well worth it.
On Mon, Jun 29, 2015 at 11:39 AM Erik Hesselink
I think this proposal is currently underspecified. For example, it's not clear to me what the semantics of a FilePath are. I have the feeling that `toFilePah` should return a Maybe, for example, but it's hard to say without knowing what it's converting to, exactly.
I also worry about the immense breakage this will cause. This is not just an issue of causing a lot of work for maintainers, but also of lots of unmaintained libraries, printed code etc breaking. I feel that there is not enough gain in this proposal relative to the amount of breakage.
Has any thought been given to introduce new modules for this type, and leave the old ones in place?
Erik
On Fri, Jun 26, 2015 at 6:08 PM, Herbert Valerio Riedel
wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On Mon, Jun 29, 2015 at 10:46 AM, Michael Snoyman
Regarding underspecified: I think that's appropriate at this phase. The main proposal is: maybe FilePath an abstract type. It will take multiple GHC releases before we switch over fully, with plenty of time to hash out details of how the filepath package should work, and the opportunity to experiment with different wrappers around a core abstract type.
But changing the semantics of an established newtype is very tricky business, since the resulting breakage won't be indicated by the types!
Having used an alternate FilePath type for a while (via system-filepath), I can say that it doesn't give the same benefit of just fixing the central FilePath type. Having to convert between types all over the place is tedious, defeats a lot of the performance benefits we're going for, and hurts type safety.
Why would you have to convert 'all over the place'? If the alternative library also provides the basic IO functions, the only places you'd have to convert are interfaces with other libraries, and things from e.g. config file, both of which don't happen a lot.
As someone who typically is very much opposed to breaking changes in core libraries: I think this one is well worth it.
Do you have any insight in the amount of breakage this will cause? I have a gut feeling that it's a lot more than any of the previous changes we've had, and those have already caused a lot of grumbling. But the only way to be sure is to run the builds on hackage (or stackage, but that's a smaller sample size). Erik
On Mon, Jun 29, 2015 at 11:39 AM Erik Hesselink
wrote: I think this proposal is currently underspecified. For example, it's not clear to me what the semantics of a FilePath are. I have the feeling that `toFilePah` should return a Maybe, for example, but it's hard to say without knowing what it's converting to, exactly.
I also worry about the immense breakage this will cause. This is not just an issue of causing a lot of work for maintainers, but also of lots of unmaintained libraries, printed code etc breaking. I feel that there is not enough gain in this proposal relative to the amount of breakage.
Has any thought been given to introduce new modules for this type, and leave the old ones in place?
Erik
On Fri, Jun 26, 2015 at 6:08 PM, Herbert Valerio Riedel
wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On Mon, Jun 29, 2015 at 12:07 PM Erik Hesselink
Regarding underspecified: I think that's appropriate at this phase. The
On Mon, Jun 29, 2015 at 10:46 AM, Michael Snoyman
wrote: main proposal is: maybe FilePath an abstract type. It will take multiple GHC releases before we switch over fully, with plenty of time to hash out details of how the filepath package should work, and the opportunity to experiment with different wrappers around a core abstract type.
But changing the semantics of an established newtype is very tricky business, since the resulting breakage won't be indicated by the types!
My suggestion isn't to roll out one breaking change and then another silent semantics change later. Rather, my point is: getting FilePath to be an abstract type is the meat of the proposal, and what we need to agree on. Working out the exact semantics of how the filepath package interacts with that is important, but not urgent. Let's get to an agreement that an abstract type is an improvement, and then we can figure out exactly how it should behave. After all, we'll have about 2 years to figure that out.
Having used an alternate FilePath type for a while (via system-filepath), I can say that it doesn't give the same benefit of just fixing the central FilePath type. Having to convert between types all over the place is tedious, defeats a lot of the performance benefits we're going for, and hurts type safety.
Why would you have to convert 'all over the place'? If the alternative library also provides the basic IO functions, the only places you'd have to convert are interfaces with other libraries, and things from e.g. config file, both of which don't happen a lot.
By having two different types, we know that not everyone will convert over. In fact, the very argument for having two types is so that not everyone will need to convert. Especially if Prelude continues to export the current `type FilePath = [Char]`, it will be difficult to get all libraries to use the new type.
As someone who typically is very much opposed to breaking changes in core libraries: I think this one is well worth it.
Do you have any insight in the amount of breakage this will cause? I have a gut feeling that it's a lot more than any of the previous changes we've had, and those have already caused a lot of grumbling. But the only way to be sure is to run the builds on hackage (or stackage, but that's a smaller sample size).
I agree, this is going to be a big one. It does not lend itself to elegant migrations like FTP did, for instance. But the scope of the current problem is also large, which is why I believe this breakage is warranted. Doing it gradually with a deprecation plan will hopefully make it possible for us to make it as easy as possible.
Erik
On Mon, Jun 29, 2015 at 11:39 AM Erik Hesselink
wrote: I think this proposal is currently underspecified. For example, it's not clear to me what the semantics of a FilePath are. I have the feeling that `toFilePah` should return a Maybe, for example, but it's hard to say without knowing what it's converting to, exactly.
I also worry about the immense breakage this will cause. This is not just an issue of causing a lot of work for maintainers, but also of lots of unmaintained libraries, printed code etc breaking. I feel that there is not enough gain in this proposal relative to the amount of breakage.
Has any thought been given to introduce new modules for this type, and leave the old ones in place?
Erik
On Fri, Jun 26, 2015 at 6:08 PM, Herbert Valerio Riedel
wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

One tiny amendment to a comment(!) in the non-normative(!) code in Phase 3:
data WindowsFilePath = WFP ByteArray# -- UTF16 data
If a Windows file path is valid UTF-16 then it is displayed as such in the
GUI, but if not it's still a legal file path. It really is just wchar_t[]
data:
data WindowsFilePath = WFP ByteArray# -- wchar_t[] data as passed to syscalls
This seems to be the source of some confusion.
Cheers,
David
On 26 June 2015 at 17:08, Herbert Valerio Riedel
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

+1 for the first two phase of the original proposal. I always wished it was
not a type alias.
No strong opinion of phase 3, I have propabaly never run into sophisticated
enough issues to fully get the picture... but I doubt we'll be able to
craft an ideal cross-platform API, I like what is in spirit in the original
proposal.
On 29 June 2015 at 11:27, David Turner
One tiny amendment to a comment(!) in the non-normative(!) code in Phase 3:
data WindowsFilePath = WFP ByteArray# -- UTF16 data
If a Windows file path is valid UTF-16 then it is displayed as such in the GUI, but if not it's still a legal file path. It really is just wchar_t[] data:
data WindowsFilePath = WFP ByteArray# -- wchar_t[] data as passed to syscalls
This seems to be the source of some confusion.
Cheers,
David
On 26 June 2015 at 17:08, Herbert Valerio Riedel
wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-- *Λ\ois* http://twitter.com/aloiscochard http://github.com/aloiscochard

Hi David,
One tiny amendment to a comment(!) in the non-normative(!) code in Phase 3:
data WindowsFilePath = WFP ByteArray# -- UTF16 data
If a Windows file path is valid UTF-16 then it is displayed as such in the GUI, but if not it's still a legal file path. It really is just wchar_t[] data.
Thanks for bringing this up. It's tricky - I think in practice: toFilePath x = WPF (encodeStringAsUTF16 x) But the data in WPF will be treated as UCS2 (aka wchar_t) when passing to the API calls, so it's really both. While on Windows NT it really was UCS2, but Win 7 it's always treated as UTF16 in the GUI, so that seems to be consistent with what people expect and ensures we don't throw away information when converting to/from FilePath. Given it seems you are quite knowledgeable in this area, please shout if that seems misguided! To all the people who are worried about breakage, I can guarantee this will cause breakage. It's a sad fact, and certainly the main negative to this proposal. I was on the fence initially when hvr suggested this change to me, but was convinced by performance and correctness. Whether the Haskell community as a whole thinks that makes it worth it is why it's a proposal. If anything, I'm concerned by the lack of people saying -1, please don't break my code... Thanks, Neil

On Tue, Jun 30, 2015 at 11:25 AM, Neil Mitchell
To all the people who are worried about breakage, I can guarantee this will cause breakage. It's a sad fact, and certainly the main negative to this proposal. I was on the fence initially when hvr suggested this change to me, but was convinced by performance and correctness. Whether the Haskell community as a whole thinks that makes it worth it is why it's a proposal. If anything, I'm concerned by the lack of people saying -1, please don't break my code...
I'm not convinced by the performance argument. Most people don't need performance from the small amount of FilePath usage they have. Those who do can switch to a different package. Now correctness would be a good argument, but this proposal doesn't really add that much in that respect, it seems. I'm still on the fence, but leaning towards -1, but I'm not saying please don't break my code. My code will be fine, I'm around to fix it. I'm more worried about other people's code (that I might rely on), maintainers that have left, or aren't that responsive, newcomers reading old tutorials, people getting angry about needing more CPP/fixing more code on new GHC releases, etc. We're still breaking code on every new GHC release, and it seems the amount of breakage is only increasing. Erik

In an ideal world, FilePath would be an abstract type. I think nearly
everyone can agree on that.
However, it seems every major ghc release includes some major breaking
changes. I've spent a lot of time fixing the fallout from them, and this
looks much more significant than any we've had in years.
In particular, I'm quite scared that people attempted to gauge the fallout
by building hackage, but it was too much work. Also consider that private
codebases are likely to be impacted significantly (at least the ones I've
seen will be).
I think it's likely this will cause a major break in the ecosystem, with
most packages only supporting old or new style FilePath.
I guess my point is, I don't think this proposal should go ahead unless
there's significant buy-in from the community (not merely silence or a
small majority in favor). I'm not doing much Haskell these days so I'm
pretty neutral on it.
John L.
On Tue, Jun 30, 2015, 2:25 AM Neil Mitchell
Hi David,
One tiny amendment to a comment(!) in the non-normative(!) code in Phase 3:
data WindowsFilePath = WFP ByteArray# -- UTF16 data
If a Windows file path is valid UTF-16 then it is displayed as such in the GUI, but if not it's still a legal file path. It really is just wchar_t[] data.
Thanks for bringing this up. It's tricky - I think in practice:
toFilePath x = WPF (encodeStringAsUTF16 x)
But the data in WPF will be treated as UCS2 (aka wchar_t) when passing to the API calls, so it's really both. While on Windows NT it really was UCS2, but Win 7 it's always treated as UTF16 in the GUI, so that seems to be consistent with what people expect and ensures we don't throw away information when converting to/from FilePath. Given it seems you are quite knowledgeable in this area, please shout if that seems misguided!
To all the people who are worried about breakage, I can guarantee this will cause breakage. It's a sad fact, and certainly the main negative to this proposal. I was on the fence initially when hvr suggested this change to me, but was convinced by performance and correctness. Whether the Haskell community as a whole thinks that makes it worth it is why it's a proposal. If anything, I'm concerned by the lack of people saying -1, please don't break my code...
Thanks, Neil _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

So this goes back to a valid question:
What fraction of currently build able hackage breaks with such an Api
change, and how complex will fixing those breaks.
This should be evaluated. And to what extent can the appropriate
migrations be mechanically assisted.
Would some of this breakage be mitigated by changing ++ to be monoid or
semigroup merge?
On Friday, July 3, 2015, John Lato
In an ideal world, FilePath would be an abstract type. I think nearly everyone can agree on that.
However, it seems every major ghc release includes some major breaking changes. I've spent a lot of time fixing the fallout from them, and this looks much more significant than any we've had in years.
In particular, I'm quite scared that people attempted to gauge the fallout by building hackage, but it was too much work. Also consider that private codebases are likely to be impacted significantly (at least the ones I've seen will be).
I think it's likely this will cause a major break in the ecosystem, with most packages only supporting old or new style FilePath.
I guess my point is, I don't think this proposal should go ahead unless there's significant buy-in from the community (not merely silence or a small majority in favor). I'm not doing much Haskell these days so I'm pretty neutral on it.
John L.
On Tue, Jun 30, 2015, 2:25 AM Neil Mitchell
javascript:_e(%7B%7D,'cvml','ndmitchell@gmail.com');> wrote: Hi David,
One tiny amendment to a comment(!) in the non-normative(!) code in Phase 3:
data WindowsFilePath = WFP ByteArray# -- UTF16 data
If a Windows file path is valid UTF-16 then it is displayed as such in the GUI, but if not it's still a legal file path. It really is just wchar_t[] data.
Thanks for bringing this up. It's tricky - I think in practice:
toFilePath x = WPF (encodeStringAsUTF16 x)
But the data in WPF will be treated as UCS2 (aka wchar_t) when passing to the API calls, so it's really both. While on Windows NT it really was UCS2, but Win 7 it's always treated as UTF16 in the GUI, so that seems to be consistent with what people expect and ensures we don't throw away information when converting to/from FilePath. Given it seems you are quite knowledgeable in this area, please shout if that seems misguided!
To all the people who are worried about breakage, I can guarantee this will cause breakage. It's a sad fact, and certainly the main negative to this proposal. I was on the fence initially when hvr suggested this change to me, but was convinced by performance and correctness. Whether the Haskell community as a whole thinks that makes it worth it is why it's a proposal. If anything, I'm concerned by the lack of people saying -1, please don't break my code...
Thanks, Neil _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org javascript:_e(%7B%7D,'cvml','ghc-devs@haskell.org'); http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

2015-07-04 4:28 GMT+02:00 Carter Schonwald
[...] What fraction of currently build able hackage breaks with such an Api change, and how complex will fixing those breaks. [...]
I think it is highly irrelevant how complex fixing the breakage is, it will probably almost always be trivial, but that's not the point: Think e.g. about a package which didn't really need any update for a few years, its maintainer is inactive (nothing to recently, so that's OK), and which is a transitive dependency of a number of other packages. This will effectively mean lots of broken packages for weeks or even longer. Fixing breakage from the AMP or FTP proposals was trivial, too, but nevertheless a bit painful. This should be evaluated. And to what extent can the appropriate
migrations be mechanically assisted. Would some of this breakage be mitigated by changing ++ to be monoid or semigroup merge?
To me the fundamental question which should be answered before any detail question is: Should we go on and continuously break minor things (i.e. basically give up any stability guarantees) or should we collect a bunch of changes first (leaving vital things untouched for that time) and release all those changes together, in longer intervals? That's IMHO a tough question which we somehow avoided to answer up to now. I would like to see a broader discussion like this first, both approaches have their pros and cons, and whatever we do, there should be some kind of consensus behind it. Cheers, S. P.S.: Just for the record: I'm leaning towards the "lots-of-changes-after-a-longer-time" approach, otherwise I see a flood of #ifdefs and tons of failing builds coming our way... :-P

On Sat, Jul 4, 2015 at 3:26 PM, Sven Panne
To me the fundamental question which should be answered before any detail question is: Should we go on and continuously break minor things (i.e. basically give up any stability guarantees) or should we collect a bunch of changes first (leaving vital things untouched for that time) and release all those changes together, in longer intervals? That's IMHO a tough question which we somehow avoided to answer up to now. I would like to see a broader discussion like this first, both approaches have their pros and cons, and whatever we do, there should be some kind of consensus behind it.
I recall suggesting something along the lines of stable vs. research ghc releases a few months back. This seems like it would fit in fairly well; the problem is getting buy-in from certain parts of the ecosystem that seem to prefer to build production-oriented packages from research/"unstable" releases. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

El Jul 4, 2015, a las 15:26, Sven Panne
2015-07-04 4:28 GMT+02:00 Carter Schonwald
: [...] What fraction of currently build able hackage breaks with such an Api change, and how complex will fixing those breaks. [...]
I think it is highly irrelevant how complex fixing the breakage is, it will probably almost always be trivial, but that's not the point: Think e.g. about a package which didn't really need any update for a few years, its maintainer is inactive (nothing to recently, so that's OK), and which is a transitive dependency of a number of other packages. This will effectively mean lots of broken packages for weeks or even longer. Fixing breakage from the AMP or FTP proposals was trivial, too, but nevertheless a bit painful.
This should be evaluated. And to what extent can the appropriate migrations be mechanically assisted. Would some of this breakage be mitigated by changing ++ to be monoid or semigroup merge?
To me the fundamental question which should be answered before any detail question is: Should we go on and continuously break minor things (i.e. basically give up any stability guarantees) or should we collect a bunch of changes first (leaving vital things untouched for that time) and release all those changes together, in longer intervals? That's IMHO a tough question which we somehow avoided to answer up to now.
I'd argue that Haskell and GHC's history clearly shows we've answered that question and that overalll we value frequent small breaking changes over giant change roadblocks like Perl's or Python's. Still +0 on the proposal though. Tom
I would like to see a broader discussion like this first, both approaches have their pros and cons, and whatever we do, there should be some kind of consensus behind it.
Cheers, S.
P.S.: Just for the record: I'm leaning towards the "lots-of-changes-after-a-longer-time" approach, otherwise I see a flood of #ifdefs and tons of failing builds coming our way... :-P _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

2015-07-04 22:48 GMT+02:00
I'd argue that Haskell and GHC's history clearly shows we've answered that question and that overalll we value frequent small breaking changes over giant change roadblocks like Perl's or Python's. [...]
I'm not sure that "value" is the right word. My impression is more that this somehow happened accidentally and was not the result of careful planning or broad consensus. And even if in the past this might have been the right thing, I consider today's state of affairs as something totally different: In the past it was only GHC, small parts of the language or a handful of packages (or even just a few modules, in the pre-package times). Today every change resonates through thousands of packages on Hackage and elsewhere. IMHO some approach similar to e.g. C++03 => C++11 => C++14 makes more sense in world like this than a constantly fluctuating base, but others might see this differently. My fear is that this will inevitably lead to the necessity of having an autoconf-like feature detection machinery to compile a package, and looking at a few packages, we are already halfway there. :-/

On 04/07/2015 at 21:26:31 +0200, Sven Panne wrote:
To me the fundamental question which should be answered before any detail question is: Should we go on and continuously break minor things (i.e. basically give up any stability guarantees) or should we collect a bunch of changes first (leaving vital things untouched for that time) and release all those changes together, in longer intervals? That's IMHO a tough question which we somehow avoided to answer up to now. I would like to see a broader discussion like this first, both approaches have their pros and cons, and whatever we do, there should be some kind of consensus behind it.
Potentially we ought to await Backpack [0], which should make such transitions easier. [0] http://plv.mpi-sws.org/backpack/
participants (26)
-
Alois Cochard
-
amindfv@gmail.com
-
Antonio Nikishaev
-
Bart Massey
-
Bob Ippolito
-
Boespflug, Mathieu
-
Brandon Allbery
-
Carter Schonwald
-
David Fox
-
David Turner
-
Edward Kmett
-
Erik Hesselink
-
Gershom B
-
Henning Thielemann
-
Herbert Valerio Riedel
-
Herbert Valerio Riedel
-
Joachim Breitner
-
John Lato
-
Kostiantyn Rybnikov
-
M Farkas-Dyck
-
Mario Blažević
-
Michael Snoyman
-
Neil Mitchell
-
Niklas Larsson
-
Sven Panne
-
Yitzchak Gale