Abstract FilePath Proposal

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello *, What? ===== We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym type FilePath = String into an abstract/opaque data type instead. Why/How/When? ============= For details (including motivation and a suggested transition scheme) please consult https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE-----

On 06/26/2015 06:08 PM, Herbert Valerio Riedel wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
[--snip--] I am *entirely* behind this in priciple and if it doesn't break too much of Hackage, also in practice, but... ... how much of Hackage *does* this break? The reason that I'm in favor in principle is that paths really *are* opaque things -- platforms have entirely different conventions. AFAICT the only thing that they seem to agree on is that there is a "hierarchy" of some sort. (And not much else, including such things as case (in-)sensivity or character sets.). For example, in POSIX they're just strings of bytes without any specified encoding, and I'd love if they could be make to work like that when dealing with files in Haskell. Regards,

On 2015-06-27 at 03:30:56 +0200, Bardur Arantsson wrote: [...]
I am *entirely* behind this in priciple and if it doesn't break too much of Hackage, also in practice, but...
... how much of Hackage *does* this break?
This won't be for free: I expect most packages which currently do more than just opaquely pass around FilePaths to require fixes. Some examples: - `writeFile "doo/foo.bar" ...` `_ <- readFile ("doo" > "foo" <.> "bar")` This will break unless -XOverloadedStrings happens to be enabled - Unless we generalise (++) to (<>), all cases where `FilePath`s are concatenated via (++) will break. - Code that uses Data.List rather than the `filepath` package for FilePath manipulation will need fixups (simplest fix: explicitly convert to/from String for the manipulation) - Some code, like e.g. fnames <- System.Environment.getArgs forM fnames $ \fn -> print =<< readFile fn will inevitably need to insert explicit conversions to/from FilePaths I tried to simulate the effect on Hackage, but this turned out to be more time-demanding than I hoped for and I had to abort. But the above is what I encountered in my attempt.
The reason that I'm in favor in principle is that paths really *are* opaque things -- platforms have entirely different conventions. AFAICT the only thing that they seem to agree on is that there is a "hierarchy" of some sort. (And not much else, including such things as case (in-)sensivity or character sets.).
For example, in POSIX they're just strings of bytes without any specified encoding, and I'd love if they could be make to work like that when dealing with files in Haskell.
Yes, if you look e.g. at http://hackage.haskell.org/package/unix you see a lot of API duplication, which wouldn't have been needed if FilePath was an opaque type w/ lossless conversion to/from ByteString.

Hi!
Instead of trying to minimally patch the existing API and still breaking loads of code, why not make a new API that doesn't have to compromise and depreciate the old one?
Niklas
----- Ursprungligt meddelande -----
Från: "Herbert Valerio Riedel"

Because new api already exists in libraries, but FilePath from base is
still being used, which makes things worse (now your programs have all
those conversions all over).
I like the idea with gradual deprecation warning, but it's not clear if
it's feasible to implement.
27 черв. 2015 12:33 "Niklas Larsson"
Hi!
Instead of trying to minimally patch the existing API and still breaking loads of code, why not make a new API that doesn't have to compromise and depreciate the old one?
Niklas ------------------------------ Från: Herbert Valerio Riedel
Skickat: 2015-06-26 18:09 Till: libraries@haskell.org; ghc-devs@haskell.org Ämne: Abstract FilePath Proposal -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Hi Niklas,
The function writeFile takes a FilePath. We could fork base or tell
everyone to use writeFile2, but in practice everyone will keep using
writeFile, and this String for FilePath. This approach is the only thing we
could figure that made sense.
Henning: we do not propose normalisation on initialisation. For ASCII
strings fromFilePath . toFilePath will be id. It might also be for unicode
on some/all platforms. Of course, you can write your own FilePath creator
that does normalisation on construction.
Thanks, Neil
On Saturday, 27 June 2015, Niklas Larsson
Hi!
Instead of trying to minimally patch the existing API and still breaking loads of code, why not make a new API that doesn't have to compromise and depreciate the old one?
Niklas ------------------------------ Från: Herbert Valerio Riedel Skickat: 2015-06-26 18:09 Till: libraries@haskell.org; ghc-devs@haskell.org Ämne: Abstract FilePath Proposal
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Hi Neil,
why does the proposal *not* include normalization?
There are four advantages that I see to making FilePath a datatype:
1. it makes it possible to implement the correct semantics for some
systems (including POSIX),
2. it allows for information hiding, which in turn helps modularity,
3. the type is distinct from any other type, hence static checks are stronger,
4. it becomes possible to quotient values over some arbitrary set of
identities that makes sense. i.e. in the case of FilePath, arguably
"foo/bar//baz" *is* "foo/bar/baz" *is* "foo//bar/baz" for all intents
and purposes, so it is not useful to distinguish these three ways of
writing down the same path (and in fact in practice distinguishing
them leads to subtle bugs). That is, the Eq instance compares
FilePath's modulo a few laws.
Do you propose to forego (4)? If so why so?
If we're going through a deprecation process, could we do so once, by
getting the notion of path equality we want right the first time?
Contrary to type indexing FilePath, it seems to me that the design
space for path identities is much smaller. Essentially, exactly the
ones here: https://hackage.haskell.org/package/filepath-1.1.0.2/docs/System-FilePath-Po....
Best,
Mathieu
On 27 June 2015 at 12:12, Neil Mitchell
Hi Niklas,
The function writeFile takes a FilePath. We could fork base or tell everyone to use writeFile2, but in practice everyone will keep using writeFile, and this String for FilePath. This approach is the only thing we could figure that made sense.
Henning: we do not propose normalisation on initialisation. For ASCII strings fromFilePath . toFilePath will be id. It might also be for unicode on some/all platforms. Of course, you can write your own FilePath creator that does normalisation on construction.
Thanks, Neil
On Saturday, 27 June 2015, Niklas Larsson
wrote: Hi!
Instead of trying to minimally patch the existing API and still breaking loads of code, why not make a new API that doesn't have to compromise and depreciate the old one?
Niklas ________________________________ Från: Herbert Valerio Riedel Skickat: 2015-06-26 18:09 Till: libraries@haskell.org; ghc-devs@haskell.org Ämne: Abstract FilePath Proposal
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Hi,
I think it'd be more robust to handle normalisation when converting from
String/Text to FilePath (and combining things with (>) and so on) rather
than in the underlying representation.
It's absolutely crucial that you can ask the OS for a filename (which it
gives you as a sequence of bytes) and then pass that exact same sequence of
bytes back to the OS without any normalisation or other useful alterations
having taken place.
You can do some deeply weird stuff in Windows by starting an absolute path
with \\?\, including apparently using '.' and '..' as the name of a
filesystem component:
Because it turns off automatic expansion of the path string, the "\\?\"
prefix also allows the use of ".." and "." in the path names, which can be
useful if you are attempting to perform operations on a file with these
otherwise reserved relative path specifiers as part of the fully qualified
path.
(from
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).a...
)
I don't fancy shaking all the corner cases out of this. An explicit
'normalise' function seems ok, but baking normalisation into the type
itself seems bad.
Cheers,
David
On 28 June 2015 at 11:03, Boespflug, Mathieu
Hi Neil,
why does the proposal *not* include normalization?
There are four advantages that I see to making FilePath a datatype:
1. it makes it possible to implement the correct semantics for some systems (including POSIX), 2. it allows for information hiding, which in turn helps modularity, 3. the type is distinct from any other type, hence static checks are stronger, 4. it becomes possible to quotient values over some arbitrary set of identities that makes sense. i.e. in the case of FilePath, arguably "foo/bar//baz" *is* "foo/bar/baz" *is* "foo//bar/baz" for all intents and purposes, so it is not useful to distinguish these three ways of writing down the same path (and in fact in practice distinguishing them leads to subtle bugs). That is, the Eq instance compares FilePath's modulo a few laws.
Do you propose to forego (4)? If so why so?
If we're going through a deprecation process, could we do so once, by getting the notion of path equality we want right the first time? Contrary to type indexing FilePath, it seems to me that the design space for path identities is much smaller. Essentially, exactly the ones here: https://hackage.haskell.org/package/filepath-1.1.0.2/docs/System-FilePath-Po... .
Best,
Mathieu
On 27 June 2015 at 12:12, Neil Mitchell
wrote: Hi Niklas,
The function writeFile takes a FilePath. We could fork base or tell everyone to use writeFile2, but in practice everyone will keep using writeFile, and this String for FilePath. This approach is the only thing we could figure that made sense.
Henning: we do not propose normalisation on initialisation. For ASCII strings fromFilePath . toFilePath will be id. It might also be for unicode on some/all platforms. Of course, you can write your own FilePath creator that does normalisation on construction.
Thanks, Neil
On Saturday, 27 June 2015, Niklas Larsson
wrote: Hi!
Instead of trying to minimally patch the existing API and still breaking loads of code, why not make a new API that doesn't have to compromise
and
depreciate the old one?
Niklas ________________________________ Från: Herbert Valerio Riedel Skickat: 2015-06-26 18:09 Till: libraries@haskell.org; ghc-devs@haskell.org Ämne: Abstract FilePath Proposal
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Worse there are situations where you absolutely _have_ to be able to use
\\?\ encoding of a path on Windows to read, modify or delete files with
"impossible names" that were created by other means.
e.g. Filenames like AUX, that had traditional roles under DOS cause weird
interactions, or that were created with "impossibly long names" -- which
can happen in the wild when you move directories around, etc.
I'm weakly in favor of the proposal precisely because it is the first
version of this concept that I've seen that DOESN'T try to get too clever
with regards to adding all sorts of normalization and this proposal seems
to be the simplest move that would enable us to do something correctly in
the future, regardless of what that correct thing winds up being.
-Edward
On Sun, Jun 28, 2015 at 8:09 AM, David Turner wrote: Hi, I think it'd be more robust to handle normalisation when converting from
String/Text to FilePath (and combining things with (>) and so on) rather
than in the underlying representation. It's absolutely crucial that you can ask the OS for a filename (which it
gives you as a sequence of bytes) and then pass that exact same sequence of
bytes back to the OS without any normalisation or other useful alterations
having taken place. You can do some deeply weird stuff in Windows by starting an absolute path
with \\?\, including apparently using '.' and '..' as the name of a
filesystem component: Because it turns off automatic expansion of the path string, the "\\?\"
prefix also allows the use of ".." and "." in the path names, which can be
useful if you are attempting to perform operations on a file with these
otherwise reserved relative path specifiers as part of the fully qualified
path. (from
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).a...
) I don't fancy shaking all the corner cases out of this. An explicit
'normalise' function seems ok, but baking normalisation into the type
itself seems bad. Cheers, David On 28 June 2015 at 11:03, Boespflug, Mathieu Hi Neil, why does the proposal *not* include normalization? There are four advantages that I see to making FilePath a datatype: 1. it makes it possible to implement the correct semantics for some
systems (including POSIX),
2. it allows for information hiding, which in turn helps modularity,
3. the type is distinct from any other type, hence static checks are
stronger,
4. it becomes possible to quotient values over some arbitrary set of
identities that makes sense. i.e. in the case of FilePath, arguably
"foo/bar//baz" *is* "foo/bar/baz" *is* "foo//bar/baz" for all intents
and purposes, so it is not useful to distinguish these three ways of
writing down the same path (and in fact in practice distinguishing
them leads to subtle bugs). That is, the Eq instance compares
FilePath's modulo a few laws. Do you propose to forego (4)? If so why so? If we're going through a deprecation process, could we do so once, by
getting the notion of path equality we want right the first time?
Contrary to type indexing FilePath, it seems to me that the design
space for path identities is much smaller. Essentially, exactly the
ones here:
https://hackage.haskell.org/package/filepath-1.1.0.2/docs/System-FilePath-Po...
. Best, Mathieu Hi Niklas, The function writeFile takes a FilePath. We could fork base or tell
everyone
to use writeFile2, but in practice everyone will keep using writeFile,
and
this String for FilePath. This approach is the only thing we could On 27 June 2015 at 12:12, Neil Mitchell that made sense. Henning: we do not propose normalisation on initialisation. For ASCII
strings fromFilePath . toFilePath will be id. It might also be for
unicode
on some/all platforms. Of course, you can write your own FilePath
creator
that does normalisation on construction. Thanks, Neil On Saturday, 27 June 2015, Niklas Larsson Hi! Instead of trying to minimally patch the existing API and still breaking loads of code, why not make a new API that doesn't have to compromise
and
depreciate the old one? Niklas
________________________________
Från: Herbert Valerio Riedel
Skickat: 2015-06-26 18:09
Till: libraries@haskell.org; ghc-devs@haskell.org
Ämne: Abstract FilePath Proposal -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1 Hello *, What?
===== We (see From: & CC: headers) propose, plain and simple, to turn the
currently defined type-synonym type FilePath = String into an abstract/opaque data type instead. Why/How/When?
============= For details (including motivation and a suggested transition scheme)
please consult https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath Suggested discussion period: 4 weeks
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1 iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon
BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526
YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2
28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn
koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN
qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5
KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+
NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU
tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm
awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv
aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb
HjIPRrJbVK9AABo4AZ/Y
=lg0o
-----END PGP SIGNATURE-----
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs _______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs _______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs _______________________________________________
Libraries mailing list
Libraries@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Normalization is a very hairy issue, which is not just platform specific
but also filesystem specific. Mac OS X is probably the worst of all words
in that respect, where HFS+ will do NFD normalization and may or may not
have case sensitivity depending on how that partition was formatted.
Network file shares and disk images may or may not have case sensitivity
and can use either NFD or NFC normalization based on mount options.
Contrary to statements earlier in the thread, NFD normalization happens on
HFS+ filesystems (the default) regardless of whether you're using POSIX
APIs or not. It's easy to prove this to yourself by creating a file with
U+00c9 (LATIN SMALL LETTER E WITH ACUTE) in the name (from any of the APIs)
and you'll see it come back out (e.g. from readdir) as two code points: 'e'
and then U+0301 (COMBINING ACUTE ACCENT). It'll also do some weird
transformations to file names that contain byte sequences that are not
valid UTF-8.
On Sun, Jun 28, 2015 at 12:05 PM, Edward Kmett
Worse there are situations where you absolutely _have_ to be able to use \\?\ encoding of a path on Windows to read, modify or delete files with "impossible names" that were created by other means.
e.g. Filenames like AUX, that had traditional roles under DOS cause weird interactions, or that were created with "impossibly long names" -- which can happen in the wild when you move directories around, etc.
I'm weakly in favor of the proposal precisely because it is the first version of this concept that I've seen that DOESN'T try to get too clever with regards to adding all sorts of normalization and this proposal seems to be the simplest move that would enable us to do something correctly in the future, regardless of what that correct thing winds up being.
-Edward
On Sun, Jun 28, 2015 at 8:09 AM, David Turner < dct25-561bs@mythic-beasts.com> wrote:
Hi,
I think it'd be more robust to handle normalisation when converting from String/Text to FilePath (and combining things with (>) and so on) rather than in the underlying representation.
It's absolutely crucial that you can ask the OS for a filename (which it gives you as a sequence of bytes) and then pass that exact same sequence of bytes back to the OS without any normalisation or other useful alterations having taken place.
You can do some deeply weird stuff in Windows by starting an absolute path with \\?\, including apparently using '.' and '..' as the name of a filesystem component:
Because it turns off automatic expansion of the path string, the "\\?\" prefix also allows the use of ".." and "." in the path names, which can be useful if you are attempting to perform operations on a file with these otherwise reserved relative path specifiers as part of the fully qualified path.
(from https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).a... )
I don't fancy shaking all the corner cases out of this. An explicit 'normalise' function seems ok, but baking normalisation into the type itself seems bad.
Cheers,
David
On 28 June 2015 at 11:03, Boespflug, Mathieu
wrote: Hi Neil,
why does the proposal *not* include normalization?
There are four advantages that I see to making FilePath a datatype:
1. it makes it possible to implement the correct semantics for some systems (including POSIX), 2. it allows for information hiding, which in turn helps modularity, 3. the type is distinct from any other type, hence static checks are stronger, 4. it becomes possible to quotient values over some arbitrary set of identities that makes sense. i.e. in the case of FilePath, arguably "foo/bar//baz" *is* "foo/bar/baz" *is* "foo//bar/baz" for all intents and purposes, so it is not useful to distinguish these three ways of writing down the same path (and in fact in practice distinguishing them leads to subtle bugs). That is, the Eq instance compares FilePath's modulo a few laws.
Do you propose to forego (4)? If so why so?
If we're going through a deprecation process, could we do so once, by getting the notion of path equality we want right the first time? Contrary to type indexing FilePath, it seems to me that the design space for path identities is much smaller. Essentially, exactly the ones here: https://hackage.haskell.org/package/filepath-1.1.0.2/docs/System-FilePath-Po... .
Best,
Mathieu
Hi Niklas,
The function writeFile takes a FilePath. We could fork base or tell everyone to use writeFile2, but in practice everyone will keep using writeFile, and this String for FilePath. This approach is the only thing we could
On 27 June 2015 at 12:12, Neil Mitchell
wrote: figure that made sense.
Henning: we do not propose normalisation on initialisation. For ASCII strings fromFilePath . toFilePath will be id. It might also be for unicode on some/all platforms. Of course, you can write your own FilePath creator that does normalisation on construction.
Thanks, Neil
On Saturday, 27 June 2015, Niklas Larsson
wrote: Hi!
Instead of trying to minimally patch the existing API and still
breaking
loads of code, why not make a new API that doesn't have to compromise and depreciate the old one?
Niklas ________________________________ Från: Herbert Valerio Riedel Skickat: 2015-06-26 18:09 Till: libraries@haskell.org; ghc-devs@haskell.org Ämne: Abstract FilePath Proposal
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On 06/27/2015 11:33 AM, Niklas Larsson wrote:
Hi!
Instead of trying to minimally patch the existing API and still breaking loads of code, why not make a new API that doesn't have to compromise and depreciate the old one?
This is a good idea in theory, but it's how we end up in situations like https://xkcd.com/927/ :) Regards,

On Fri, Jun 26, 2015 at 9:08 AM, Herbert Valerio Riedel
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When?
I've had success with a slightly different "How": Phase 1: Replace FilePath with a type class, with instances for the old FilePath (i.e. String) and the new implementation. Phase 2: Wait until a suitable amount of hackage builds without the string instance. Phase 3: Deprecate the String instance - move it to an old-filepath package. Phase 4: Replace the type class with the new implementation This way the new implementation is available immediately, packages can begin converting at once, benefits can be assessed.

On 2015-06-27 at 14:56:33 +0200, David Fox wrote: [...]
I've had success with a slightly different "How":
What was your concrete use-case btw?
Phase 1: Replace FilePath with a type class, with instances for the old FilePath (i.e. String) and the new implementation.
what would that comprise in the FilePath case? I assume adding a transitional class whose methods are not exposed (and whose typeclass name is exported from some GHC-specific internal-marked module)? i.e. class IsFilePath a where privateToFilePath :: a -> FilePath privateFromFilePath :: FilePath -> a instance IsFilePath FilePath where privateToFilePath = id privateFromFilePath = id instance IsFilePath [Char] where privateToFilePath = System.IO.toFilePath privateFromFilePath = System.IO.fromFilePath ? as well as changing a lot of type-sigs in base & filepath from e.g. writeFile :: FilePath -> String -> IO () openTempFile :: FilePath -> String -> IO (FilePath, Handle) to writeFile :: IsFilePath a => a -> String -> IO () openTempFile :: IsFilePath a => a -> String -> IO (a, Handle) ?
Phase 2: Wait until a suitable amount of hackage builds without the string instance.
I can see Stackage helping with that by using a custom GHC which lacks the legacy `IsFilePath [Char]`-instance. So I'd be optimistic that Phase2 could be accomplished within one year for the Stackage-subset of Hackage.
Phase 3: Deprecate the String instance - move it to an old-filepath package.
Phase 4: Replace the type class with the new implementation
I assume this means getting rid again of the typeclass, and changing the type-sigs back to i.e. writeFile :: FilePath -> String -> IO () openTempFile :: FilePath -> String -> IO (FilePath, Handle) (but now with with the new opaque `FilePath`)?
This way the new implementation is available immediately, packages can begin converting at once, benefits can be assessed.
This scheme seems feasible at first glance, as long as the typeclass doesn't start spreading across packages and find its way into type-sigs (in which case it'd become more disruptive to get rid of it again). Otoh, I'm not sure (assuming I understood how your scheme works) it can be avoided to have the typeclass spread, since if not every API that now has `FilePath` arguments in their type-sigs gets generalised to have `IsFilePath a => a` arguments instead, we can't reach the goal of "Phase 2". But I suspect that I didn't fully understand how your proposed transition scheme works exactly... so please correct me where I got it wrong! Cheers, hvr

On Sat, Jun 27, 2015 at 6:37 AM, Herbert Valerio Riedel
On 2015-06-27 at 14:56:33 +0200, David Fox wrote:
[...]
I've had success with a slightly different "How":
What was your concrete use-case btw?
Phase 1: Replace FilePath with a type class, with instances for the old FilePath (i.e. String) and the new implementation.
what would that comprise in the FilePath case?
I assume adding a transitional class whose methods are not exposed (and whose typeclass name is exported from some GHC-specific internal-marked module)? i.e.
class IsFilePath a where privateToFilePath :: a -> FilePath privateFromFilePath :: FilePath -> a
instance IsFilePath FilePath where privateToFilePath = id privateFromFilePath = id
instance IsFilePath [Char] where privateToFilePath = System.IO.toFilePath privateFromFilePath = System.IO.fromFilePath
?
as well as changing a lot of type-sigs in base & filepath from e.g.
writeFile :: FilePath -> String -> IO () openTempFile :: FilePath -> String -> IO (FilePath, Handle)
to
writeFile :: IsFilePath a => a -> String -> IO () openTempFile :: IsFilePath a => a -> String -> IO (a, Handle)
?
Phase 2: Wait until a suitable amount of hackage builds without the string instance.
I can see Stackage helping with that by using a custom GHC which lacks the legacy `IsFilePath [Char]`-instance. So I'd be optimistic that Phase2 could be accomplished within one year for the Stackage-subset of Hackage.
Phase 3: Deprecate the String instance - move it to an old-filepath package.
Phase 4: Replace the type class with the new implementation
I assume this means getting rid again of the typeclass, and changing the type-sigs back to i.e.
writeFile :: FilePath -> String -> IO () openTempFile :: FilePath -> String -> IO (FilePath, Handle) (but now with with the new opaque `FilePath`)?
This way the new implementation is available immediately, packages can begin converting at once, benefits can be assessed.
This scheme seems feasible at first glance, as long as the typeclass doesn't start spreading across packages and find its way into type-sigs (in which case it'd become more disruptive to get rid of it again). Otoh, I'm not sure (assuming I understood how your scheme works) it can be avoided to have the typeclass spread, since if not every API that now has `FilePath` arguments in their type-sigs gets generalised to have `IsFilePath a => a` arguments instead, we can't reach the goal of "Phase 2".
But I suspect that I didn't fully understand how your proposed transition scheme works exactly... so please correct me where I got it wrong!
You are right, your approach is more appropriate for use by a community. I missed some of the problems that would arise.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 +1. - -- Alexander alexander@plaimi.net https://secure.plaimi.net/~alexander -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCgAGBQJVkPLeAAoJENQqWdRUGk8BPmAQAIIdqvoB9buiDEUEb9pdKnuf q/+/iPo65QDTntDx8izQqHPcUt8h/f2u5JG7/S+78mTtzLkn0pkj/pQt3FjhdJwX hcJz4KFf3IHPRndKRDeGvW21m6A3OS9hxzwyvuzGFtBca3w/xQJVruViY2jto816 YK9VCZGkSoHqkQNBsGs+mh0WKpQ7FTGjHYhytQ0+CflhssWs+auHuf1Fm7sfb62t 2shjFLjnEatv0QnD31dZfHCdSiy0I+Htc5W8boS5w6LW/uAK4QUr6PAt9TTAsYUq vOD7naeVgcUMuUx9hAkrbaigPbGW+jd2WvDq8C5EH2FRTJnzpY/srHiy+AJA/Yae l9DkaQxrlTCfn7JXja15Kc+Ln0T7trnc7Xq9hz6AUfM8Tf6aOvL1O03bNXgnoHVP dAi7KiFC+lOvA37zfZk1Prk6aO5GQi9XLy4lqw9xbMayK4mztB1WTDi6UVBLXJVw jnxOwL79SnkKt8lt3GJhtFAMV0r+NR1+bFmI/b2bSC+SVCc8lA9D91L/ZdaU+r1d hzihRbIpl2FhryE4rwXeObymc31xQsrWlTvxLSr0v+QG/uduw77EXJhfWAX505ma k/C+uwCi8ErBgE81bt6rsXJZfugDZcEQmV9yKZlpQ8ypxVWqhz5GVlHI7If7XYH8 e0v2AHQnrcVQOimk9jCu =hPFP -----END PGP SIGNATURE-----

I think this proposal is currently underspecified. For example, it's
not clear to me what the semantics of a FilePath are. I have the
feeling that `toFilePah` should return a Maybe, for example, but it's
hard to say without knowing what it's converting to, exactly.
I also worry about the immense breakage this will cause. This is not
just an issue of causing a lot of work for maintainers, but also of
lots of unmaintained libraries, printed code etc breaking. I feel that
there is not enough gain in this proposal relative to the amount of
breakage.
Has any thought been given to introduce new modules for this type, and
leave the old ones in place?
Erik
On Fri, Jun 26, 2015 at 6:08 PM, Herbert Valerio Riedel
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Regarding underspecified: I think that's appropriate at this phase. The
main proposal is: maybe FilePath an abstract type. It will take multiple
GHC releases before we switch over fully, with plenty of time to hash out
details of how the filepath package should work, and the opportunity to
experiment with different wrappers around a core abstract type.
Having used an alternate FilePath type for a while (via system-filepath), I
can say that it doesn't give the same benefit of just fixing the central
FilePath type. Having to convert between types all over the place is
tedious, defeats a lot of the performance benefits we're going for, and
hurts type safety.
As someone who typically is very much opposed to breaking changes in core
libraries: I think this one is well worth it.
On Mon, Jun 29, 2015 at 11:39 AM Erik Hesselink
I think this proposal is currently underspecified. For example, it's not clear to me what the semantics of a FilePath are. I have the feeling that `toFilePah` should return a Maybe, for example, but it's hard to say without knowing what it's converting to, exactly.
I also worry about the immense breakage this will cause. This is not just an issue of causing a lot of work for maintainers, but also of lots of unmaintained libraries, printed code etc breaking. I feel that there is not enough gain in this proposal relative to the amount of breakage.
Has any thought been given to introduce new modules for this type, and leave the old ones in place?
Erik
On Fri, Jun 26, 2015 at 6:08 PM, Herbert Valerio Riedel
wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On Mon, Jun 29, 2015 at 10:46 AM, Michael Snoyman
Regarding underspecified: I think that's appropriate at this phase. The main proposal is: maybe FilePath an abstract type. It will take multiple GHC releases before we switch over fully, with plenty of time to hash out details of how the filepath package should work, and the opportunity to experiment with different wrappers around a core abstract type.
But changing the semantics of an established newtype is very tricky business, since the resulting breakage won't be indicated by the types!
Having used an alternate FilePath type for a while (via system-filepath), I can say that it doesn't give the same benefit of just fixing the central FilePath type. Having to convert between types all over the place is tedious, defeats a lot of the performance benefits we're going for, and hurts type safety.
Why would you have to convert 'all over the place'? If the alternative library also provides the basic IO functions, the only places you'd have to convert are interfaces with other libraries, and things from e.g. config file, both of which don't happen a lot.
As someone who typically is very much opposed to breaking changes in core libraries: I think this one is well worth it.
Do you have any insight in the amount of breakage this will cause? I have a gut feeling that it's a lot more than any of the previous changes we've had, and those have already caused a lot of grumbling. But the only way to be sure is to run the builds on hackage (or stackage, but that's a smaller sample size). Erik
On Mon, Jun 29, 2015 at 11:39 AM Erik Hesselink
wrote: I think this proposal is currently underspecified. For example, it's not clear to me what the semantics of a FilePath are. I have the feeling that `toFilePah` should return a Maybe, for example, but it's hard to say without knowing what it's converting to, exactly.
I also worry about the immense breakage this will cause. This is not just an issue of causing a lot of work for maintainers, but also of lots of unmaintained libraries, printed code etc breaking. I feel that there is not enough gain in this proposal relative to the amount of breakage.
Has any thought been given to introduce new modules for this type, and leave the old ones in place?
Erik
On Fri, Jun 26, 2015 at 6:08 PM, Herbert Valerio Riedel
wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On Mon, Jun 29, 2015 at 12:07 PM Erik Hesselink
Regarding underspecified: I think that's appropriate at this phase. The
On Mon, Jun 29, 2015 at 10:46 AM, Michael Snoyman
wrote: main proposal is: maybe FilePath an abstract type. It will take multiple GHC releases before we switch over fully, with plenty of time to hash out details of how the filepath package should work, and the opportunity to experiment with different wrappers around a core abstract type.
But changing the semantics of an established newtype is very tricky business, since the resulting breakage won't be indicated by the types!
My suggestion isn't to roll out one breaking change and then another silent semantics change later. Rather, my point is: getting FilePath to be an abstract type is the meat of the proposal, and what we need to agree on. Working out the exact semantics of how the filepath package interacts with that is important, but not urgent. Let's get to an agreement that an abstract type is an improvement, and then we can figure out exactly how it should behave. After all, we'll have about 2 years to figure that out.
Having used an alternate FilePath type for a while (via system-filepath), I can say that it doesn't give the same benefit of just fixing the central FilePath type. Having to convert between types all over the place is tedious, defeats a lot of the performance benefits we're going for, and hurts type safety.
Why would you have to convert 'all over the place'? If the alternative library also provides the basic IO functions, the only places you'd have to convert are interfaces with other libraries, and things from e.g. config file, both of which don't happen a lot.
By having two different types, we know that not everyone will convert over. In fact, the very argument for having two types is so that not everyone will need to convert. Especially if Prelude continues to export the current `type FilePath = [Char]`, it will be difficult to get all libraries to use the new type.
As someone who typically is very much opposed to breaking changes in core libraries: I think this one is well worth it.
Do you have any insight in the amount of breakage this will cause? I have a gut feeling that it's a lot more than any of the previous changes we've had, and those have already caused a lot of grumbling. But the only way to be sure is to run the builds on hackage (or stackage, but that's a smaller sample size).
I agree, this is going to be a big one. It does not lend itself to elegant migrations like FTP did, for instance. But the scope of the current problem is also large, which is why I believe this breakage is warranted. Doing it gradually with a deprecation plan will hopefully make it possible for us to make it as easy as possible.
Erik
On Mon, Jun 29, 2015 at 11:39 AM Erik Hesselink
wrote: I think this proposal is currently underspecified. For example, it's not clear to me what the semantics of a FilePath are. I have the feeling that `toFilePah` should return a Maybe, for example, but it's hard to say without knowing what it's converting to, exactly.
I also worry about the immense breakage this will cause. This is not just an issue of causing a lot of work for maintainers, but also of lots of unmaintained libraries, printed code etc breaking. I feel that there is not enough gain in this proposal relative to the amount of breakage.
Has any thought been given to introduce new modules for this type, and leave the old ones in place?
Erik
On Fri, Jun 26, 2015 at 6:08 PM, Herbert Valerio Riedel
wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

One tiny amendment to a comment(!) in the non-normative(!) code in Phase 3:
data WindowsFilePath = WFP ByteArray# -- UTF16 data
If a Windows file path is valid UTF-16 then it is displayed as such in the
GUI, but if not it's still a legal file path. It really is just wchar_t[]
data:
data WindowsFilePath = WFP ByteArray# -- wchar_t[] data as passed to syscalls
This seems to be the source of some confusion.
Cheers,
David
On 26 June 2015 at 17:08, Herbert Valerio Riedel
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

+1 for the first two phase of the original proposal. I always wished it was
not a type alias.
No strong opinion of phase 3, I have propabaly never run into sophisticated
enough issues to fully get the picture... but I doubt we'll be able to
craft an ideal cross-platform API, I like what is in spirit in the original
proposal.
On 29 June 2015 at 11:27, David Turner
One tiny amendment to a comment(!) in the non-normative(!) code in Phase 3:
data WindowsFilePath = WFP ByteArray# -- UTF16 data
If a Windows file path is valid UTF-16 then it is displayed as such in the GUI, but if not it's still a legal file path. It really is just wchar_t[] data:
data WindowsFilePath = WFP ByteArray# -- wchar_t[] data as passed to syscalls
This seems to be the source of some confusion.
Cheers,
David
On 26 June 2015 at 17:08, Herbert Valerio Riedel
wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello *,
What? =====
We (see From: & CC: headers) propose, plain and simple, to turn the currently defined type-synonym
type FilePath = String
into an abstract/opaque data type instead.
Why/How/When? =============
For details (including motivation and a suggested transition scheme) please consult
https://ghc.haskell.org/trac/ghc/wiki/Proposal/AbstractFilePath
Suggested discussion period: 4 weeks -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQIcBAEBAgAGBQJVjXkZAAoJELo8uj/+IrV0WXUP/0romoKazwLbQpaMAKgCNZon BsY8Di44w6rkbdBXoky0xZooII8LJJyQfexH0BLRYEVLZFy0+LB8XzpPt8Ekg526 YlY4x0qFm9oiJbJDMqHUnb6z6Lr2KxzBcV37drTPbltUA+HB49DUVkkPbvHimpL2 28SIyhAr4fN6fLpGcFAkv6Rcs0mkvnTp7vsC0HNyshmGi6qQ+C+eB4mklQzWOPcn koHZ2wtI8AJmyTdHKcXKAIFM0r+xl4MJ5445IvDjvIuGXZCzybXMw9Ss/4wSG3VN qSIJVEDGZXrBCc12fPxPEB0Bqx9MIVytjplXKIo8rFrk93h3at9t9kDM26z+9PZ5 KYnEdjRKF4KL4j+3xqJDOEJT15GVRbGRRzb9A8xH0YIQ0S3Q3pt1PAfla1Hss75+ NRQgfowZYryL9dfCkAj2XNfdQ+pUk25N3bNig11se+zjk2JO77QRM0u3GOYZ9+CU tSlwhtIMF32xnjgQyWE5yBBiEg3/Y+S+809tVaPseUEzkQJXMGq5TFxBrN6bj1Vm awr6QghThKjeoRwky5bmFn/gept/lbYN6VV5B6gNznGP5xgFrmvVtmjbQJBRMYCv aEUnrYqxkkbIddJjD5gl771/LWH4M2F1yBgJjfiZw2paEVAXKxEr327LsbOQaPdb HjIPRrJbVK9AABo4AZ/Y =lg0o -----END PGP SIGNATURE----- _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-- *Λ\ois* http://twitter.com/aloiscochard http://github.com/aloiscochard

Hi David,
One tiny amendment to a comment(!) in the non-normative(!) code in Phase 3:
data WindowsFilePath = WFP ByteArray# -- UTF16 data
If a Windows file path is valid UTF-16 then it is displayed as such in the GUI, but if not it's still a legal file path. It really is just wchar_t[] data.
Thanks for bringing this up. It's tricky - I think in practice: toFilePath x = WPF (encodeStringAsUTF16 x) But the data in WPF will be treated as UCS2 (aka wchar_t) when passing to the API calls, so it's really both. While on Windows NT it really was UCS2, but Win 7 it's always treated as UTF16 in the GUI, so that seems to be consistent with what people expect and ensures we don't throw away information when converting to/from FilePath. Given it seems you are quite knowledgeable in this area, please shout if that seems misguided! To all the people who are worried about breakage, I can guarantee this will cause breakage. It's a sad fact, and certainly the main negative to this proposal. I was on the fence initially when hvr suggested this change to me, but was convinced by performance and correctness. Whether the Haskell community as a whole thinks that makes it worth it is why it's a proposal. If anything, I'm concerned by the lack of people saying -1, please don't break my code... Thanks, Neil

On Tue, Jun 30, 2015 at 11:25 AM, Neil Mitchell
To all the people who are worried about breakage, I can guarantee this will cause breakage. It's a sad fact, and certainly the main negative to this proposal. I was on the fence initially when hvr suggested this change to me, but was convinced by performance and correctness. Whether the Haskell community as a whole thinks that makes it worth it is why it's a proposal. If anything, I'm concerned by the lack of people saying -1, please don't break my code...
I'm not convinced by the performance argument. Most people don't need performance from the small amount of FilePath usage they have. Those who do can switch to a different package. Now correctness would be a good argument, but this proposal doesn't really add that much in that respect, it seems. I'm still on the fence, but leaning towards -1, but I'm not saying please don't break my code. My code will be fine, I'm around to fix it. I'm more worried about other people's code (that I might rely on), maintainers that have left, or aren't that responsive, newcomers reading old tutorials, people getting angry about needing more CPP/fixing more code on new GHC releases, etc. We're still breaking code on every new GHC release, and it seems the amount of breakage is only increasing. Erik

In an ideal world, FilePath would be an abstract type. I think nearly
everyone can agree on that.
However, it seems every major ghc release includes some major breaking
changes. I've spent a lot of time fixing the fallout from them, and this
looks much more significant than any we've had in years.
In particular, I'm quite scared that people attempted to gauge the fallout
by building hackage, but it was too much work. Also consider that private
codebases are likely to be impacted significantly (at least the ones I've
seen will be).
I think it's likely this will cause a major break in the ecosystem, with
most packages only supporting old or new style FilePath.
I guess my point is, I don't think this proposal should go ahead unless
there's significant buy-in from the community (not merely silence or a
small majority in favor). I'm not doing much Haskell these days so I'm
pretty neutral on it.
John L.
On Tue, Jun 30, 2015, 2:25 AM Neil Mitchell
Hi David,
One tiny amendment to a comment(!) in the non-normative(!) code in Phase 3:
data WindowsFilePath = WFP ByteArray# -- UTF16 data
If a Windows file path is valid UTF-16 then it is displayed as such in the GUI, but if not it's still a legal file path. It really is just wchar_t[] data.
Thanks for bringing this up. It's tricky - I think in practice:
toFilePath x = WPF (encodeStringAsUTF16 x)
But the data in WPF will be treated as UCS2 (aka wchar_t) when passing to the API calls, so it's really both. While on Windows NT it really was UCS2, but Win 7 it's always treated as UTF16 in the GUI, so that seems to be consistent with what people expect and ensures we don't throw away information when converting to/from FilePath. Given it seems you are quite knowledgeable in this area, please shout if that seems misguided!
To all the people who are worried about breakage, I can guarantee this will cause breakage. It's a sad fact, and certainly the main negative to this proposal. I was on the fence initially when hvr suggested this change to me, but was convinced by performance and correctness. Whether the Haskell community as a whole thinks that makes it worth it is why it's a proposal. If anything, I'm concerned by the lack of people saying -1, please don't break my code...
Thanks, Neil _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

So this goes back to a valid question:
What fraction of currently build able hackage breaks with such an Api
change, and how complex will fixing those breaks.
This should be evaluated. And to what extent can the appropriate
migrations be mechanically assisted.
Would some of this breakage be mitigated by changing ++ to be monoid or
semigroup merge?
On Friday, July 3, 2015, John Lato
In an ideal world, FilePath would be an abstract type. I think nearly everyone can agree on that.
However, it seems every major ghc release includes some major breaking changes. I've spent a lot of time fixing the fallout from them, and this looks much more significant than any we've had in years.
In particular, I'm quite scared that people attempted to gauge the fallout by building hackage, but it was too much work. Also consider that private codebases are likely to be impacted significantly (at least the ones I've seen will be).
I think it's likely this will cause a major break in the ecosystem, with most packages only supporting old or new style FilePath.
I guess my point is, I don't think this proposal should go ahead unless there's significant buy-in from the community (not merely silence or a small majority in favor). I'm not doing much Haskell these days so I'm pretty neutral on it.
John L.
On Tue, Jun 30, 2015, 2:25 AM Neil Mitchell
javascript:_e(%7B%7D,'cvml','ndmitchell@gmail.com');> wrote: Hi David,
One tiny amendment to a comment(!) in the non-normative(!) code in Phase 3:
data WindowsFilePath = WFP ByteArray# -- UTF16 data
If a Windows file path is valid UTF-16 then it is displayed as such in the GUI, but if not it's still a legal file path. It really is just wchar_t[] data.
Thanks for bringing this up. It's tricky - I think in practice:
toFilePath x = WPF (encodeStringAsUTF16 x)
But the data in WPF will be treated as UCS2 (aka wchar_t) when passing to the API calls, so it's really both. While on Windows NT it really was UCS2, but Win 7 it's always treated as UTF16 in the GUI, so that seems to be consistent with what people expect and ensures we don't throw away information when converting to/from FilePath. Given it seems you are quite knowledgeable in this area, please shout if that seems misguided!
To all the people who are worried about breakage, I can guarantee this will cause breakage. It's a sad fact, and certainly the main negative to this proposal. I was on the fence initially when hvr suggested this change to me, but was convinced by performance and correctness. Whether the Haskell community as a whole thinks that makes it worth it is why it's a proposal. If anything, I'm concerned by the lack of people saying -1, please don't break my code...
Thanks, Neil _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org javascript:_e(%7B%7D,'cvml','ghc-devs@haskell.org'); http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

2015-07-04 4:28 GMT+02:00 Carter Schonwald
[...] What fraction of currently build able hackage breaks with such an Api change, and how complex will fixing those breaks. [...]
I think it is highly irrelevant how complex fixing the breakage is, it will probably almost always be trivial, but that's not the point: Think e.g. about a package which didn't really need any update for a few years, its maintainer is inactive (nothing to recently, so that's OK), and which is a transitive dependency of a number of other packages. This will effectively mean lots of broken packages for weeks or even longer. Fixing breakage from the AMP or FTP proposals was trivial, too, but nevertheless a bit painful. This should be evaluated. And to what extent can the appropriate
migrations be mechanically assisted. Would some of this breakage be mitigated by changing ++ to be monoid or semigroup merge?
To me the fundamental question which should be answered before any detail question is: Should we go on and continuously break minor things (i.e. basically give up any stability guarantees) or should we collect a bunch of changes first (leaving vital things untouched for that time) and release all those changes together, in longer intervals? That's IMHO a tough question which we somehow avoided to answer up to now. I would like to see a broader discussion like this first, both approaches have their pros and cons, and whatever we do, there should be some kind of consensus behind it. Cheers, S. P.S.: Just for the record: I'm leaning towards the "lots-of-changes-after-a-longer-time" approach, otherwise I see a flood of #ifdefs and tons of failing builds coming our way... :-P

On Sat, Jul 4, 2015 at 3:26 PM, Sven Panne
To me the fundamental question which should be answered before any detail question is: Should we go on and continuously break minor things (i.e. basically give up any stability guarantees) or should we collect a bunch of changes first (leaving vital things untouched for that time) and release all those changes together, in longer intervals? That's IMHO a tough question which we somehow avoided to answer up to now. I would like to see a broader discussion like this first, both approaches have their pros and cons, and whatever we do, there should be some kind of consensus behind it.
I recall suggesting something along the lines of stable vs. research ghc releases a few months back. This seems like it would fit in fairly well; the problem is getting buy-in from certain parts of the ecosystem that seem to prefer to build production-oriented packages from research/"unstable" releases. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On 07/04/2015 09:38 PM, Brandon Allbery wrote:
On Sat, Jul 4, 2015 at 3:26 PM, Sven Panne
wrote: To me the fundamental question which should be answered before any detail question is: Should we go on and continuously break minor things (i.e. basically give up any stability guarantees) or should we collect a bunch of changes first (leaving vital things untouched for that time) and release all those changes together, in longer intervals? That's IMHO a tough question which we somehow avoided to answer up to now. I would like to see a broader discussion like this first, both approaches have their pros and cons, and whatever we do, there should be some kind of consensus behind it.
I recall suggesting something along the lines of stable vs. research ghc releases a few months back. This seems like it would fit in fairly well; the problem is getting buy-in from certain parts of the ecosystem that seem to prefer to build production-oriented packages from research/"unstable" releases.
But isn't that effectively just the same as saying: "In our organization we'll be staying with GHC 7.8.x until GHC 7.12.x comes out"? (Or similar, I'm sure you get the point.) Yes, the rest of the ecosystem may move along and use the latest new shiny, but then you can always use the packages that worked with GHC 7.8.x thanks to version ranges. Am I missing something? Regards,

On Sat, Jul 4, 2015 at 6:17 PM, Bardur Arantsson
Yes, the rest of the ecosystem may move along and use the latest new shiny, but then you can always use the packages that worked with GHC 7.8.x thanks to version ranges.
Am I missing something?
Updates needed to fix e.g. security issues, which otherwise might not be backported if others are staying close to current. This is why Stackage has both LTS and Nightly; LTS only works if there's a *commitment* to it, at the level of the community for a community resource or at the level of the provider for something like ghc or Stackage. Note that GHC HQ's response was that they have had problems finding people to keep multiple versions active at the same time; it's a significant job given that backporting (say) a fix to a type system issue allowing unexpectedly unsafe code (say, https://ghc.haskell.org/trac/ghc/ticket/9858) can mean a complete redesign of the patch, if the one in HEAD relies on other changes that can't be sensibly backported. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On 07/05/2015 12:27 AM, Brandon Allbery wrote:
On Sat, Jul 4, 2015 at 6:17 PM, Bardur Arantsson
wrote: Yes, the rest of the ecosystem may move along and use the latest new shiny, but then you can always use the packages that worked with GHC 7.8.x thanks to version ranges.
Am I missing something?
Updates needed to fix e.g. security issues, which otherwise might not be backported if others are staying close to current. This is why Stackage has both LTS and Nightly; LTS only works if there's a *commitment* to it, at the level of the community for a community resource or at the level of the provider for something like ghc or Stackage.
How often have security issues with GHC (or the base libraries) itself been a problem? (In practice, I mean.) In my hypothetical scenario, there's nothing to prevent a release of GHC 7.8.(x+1) while GHC 7.12. is the new thing. Nor does anything prevent library releases of my-library-1.2.x (security patch) while my-library-1.6.x is the hot new thing.
Note that GHC HQ's response was that they have had problems finding people to keep multiple versions active at the same time; it's a significant job given that backporting (say) a fix to a type system issue allowing unexpectedly unsafe code (say, https://ghc.haskell.org/trac/ghc/ticket/9858) can mean a complete redesign of the patch, if the one in HEAD relies on other changes that can't be sensibly backported.
Yes, there's a man-power problem... but that isn't going to be solved unless some people/companies step up to the plate. Preferably the people who are actually using/relying on those old versions. This is no different from e.g. RHEL/Ubuntu LTS/Debian in the Linux world. (Well, except RHEL actually has a revenue stream that means that they can and do support old versions of various things.) Regards,

On Sun, Jul 5, 2015 at 2:25 PM, Bardur Arantsson
How often have security issues with GHC (or the base libraries) itself been a problem? (In practice, I mean.)
Not that often, but consider one real example: aeson was found to have a DDoS bug which was fixed by making it depend on a package which IIRC needed a newer base, so the fix couldn't be backported to versions of aeson compatible with older base. The necessary fix for those would have been substantially more complicated. (There are other examples, but the primary one that actually involves something shipped with ghc is never going to be fixed until it destroys someone's system, and I bet even then we'll get another load of HOMG MUST NEVER CHANGE API ONLY DOCUMENT AS BAD from the maintainer. I'm still waiting for one of the Linux distributions to notice and CVE it.) -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On 07/05/2015 08:40 PM, Brandon Allbery wrote:
On Sun, Jul 5, 2015 at 2:25 PM, Bardur Arantsson
wrote: How often have security issues with GHC (or the base libraries) itself been a problem? (In practice, I mean.)
Not that often, but consider one real example: aeson was found to have a DDoS bug which was fixed by making it depend on a package which IIRC needed a newer base, so the fix couldn't be backported to versions of aeson compatible with older base. The necessary fix for those would have been substantially more complicated.
(There are other examples, but the primary one that actually involves something shipped with ghc is never going to be fixed until it destroys someone's system, and I bet even then we'll get another load of HOMG MUST NEVER CHANGE API ONLY DOCUMENT AS BAD from the maintainer. I'm still waiting for one of the Linux distributions to notice and CVE it.)
Oh, yeah, that's a valid point... but is this something that should drive design? Further, I don't think the aeson DDoS problem was predicated on an old/obsolete "base" library? Maybe I'm wrong about that, and I'm sure y'all will be happy to point out where and why. :) Regards,

On Sun, Jul 5, 2015 at 3:27 PM, Bardur Arantsson
Further, I don't think the aeson DDoS problem was predicated on an old/obsolete "base" library? Maybe I'm wrong about that, and I'm sure y'all will be happy to point out where and why. :)
I may be misremembering, but I thought one of the complaints about aeson adding scientific to its dependencies (to fix the DDoS) was that it required a newer ghc? -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On 04/07/2015 at 21:26:31 +0200, Sven Panne wrote:
To me the fundamental question which should be answered before any detail question is: Should we go on and continuously break minor things (i.e. basically give up any stability guarantees) or should we collect a bunch of changes first (leaving vital things untouched for that time) and release all those changes together, in longer intervals? That's IMHO a tough question which we somehow avoided to answer up to now. I would like to see a broader discussion like this first, both approaches have their pros and cons, and whatever we do, there should be some kind of consensus behind it.
Potentially we ought to await Backpack [0], which should make such transitions easier. [0] http://plv.mpi-sws.org/backpack/
participants (20)
-
Alexander Berntsen
-
Alois Cochard
-
Bardur Arantsson
-
Bob Ippolito
-
Boespflug, Mathieu
-
Brandon Allbery
-
Carter Schonwald
-
David Fox
-
David Turner
-
Edward Kmett
-
Erik Hesselink
-
Herbert Valerio Riedel
-
Herbert Valerio Riedel
-
John Lato
-
Kostiantyn Rybnikov
-
M Farkas-Dyck
-
Michael Snoyman
-
Neil Mitchell
-
Niklas Larsson
-
Sven Panne