
We seem to be circling ever closer to consensus here! Yay!
Indeed! Good :-)
However, I’m not getting the bit about API changing vs non-API changing.
Firstly I don’t know which APIs are intended. The GHC API is essentially GHC itself, so it changes daily. Maybe you mean the base package? Or what?
I suspect you mean that a “non-API-changing” release absolutely guarantees to compile any package that compiled with the previous version. If so, that is a very strong constraint indeed. We do observe it for patch releases for GHC (e g 7.6.2 should compile anything that 7.6.1 compiles). But I think it would be difficult to guarantee for anything beyond a patch release. Every single commit (and the commit rate is many/day) would have to be evaluated against this criterion. And if it failed the criterion, it would have to go on a API-breaking HEAD. In effect we’d have two HEADs. I can’t see us sustaining this. And I don’t yet really see why it’s necessary. If you don’t want an API-breaking change, stick with the patch releases.
So, we have a channel for non-API-breaking changes already: the patch releases. So that means we already have all three channels!
· Haskell Platform
· Patch-level releases
· New releases
if that’s so, all we need is better signposting. And I’m all for that!
Have I got this right?
Simon
From: Mark Lentczner [mailto:mark.lentczner@gmail.com]
Sent: 09 February 2013 17:48
To: Simon Marlow; Manuel M T Chakravarty; Johan Tibell; Simon Peyton-Jones; Mark Lentczner; andreas.voellmy@gmail.com; Carter Schonwald; kostirya@gmail.com; Edsko de Vries; ghc-devs@haskell.org; glasgow-haskell-users
Subject: Re: GHC 7.8 release?
We seem to be circling ever closer to consensus here! Yay!
I think the distinction of non-API breaking and API breaking release is very important. Refining SPJ's trifecta:
Haskell Platform comes out twice a year. It is based on very stable version of GHC, and intention is that people can just assume things on Hackage work with it. These are named for the year and sequence of the release: 2013.2, 2013.2.1, 2013.4,...
Non-API breaking releases can come out as often as desired. However, the version that is current as of mid-Feb. and mid-Aug. will be the ones considered for HP inclusion. By non-API breaking we mean the whole API surface including all the libraries bundled with GHC, as well as the operation of ghc, cabal, ghc-pkg, etc. Additions of features that must be explicitly enabled are okay. Additions of new APIs into existing modules are discouraged: Much code often imports base modules wholesale, and name clashes could easily result. These should never bump the major revision number: 7.4.1, 7.4.2...
API breaking releases happen by being released into a separate channel when ready for library owners to look at them. This channel should probably go through several stages: Ready for core package owners to work with, then HP package owners, then all package owners. I'd imagine this is a several month process. At the end of which, the release can go into the main channel. Such a merge shouldn't happen more than once a year... I think even once every two years is fine (!) To avoid confusion, I'd suggest that while in the separate channel, these release be named with odd number: 7.9, 7.11,..., and when moved to the main channel renamed to even: 7.10, 7.12...
This idea of three channels needs to be much more clearly communicated. The warning on the download page is a failure: Googling "ghc" takes you to the home page of GHC which immediately trumpets the "Lastest News" of a release of GHC 7.6.2. Once a user has read that and decided to download, then "STOP!" box is a) going to be skipped as they scan for the download link, and b) if read and followed, causes the "WTF? Why is HP so back rev?" So we need to change the front page so that the three channels are clearly communicated and targeted at the right users.
- Mark
(BTW: The first few links on the GHC web site are out of date: The second nav link is to a survey that is 7 years old. The License page is 8 years out of date. The FAQ is over a year old.)
On Sat, Feb 9, 2013 at 8:24 AM, Ian Lynagh
As a straw man, let's suppose we want to do annual API releases in September, with intermediate non-API releases in February.
That's a non-API release 5 months after the API release. 6.10.2 was 5 months after 6.10.1 (.3 was 1 month later, .4 a further 2) 6.12.2 was 4 months after 6.12.1 (.3 was 2 months later) 7.0.2 was 3.5 months after 7.0.1 (.3 was 1 month later, .4 a further 3) 7.2.2 was 3 months after 7.2.1 7.4.2 was 4 months after 7.4.1 7.6.2 was 4.5 months after 7.6.2 so if we do non-API releases, then perhaps it would make sense to stop doing minor releases (unless a release turns out to just be broken). Thanks Ian

On 10/02/13 15:36, Simon Peyton-Jones wrote:
We seem to be circling ever closer to consensus here! Yay!
Indeed! Good :-)
However, I’m not getting the bit about API changing vs non-API changing.
Firstly I don’t know which APIs are intended. The GHC API is essentially GHC itself, so it changes daily. Maybe you mean the base package? Or what?
I suspect you mean that a “non-API-changing” release absolutely guarantees to compile any package that compiled with the previous version. If so, that is a very strong constraint indeed. We do observe it for patch releases for GHC (e g 7.6.2 should compile anything that 7.6.1 compiles). But I think it would be difficult to guarantee for anything beyond a patch release. Every single commit (and the commit rate is many/day) would have to be evaluated against this criterion. And if it failed the criterion, it would have to go on a API-breaking HEAD. In effect we’d have two HEADs. I can’t see us sustaining this. And I don’t yet really see why it’s necessary. If you don’t want an API-breaking change, stick with the patch releases.
So, we have a channel for non-API-breaking changes already: the patch releases. So that means we already have all three channels!
Mark is asking for major GHC releases every year at the most, preferably less frequently. That means major GHC releases in the sense that we do them now, where libraries change, and a wave of package updates are required to get everything working. Johan, Manuel and Carter are saying that they want releases that add features but don't break code, i.e. a non-API-breaking release, as a way to get the new bits into the hands of the punters sooner. This is something that we don't do right now, and it would entail a change to our workflow and release schedule. It doesn't mean no API changes at all - we would have to allow APIs to be extended, because many feature additions come with new primops, or new supporting code in the ghc-prim or base packages. The package version policy states precisely what it means to extend an API (http://www.haskell.org/haskellwiki/Package_versioning_policy) and most third-party packages will still work so long as we only bump the minor versions of the packages that come with GHC. The GHC package itself would have to be exempt, because it contains every module in GHC, and hence would be impossible to keep stable if we are modifying the compiler to add new features. Of course it's not practical to maintain an extra branch of GHC for non-API-breaking development - two branches is already plenty. So there would need to be an API-freeze for a while between the major release and the non-API-breaking release, during which time people developing API changes would need to work on branches. Is it workable? I'm not sure, but I think it's worth a try. I wouldn't want to see this replace the patchlevel bugfix releases that we already do, and as Ian points out, there isn't a lot of room in the release schedule for more releases, unless we stretch out the timescales, doing major releases less frequently. Cheers, Simon
·Haskell Platform
·Patch-level releases
·New releases
if that’s so, all we need is better signposting. And I’m all for that!
Have I got this right?
Simon
*From:*Mark Lentczner [mailto:mark.lentczner@gmail.com] *Sent:* 09 February 2013 17:48 *To:* Simon Marlow; Manuel M T Chakravarty; Johan Tibell; Simon Peyton-Jones; Mark Lentczner; andreas.voellmy@gmail.com; Carter Schonwald; kostirya@gmail.com; Edsko de Vries; ghc-devs@haskell.org; glasgow-haskell-users *Subject:* Re: GHC 7.8 release?
We seem to be circling ever closer to consensus here! Yay!
I think the distinction of non-API breaking and API breaking release is very important. Refining SPJ's trifecta:
*Haskell Platform* comes out twice a year. It is based on very stable version of GHC, and intention is that people can just assume things on Hackage work with it. These are named for the year and sequence of the release: 2013.2, 2013.2.1, 2013.4,...
*Non-API breaking releases* can come out as often as desired. However, the version that is current as of mid-Feb. and mid-Aug. will be the ones considered for HP inclusion. By non-API breaking we mean the whole API surface including all the libraries bundled with GHC, as well as the operation of ghc, cabal, ghc-pkg, etc. Additions of features that must be explicitly enabled are okay. Additions of new APIs into existing modules are discouraged: Much code often imports base modules wholesale, and name clashes could easily result. These should never bump the major revision number: 7.4.1, 7.4.2...
*API breaking releases* happen by being released into a separate channel when ready for library owners to look at them. This channel should probably go through several stages: Ready for core package owners to work with, then HP package owners, then all package owners. I'd imagine this is a several month process. At the end of which, the release can go into the main channel. Such a merge shouldn't happen more than once a year... I think even once every two years is fine (!) To avoid confusion, I'd suggest that while in the separate channel, these release be named with odd number: 7.9, 7.11,..., and when moved to the main channel renamed to even: 7.10, 7.12...
This idea of three channels needs to be much more clearly communicated. The warning on the download page is a failure: Googling "ghc" takes you to the home page of GHC which immediately trumpets the "Lastest News" of a release of GHC 7.6.2. Once a user has read that and decided to download, then "STOP!" box is a) going to be skipped as they scan for the download link, and b) if read and followed, causes the "WTF? Why is HP so back rev?" So we need to change the front page so that the three channels are clearly communicated and targeted at the right users.
- Mark
(BTW: The first few links on the GHC web site are out of date: The second nav link is to a survey that is 7 years old. The License page is 8 years out of date. The FAQ is over a year old.)
On Sat, Feb 9, 2013 at 8:24 AM, Ian Lynagh
mailto:ian@well-typed.com> wrote: On Sat, Feb 09, 2013 at 12:06:12PM +0000, Simon Marlow wrote:
As a straw man, let's suppose we want to do annual API releases in September, with intermediate non-API releases in February.
That's a non-API release 5 months after the API release.
6.10.2 was 5 months after 6.10.1 (.3 was 1 month later, .4 a further 2) 6.12.2 was 4 months after 6.12.1 (.3 was 2 months later) 7.0.2 was 3.5 months after 7.0.1 (.3 was 1 month later, .4 a further 3) 7.2.2 was 3 months after 7.2.1 7.4.2 was 4 months after 7.4.1 7.6.2 was 4.5 months after 7.6.2
so if we do non-API releases, then perhaps it would make sense to stop doing minor releases (unless a release turns out to just be broken).
Thanks Ian

What I am still missing is this:
| Mark is asking for major GHC releases every year at the most, preferably
| less frequently. That means major GHC releases in the sense that we do
| them now, where libraries change, and a wave of package updates are
| required to get everything working.
What causes the "wave of package updates"? Just because GHC 7.8 (say) comes out, no package author need lift a finger. The Haskell Platform sets the pace for package updates. When the Haskell Platform comes out, now THAT is indeed a trigger for a wave of updates. Authors of packages in HP are forced to act; authors of other packages want their packages to work with the next HP.
But there is no reason why package authors should respond to GHC releases, provided we signpost it accurately.
You may ask what use is a GHC release that doesn't cause a wave of updates? And hence that doesn't work with at least some libraries. Well, it's a very useful forcing function to get new features actually out and tested. Of course we could just not do that, and say "build from source", but a release brings a welcome discipline. But under this story, release or not-release would be a Little Deal, not a Big Deal. The benefits are modest; the costs are modest.
In short, I'm continuing to propose that we stick to the current story, but signpost it better. If it ain't broke, don't fix it --- or at least fix only the bits that are broken, which is the signposting.
Simon
|
| Johan, Manuel and Carter are saying that they want releases that add
| features but don't break code, i.e. a non-API-breaking release, as a way
| to get the new bits into the hands of the punters sooner. This is
| something that we don't do right now, and it would entail a change to
| our workflow and release schedule.
|
| It doesn't mean no API changes at all - we would have to allow APIs to
| be extended, because many feature additions come with new primops, or
| new supporting code in the ghc-prim or base packages. The package
| version policy states precisely what it means to extend an API
| (http://www.haskell.org/haskellwiki/Package_versioning_policy) and most
| third-party packages will still work so long as we only bump the minor
| versions of the packages that come with GHC.
|
| The GHC package itself would have to be exempt, because it contains
| every module in GHC, and hence would be impossible to keep stable if we
| are modifying the compiler to add new features.
|
| Of course it's not practical to maintain an extra branch of GHC for
| non-API-breaking development - two branches is already plenty. So there
| would need to be an API-freeze for a while between the major release and
| the non-API-breaking release, during which time people developing API
| changes would need to work on branches.
|
| Is it workable? I'm not sure, but I think it's worth a try. I wouldn't
| want to see this replace the patchlevel bugfix releases that we already
| do, and as Ian points out, there isn't a lot of room in the release
| schedule for more releases, unless we stretch out the timescales, doing
| major releases less frequently.
|
| Cheers,
| Simon
|
|
| > ·Haskell Platform
| >
| > ·Patch-level releases
| >
| > ·New releases
| >
| >
| > if that’s so, all we need is better signposting. And I’m all for that!
| >
| > Have I got this right?
| >
| >
| > Simon
| >
| > *From:*Mark Lentczner [mailto:mark.lentczner@gmail.com]
| > *Sent:* 09 February 2013 17:48
| > *To:* Simon Marlow; Manuel M T Chakravarty; Johan Tibell; Simon
| > Peyton-Jones; Mark Lentczner; andreas.voellmy@gmail.com; Carter
| > Schonwald; kostirya@gmail.com; Edsko de Vries; ghc-devs@haskell.org;
| > glasgow-haskell-users
| > *Subject:* Re: GHC 7.8 release?
| >
| > We seem to be circling ever closer to consensus here! Yay!
| >
| > I think the distinction of non-API breaking and API breaking release is
| > very important. Refining SPJ's trifecta:
| >
| > *Haskell Platform* comes out twice a year. It is based on very
| > stable version of GHC, and intention is that people can just assume
| > things on Hackage work with it. These are named for the year and
| > sequence of the release: 2013.2, 2013.2.1, 2013.4,...
| >
| > *Non-API breaking releases* can come out as often as desired.
| > However, the version that is current as of mid-Feb. and mid-Aug.
| > will be the ones considered for HP inclusion. By non-API breaking we
| > mean the whole API surface including all the libraries bundled with
| > GHC, as well as the operation of ghc, cabal, ghc-pkg, etc. Additions
| > of features that must be explicitly enabled are okay. Additions of
| > new APIs into existing modules are discouraged: Much code often
| > imports base modules wholesale, and name clashes could easily
| > result. These should never bump the major revision number: 7.4.1,
| > 7.4.2...
| >
| > *API breaking releases* happen by being released into a separate
| > channel when ready for library owners to look at them. This channel
| > should probably go through several stages: Ready for core package
| > owners to work with, then HP package owners, then all package
| > owners. I'd imagine this is a several month process. At the end of
| > which, the release can go into the main channel. Such a merge
| > shouldn't happen more than once a year... I think even once every
| > two years is fine (!) To avoid confusion, I'd suggest that while in
| > the separate channel, these release be named with odd number: 7.9,
| > 7.11,..., and when moved to the main channel renamed to even: 7.10,
| > 7.12...
| >
| > This idea of three channels needs to be much more clearly communicated.
| > The warning on the download page is a failure: Googling "ghc" takes you
| > to the home page of GHC which immediately trumpets the "Lastest News" of
| > a release of GHC 7.6.2. Once a user has read that and decided to
| > download, then "STOP!" box is a) going to be skipped as they scan for
| > the download link, and b) if read and followed, causes the "WTF? Why is
| > HP so back rev?" So we need to change the front page so that the three
| > channels are clearly communicated and targeted at the right users.
| >
| > - Mark
| >
| > (BTW: The first few links on the GHC web site are out of date: The
| > second nav link is to a survey that is 7 years old. The License page is
| > 8 years out of date. The FAQ is over a year old.)
| >
| > On Sat, Feb 9, 2013 at 8:24 AM, Ian Lynagh

On Sun, Feb 10, 2013 at 09:02:18PM +0000, Simon Peyton-Jones wrote:
You may ask what use is a GHC release that doesn't cause a wave of updates? And hence that doesn't work with at least some libraries. Well, it's a very useful forcing function to get new features actually out and tested.
But the way you test new features is to write programs that use them, and programs depend on libraries. Thanks Ian

| > You may ask what use is a GHC release that doesn't cause a wave of updates? | And hence that doesn't work with at least some libraries. Well, it's a very useful | forcing function to get new features actually out and tested. | | But the way you test new features is to write programs that use them, | and programs depend on libraries. That is of course ideal, but the ideal carries costs. A half way house is a release whose library support will be patchy. Not such good testing, but much lower costs. But still (I think) a lot more testing than "compile HEAD" gives us. Simon

On Sun, Feb 10, 2013 at 09:30:23PM +0000, Simon Peyton-Jones wrote:
| > You may ask what use is a GHC release that doesn't cause a wave of updates? | And hence that doesn't work with at least some libraries. Well, it's a very useful | forcing function to get new features actually out and tested. | | But the way you test new features is to write programs that use them, | and programs depend on libraries.
That is of course ideal, but the ideal carries costs. A half way house is a release whose library support will be patchy.
But that's not what happens. GHC 7.8 is released. Someone installs it in order to try to use TypeHoles when developing their program. But their program depends on text, so they send Bryan a mail saying that text doesn't build with 7.8. And so the wave of updates begins. Thanks Ian

On 10/02/2013 21:43, Ian Lynagh wrote:
On Sun, Feb 10, 2013 at 09:30:23PM +0000, Simon Peyton-Jones wrote:
| > You may ask what use is a GHC release that doesn't cause a wave of updates? | And hence that doesn't work with at least some libraries. Well, it's a very useful | forcing function to get new features actually out and tested. | | But the way you test new features is to write programs that use them, | and programs depend on libraries.
That is of course ideal, but the ideal carries costs. A half way house is a release whose library support will be patchy.
But that's not what happens. GHC 7.8 is released. Someone installs it in order to try to use TypeHoles when developing their program. But their program depends on text, so they send Bryan a mail saying that text doesn't build with 7.8. And so the wave of updates begins.
As the maintainer of a low-level package (HTTP), I certainly see this kind of pressure starting even before a GHC release - e.g. https://github.com/haskell/HTTP/issues/36 As one of the maintainers of a high-level tool (darcs) that aims to always build against the current HP, I generate this kind of pressure myself: once GHC is released, I expect it to be in the HP within 3-6 months, so I need to get started quickly. I can't even check darcs itself until the dependencies work. I don't think there are any easy answers :-/ Ganesh

Yes, exactly this.
A release where the versions of base, and all other baked in libraries are
only minor version bumps and where breaking changes are localized to
relatively experimental language features / extensions and GHC specific
APIs would ideal.
Eg: I'm OK having to patch ghc-mod so it works with a new intermediate GHC
release. (Esp since it uses GHC internal apis)
The new scheduler improvements, the recent doh work , the great SIMD work
/ code generator improvments, the type level nats, ordered type families,
the overloaded list syntax,
All of these extensions and associated API augmentations should not break
stable libraries, at least on minor version bumps, cause
1) maybe experimental new APIs can be included in minor releases as
separate explicitly experimental modules (this gives a way of including new
functionality in a minor release)
2) generally type checking / inference on stable code that doesn't enable
new features stays the same.
I'm probably overlooking some pieces or. Details, and I'm largely restating
what Johan and Manuel have communicated.
-Carter
On Feb 10, 2013 4:43 PM, "Ian Lynagh"
On Sun, Feb 10, 2013 at 09:30:23PM +0000, Simon Peyton-Jones wrote:
| > You may ask what use is a GHC release that doesn't cause a wave of updates? | And hence that doesn't work with at least some libraries. Well, it's a very useful | forcing function to get new features actually out and tested. | | But the way you test new features is to write programs that use them, | and programs depend on libraries.
That is of course ideal, but the ideal carries costs. A half way house is a release whose library support will be patchy.
But that's not what happens. GHC 7.8 is released. Someone installs it in order to try to use TypeHoles when developing their program. But their program depends on text, so they send Bryan a mail saying that text doesn't build with 7.8. And so the wave of updates begins.
Thanks Ian
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

On Sun, Feb 10, 2013 at 3:30 PM, Simon Peyton-Jones
| > You may ask what use is a GHC release that doesn't cause a wave of updates? | And hence that doesn't work with at least some libraries. Well, it's a very useful | forcing function to get new features actually out and tested. | | But the way you test new features is to write programs that use them, | and programs depend on libraries.
That is of course ideal, but the ideal carries costs. A half way house is a release whose library support will be patchy. Not such good testing, but much lower costs. But still (I think) a lot more testing than "compile HEAD" gives us.
Simon
Fitting: http://xkcd.com/1172/ (sorry, I couldn't resist) -- Gaby

Simon Peyton-Jones
| > You may ask what use is a GHC release that doesn't cause a wave of updates? | And hence that doesn't work with at least some libraries. Well, it's a very useful | forcing function to get new features actually out and tested. | | But the way you test new features is to write programs that use them, | and programs depend on libraries.
That is of course ideal, but the ideal carries costs. A half way house is a release whose library support will be patchy. Not such good testing, but much lower costs. But still (I think) a lot more testing than "compile HEAD" gives us.
I don't think so. In my experience, library support is not patchy, but virtually non-existent as some of the very widely used libraries (like Text) break, and everything else depends on them in one way or another. If we don't make sure that the commonly used libraries work with these "preview" releases, I don't think those releases are worth the effort. I understand that we can't really guarantee API backwards compatibility for the GHC API (but that's ok as few packages depend on that). Critical are all the libraries bundled with GHC. Adding to them is fine, but no API definitions should change or be removed. Manuel

On Sun, Feb 10, 2013 at 3:16 PM, Ian Lynagh
On Sun, Feb 10, 2013 at 09:02:18PM +0000, Simon Peyton-Jones wrote:
You may ask what use is a GHC release that doesn't cause a wave of updates? And hence that doesn't work with at least some libraries. Well, it's a very useful forcing function to get new features actually out and tested.
But the way you test new features is to write programs that use them, and programs depend on libraries.
Thanks Ian
Releasing GHC early and often (possibly with API breakage) isn't really the problem. The real problem is how to coordinate with library authors (e.g. Haskell Platform), etc. I suspect GHC should continue to offer a platform for research and experiments. That is much harder if you curtail the ability to release GHC early and often. -- Gaby

Well said. Having a more aggressive release cycle is another interesting
perspective.
On Feb 10, 2013 6:21 PM, "Gabriel Dos Reis"
On Sun, Feb 10, 2013 at 3:16 PM, Ian Lynagh
wrote: On Sun, Feb 10, 2013 at 09:02:18PM +0000, Simon Peyton-Jones wrote:
You may ask what use is a GHC release that doesn't cause a wave of
updates? And hence that doesn't work with at least some libraries. Well, it's a very useful forcing function to get new features actually out and tested.
But the way you test new features is to write programs that use them, and programs depend on libraries.
Thanks Ian
Releasing GHC early and often (possibly with API breakage) isn't really the problem. The real problem is how to coordinate with library authors (e.g. Haskell Platform), etc.
I suspect GHC should continue to offer a platform for research and experiments. That is much harder if you curtail the ability to release GHC early and often.
-- Gaby
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

While I'm notionally in favor of decoupling API-breaking changes from non-API breaking changes, there are two major difficulties: GHC.Prim and Template Haskell. Should a non-API-breaking change mean that GHC.Prim is immutable? If so, this greatly restricts GHC's development. If not, it means that a large chunk of hackage will become unbuildable due to deps on vector and primitive. With Template Haskell the situation is largely similar, although the deps are different. What I would like to see are more patch-level bugfix releases. I suspect the reason we don't have more is that making a release is a lot of work. So, Ian, what needs to happen to make more frequent patch releases feasible? On Mon, Feb 11, 2013 at 7:42 AM, Carter Schonwald < carter.schonwald@gmail.com> wrote:
Well said. Having a more aggressive release cycle is another interesting perspective. On Feb 10, 2013 6:21 PM, "Gabriel Dos Reis"
wrote: On Sun, Feb 10, 2013 at 3:16 PM, Ian Lynagh
wrote: On Sun, Feb 10, 2013 at 09:02:18PM +0000, Simon Peyton-Jones wrote:
You may ask what use is a GHC release that doesn't cause a wave of
updates? And hence that doesn't work with at least some libraries. Well, it's a very useful forcing function to get new features actually out and tested.
But the way you test new features is to write programs that use them, and programs depend on libraries.
Thanks Ian
Releasing GHC early and often (possibly with API breakage) isn't really the problem. The real problem is how to coordinate with library authors (e.g. Haskell Platform), etc.
I suspect GHC should continue to offer a platform for research and experiments. That is much harder if you curtail the ability to release GHC early and often.
-- Gaby
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

On Mon, Feb 11, 2013 at 10:09:56AM +0800, John Lato wrote:
What I would like to see are more patch-level bugfix releases. I suspect the reason we don't have more is that making a release is a lot of work. So, Ian, what needs to happen to make more frequent patch releases feasible?
Well, * actually making a release takes time * I assume that you also want more patches merged from the HEAD, rather than just putting intermediate releases out, in which case merging those patches also takes time. And most of the small fixes already get merged, so you'd be looking at merging bigger changes which are more likely to require resolving conflicts, and so take even more time so basically to make more releases we'd need to spend less time doing other stuff (or for someone else to look after the branch). Bigger changes, and especially changes that need altering for the stable branch, are also more likely to introduce bugs. Thanks Ian

Hi, one remedy to the problem could be better infrastructure: * More automated test-building of packages on hackage (including test suites) with various GHC releases, and better display of the results. This way, library authors would not have to manually build their library to see if they work with the latest compiler. * Automatic test building of packages with explicit relaxation of the dependencies, to suggest dependency relaxation that would work without code changes (from my work at Debian, in most cases only meta-data changes are required to support newer versions of dependencies, including core libs). * A more liberal attitude to changing other peoples packages’s metadata to fix the dependencies – either to exclude broken combinations or to expand the range. Preferably online, similar to the edit button in github, and checked by the above CI infrastructure. This way, it would be easier for libraries to support newer GHC releases without having to give up supporting older releases. But I know that development of Hackage is not something that seems to be happening a lot right now, and as I don’t help, I’m not complaining. So consider this a side notice and carry on discussing :-) Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

On Mon, Feb 11, 2013 at 2:46 AM, Joachim Breitner
Hi,
one remedy to the problem could be better infrastructure: * More automated test-building of packages on hackage (including test suites) with various GHC releases, and better display of the results. This way, library authors would not have to manually build their library to see if they work with the latest compiler. * Automatic test building of packages with explicit relaxation of the dependencies, to suggest dependency relaxation that would work without code changes (from my work at Debian, in most cases only meta-data changes are required to support newer versions of dependencies, including core libs). * A more liberal attitude to changing other peoples packages’s metadata to fix the dependencies – either to exclude broken combinations or to expand the range. Preferably online, similar to the edit button in github, and checked by the above CI infrastructure.
This way, it would be easier for libraries to support newer GHC releases without having to give up supporting older releases.
But I know that development of Hackage is not something that seems to be happening a lot right now, and as I don’t help, I’m not complaining. So consider this a side notice and carry on discussing :-)
Greetings, Joachim
The issue is largely "social" problem that I suspect technological solutions won't solve. Yes, it would be nice to have better infrastructure. However, at the end of the day, when the automated tests say an API or ABI set has been broken, a non-technical decision needs to be taken (should it be part of the next release, or should it be delayed or removed?) And you will be back to square one. Part of the problem is what happens to successful programming languages: you get users, and users -- believe it or not -- like stability (few people like fiddling with function names and arguments in an otherwise working program or library) and, yes they will take new improvements as long as their working programs continue working. Researchers on the other hand need room to experiment with ideas; they need to get their implementations in the wild to get a sense of what scales, what doesn't. These facts are in tension; you can't solve it by more automated testing. If you want GHC to compete with GCC or Clang, then you know what to do: stop experimenting with the language. So far, most of the work on GHC has been done by researchers who are not paid to do "development work" -- unless academia changes its standards... -- Gaby

On 10 February 2013 13:02, Simon Peyton-Jones
What I am still missing is this:
| Mark is asking for major GHC releases every year at the most, preferably | less frequently. That means major GHC releases in the sense that we do | them now, where libraries change, and a wave of package updates are | required to get everything working.
What causes the "wave of package updates"? Just because GHC 7.8 (say) comes out, no package author need lift a finger. The Haskell Platform sets the pace for package updates. When the Haskell Platform comes out, now THAT is indeed a trigger for a wave of updates. Authors of packages in HP are forced to act; authors of other packages want their packages to work with the next HP.
But there is no reason why package authors should respond to GHC releases, provided we signpost it accurately.
You may ask what use is a GHC release that doesn't cause a wave of updates? And hence that doesn't work with at least some libraries. Well, it's a very useful forcing function to get new features actually out and tested. Of course we could just not do that, and say "build from source", but a release brings a welcome discipline. But under this story, release or not-release would be a Little Deal, not a Big Deal. The benefits are modest; the costs are modest.
In short, I'm continuing to propose that we stick to the current story, but signpost it better. If it ain't broke, don't fix it --- or at least fix only the bits that are broken, which is the signposting.
My understanding of the proposed changes (which I'm also supportive of) is to separate GHC improvements that break existing libraries (or perhaps even simply add language level features), and those that are improvements under-the-hood (e.g., bug fixes, performance improvements). So rather than 7.8 be a huge single release containing new type level features, SIMD, the new code-generator. There would be two releases, one containing just say the new-code-generator, improvements to the IO manager, potentially also DPH... another release would containing new language level improvements. So then HP can benefit from improvements to the existing language and API without having to also pull in breaking (or just extending) changes... It's the issue of a research compiler Vs. and industrial compiler and managing that more explicitly in the release model. Cheers, David
Simon
| | Johan, Manuel and Carter are saying that they want releases that add | features but don't break code, i.e. a non-API-breaking release, as a way | to get the new bits into the hands of the punters sooner. This is | something that we don't do right now, and it would entail a change to | our workflow and release schedule. | | It doesn't mean no API changes at all - we would have to allow APIs to | be extended, because many feature additions come with new primops, or | new supporting code in the ghc-prim or base packages. The package | version policy states precisely what it means to extend an API | (http://www.haskell.org/haskellwiki/Package_versioning_policy) and most | third-party packages will still work so long as we only bump the minor | versions of the packages that come with GHC. | | The GHC package itself would have to be exempt, because it contains | every module in GHC, and hence would be impossible to keep stable if we | are modifying the compiler to add new features. | | Of course it's not practical to maintain an extra branch of GHC for | non-API-breaking development - two branches is already plenty. So there | would need to be an API-freeze for a while between the major release and | the non-API-breaking release, during which time people developing API | changes would need to work on branches. | | Is it workable? I'm not sure, but I think it's worth a try. I wouldn't | want to see this replace the patchlevel bugfix releases that we already | do, and as Ian points out, there isn't a lot of room in the release | schedule for more releases, unless we stretch out the timescales, doing | major releases less frequently. | | Cheers, | Simon | | | > ·Haskell Platform | > | > ·Patch-level releases | > | > ·New releases | > | > | > if that’s so, all we need is better signposting. And I’m all for that! | > | > Have I got this right? | > | > | > Simon | > | > *From:*Mark Lentczner [mailto:mark.lentczner@gmail.com] | > *Sent:* 09 February 2013 17:48 | > *To:* Simon Marlow; Manuel M T Chakravarty; Johan Tibell; Simon | > Peyton-Jones; Mark Lentczner; andreas.voellmy@gmail.com; Carter | > Schonwald; kostirya@gmail.com; Edsko de Vries; ghc-devs@haskell.org; | > glasgow-haskell-users | > *Subject:* Re: GHC 7.8 release? | > | > We seem to be circling ever closer to consensus here! Yay! | > | > I think the distinction of non-API breaking and API breaking release is | > very important. Refining SPJ's trifecta: | > | > *Haskell Platform* comes out twice a year. It is based on very | > stable version of GHC, and intention is that people can just assume | > things on Hackage work with it. These are named for the year and | > sequence of the release: 2013.2, 2013.2.1, 2013.4,... | > | > *Non-API breaking releases* can come out as often as desired. | > However, the version that is current as of mid-Feb. and mid-Aug. | > will be the ones considered for HP inclusion. By non-API breaking we | > mean the whole API surface including all the libraries bundled with | > GHC, as well as the operation of ghc, cabal, ghc-pkg, etc. Additions | > of features that must be explicitly enabled are okay. Additions of | > new APIs into existing modules are discouraged: Much code often | > imports base modules wholesale, and name clashes could easily | > result. These should never bump the major revision number: 7.4.1, | > 7.4.2... | > | > *API breaking releases* happen by being released into a separate | > channel when ready for library owners to look at them. This channel | > should probably go through several stages: Ready for core package | > owners to work with, then HP package owners, then all package | > owners. I'd imagine this is a several month process. At the end of | > which, the release can go into the main channel. Such a merge | > shouldn't happen more than once a year... I think even once every | > two years is fine (!) To avoid confusion, I'd suggest that while in | > the separate channel, these release be named with odd number: 7.9, | > 7.11,..., and when moved to the main channel renamed to even: 7.10, | > 7.12... | > | > This idea of three channels needs to be much more clearly communicated. | > The warning on the download page is a failure: Googling "ghc" takes you | > to the home page of GHC which immediately trumpets the "Lastest News" of | > a release of GHC 7.6.2. Once a user has read that and decided to | > download, then "STOP!" box is a) going to be skipped as they scan for | > the download link, and b) if read and followed, causes the "WTF? Why is | > HP so back rev?" So we need to change the front page so that the three | > channels are clearly communicated and targeted at the right users. | > | > - Mark | > | > (BTW: The first few links on the GHC web site are out of date: The | > second nav link is to a survey that is 7 years old. The License page is | > 8 years out of date. The FAQ is over a year old.) | > | > On Sat, Feb 9, 2013 at 8:24 AM, Ian Lynagh
mailto:ian@well-typed.com> wrote: | > | > On Sat, Feb 09, 2013 at 12:06:12PM +0000, Simon Marlow wrote: | > > | > > As a straw man, let's suppose we want to do annual API releases in | > > September, with intermediate non-API releases in February. | > | > That's a non-API release 5 months after the API release. | > | > 6.10.2 was 5 months after 6.10.1 (.3 was 1 month later, .4 a further 2) | > 6.12.2 was 4 months after 6.12.1 (.3 was 2 months later) | > 7.0.2 was 3.5 months after 7.0.1 (.3 was 1 month later, .4 a further 3) | > 7.2.2 was 3 months after 7.2.1 | > 7.4.2 was 4 months after 7.4.1 | > 7.6.2 was 4.5 months after 7.6.2 | > | > so if we do non-API releases, then perhaps it would make sense to stop | > doing minor releases (unless a release turns out to just be broken). | > | > | > Thanks | > Ian | > _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On Sun, Feb 10, 2013 at 4:02 PM, Simon Peyton-Jones
What causes the "wave of package updates"? Just because GHC 7.8 (say) comes out, no package author need lift a finger. The Haskell Platform sets the pace for package updates. When the Haskell Platform comes out, now THAT is indeed a trigger for a wave of updates. Authors of packages in HP are forced to act; authors of other packages want their packages to work with the next HP.
(a) There are packages which tend to track GHC's latest version instead of the HP (yesod used to do this, which was a source of much pain). (b) There are linux distributions which always track the latest everything, often in a rolling-release fashion (notably Arch). They are actively hostile to the Platform, and a source of even greater pain. Many package authors update because Arch users demand it and openly insult anyone who points them to the Platform or any policy which suggests that anything other then the absolutely latest version is acceptable. You *might* be able to control expectations with respect to (a); (b) is not subject to any variety of reason. It will produce as much pressure as it has users, plus multiply that pressure by the number of package authors who are also users. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

(a) There are packages which tend to track GHC's latest version instead of the HP (yesod used to do this, which was a source of much pain).
(b) There are linux distributions which always track the latest everything, often in a rolling-release fashion (notably Arch). They are actively hostile to the Platform, and a source of even greater pain. Many package authors update because Arch users demand it and openly insult anyone who points them to the Platform or any policy which suggests that anything other then the absolutely latest version is acceptable.
These must be social questions (what I was earlier calling “signposting”) rather than technical ones. For example, you say that (b) is not subject to any variety of reason, and yet no linux distribution tracks HEAD, does it? They don’t openly insult anyone who points to a release just because HEAD has new cool stuff! No, they track things we call “releases”. Very well, maybe we should call them “previews” instead, and only dignify it as a “release” when, and only when a preview is picked by HP as worthy of incorporation in the next HP.
Or something. I’m just looking for a way to reconcile
· Release early, release often
· Stability for the Haskell Platform
It seems to me that such a reconciliation is within reach, and is actually very close to what we do, if we only signpost what is what far more vigorously and clearly than we do now. But maybe I’m wrong.
Simon
From: Brandon Allbery [mailto:allbery.b@gmail.com]
Sent: 11 February 2013 01:15
To: Simon Peyton-Jones
Cc: Simon Marlow; Mark Lentczner; Manuel M T Chakravarty; kostirya@gmail.com; glasgow-haskell-users; ghc-devs@haskell.org; Edsko de Vries
Subject: Re: GHC 7.8 release?
On Sun, Feb 10, 2013 at 4:02 PM, Simon Peyton-Jones

Hi, Am Montag, den 11.02.2013, 22:31 +0000 schrieb Simon Peyton-Jones:
No, they track things we call “releases”. Very well, maybe we should call them “previews” instead, and only dignify it as a “release” when, and only when a preview is picked by HP as worthy of incorporation in the next HP.
wording does indeed help a lot. I remember that 7.2 was dubbed a technology review with the effect that, at least Debian, never included it in the main repository. It was packaged and provided in experimental, for easy access to those who want to play with it, but not with additional library packages. From what I can remember, there was no push towards supporting 7.2 properly, and I believe this would have been different if 7.2 was just “the next regular GHC release”. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

Agreed.
having relatively bug free "technology preview" releases, which (perhaps
ideally) have new functionality included in a way that keeps the breakage
overhead lowish, on a regular basis, is ideal.
one thought on the api hacking front:
the main concern we're hitting is that we want to not "pin" internal GHC
apis, yet we want to make the breakage rate on libraries people may want to
use that might depend on say GHC.Prim or GHC.TH to be minimal.
Is a possible solution that on preview releases we have the changed bits of
API for a module M to be exported in a module M.Experimental?
eg, new ghc primops in a tech preview release maybe are exported by
GHC.Prim.Experimental
(or something of this sort?)
just throwing out one possible point in the design space.
cheers
-Carter
On Mon, Feb 11, 2013 at 5:31 PM, Simon Peyton-Jones
(a) There are packages which tend to track GHC's latest version instead of the HP (yesod used to do this, which was a source of much pain).****
****
(b) There are linux distributions which always track the latest everything, often in a rolling-release fashion (notably Arch). They are actively hostile to the Platform, and a source of even greater pain. Many package authors update because Arch users demand it and openly insult anyone who points them to the Platform or any policy which suggests that anything other then the absolutely latest version is acceptable.****
** **
These must be *social* questions (what I was earlier calling “signposting”) rather than technical ones. For example, you say that (b) is not subject to any variety of reason, and yet no linux distribution tracks HEAD, does it? They don’t openly insult anyone who points to a release just because HEAD has new cool stuff! No, they track things we call “releases”. Very well, maybe we should call them “previews” instead, and only dignify it as a “release” when, and only when a preview is picked by HP as worthy of incorporation in the next HP. ****
** **
Or something. I’m just looking for a way to reconcile****
**· **Release early, release often****
**· **Stability for the Haskell Platform****
It seems to me that such a reconciliation is within reach, and is actually very close to what we do, if we only signpost what is what far more vigorously and clearly than we do now. But maybe I’m wrong.****
** **
Simon****
** **
*From:* Brandon Allbery [mailto:allbery.b@gmail.com] *Sent:* 11 February 2013 01:15 *To:* Simon Peyton-Jones *Cc:* Simon Marlow; Mark Lentczner; Manuel M T Chakravarty; kostirya@gmail.com; glasgow-haskell-users; ghc-devs@haskell.org; Edsko de Vries
*Subject:* Re: GHC 7.8 release?****
** **
On Sun, Feb 10, 2013 at 4:02 PM, Simon Peyton-Jones
wrote:**** What causes the "wave of package updates"? Just because GHC 7.8 (say) comes out, no package author need lift a finger. The Haskell Platform sets the pace for package updates. When the Haskell Platform comes out, now THAT is indeed a trigger for a wave of updates. Authors of packages in HP are forced to act; authors of other packages want their packages to work with the next HP.****
** **
(a) There are packages which tend to track GHC's latest version instead of the HP (yesod used to do this, which was a source of much pain).****
****
(b) There are linux distributions which always track the latest everything, often in a rolling-release fashion (notably Arch). They are actively hostile to the Platform, and a source of even greater pain. Many package authors update because Arch users demand it and openly insult anyone who points them to the Platform or any policy which suggests that anything other then the absolutely latest version is acceptable.****
** **
You *might* be able to control expectations with respect to (a); (b) is not subject to any variety of reason. It will produce as much pressure as it has users, plus multiply that pressure by the number of package authors who are also users.****
** **
-- ****
brandon s allbery kf8nh sine nomine associates****
allbery.b@gmail.com ballbery@sinenomine.net****
unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net****
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Hi, I think reducing breakages is not necessarily, and maybe not even primarily, an issue of releases. It's more about realizing that the cost of breaking things (e.g. changing library APIs) has gone up as the Haskell community and ecosystem has grown. We need to be conscious of that and carefully consider if making a breaking change (e.g. changing a function instead of adding a new function) is really necessary. Many platforms (e.g. Java and Python) rarely, if ever, make breaking changes. If you look at compiler projects (e.g. LLVM and GCC) you never see intentional breakages, even in major releases*. Here's a question I think we should be asking ourselves: why is the major version of base bumped with every release? Is it really necessary to make breaking changes this often? How about aiming for having GHC 7.10 be a release where we only add new stuff and improve old stuff? -- Johan * A major GCC release usually signifies that some large change to the code generator was made.

On Mon, 11 Feb 2013 15:03:25 -0800 Johan Tibell
Many platforms (e.g. Java and Python) rarely, if ever, make breaking changes. If you look at compiler projects (e.g. LLVM and GCC) you never see intentional breakages, even in major releases*.
Those are very mature platforms, hundreds of millions of people
use them indirectly each day, it's hard to compare GHC to them.
If anything, Haskell is very young for its age, and should rather move
faster. Bad mistakes, accidents and partial or inefficient
implementations proliferate in standard libraries for decades,
tampering GHC's growth as a serious production language.
On Mon, 11 Feb 2013 22:31:53 +0000 Simon Peyton-Jones
no linux distribution tracks HEAD, does it?
Gentoo packages HEAD just fine.

On Mon, Feb 11, 2013 at 5:03 PM, Johan Tibell
Hi,
I think reducing breakages is not necessarily, and maybe not even primarily, an issue of releases. It's more about realizing that the cost of breaking things (e.g. changing library APIs) has gone up as the Haskell community and ecosystem has grown. We need to be conscious of that and carefully consider if making a breaking change (e.g. changing a function instead of adding a new function) is really necessary. Many platforms (e.g. Java and Python) rarely, if ever, make breaking changes. If you look at compiler projects (e.g. LLVM and GCC) you never see intentional breakages, even in major releases*. Here's a question I think we should be asking ourselves: why is the major version of base bumped with every release? Is it really necessary to make breaking changes this often? How about aiming for having GHC 7.10 be a release where we only add new stuff and improve old stuff?
-- Johan
* A major GCC release usually signifies that some large change to the code generator was made.
I have some experience with GCC releases -- having served as a GCC Release Manager for several years. In fact, the release scheme we currently have has gone through several iterations -- usually after many "existential" crisis. Yes, we don't break GCC ABI lightly, mostly because GCC isn't a research compiler and most "research works" are done on forgotten branches that nobody cares about anymore. Implementing new standards (e.g. moving from C++03 to C++11 that has several mandated API and ABI breakage) is a royal pain that isn't worth replicating in GHC -- at least if you want GHC to remain a research compiler. Concerning your question about release number, I would venture that there is a certain "marketing" aspect to it. I can tell you that we, the GCC community, are very poor at that -- otherwise, we would have been at version 26 or something :-) -- Gaby

On Mon, Feb 11, 2013 at 4:34 PM, Gabriel Dos Reis < gdr@integrable-solutions.net> wrote:
I have some experience with GCC releases -- having served as a GCC Release Manager for several years. In fact, the release scheme we currently have has gone through several iterations -- usually after many "existential" crisis. Yes, we don't break GCC ABI lightly, mostly because GCC isn't a research compiler and most "research works" are done on forgotten branches that nobody cares about anymore. Implementing new standards (e.g. moving from C++03 to C++11 that has several mandated API and ABI breakage) is a royal pain that isn't worth replicating in GHC -- at least if you want GHC to remain a research compiler.
Concerning your question about release number, I would venture that there is a certain "marketing" aspect to it. I can tell you that we, the GCC community, are very poor at that -- otherwise, we would have been at version 26 or something :-)
Thanks for sharing! My perspective is of course as a user. I don't think I've ever run into a case where the compiler broken a previous work e.g. C++ program. On the other hand I have to make a release of most of the libraries I maintain with every GHC release (to bump cabal version constraints to accept the new base version, if nothing else). -- Johan

On Mon, Feb 11, 2013 at 6:37 PM, Johan Tibell
On Mon, Feb 11, 2013 at 4:34 PM, Gabriel Dos Reis
wrote: I have some experience with GCC releases -- having served as a GCC Release Manager for several years. In fact, the release scheme we currently have has gone through several iterations -- usually after many "existential" crisis. Yes, we don't break GCC ABI lightly, mostly because GCC isn't a research compiler and most "research works" are done on forgotten branches that nobody cares about anymore. Implementing new standards (e.g. moving from C++03 to C++11 that has several mandated API and ABI breakage) is a royal pain that isn't worth replicating in GHC -- at least if you want GHC to remain a research compiler.
Concerning your question about release number, I would venture that there is a certain "marketing" aspect to it. I can tell you that we, the GCC community, are very poor at that -- otherwise, we would have been at version 26 or something :-)
Thanks for sharing! My perspective is of course as a user. I don't think I've ever run into a case where the compiler broken a previous work e.g. C++ program. On the other hand I have to make a release of most of the libraries I maintain with every GHC release (to bump cabal version constraints to accept the new base version, if nothing else).
-- Johan
I understand. Concerning GCC, it is true that the shear size of the user base and the audience of the compiler ("industrial strength") calls for a very conservative approach to ABI or API breaking. On the hand, that means that there are certain welcomed, beneficial changes that we cannot effect. For example, because libstdc++ has been an early adopter of a reference counted-based implementation of std::string, we could not switch to more efficient and more multithread-friendly implementation. That was has been contributed for years but has been sequestered in some branches and namespaces integrated with the rest of the library. That is a shame, but one that is unavoidable given the expectation of the GCC audience. It is not clear to me that GHC is ready for that kind of constraints. We are still describing the C++11 implementation as "experimental" because we "fear" that doing otherwise might commit ourselves to an ABI and API that we may need to break later -- possibly because of undetected bugs or because we have found implementations we like better. Of course, that causes some distress in the community because people would like to say "GCC implements C++11." Finally, we do break API, you have just been lucky :-) http://gcc.gnu.org/onlinedocs/libstdc++/manual/api.html But, we have also developed very elaborate scheme for symbol versioning and namespace associations to help users digest API breakages without tears. Symbol versioning is a very messy business. I am still of the opinion that the current issue with GHC and HP is social, and it can be resolved through communication and coordination between the two communities for the great good of the Haskell community. -- Gaby

Let me strongly support Gaby's many points. Simon has it right: we need a way to support 'users' in a stable way, without adding enormous inertia to the development of GHC. I has lived through the slow death of a system from being rapidly innovative to having 'innovations' which exist only because a marketer says that the raft of boring, incremental improvements in each new release are innovative on some What's New blurb. Haskell is extremely useful, and GHC's flavour of Haskell is quite exciting because of the continual evolution of the language. Please don't kill the excitement! The Platform does seem to be an ideal mechanism to provide stability for users who value stability over bleeding-edge. But for that to be successful, there are to be strong community commitment [especially from library maintainers] to tracking that evolution. If the social mechanisms are not strong enough to fully support the Platform, I would say that that is the most important thing to fix. Jacques

On Mon, Feb 11, 2013 at 7:37 PM, Johan Tibell
Thanks for sharing! My perspective is of course as a user. I don't think I've ever run into a case where the compiler broken a previous work e.g. C++ program. On the other hand I have to make a release of most of the libraries I maintain with every GHC release (to bump cabal version constraints to accept the new base version, if nothing else).
gcc has had its moments occasionally, although usually major gcc issues aren't called by gcc itself but by packagers ("gcc 2.96", anyone?) or fugly politics (egcs). -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On 11/02/13 23:03, Johan Tibell wrote:
Hi,
I think reducing breakages is not necessarily, and maybe not even primarily, an issue of releases. It's more about realizing that the cost of breaking things (e.g. changing library APIs) has gone up as the Haskell community and ecosystem has grown. We need to be conscious of that and carefully consider if making a breaking change (e.g. changing a function instead of adding a new function) is really necessary. Many platforms (e.g. Java and Python) rarely, if ever, make breaking changes. If you look at compiler projects (e.g. LLVM and GCC) you never see intentional breakages, even in major releases*. Here's a question I think we should be asking ourselves: why is the major version of base bumped with every release? Is it really necessary to make breaking changes this often?
One reason for the major version bumps is that base is a big conglomeration of modules, ranging from those that hardly ever change (Prelude) to those that change frequently (GHC.*). For example, the new IO manager that is about to get merged in will force a major bump of base, because it changes GHC.Event. The unicode support in the IO library was similar: although it only added to the external APIs that most people use, it also changed stuff inside GHC.* that we expose for a few clients. The solution to this would be to split up base further, but of course doing that is itself a major upheaval. However, having done that, it might be more feasible to have non-API-breaking releases. Of course we do also make well-intentioned changes to libraries, via the library proposal process, and some of these break APIs. But it wouldn't do any harm to batch these up and defer them until the next API-changing release. It would be great to have a list of the changes that had gone into base in the last few major releases, any volunteers? Cheers, Simon

On 02/12/2013 09:37 AM, Simon Marlow wrote:
On 11/02/13 23:03, Johan Tibell wrote:
Hi,
Of course we do also make well-intentioned changes to libraries, via the library proposal process, and some of these break APIs. But it wouldn't do any harm to batch these up and defer them until the next API-changing release.
Indeed. It might even be preferable to just have one "huge" breakage every year than having lots of minor breakages which require updating packages (and dependencies). At least you'll feel you're getting your money's worth.

On 2/12/13 3:37 AM, Simon Marlow wrote:
One reason for the major version bumps is that base is a big conglomeration of modules, ranging from those that hardly ever change (Prelude) to those that change frequently (GHC.*). For example, the new IO manager that is about to get merged in will force a major bump of base, because it changes GHC.Event. The unicode support in the IO library was similar: although it only added to the external APIs that most people use, it also changed stuff inside GHC.* that we expose for a few clients.
The solution to this would be to split up base further, but of course doing that is itself a major upheaval. However, having done that, it might be more feasible to have non-API-breaking releases.
While it will lead to much wailing and gnashing of teeth in the short term, if it's feasible to break GHC.* off into its own package, then I think we should. The vast majority of base seems quite stable or else is rolling along at a reasonable pace. And yet, every time a new GHC comes out, there's a new wave of fiddling the knobs on cabal files because nothing really changed. On the other hand, GHC.* moves rather quickly. Nevertheless, GHC.* is nice to have around, so we don't want to just hide that churning. The impedance mismatch here suggests that they really should be separate packages. I wonder whether GHC.* should be moved in with ghc-prim, or whether they should remain separate... But again, this depends on how feasible it would be to actually split the packages apart. Is it feasible? -- Live well, ~wren

On 13/02/13 07:06, wren ng thornton wrote:
On 2/12/13 3:37 AM, Simon Marlow wrote:
One reason for the major version bumps is that base is a big conglomeration of modules, ranging from those that hardly ever change (Prelude) to those that change frequently (GHC.*). For example, the new IO manager that is about to get merged in will force a major bump of base, because it changes GHC.Event. The unicode support in the IO library was similar: although it only added to the external APIs that most people use, it also changed stuff inside GHC.* that we expose for a few clients.
The solution to this would be to split up base further, but of course doing that is itself a major upheaval. However, having done that, it might be more feasible to have non-API-breaking releases.
While it will lead to much wailing and gnashing of teeth in the short term, if it's feasible to break GHC.* off into its own package, then I think we should. The vast majority of base seems quite stable or else is rolling along at a reasonable pace. And yet, every time a new GHC comes out, there's a new wave of fiddling the knobs on cabal files because nothing really changed. On the other hand, GHC.* moves rather quickly. Nevertheless, GHC.* is nice to have around, so we don't want to just hide that churning. The impedance mismatch here suggests that they really should be separate packages. I wonder whether GHC.* should be moved in with ghc-prim, or whether they should remain separate...
But again, this depends on how feasible it would be to actually split the packages apart. Is it feasible?
So I think we'd need to add another package, call it ghc-base perhaps. The reason is that ghc-prim sits below the integer package (integer-simple or integer-gmp). It's feasible to split base, but to a first approximation what you end up with is base renamed to ghc-base, and then the new base contains just stub modules that re-export stuff from ghc-base. In fact, maybe you want to do it exactly like this for simplicity - all the code goes in ghc-base. There would be some impact on compilation time, as we'd have twice as many interfaces to read. I believe Ian has done some experiments with splitting base further, so he might have more to add here. Cheers, Simon

* Simon Marlow
It's feasible to split base, but to a first approximation what you end up with is base renamed to ghc-base, and then the new base contains just stub modules that re-export stuff from ghc-base.
It would be great to have a portable base, without any GHC-specific stuff in it. After all, modules like Control.Monad or Data.Foldable are pure Haskell2010. The only obstacle I see is that ghc-base, as you call it, uses some amount of base definitions, and so we have a loop. How hard would it be to break this loop? That is, either make GHC.* modules self-contained, or make base not to depend on GHC.*? Roman

Hi, Am Mittwoch, den 13.02.2013, 11:34 +0200 schrieb Roman Cheplyaka:
It would be great to have a portable base, without any GHC-specific stuff in it. After all, modules like Control.Monad or Data.Foldable are pure Haskell2010.
while you are considering to split base, please also consider separating IO out. We can expect compiling Haskell to, say JavaScript or other targets that are not processes in the usual sense. For these IO might not make sense. Having something below base that provides the pure stuff (common data structures etc.) would enable libraries to easily say: „My algorithm can be used in normal programs as well as in programs that are compiled to JS“ by not depending on base, but on, say, pure-base. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

Somebody claiming to be Roman Cheplyaka wrote:
* Simon Marlow
[2013-02-13 09:00:15+0000] It's feasible to split base, but to a first approximation what you end up with is base renamed to ghc-base, and then the new base contains just stub modules that re-export stuff from ghc-base.
It would be great to have a portable base, without any GHC-specific stuff in it. After all, modules like Control.Monad or Data.Foldable are pure Haskell2010.
+1 -- Stephen Paul Weber, @singpolyma See http://singpolyma.net for how I prefer to be contacted edition right joseph

Perhaps looking at what base-compat has to offer ("*A compatibility layer
for base*") is relevant to this discussion. See:
https://github.com/sol/base-compat#readme
http://hackage.haskell.org/package/base-compat
I didn't use it but it looks as promising approach to base API stability.
All best,
Krzysztof Skrzętnicki
On Wed, Feb 13, 2013 at 10:00 AM, Simon Marlow
On 13/02/13 07:06, wren ng thornton wrote:
On 2/12/13 3:37 AM, Simon Marlow wrote:
One reason for the major version bumps is that base is a big conglomeration of modules, ranging from those that hardly ever change (Prelude) to those that change frequently (GHC.*). For example, the new IO manager that is about to get merged in will force a major bump of base, because it changes GHC.Event. The unicode support in the IO library was similar: although it only added to the external APIs that most people use, it also changed stuff inside GHC.* that we expose for a few clients.
The solution to this would be to split up base further, but of course doing that is itself a major upheaval. However, having done that, it might be more feasible to have non-API-breaking releases.
While it will lead to much wailing and gnashing of teeth in the short term, if it's feasible to break GHC.* off into its own package, then I think we should. The vast majority of base seems quite stable or else is rolling along at a reasonable pace. And yet, every time a new GHC comes out, there's a new wave of fiddling the knobs on cabal files because nothing really changed. On the other hand, GHC.* moves rather quickly. Nevertheless, GHC.* is nice to have around, so we don't want to just hide that churning. The impedance mismatch here suggests that they really should be separate packages. I wonder whether GHC.* should be moved in with ghc-prim, or whether they should remain separate...
But again, this depends on how feasible it would be to actually split the packages apart. Is it feasible?
So I think we'd need to add another package, call it ghc-base perhaps. The reason is that ghc-prim sits below the integer package (integer-simple or integer-gmp).
It's feasible to split base, but to a first approximation what you end up with is base renamed to ghc-base, and then the new base contains just stub modules that re-export stuff from ghc-base. In fact, maybe you want to do it exactly like this for simplicity - all the code goes in ghc-base. There would be some impact on compilation time, as we'd have twice as many interfaces to read.
I believe Ian has done some experiments with splitting base further, so he might have more to add here.
Cheers, Simon
______________________________**_________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.**org
http://www.haskell.org/**mailman/listinfo/glasgow-**haskell-usershttp://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On Wed, Feb 13, 2013 at 09:00:15AM +0000, Simon Marlow wrote:
I believe Ian has done some experiments with splitting base further, so he might have more to add here.
There are some sensible chunks that can be pulled out, e.g. Foreign.* can be pulled out into a separate package fairly easily IIRC. Part of the problem is that it's hard to see what's possible without actually doing it, because base is so large, and has so many imports and import loops. IMO by far the easiest way to improve base would be to start off by breaking it up into lots of small packages (some of which will probably be single-module packages, others may contain an entire hierarchy like Foreign.*, and others may contain an odd mixture of modules due to dependency problems). Then we can look at which packages ought to be split up, which ought to be coalesced, and which ought to be moved higher up or lower down the dependency tree, and then look at which module imports are preventing what we want to do and see if there's a way to fix them (either by moving a definition and reversing an import, or by changing an import to import something lower down the dependency tree instead). If we go this route, then we would probably want to end up without a package called 'base', and then to make a new package called 'base' that just re-exports modules from all the new packages. I imagine the first release would let people use the new base without warnings, a year later new base would give deprecated warnings, and the following year we'd remove it. We could do this process slower, but the sooner packages move off of base, the sooner they benefit from fewer major version bumps. The advantages would be: * the new packages would be easier to maintain than base is * we could more easily make other improvements we'd like to make, e.g. we could move the unix and Win32 packages further down the tree without having to do it in one big leap, and without having to put them below the whole of base * if one module causes a major version bump, then only libraries using that functionality would need to relax their dependencies, rather than every single package * some targets (JS, JVM, .NET, etc) or other implementations might want to do things like IO, concurrency, etc, completely differently. This way they'd just use a different io/concurrency package, rather than having to have a different implementation of parts of base * it would be nice if pure libraries could simply not depend on the io package etc, and thus clearly do no IO The disadvantage is that, at some point between the first release and the release that removes base, each package will have to have its dependencies updated. Thanks Ian

I would like to see base split up, but I am reluctant to invest the effort to do it. I'm not going to do it myself, and Ian's time is so precious on many fronts. My guess is that it'll need someone else to take the lead, with advice from HQ. Simon | -----Original Message----- | From: ghc-devs-bounces@haskell.org [mailto:ghc-devs-bounces@haskell.org] On | Behalf Of Ian Lynagh | Sent: 13 February 2013 13:58 | To: Simon Marlow | Cc: wren ng thornton; glasgow-haskell-users; ghc-devs@haskell.org | Subject: Re: GHC 7.8 release? | | On Wed, Feb 13, 2013 at 09:00:15AM +0000, Simon Marlow wrote: | > | > I believe Ian has done some experiments with splitting base further, | > so he might have more to add here. | | There are some sensible chunks that can be pulled out, e.g. Foreign.* | can be pulled out into a separate package fairly easily IIRC. | | Part of the problem is that it's hard to see what's possible without | actually doing it, because base is so large, and has so many imports and | import loops. IMO by far the easiest way to improve base would be to | start off by breaking it up into lots of small packages (some of which | will probably be single-module packages, others may contain an entire | hierarchy like Foreign.*, and others may contain an odd mixture of | modules due to dependency problems). | | Then we can look at which packages ought to be split up, which ought to | be coalesced, and which ought to be moved higher up or lower down the | dependency tree, and then look at which module imports are preventing | what we want to do and see if there's a way to fix them (either by | moving a definition and reversing an import, or by changing an import to | import something lower down the dependency tree instead). | | If we go this route, then we would probably want to end up without a | package called 'base', and then to make a new package called 'base' that | just re-exports modules from all the new packages. I imagine the first | release would let people use the new base without warnings, a year later | new base would give deprecated warnings, and the following year we'd | remove it. We could do this process slower, but the sooner packages move | off of base, the sooner they benefit from fewer major version bumps. | | The advantages would be: | * the new packages would be easier to maintain than base is | * we could more easily make other improvements we'd like to make, e.g. | we could move the unix and Win32 packages further down the tree | without having to do it in one big leap, and without having to put | them below the whole of base | * if one module causes a major version bump, then only libraries using | that functionality would need to relax their dependencies, rather than | every single package | * some targets (JS, JVM, .NET, etc) or other implementations might want | to do things like IO, concurrency, etc, completely differently. This | way they'd just use a different io/concurrency package, rather than | having to have a different implementation of parts of base | * it would be nice if pure libraries could simply not depend on the io | package etc, and thus clearly do no IO | | The disadvantage is that, at some point between the first release and | the release that removes base, each package will have to have its | dependencies updated. | | | Thanks | Ian | | | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://www.haskell.org/mailman/listinfo/ghc-devs

Hi, Am Mittwoch, den 13.02.2013, 13:58 +0000 schrieb Ian Lynagh:
If we go this route, then we would probably want to end up without a package called 'base', and then to make a new package called 'base' that just re-exports modules from all the new packages.
can you transparently re-export a module from another package? I.e. if base depends on io, IO provides System.IO, is there a way for base to tell ghc to pretend that System.IO is in base, but that there is no conflict if io happens to be un-hidden as well. It seems that something like this would be required to move modules from base to something below it without breaking existing code. Also, if it works that smooth, this would not have to be one big reorganization, but could be done piece by piece.
The disadvantage is that, at some point between the first release and the release that removes base, each package will have to have its dependencies updated.
Why remove base? If it is just a list of dependencies and list of modules to be re-exported, then keeping it (but advocate that it should not be used) should not be too much a burden. (This is assuming that the reorganizing should not change existing module names. If your plan was to give the modules new names, this problem does not exist, but I’d rather prefer the less intrusive approach.) Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

On Wed, Feb 13, 2013 at 06:28:22PM +0100, Joachim Breitner wrote:
Am Mittwoch, den 13.02.2013, 13:58 +0000 schrieb Ian Lynagh:
If we go this route, then we would probably want to end up without a package called 'base', and then to make a new package called 'base' that just re-exports modules from all the new packages.
can you transparently re-export a module from another package? I.e. if base depends on io, IO provides System.IO, is there a way for base to tell ghc to pretend that System.IO is in base, but that there is no conflict if io happens to be un-hidden as well.
No. But there are currently no packages that depend on both base and io, and anyone adding a dependency on io would remove the base dependency at the same time.
It seems that something like this would be required to move modules from base to something below it without breaking existing code.
I don't see why that's necessary. base would end up containing a load of modules that look something like {-# LANGUAGE PackageImports #-} module System.IO (module X) where import "io" System.IO as X
Also, if it works that smooth, this would not have to be one big reorganization, but could be done piece by piece.
It's tricky to do it piece by piece. It's hard to remove individual sensible pieces in the first place, and it means that you can't subsequently move modules between packages later without breaking code depending on the new packages.
The disadvantage is that, at some point between the first release and the release that removes base, each package will have to have its dependencies updated.
Why remove base? If it is just a list of dependencies and list of modules to be re-exported, then keeping it (but advocate that it should not be used) should not be too much a burden.
* Any package using it doesn't benefit from the reduced version bumps, so we do actually want packages to move away from it * Even though base (probably) wouldn't require a lot of work at any one time, it would require a little work every now and again, and that adds up to a lot of work * Any time a module is added to one of the new packages, either we'd have to spend time adding it to base too, or packages continuing to use base wouldn't (easily) be able to use that new module.
(This is assuming that the reorganizing should not change existing module names. If your plan was to give the modules new names, this problem does not exist, but I’d rather prefer the less intrusive approach.)
The odd module might be renamed, and there will probably be a handful of definitions that move from one module to another, but for the most part I was envisaging that we'd end up with the same modules exporting the same things. Thanks Ian

Hi, I have started a wikipage with the list of all modules from base, for a first round of shuffling, grouping and brainstorming: http://hackage.haskell.org/trac/ghc/wiki/SplitBase Am Mittwoch, den 13.02.2013, 18:09 +0000 schrieb Ian Lynagh:
On Wed, Feb 13, 2013 at 06:28:22PM +0100, Joachim Breitner wrote:
Am Mittwoch, den 13.02.2013, 13:58 +0000 schrieb Ian Lynagh:
If we go this route, then we would probably want to end up without a package called 'base', and then to make a new package called 'base' that just re-exports modules from all the new packages.
can you transparently re-export a module from another package? I.e. if base depends on io, IO provides System.IO, is there a way for base to tell ghc to pretend that System.IO is in base, but that there is no conflict if io happens to be un-hidden as well.
No. But there are currently no packages that depend on both base and io, and anyone adding a dependency on io would remove the base dependency at the same time.
hmm, that reminds me of how haskell98 was handled, and it was slightly annoying when haskell98 and base eventually were made to conflict, and we had to patch some unmaintained packages. Ok, in this case io would be introduced with the intention of being used exclusive from base. So as long as we make sure that the set of modules exported by base is always the union of all modules provided by package that have any module in common with base this would be fine. (Why this condition? Imagine io adding IO.GreatModule without base also providing the module. Then a program that still uses base cannot use IO.GreatModule without fixing the dependencies _now_ (or using package imports everywhere). It would be nice if library authors allowed to do the change whenever convenient.)
Also, if it works that smooth, this would not have to be one big reorganization, but could be done piece by piece.
It's tricky to do it piece by piece. It's hard to remove individual sensible pieces in the first place, and it means that you can't subsequently move modules between packages later without breaking code depending on the new packages.
Agreed.
The disadvantage is that, at some point between the first release and the release that removes base, each package will have to have its dependencies updated.
Why remove base? If it is just a list of dependencies and list of modules to be re-exported, then keeping it (but advocate that it should not be used) should not be too much a burden.
* Any package using it doesn't benefit from the reduced version bumps, so we do actually want packages to move away from it
We want them to do so. We should not force them (most surely will...)
* Even though base (probably) wouldn't require a lot of work at any one time, it would require a little work every now and again, and that adds up to a lot of work
Hopefully it is just updating the set of modules to be exported, sounds like it could be automated, given a list of packages.
* Any time a module is added to one of the new packages, either we'd have to spend time adding it to base too, or packages continuing to use base wouldn't (easily) be able to use that new module.
Hence we should add them; shouldn’t be too much work. After every larger change to base I am forced to touch old, hardly maintained code that I do not know, to get the packages working in Debian again. Hence my plea for staying compatible as long as feasible. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

Somebody signing messages as Joachim Breitner wrote:
I have started a wikipage with the list of all modules from base, for a first round of shuffling, grouping and brainstorming:
Looks like a good start! Here's an idea: why not use the `haskell2010` package as one of the groupings? It seems like this sort of reorganisation could help solve the problem we currently have where one cannot using any of the features of `base` along with the `haskell2010` modules. -- Stephen Paul Weber, @singpolyma See http://singpolyma.net for how I prefer to be contacted edition right joseph

On Wed, Feb 13, 2013 at 4:32 PM, Joachim Breitner
No. But there are currently no packages that depend on both base and io, and anyone adding a dependency on io would remove the base dependency at the same time.
hmm, that reminds me of how haskell98 was handled, and it was slightly annoying when haskell98 and base eventually were made to conflict, and we had to patch some unmaintained packages.
Ok, in this case io would be introduced with the intention of being used exclusive from base. So as long as we make sure that the set of modules exported by base is always the union of all modules provided by package that have any module in common with base this would be fine.
(Why this condition? Imagine io adding IO.GreatModule without base also providing the module. Then a program that still uses base cannot use IO.GreatModule without fixing the dependencies _now_ (or using package imports everywhere). It would be nice if library authors allowed to do the change whenever convenient.)
There should also be a condition stating that base should only re-export modules, and that those re-exports need to have the same name on another package. This condition guarantees that the only thing you need to change is the import list, and even this change could be (at least partially) automated via a tool take took all your imports and decided which new packages export them. -- Felipe.

On Wed, Feb 13, 2013 at 07:32:06PM +0100, Joachim Breitner wrote:
I have started a wikipage with the list of all modules from base, for a first round of shuffling, grouping and brainstorming:
Great, thanks for taking the lead on this!
The disadvantage is that, at some point between the first release and the release that removes base, each package will have to have its dependencies updated.
Why remove base? If it is just a list of dependencies and list of modules to be re-exported, then keeping it (but advocate that it should not be used) should not be too much a burden.
* Any package using it doesn't benefit from the reduced version bumps, so we do actually want packages to move away from it
We want them to do so. We should not force them (most surely will...)
A lot of packages won't react until something actually breaks. (and I suspect many are unmaintained and unused, and won't react even once it does break).
* Even though base (probably) wouldn't require a lot of work at any one time, it would require a little work every now and again, and that adds up to a lot of work
Hopefully it is just updating the set of modules to be exported, sounds like it could be automated, given a list of packages.
* Any time a module is added to one of the new packages, either we'd have to spend time adding it to base too, or packages continuing to use base wouldn't (easily) be able to use that new module.
Hence we should add them; shouldn’t be too much work.
I realised that there's actually no reason that the new 'base' package has to come with GHC (even immediately after the break-up); it can just be a package on Hackage (and, if desired, in the Haskell Platform). So it could easily be maintained by someone else, and thus be not much work for you, and 0 work for me :-) Thanks Ian

Hi, Am Donnerstag, den 14.02.2013, 02:21 +0000 schrieb Ian Lynagh:
On Wed, Feb 13, 2013 at 07:32:06PM +0100, Joachim Breitner wrote:
I have started a wikipage with the list of all modules from base,
for a
first round of shuffling, grouping and brainstorming:
Great, thanks for taking the lead on this!
lets see how far that leads goes... Yesterday, I experimented a bit with base’s code, first beginning with as few modules as possible and adding what’s required; then starting with the whole thing and trying to remove e.g. IO. But clearly it is not easy: 1. Int requires throw DivideByZero; Monad requires error. That pulls in Exceptions. 2. Exceptions require Typeable. 3. Typeable is implemented with GHC.Fingerprint. 4. GHC.Fingerprint is implemented with Foreign and IO. 5. Foreign clearly needs Int and the like. so it is not clear how to pull out a pure base without IO and Foreign. Are there any good tricks how to break these interdependencies? Maybe it is possible to have a pure base without the Monad class and without some of the operations on Int that throw exceptions, but that wold hardly be useable. There are other issues, some avoidable (such as the hard-coded base:Num constraint on literals); I collected a list on http://hackage.haskell.org/trac/ghc/wiki/SplitBase Maybe the proper is to reverse the whole approach: Leave base as it is, and then build re-exporting smaller packages (e.g. a base-pure) on top of it. The advantage is: * No need to rewrite the tightly intertwined base. * Libraries still have the option to have tighter dependencies. * Base can evolve with lots of breaking changes, as long as they do not affect the API by the smaller packages. * Development of this collection can happen outside the GHC tree. * Alternative implementations for (some of) these packages can be created, if the reason why they could not be moved out of base is one of implementation, not of API How does that sound? Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

On Thu, Feb 14, 2013 at 03:48:51PM +0100, Joachim Breitner wrote:
Yesterday, I experimented a bit with base’s code, first beginning with as few modules as possible and adding what’s required; then starting with the whole thing and trying to remove e.g. IO.
But clearly it is not easy: 1. Int requires throw DivideByZero; Monad requires error. That pulls in Exceptions. 2. Exceptions require Typeable. 3. Typeable is implemented with GHC.Fingerprint. 4. GHC.Fingerprint is implemented with Foreign and IO. 5. Foreign clearly needs Int and the like.
so it is not clear how to pull out a pure base without IO and Foreign. Are there any good tricks how to break these interdependencies?
We'll probably end up with a package (let's say ghc-bottom for now) right at the bottom of the hierarchy which contains a minimal set of definitions for Typeable, throw, etc, in modules like GHC.Typeable, GHC.IO (or, if it's small and the dependencies are circular, might be easier to have it all in a single module). It might be possible to merge this into ghc-prim later, but don't worry about that for now. These definitions would then be re-exported by Data.Typeable, Control.Exception etc higher up. Other libraries would be discouraged from depending on ghc-bottom, either socially or by needing to jump through extra hoops in a .cabal file fo allow it. However, this is the hard end to start from, because the modules right at the "bottom" depend on things much higher up, with cyclic imports. It'll be a lot easier to pull modules off the top instead, as nothing imports them. They also tend to be less magical (e.g. don't have wired-in names). Also, don't worry about necessarily pulling out /all/ the modules you want for a package all at once. Once you have smaller chunks it'll be easier to see what changes are necessary to move modules up/down. You may also need to leave e.g. some of IO lower down than the io package in a GHC.* module, and then to re-export it from io. Once the top has been pulled at as much as possible, it'll be a lot easier to see what's going on with the remainder, and work out what needs to be done to break the biggest loops.
There are other issues, some avoidable (such as the hard-coded base:Num constraint on literals); I collected a list on http://hackage.haskell.org/trac/ghc/wiki/SplitBase
Right, for things with built-in syntax you'll have to update GHC's wired-in names as you go. This will mostly just mean changing the modules, e.g. gHC_ENUM = mkBaseModule (fsLit "GHC.Enum") in compiler/prelude/PrelNames.lhs might become gHC_ENUM = mkGhcPureModule (fsLit "GHC.Enum") with mkGhcPureModule defined analogously to mkBaseModule, but occasionally you might want to move a definition to another module. If you get stuck, just yell.
Maybe the proper is to reverse the whole approach: Leave base as it is, and then build re-exporting smaller packages (e.g. a base-pure) on top of it. The advantage is: * No need to rewrite the tightly intertwined base. * Libraries still have the option to have tighter dependencies. * Base can evolve with lots of breaking changes, as long as they do not affect the API by the smaller packages. * Development of this collection can happen outside the GHC tree. * Alternative implementations for (some of) these packages can be created, if the reason why they could not be moved out of base is one of implementation, not of API
Disadvantages: * No-one would use the new packages unless they come with GHC; e.g. not a perfect analogy, but compare the number of rev-deps according to http://packdeps.haskellers.com/reverse of the various *prelude* packages vs base: 4831 base 6 basic-prelude 8 classy-prelude 4 classy-prelude-conduit 2 custom-prelude 1 general-prelude 1 modular-prelude 17 numeric-prelude 2 prelude-extras * If it comes with GHC, it would mean us permanently maintaining the two levels * base would still be an opaque blob, with too many modules and cyclic imports, which makes development tricky Thanks Ian

Hi, I made a little progress after crippling GHC.Fingerprint: The package at https://github.com/nomeata/packages-base/tree/base-pure (Branch base-pure) builds and contains just these modules: ./Control/Applicative.hs ./Control/Arrow.hs ./Control/Category.hs ./Control/Monad/Fix.hs ./Control/Monad.hs ./Data/Bits.hs ./Data/Bool.hs ./Data/Either.hs ./Data/Eq.hs ./Data/Foldable.hs ./Data/Function.hs ./Data/Functor.hs ./Data/Int.hs ./Data/List.hs ./Data/Maybe.hs ./Data/Monoid.hs ./Data/Ord.hs ./Data/Ratio.hs ./Data/Traversable.hs ./Data/Tuple.hs ./Data/Typeable.hs ./Data/Typeable.hs-boot ./Data/Typeable/Internal.hs ./Data/Typeable/Internal.hs-boot ./dist/build/autogen/Paths_base_pure.hs ./GHC/Base.lhs ./GHC/Char.hs ./GHC/Enum.lhs ./GHC/Err.lhs ./GHC/Err.lhs-boot ./GHC/Exception.lhs ./GHC/Fingerprint.hs ./GHC/Fingerprint/Type.hs ./GHC/Int.hs ./GHC/List.lhs ./GHC/Num.lhs ./GHC/Real.lhs ./GHC/Show.lhs ./GHC/Word.hs ./Prelude.hs (contains just $!) ./Unsafe/Coerce.hs The crippling can be seen here: https://github.com/nomeata/packages-base/blob/base-pure/GHC/Fingerprint.hs So if there were a magic way of getting a working GHC.Fingerprint module without pulling in Foreign, this would be a good start for a base free of any trace of * Foreign * IO * Floating point arithmetic * Read * ST * Array Alternative, a magic way of providing the functions error and divZeroError without having to define Exceptions would do well. I guess it is not possible to define the "data ErrorCall" without the Exception class and somehow call the primop raise# in a way that the error can be caught, as catch will expect something of type "SomeException" that has the Exception dictionary bundled. Any idea how to achieve that, without having resort to ghc-bottom as suggested by Ian (which would be ok, but not as nice as genuine small packages to start with)? Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

On Thu, Feb 14, 2013 at 8:45 AM, Joachim Breitner
./Control/Applicative.hs ./Control/Arrow.hs ./Control/Category.hs ./Control/Monad/Fix.hs ./Control/Monad.hs ./Data/Bits.hs ./Data/Bool.hs ./Data/Either.hs ./Data/Eq.hs ./Data/Foldable.hs ./Data/Function.hs ./Data/Functor.hs ./Data/Int.hs ./Data/List.hs ./Data/Maybe.hs ./Data/Monoid.hs ./Data/Ord.hs ./Data/Ratio.hs ./Data/Traversable.hs ./Data/Tuple.hs ./Data/Typeable.hs ./Data/Typeable.hs-boot ./Data/Typeable/Internal.hs ./Data/Typeable/Internal.hs-boot ./dist/build/autogen/Paths_base_pure.hs ./GHC/Base.lhs ./GHC/Char.hs ./GHC/Enum.lhs ./GHC/Err.lhs ./GHC/Err.lhs-boot ./GHC/Exception.lhs ./GHC/Fingerprint.hs ./GHC/Fingerprint/Type.hs ./GHC/Int.hs ./GHC/List.lhs ./GHC/Num.lhs ./GHC/Real.lhs ./GHC/Show.lhs ./GHC/Word.hs ./Prelude.hs (contains just $!) ./Unsafe/Coerce.hs
That's great. I'm curious I was under the impression that it was hard to split out a pure subset as functions might call 'error' (e.g. due to incomplete pattern matches) and that would pull in the whole I/O subsystem. How did you avoid that? -- Johan

Hi, Am Donnerstag, den 14.02.2013, 13:19 -0800 schrieb Johan Tibell:
That's great. I'm curious I was under the impression that it was hard to split out a pure subset as functions might call 'error' (e.g. due to incomplete pattern matches) and that would pull in the whole I/O subsystem. How did you avoid that?
as mentioned before: By crippling GHC.Fingerprint. error foo just calls raise (ErrorCall foo), which calls "raise# (SomeException (ErrorCall foo)". The problem is that the definition of SomeException pulls in the Exception type class, which pulls in Typeable, which pulls in GHC.Fingerprint, which uses FFI and IO code to to fingerprinting... Looking at the code it seems that the FFI is only required for MD5. Maybe a pure implementation there is worth it, if it is not performance-critical code. Or, another work-around, would be primops for these commands that can be used without the FFI types and IO. I have also removed GHC.Unicode and functions like lines that require the FFI calls there. These seem to be the largest open issues. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

On Thu, Feb 14, 2013 at 2:53 PM, Joachim Breitner
Hi,
Am Donnerstag, den 14.02.2013, 13:19 -0800 schrieb Johan Tibell:
That's great. I'm curious I was under the impression that it was hard to split out a pure subset as functions might call 'error' (e.g. due to incomplete pattern matches) and that would pull in the whole I/O subsystem. How did you avoid that?
as mentioned before: By crippling GHC.Fingerprint. error foo just calls raise (ErrorCall foo), which calls "raise# (SomeException (ErrorCall foo)". The problem is that the definition of SomeException pulls in the Exception type class, which pulls in Typeable, which pulls in GHC.Fingerprint, which uses FFI and IO code to to fingerprinting...
I don't think having FFI far down the stack is a problem. There are lots of pure data types we'd like in the "pure data" layer (e.g. bytestring) that uses FFI. As long as the I/O layer itself (System.IO, the I/O manager, etc) doesn't get pulled in there's no real problem in depending on the FFI.

On Thursday, February 14, 2013 at 8:14 PM, Johan Tibell wrote:
On Thu, Feb 14, 2013 at 2:53 PM, Joachim Breitner
wrote: I don't think having FFI far down the stack is a problem. There are lots of pure data types we'd like in the "pure data" layer (e.g. bytestring) that uses FFI. As long as the I/O layer itself (System.IO, the I/O manager, etc) doesn't get pulled in there's no real problem in depending on the FFI. Doesn't the FFI pull in some part of the I/O layer, though? In particular threaded programs are going to end up using forkOS?
-- brandon s allbery kf8nh Sent with Sparrow (http://www.sparrowmailapp.com/?sig)

Hi, Am Donnerstag, den 14.02.2013, 21:41 -0500 schrieb brandon s allbery kf8nh:
On Thursday, February 14, 2013 at 8:14 PM, Johan Tibell wrote:
On Thu, Feb 14, 2013 at 2:53 PM, Joachim Breitner
wrote: I don't think having FFI far down the stack is a problem. There are lots of pure data types we'd like in the "pure data" layer (e.g. bytestring) that uses FFI. As long as the I/O layer itself (System.IO, the I/O manager, etc) doesn't get pulled in there's no real problem in depending on the FFI.
I think it would be nice, also to other Haskell implementations that might have not have FFI, to separate the really basic stuff from pure-but-impurely-implemented stuff. At least as long as it does not caues trouble. GHC.Fingerprint does not need to be crippled when it is going to use a pure hashing; I quickly added some simple fingerpriting found via Wikipedia that was easier than MD5. https://github.com/nomeata/packages-base/commit/b7f80066a03fd296950e0cafa227... The choice is of course not final, maybe something with more bits is desirable.
Doesn't the FFI pull in some part of the I/O layer, though? In particular threaded programs are going to end up using forkOS?
Another good reason to try to have a pure ground library. Based on base-pure, the next step would be to check if FFI can be provided without IO (but I doubt that is difficult), so there would be an base-io package on top of base-pure, and then bytestring can depend on that base-io and base-pure, while users of bytestring of course don’t have to depend on base-io (as long as they are not using the IO-related functions of bytestring). Questions: * Does anyone have a tool to compare package APIs? It would be interesting to diff base’s API with the combined APIs of the package we are creating right now. * Currently, base-pure exports lots of modules that should not be part of its “pure” API (all the GHC.* packages). But I assume they need to be exported for the benefit of packages like base-io built on top. Should we provide another package that re-exports those that are for public consumption and is likely to have a more stable API? Again I feel the need for packages re-exporting modules without incurring a conflict. * What to do with Prelude. Should it be in base-io (which is potentially the lowest package containing everything needed) or rather in a package of its own? Or should it live in a package of its own? Or can we use the haskell2010 package for that somehow? * Should base-io provide just the IO monad and all, say, file-related stuff in a separate package or is this going too far? Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

| > Doesn't the FFI pull in some part of the I/O layer, though? In | > particular threaded programs are going to end up using forkOS? | | Another good reason to try to have a pure ground library. Remember that we have UNSAFE ffi calls and SAFE ones. The SAFE ones may block, cause GC etc. They involve a lot of jiggery pokery and I would not be surprised if that affected the I/O manager. But UNSAFE ones are, by design, no more than "fat machine instructions" that are implemented by taking an out-of-line call. They should not block. They should not cause GC. Nothing. Think of 'sin' and 'cos' for example. Fingerprinting is a classic example, I would have thought. So my guess is that it should not be hard to allow UNSAFE ffi calls in the core (non-IO-ish) bits, leaving SAFE calls for higher up the stack. Simon

On 15/02/13 09:36, Simon Peyton-Jones wrote:
| > Doesn't the FFI pull in some part of the I/O layer, though? In | > particular threaded programs are going to end up using forkOS? | | Another good reason to try to have a pure ground library.
Remember that we have UNSAFE ffi calls and SAFE ones.
The SAFE ones may block, cause GC etc. They involve a lot of jiggery pokery and I would not be surprised if that affected the I/O manager.
But UNSAFE ones are, by design, no more than "fat machine instructions" that are implemented by taking an out-of-line call. They should not block. They should not cause GC. Nothing. Think of 'sin' and 'cos' for example.
Fingerprinting is a classic example, I would have thought.
So my guess is that it should not be hard to allow UNSAFE ffi calls in the core (non-IO-ish) bits, leaving SAFE calls for higher up the stack.
Actually as far as the Haskell-level API goes, there's no difference between safe and unsafe FFI calls, the difference is all in the codegen. I don't think safe calls cause any more difficulties for splitting up the base. Cheers, Simon

Hi, more progress: On top of base-pure, I created base-io involving GHC/IO and everything required to build it (which pulled in ST, some of Foreign and unfortunately some stuff related to Handles and Devices, because it is mentioned in IOException). This is the list of modules: Foreign.C.Types, Foreign.ForeignPtr, Foreign.ForeignPtr.Imp, Foreign.ForeignPtr.Safe, Foreign.ForeignPtr.Unsafe, Foreign.Ptr, Foreign.Storable, GHC.ForeignPtr, GHC.IO.BufferedIO, GHC.IO.Buffer, GHC.IO.Device, GHC.IO.Encoding.Types, GHC.IO.Exception, GHC.IO.Handle.Types, GHC.IO, GHC.IORef, GHC.MVar, GHC.Ptr, GHC.Stable, GHC.ST, GHC.Storable, GHC.STRef It is on a different branch on my github repo: https://github.com/nomeata/packages-base/tree/base-io GHC would complain that the CInt type is not valid in a ffi call (probably due to the different package name), so I replaced foreign declarations by regular ones defined using “undefined” – after all I’m just trying to discover how things can be split at all and just work towards building stuff. ST can probably be pulled below this package, after all it is quite pure. Either a package of its own, or in base-pure. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

On 15/02/13 12:22, Joachim Breitner wrote:
Hi,
more progress: On top of base-pure, I created base-io involving GHC/IO and everything required to build it (which pulled in ST, some of Foreign and unfortunately some stuff related to Handles and Devices, because it is mentioned in IOException). This is the list of modules:
Foreign.C.Types, Foreign.ForeignPtr, Foreign.ForeignPtr.Imp, Foreign.ForeignPtr.Safe, Foreign.ForeignPtr.Unsafe, Foreign.Ptr, Foreign.Storable, GHC.ForeignPtr, GHC.IO.BufferedIO, GHC.IO.Buffer, GHC.IO.Device, GHC.IO.Encoding.Types, GHC.IO.Exception, GHC.IO.Handle.Types, GHC.IO, GHC.IORef, GHC.MVar, GHC.Ptr, GHC.Stable, GHC.ST, GHC.Storable, GHC.STRef
You have a random collection of modules here :) I think you want to have the IO *monad* (GHC.IO) live in a lower layer, separate from the IO *library* (GHC.IO.Device and so on). Every Haskell implementation will need the IO monad, but they might want to replace the IO library with something else. Things like GHC.IORef, GHC.MVar can all live in a low-down layer because they're just wrappers over the primops. Cheers, Simon

Hi, Am Freitag, den 15.02.2013, 14:50 +0000 schrieb Simon Marlow:
On 15/02/13 12:22, Joachim Breitner wrote:
Hi,
more progress: On top of base-pure, I created base-io involving GHC/IO and everything required to build it (which pulled in ST, some of Foreign and unfortunately some stuff related to Handles and Devices, because it is mentioned in IOException). This is the list of modules:
You have a random collection of modules here :)
I think you want to have the IO *monad* (GHC.IO) live in a lower layer, separate from the IO *library* (GHC.IO.Device and so on). Every Haskell implementation will need the IO monad, but they might want to replace the IO library with something else.
Things like GHC.IORef, GHC.MVar can all live in a low-down layer because they're just wrappers over the primops.
Right, that is my aim, and I started with GHC.IO. But unfortunately, the IO monad calls failIO, which is an IOError which has a field of type ioe_handle :: Maybe Handle (and one of type CInt) which pulls in all the rest there, and so far I did not have a good idea how to untangle that. What would break if fail would not raise an IOError, but a separate exception type, e.g. IOFailError? Probably too much, as users expect to catch the exception raised by fail with an exception handler that matches IOError. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

Hi, Am Freitag, den 15.02.2013, 23:00 +0100 schrieb Joachim Breitner:
Am Freitag, den 15.02.2013, 14:50 +0000 schrieb Simon Marlow:
On 15/02/13 12:22, Joachim Breitner wrote:
more progress: On top of base-pure, I created base-io involving GHC/IO and everything required to build it (which pulled in ST, some of Foreign and unfortunately some stuff related to Handles and Devices, because it is mentioned in IOException). This is the list of modules:
You have a random collection of modules here :)
I think you want to have the IO *monad* (GHC.IO) live in a lower layer, separate from the IO *library* (GHC.IO.Device and so on). Every Haskell implementation will need the IO monad, but they might want to replace the IO library with something else.
Things like GHC.IORef, GHC.MVar can all live in a low-down layer because they're just wrappers over the primops.
Right, that is my aim, and I started with GHC.IO. But unfortunately, the IO monad calls failIO, which is an IOError which has a field of type ioe_handle :: Maybe Handle (and one of type CInt) which pulls in all the rest there, and so far I did not have a good idea how to untangle that.
What would break if fail would not raise an IOError, but a separate exception type, e.g. IOFailError? Probably too much, as users expect to catch the exception raised by fail with an exception handler that matches IOError.
I’m still stuck at the problem of separating the definition of IO and Monad IO from all file related stuff, which is prevented by the "Maybe Handle" field in the IOError data type. Given that IOError is abstract to the „regular“ user (i.e. not to base-io-file), I guess we can afford to be a little bit hacky here. Two ideas come to my mind: 1. Instead of declaring the field as "Maybe Handle", we define a pseudo-handle datatype data NotYetAHandle = NotYetAHandle and use "Maybe NotYetAHandle" in IOError, with the documented convention that only code in base-io-file (and code further down the tree) may use this field, and only after unsafeCoerce’ing it to a Maybe Handle. This way, the base-io package does not have to include the definition, but the IOError data type still has place for it. If the NotYetAHandle constructor is not exported, and base-io-file defines functions NotYetAHandle -> Handle and Handle -> NotYetAHandle via unsafeCoerce, then the unsafeness is very local and nobody can break the code without also using unsafeCoerce. 2. A little safer (and with a little more overhead) wold be to include Data.Dynamic in base and change the field to a "Maybe Dynamic". Same procedure as above, only that a violation of the convention might be caught without crashes. Is having a package that provides io without providing file-related definition worth this kludge? Greetings, Joachimh -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

Hi, Am Mittwoch, den 20.02.2013, 14:57 +0100 schrieb Joachim Breitner:
I’m still stuck at the problem of separating the definition of IO and Monad IO from all file related stuff, which is prevented by the "Maybe Handle" field in the IOError data type.
re-reading „An Extensible Dynamically-Typed Hierarchy of Exceptions“ helped me to come up with this somewhat neat solution: The Monad IO instance uses an exception different from IOError: $ git show HEAD | filterdiff -i \*.cabal -i \*Fail\* -i \*/GHC/IO.hs --- a/GHC/IO.hs +++ b/GHC/IO.hs @@ -46,8 +46,7 @@ import GHC.ST import GHC.Exception import GHC.Show import Data.Maybe - -import {-# SOURCE #-} GHC.IO.Exception ( userError ) +import GHC.IO.Fail -- --------------------------------------------------------------------------- -- The IO Monad @@ -79,7 +78,7 @@ liftIO :: IO a -> State# RealWorld -> STret RealWorld a liftIO (IO m) = \s -> case m s of (# s', r #) -> STret s' r failIO :: String -> IO a -failIO s = IO (raiseIO# (toException (userError s))) +failIO s = IO (raiseIO# (toException (IOFail s))) -- --------------------------------------------------------------------------- -- Coercions between IO and ST --- /dev/null +++ b/GHC/IO/Fail.hs @@ -0,0 +1,20 @@ +{-# LANGUAGE NoImplicitPrelude #-} +module GHC.IO.Fail where + +import GHC.Base +import GHC.Exception +import Data.Typeable +import GHC.Show + + +-- | This exception is thrown by the 'fail' method of the 'Monad' 'IO' instance. +-- +-- The Exception instance of IOException will also catch this, converting the +-- IOFail to a UserError, for compatibility and consistency with the Haskell +-- report +data IOFail = IOFail String + +instance Typeable IOFail -- deriving does not work without package +instance Show IOFail -- name changes to GHC +instance Exception IOFail + After this change, exposed-modules: GHC.IO.Fail, GHC.IO, GHC.IORef, GHC.ST, GHC.STRef is possible (and of course ST can be moved away as well). So far so good, but this breaks user code. So the solution is to make sure that to everyone who tries to catch an IOException (which will likely be part of some base-io-file), an IOFail will look like a IOError of type UserError: $ git show HEAD|filterdiff -i \*Exception.hs --- a/GHC/IO/Exception.hs +++ b/GHC/IO/Exception.hs @@ -45,9 +45,10 @@ import GHC.Show import GHC.Exception import Data.Maybe import GHC.IO.Handle.Types +import GHC.IO.Fail import Foreign.C.Types -import Data.Typeable ( Typeable ) +import Data.Typeable ( Typeable, cast ) -- ------------------------------------------------------------------------ -- Exception datatypes and operations @@ -222,7 +223,11 @@ data IOException } instance Typeable IOException -instance Exception IOException +instance Exception IOException where + toException = SomeException + fromException e = case cast e of + Just (IOFail s) -> Just (userError s) + Nothing -> cast e instance Eq IOException where (IOError h1 e1 loc1 str1 en1 fn1) == (IOError h2 e2 loc2 str2 en2 fn2) = Neat, isn’t it? Now I can proceed separating some of the Foreign stuff from the IO stuff. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

On 20/02/13 15:40, Joachim Breitner wrote:
+-- | This exception is thrown by the 'fail' method of the 'Monad' 'IO' instance. +-- +-- The Exception instance of IOException will also catch this, converting the +-- IOFail to a UserError, for compatibility and consistency with the Haskell +-- report +data IOFail = IOFail String + +instance Typeable IOFail -- deriving does not work without package +instance Show IOFail -- name changes to GHC +instance Exception IOFail +
I like the idea of making IOFail a separate exception type.
-instance Exception IOException +instance Exception IOException where + toException = SomeException + fromException e = case cast e of + Just (IOFail s) -> Just (userError s) + Nothing -> cast e
I think that should be
+ fromException (SomeException e) = case cast e of + Just (IOFail s) -> Just (userError s) + Nothing -> cast e
Otherwise it will typecheck but not work (hurrah for dynamic typing). The trick is indeed neat, but only if it is possible to make IOFail completely invisible. If it isn't possible to make it completely invisible, then I would prefer IOFail to be a first-class exception type without the special trick to coerce it to IOException, and accept the API breakage. I don't think it's a good idea to have special magic in the exception hierarchy - other people would start doing it too, then we'd have a mess. Cheers, Simon

Hi, Am Donnerstag, den 21.02.2013, 09:42 +0000 schrieb Simon Marlow:
The trick is indeed neat, but only if it is possible to make IOFail completely invisible. If it isn't possible to make it completely invisible, then I would prefer IOFail to be a first-class exception type without the special trick to coerce it to IOException, and accept the API breakage. I don't think it's a good idea to have special magic in the exception hierarchy - other people would start doing it too, then we'd have a mess.
great – for the purposes of splitting base I don’t care which one is taken, as long as I know that there is some way. I have now changed my setup so that I can modify the files without removing others or moving them to different branches or symlinking. This way, a simple "git diff ghc-7.6 base-split" lists all changes. I also created a script to check whether every module is in exactly one of the new packages. See https://github.com/nomeata/packages-base/blob/base-split/README.md for a description of the setup and an explanation of the changes applied. Contributions welcome! Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

Hi, another status report about my base-splitting experiment: The list of packages has grown to * base-pure: Basic stuff without `IO`, `Foreign` or floating point arithmetic. * base-st: The `ST` monad, uses base-pure. * base-array: Arrays, uses base-st. * base-float: Floating point types, uses base-array internally. * base-io: The `IO` monad, uses base-st. * base-concurrent: Currently just `MVar`, uses base-io. * base-foreign: Everything related to `Foreign`, uses base-io and base-float (for the `Storable` instances for `Double` and `Float`). Of the (exactly!) 200 modules in base, 83 have not yet been sorted into these or other packages: Control/Concurrent.hs Control/Concurrent/Chan.hs Control/Concurrent/QSem.hs Control/Concurrent/QSemN.hs Control/Concurrent/SampleVar.hs Control/Exception.hs Control/Exception/Base.hs Control/Monad/Instances.hs Data/Data.hs Data/Dynamic.hs Data/Fixed.hs Data/HashTable.hs Data/Unique.hs Data/Version.hs Debug/Trace.hs GHC/Conc.lhs GHC/Conc/IO.hs GHC/Conc/Signal.hs GHC/Conc/Sync.lhs GHC/Conc/Windows.hs GHC/ConsoleHandler.hs GHC/Constants.hs GHC/Desugar.hs GHC/Environment.hs GHC/Event.hs GHC/Event/Array.hs GHC/Event/Clock.hsc GHC/Event/Control.hs GHC/Event/EPoll.hsc GHC/Event/IntMap.hs GHC/Event/Internal.hs GHC/Event/KQueue.hsc GHC/Event/Manager.hs GHC/Event/PSQ.hs GHC/Event/Poll.hsc GHC/Event/Thread.hs GHC/Event/Unique.hs GHC/Exts.hs GHC/GHCi.hs GHC/Generics.hs GHC/Handle.hs GHC/IO.hs-boot GHC/IO/Device.hs GHC/IO/Exception.hs GHC/IO/Exception.hs-boot GHC/IO/FD.hs GHC/IO/Handle.hs GHC/IO/Handle.hs-boot GHC/IO/Handle/FD.hs GHC/IO/Handle/FD.hs-boot GHC/IO/Handle/Internals.hs GHC/IO/Handle/Text.hs GHC/IO/Handle/Types.hs GHC/IO/IOMode.hs GHC/IOArray.hs GHC/IOBase.hs GHC/IP.hs GHC/PArr.hs GHC/Pack.lhs GHC/Stack.hsc GHC/Stats.hsc GHC/TopHandler.lhs GHC/TypeLits.hs GHC/Windows.hs NHC/PosixTypes.hsc NHC/SizedTypes.hs Numeric.hs Prelude.hs System/CPUTime.hsc System/Console/GetOpt.hs System/Environment.hs System/Environment/ExecutablePath.hsc System/Exit.hs System/IO.hs System/IO/Error.hs System/Info.hs System/Mem.hs System/Mem/StableName.hs System/Mem/Weak.hs System/Posix/Types.hs System/Timeout.hs Text/Printf.hs Inspired by a similar graph by Herbert Valerio Riedel, I tried to visualize the current state and came up with this: https://github.com/nomeata/packages-base/blob/base-split/graph.pdf?raw=true It is not completely accurate due to Prelude not included in -ddump-minimal-imports (but that shouldn’t matter as most interesting functions of the Prelude are in base-pure). The script to generate the dot file from *.imports is included in the branch at https://github.com/nomeata/packages-base/tree/base-split Next I’d need to see how entangled the system-close stuff is (File IO, concurrency, GHC.Event.*). Of course with too much splitting one runs in the Bane of the Orphaned Instances – neither should base-foreign require base-float nor the other way around, but "Storable Double" needs to be define somewhere... And the same question will arise if Data.Date should go to a package of its own. Also, I notice that there is an issue with “internal” modules (mostly GHC.something) that should not be part of some stable API, but needed to implement packages further down. Should they just not be considered part of the “public” (and PVP-relevant) API? Or should there be two packages, e.g. base-pure-internal and base-pure, where the latter re-exports those modules that are meant for public consumption? So, what is the general opinion? Is this a way worth pursuing? Or are we fine with the huge base and I can do different things again ;-)? Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

Hi Joachim. Glad to see you're making progress on this. Once we're done exploring how fine-grained we can make the division we might want to pull back a bit and consider what logical groupings makes sense. For example, even if the float functionality can be split from the int functionality, I don't think that makes for a very logical grouping. In addition, I don't think we want to say that e.g. pure data structures can't depend on the FFI. While their current implementation might not use the FFI, what if we want to use it in the future. We'd have to reshuffle the packages again. Just my 2 cents. -- Johan

Hi, Am Freitag, den 22.02.2013, 11:38 -0800 schrieb Johan Tibell:
In addition, I don't think we want to say that e.g. pure data structures can't depend on the FFI. While their current implementation might not use the FFI, what if we want to use it in the future. We'd have to reshuffle the packages again.
right, there is a tension between having just independent APIs and having also independent implementations. My main goal is to allow packages to specify their imports more precisely, to require less changes as not-so-common stuff in base evolves and to make it easier for alternative compiler/targets to implement parts of base; this would just require providing better grouped APIs. But if we want that while retaining the freedom to have an entangled implementation, we are back at the "large base + specific re-exporting packages" approach, which wasn’t particularly well received here. Greetings, Joachim PS: Even with the currently explored split stuff in base-pure can use the FFI; it could just not use the modules from the Foreign.* structure. This may or may not be a problem. It was for the GHC.Fingeprint implementation, as it was marshalling arrays. -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

On Fri, Feb 22, 2013 at 3:34 PM, Joachim Breitner
right, there is a tension between having just independent APIs and having also independent implementations. My main goal is to allow packages to specify their imports more precisely, to require less changes as not-so-common stuff in base evolves and to make it easier for alternative compiler/targets to implement parts of base; this would just require providing better grouped APIs.
But if we want that while retaining the freedom to have an entangled implementation, we are back at the "large base + specific re-exporting packages" approach, which wasn’t particularly well received here.
I don't know about entangled implementations. But I'd like to have a base package (e.g. your base-pure) that has a consistent set of basic data types e.g. Int, Word, Float, Double, Char, String, ByteString, Text, [a], Maybe, Either, and so forth. These are logically at the same layer. I think splitting them according to how they happen to be implemented at the moment is a misstake. It would give us a illogical and unstable layering in the long run.

On Fri, Feb 22, 2013 at 11:38:04AM -0800, Johan Tibell wrote:
Glad to see you're making progress on this. Once we're done exploring how fine-grained we can make the division we might want to pull back a bit
I definitely agree with "Once we're done". Once we have made all the splits we might want to make, it'll be a lot easier to see the big picture and merge packages we decide should be merged. It's a lot harder to work the other way, as it's tricky to see what is or isn't possible.
In addition, I don't think we want to say that e.g. pure data structures can't depend on the FFI. While their current implementation might not use the FFI, what if we want to use it in the future. We'd have to reshuffle the packages again.
I think the issue is what a package exports, rather than what it depends on. Changing the package dependencies later won't affect users of the package. Thanks Ian

It's great work, thanks Joachim. I'd like to be very clear about goals, though. I have not been following this thread closely enough, but is there a Wiki page that explains what the goals of the base-package break-up is? I believe that the driving goal is: * to allow changes to internals without forcing a version-bump on 'base', on which every package depends But that goal needs a bit of unpacking. Suppose we divided base into six, base1, base2, base3, etc, but each was a vertical silo and every other package depended on all six. Then nothing would be gained; bumping any of them would cause a ripple of bumps down the line. Alternatively, suppose we split it into just two: 'base' and 'base-internal', where the former is just a wrapper around the latter, but presenting a stable API. Now you'd get the stability you want. I'm sure this is a vastly simplistic analysis, and those of you who have been thinking hard about this can say it more precisely than I, based on practical experience. But I would find it very helpful to have a crystal-clear articulation of the actual goals, with concrete examples, to help orient the debate, and guide choices. I assume that a non-goal is * split base into as many packages as possible. though a superficial reading of the thread might suggest just that. Indeed other things being equal, a goal should be * split base into as FEW packages as possible, consistent with meeting the other goals As Johan points out, a split now could paint us into a corner later, so we should not gratuitously split things up. Many thanks for working on this. Simon From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users-bounces@haskell.org] On Behalf Of Johan Tibell Sent: 22 February 2013 19:38 To: Joachim Breitner Cc: glasgow-haskell-users@haskell.org Subject: Re: base package Hi Joachim. Glad to see you're making progress on this. Once we're done exploring how fine-grained we can make the division we might want to pull back a bit and consider what logical groupings makes sense. For example, even if the float functionality can be split from the int functionality, I don't think that makes for a very logical grouping. In addition, I don't think we want to say that e.g. pure data structures can't depend on the FFI. While their current implementation might not use the FFI, what if we want to use it in the future. We'd have to reshuffle the packages again. Just my 2 cents. -- Johan

* Simon Peyton-Jones
I believe that the driving goal is:
* to allow changes to internals without forcing a version-bump on 'base', on which every package depends
This is a good goal. Another goal could be to make the packages more meaningful. I think this goal is good by itself (judging from common engineering principles), and also would make it easier to share the code between different implementations. (Right now JHC has its own version of many base modules, because it cannot reuse base.) What is common between the following modules? Control.Applicative Control.Concurrent Data.Foldable Data.STRef Foreign.C GHC.Exts System.Environment Text.ParserCombinators.ReadP If we ignore for a second historical incidents and tricky inter-dependencies, I think we'd all agree that by today's standard they all logically belong to different packages. In other words, if they didn't exist today, and a package came out that contained two or more of the above modules, it would look really strange. Roman

Hi, Am Samstag, den 23.02.2013, 10:27 +0000 schrieb Simon Peyton-Jones:
I’d like to be very clear about goals, though. I have not been following this thread closely enough, but is there a Wiki page that explains what the goals of the base-package break-up is?
I added a Goals section to http://hackage.haskell.org/trac/ghc/wiki/SplitBase and besides
I believe that the driving goal is:
· to allow changes to internals without forcing a version-bump on ‘base’, on which every package depends
added these two goals, which I find worthwhile to pursue: To allow packages to be explictly about what they need A library that does not use the IO monad could communicate that just by not depending on some base-io package. Similar with the Foreign Function Interface or unsafe operations. To allow alternative implementations/targets A Haskell-to-Javascript compiler will not support File IO, or maybe not even IO at all. It would be desirable such an implementation has a chance to at least provide a complete and API compatible base-pure package, and that one can hence reasonably assume that packages and libraries depending only on base-pure will indeed work without modification. This might be subsumed by fulfilling the previous goal. Just by looking at the goals, the variant with a big base package that uses all kinds of “uglyness” (FFI for pure stuff, cyclic dependencies between API-wise unrelated stuff, lots of GHC internals used) and re-exporting packages that have a more specific and possibly more stable API sounds like it can provide the mentioned goals. Iain commented this idea earlier this thread¹ with three points:
* No-one would use the new packages unless they come with GHC; e.g. not a perfect analogy, but compare the number of rev-deps according to http://packdeps.haskellers.com/reverse of the various *prelude* packages vs base: 4831 base 6 basic-prelude 8 classy-prelude 4 classy-prelude-conduit 2 custom-prelude 1 general-prelude 1 modular-prelude 17 numeric-prelude 2 prelude-extras
Hopefully the problem here (often-changing base) is big enough and the alternative (more purpose-oriented and more stable) packages are attractive enough to make people use the new set.
* If it comes with GHC, it would mean us permanently maintaining the two levels
True. I’m not convinced that it will be too much a burden, at least if the reexporting packages do that on the module level.
* base would still be an opaque blob, with too many modules and cyclic imports, which makes development tricky
Does it really make development tricky? I’d rather expect it to make development easier, because you _can_ mix, say, IO and Foreign stuff easily and even use some of that in lower levels. If it were less tricky to separate it, then my experiment at https://github.com/nomeata/packages-base/tree/base-split would have been less work... In any case there is still the problem: What and where is the Prelude... but maybe let’s postpone this. Greetings, Joachim ¹ http://www.haskell.org/pipermail/glasgow-haskell-users/2013-February/023774.... -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

| I added a Goals section to | http://hackage.haskell.org/trac/ghc/wiki/SplitBase Thanks. But the first goal, which is the dominant one, is very unclear to me as my comments mentioned. A description of what the problem is, and why a simple "API wrapper" approach would not solve it, would be useful. SImon | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell- | users-bounces@haskell.org] On Behalf Of Joachim Breitner | Sent: 25 February 2013 13:32 | To: glasgow-haskell-users@haskell.org | Subject: Re: base package -- goals | | Hi, | | Am Samstag, den 23.02.2013, 10:27 +0000 schrieb Simon Peyton-Jones: | > I’d like to be very clear about goals, though. I have not been | > following this thread closely enough, but is there a Wiki page that | > explains what the goals of the base-package break-up is? | | I added a Goals section to | http://hackage.haskell.org/trac/ghc/wiki/SplitBase | | and besides | | > I believe that the driving goal is: | > | > · to allow changes to internals without forcing a version-bump | > on ‘base’, on which every package depends | | added these two goals, which I find worthwhile to pursue: | | To allow packages to be explictly about what they need | | A library that does not use the IO monad could communicate that | just by not depending on some base-io package. Similar with the | Foreign Function Interface or unsafe operations. | | To allow alternative implementations/targets | | A Haskell-to-Javascript compiler will not support File IO, or | maybe not even IO at all. It would be desirable such an | implementation has a chance to at least provide a complete and | API compatible base-pure package, and that one can hence | reasonably assume that packages and libraries depending only on | base-pure will indeed work without modification. This might be | subsumed by fulfilling the previous goal. | | | Just by looking at the goals, the variant with a big base package that | uses all kinds of “uglyness” (FFI for pure stuff, cyclic dependencies | between API-wise unrelated stuff, lots of GHC internals used) and re- | exporting packages that have a more specific and possibly more stable | API sounds like it can provide the mentioned goals. | | Iain commented this idea earlier this thread¹ with three points: | | > * No-one would use the new packages unless they come with GHC; | > e.g. not a perfect analogy, but compare the number of rev-deps | > according to http://packdeps.haskellers.com/reverse of the various | > *prelude* packages vs base: | > 4831 base | > 6 basic-prelude | > 8 classy-prelude | > 4 classy-prelude-conduit | > 2 custom-prelude | > 1 general-prelude | > 1 modular-prelude | > 17 numeric-prelude | > 2 prelude-extras | | Hopefully the problem here (often-changing base) is big enough and the | alternative (more purpose-oriented and more stable) packages are | attractive enough to make people use the new set. | | > * If it comes with GHC, it would mean us permanently maintaining the | two | > levels | | True. I’m not convinced that it will be too much a burden, at least if | the reexporting packages do that on the module level. | | > * base would still be an opaque blob, with too many modules and cyclic | > imports, which makes development tricky | | Does it really make development tricky? I’d rather expect it to make | development easier, because you _can_ mix, say, IO and Foreign stuff | easily and even use some of that in lower levels. If it were less tricky | to separate it, then my experiment at | https://github.com/nomeata/packages-base/tree/base-split would have been | less work... | | | In any case there is still the problem: What and where is the Prelude... | but maybe let’s postpone this. | | Greetings, | Joachim | | ¹ http://www.haskell.org/pipermail/glasgow-haskell-users/2013- | February/023774.html | | | -- | Joachim "nomeata" Breitner | Debian Developer | nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C | JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

On Mon, Feb 25, 2013 at 02:25:03PM +0000, Simon Peyton-Jones wrote:
| I added a Goals section to | http://hackage.haskell.org/trac/ghc/wiki/SplitBase
Thanks. But the first goal, which is the dominant one, is very unclear to me as my comments mentioned. A description of what the problem is, and why a simple "API wrapper" approach would not solve it, would be useful.
On the wiki page you say: SPJ: But that goal needs a bit of unpacking. Suppose we divided base into six, base1, base2, base3, etc, but each was a vertical silo and every other package depended on all six. Then nothing would be gained; bumping any of them would cause a ripple of bumps down the line. but even if we did just divide base up into vertical silos then I don't think most packages would depend on them all; for example, most packages would probably not depend on file-io or concurrency. But in any case, I'd hope we would also make some horizontal cuts, and I expect very few packages would need to depend on ghc-io-manager etc. Thanks Ian

On Mon, Feb 25, 2013 at 02:31:56PM +0100, Joachim Breitner wrote:
Hopefully the problem here (often-changing base) is big enough and the alternative (more purpose-oriented and more stable) packages are attractive enough to make people use the new set.
I'm pretty confident that most packages won't do more than the minimal base bumping while importing base continues to work.
* base would still be an opaque blob, with too many modules and cyclic imports, which makes development tricky
Does it really make development tricky? I’d rather expect it to make development easier, because you _can_ mix, say, IO and Foreign stuff easily and even use some of that in lower levels. If it were less tricky to separate it, then my experiment at https://github.com/nomeata/packages-base/tree/base-split would have been less work...
It's tricky to make changes to the core modules, because that generally requires changing imports, and it's hard to see how to actually do that without making an import loop (or without making more import loops than are necessary). In general there's actually a fair amount of flexibility in which way module imports go (e.g. you can move a "Cl Ty" instance from the Cl module to the Ty module or vice-versa), but in base it's hard to see how to structure things best: there are approaching 200 modules, half of which are tied up in a recursive knot, with 13 hs-boot modules (2 of which import other hs-boot modules).
In any case there is still the problem: What and where is the Prelude... but maybe let’s postpone this.
I'd put it in its own package for now, and then look at whether/what it should be merged with later. I'm in 2 minds about it. On the one hand, I'm sure that lots of people won't like a single-module package that essentially everything depends on. But on the other hand, Prelude is both magic and broad, so it would make some sense to have it in its own package. Thanks Ian

Somebody claiming to be Ian Lynagh wrote:
On Mon, Feb 25, 2013 at 02:31:56PM +0100, Joachim Breitner wrote:
In any case there is still the problem: What and where is the Prelude... but maybe let’s postpone this.
I'd put it in its own package for now, and then look at whether/what it should be merged with later.
Why shouldn't Prelude (and other really stable, standard modules) just live in the `haskell2010` package? -- Stephen Paul Weber, @singpolyma See http://singpolyma.net for how I prefer to be contacted edition right joseph

Somebody claiming to be Roman Cheplyaka wrote:
* Stephen Paul Weber
[2013-02-25 11:29:42-0500] Why shouldn't Prelude (and other really stable, standard modules) just live in the `haskell2010` package?
Because then we can't make changes to it without producing a new language standard.
That sounds like a good thing. Very in line with the goal of making stable modules more stable. -- Stephen Paul Weber, @singpolyma See http://singpolyma.net for how I prefer to be contacted edition right joseph

On Mon, Feb 25, 2013 at 11:29:42AM -0500, Stephen Paul Weber wrote:
Somebody claiming to be Ian Lynagh wrote:
On Mon, Feb 25, 2013 at 02:31:56PM +0100, Joachim Breitner wrote:
In any case there is still the problem: What and where is the Prelude... but maybe let’s postpone this.
I'd put it in its own package for now, and then look at whether/what it should be merged with later.
Why shouldn't Prelude (and other really stable, standard modules) just live in the `haskell2010` package?
If we did that then every package would depend on haskell2010, which is fine until haskell2013 comes along and they all need to be changed (or miss out on any improvements that were made). Even the really stable modules change, incidentally. For example, since Haskell 2010, the Show superclass of Prelude.Num was removed, Prelude no longer exports catch, and Data.List gained a function dropWhileEnd. Thanks Ian

Ian Lynagh
If we did that then every package would depend on haskell2010, which is fine until haskell2013 comes along and they all need to be changed (or miss out on any improvements that were made).
...wouldn't there also be the danger of type(class)-incompatible (e.g. the superclass breakages for startes) changes between say haskell2010 and haskell2013, that would cause problems when trying to mix libraries depending on different haskell20xx library versions?

On Mon, Feb 25, 2013 at 06:38:46PM +0100, Herbert Valerio Riedel wrote:
Ian Lynagh
writes: [...]
If we did that then every package would depend on haskell2010, which is fine until haskell2013 comes along and they all need to be changed (or miss out on any improvements that were made).
...wouldn't there also be the danger of type(class)-incompatible (e.g. the superclass breakages for startes) changes between say haskell2010 and haskell2013, that would cause problems when trying to mix libraries depending on different haskell20xx library versions?
I think that actually, for the Num/Show change, the hasell98/haskell2010 packages just incorrectly re-export the new class. Personally, I don't think the language report should be specifying the content of libraries at all, and I doubt anyone really uses the haskell* packages. A separate library specification, perhaps based on the Haskell Platform, would make more sense IMO. But that's another debate :-) Thanks Ian

On 25/02/13 18:05, Ian Lynagh wrote:
On Mon, Feb 25, 2013 at 06:38:46PM +0100, Herbert Valerio Riedel wrote:
Ian Lynagh
writes: [...]
If we did that then every package would depend on haskell2010, which is fine until haskell2013 comes along and they all need to be changed (or miss out on any improvements that were made).
...wouldn't there also be the danger of type(class)-incompatible (e.g. the superclass breakages for startes) changes between say haskell2010 and haskell2013, that would cause problems when trying to mix libraries depending on different haskell20xx library versions?
I think that actually, for the Num/Show change, the hasell98/haskell2010 packages just incorrectly re-export the new class.
Personally, I don't think the language report should be specifying the content of libraries at all,
It's not that straightforward, because the language report refers to various library functions, types and classes. For example, integer literals give rise to a constraint on Num, so we have to say what Num is. Guards depend on Bool, the translation of list comprehensions refers to "map", and so on. It could be whittled down certainly (we actually removed a few libraries in Haskell 2010), but there's still a core that is tied to the language definition. Cheers, Simon

On Wed, Feb 27, 2013 at 04:54:35PM +0000, Simon Marlow wrote:
On 25/02/13 18:05, Ian Lynagh wrote:
Personally, I don't think the language report should be specifying the content of libraries at all,
It's not that straightforward, because the language report refers to various library functions, types and classes. For example, integer literals give rise to a constraint on Num, so we have to say what Num is. Guards depend on Bool, the translation of list comprehensions refers to "map", and so on.
It could be whittled down certainly (we actually removed a few libraries in Haskell 2010), but there's still a core that is tied to the language definition.
Yes, OK, my language was a bit strong: s/at all/any more than necessary/ Thanks Ian

Hi all, Let me add the goals I had in mind last time I considered trying to split base. 1. I'd like to have text Handles use the Text type and binary Handles use the ByteString type. Right now we have this somewhat awkward setup where the I/O APIs are spread out and bundled with pure types. Splitting base would let us fix this and write a better I/O layer. 2. The I/O manager currently has a copy of IntMap inside its implementation because base cannot use containers. Splitting base would let us get rid of this code duplication. I'm less interested in having super fine-grained dependencies in my libraries. More packages usually means more busy-work managing dependencies. Taken to its extreme you could imagine having base-maybe, base-bool, and whatnot. I don't think this is an improvement. Splitting base into perhaps 3-5 packages (e.g. GHC.*, IO, pure types) should let us get a bunch of benefits without too many downsides. -- Johan

Hi, Am Montag, den 25.02.2013, 11:25 -0800 schrieb Johan Tibell:
1. I'd like to have text Handles use the Text type and binary Handles use the ByteString type. Right now we have this somewhat awkward setup where the I/O APIs are spread out and bundled with pure types. Splitting base would let us fix this and write a better I/O layer.
2. The I/O manager currently has a copy of IntMap inside its implementation because base cannot use containers. Splitting base would let us get rid of this code duplication.
added to http://hackage.haskell.org/trac/ghc/wiki/SplitBase It would be interesting to see if Text and Bytestring (without the file IO parts) can be compiled using the base-foreign package extracted here https://github.com/nomeata/packages-base/tree/base-split/base-foreign. Looking at the imports, it seems possible. Then a base-io-system package can indeed depend on text and bytestring, and provide the appropriately typed file IO functions there. The containers package looks like a good example for a package that sits on base-pure: No IO, no FFI. So with the split suggested in my branch, having the ghc-io-system depend on containers seems possible.
I'm less interested in having super fine-grained dependencies in my libraries. More packages usually means more busy-work managing dependencies. Taken to its extreme you could imagine having base-maybe, base-bool, and whatnot. I don't think this is an improvement. Splitting base into perhaps 3-5 packages (e.g. GHC.*, IO, pure types) should let us get a bunch of benefits without too many downsides.
This is basically the goal added by SPJ: Split base into as FEW packages as possible, consistent with meeting the other goals. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

I think it would be vv helpful to have all these goals articulated on the wiki page. http://hackage.haskell.org/trac/ghc/wiki/SplitBase For the "avoiding version bump" goal, I still don't see why having a simple "shim" package on top whose API is stable, and whose version number changes seldom, would not do the job. Simon From: Johan Tibell [mailto:johan.tibell@gmail.com] Sent: 25 February 2013 19:25 To: Joachim Breitner Cc: glasgow-haskell-users@haskell.org; Simon Peyton-Jones Subject: Re: base package -- goals Hi all, Let me add the goals I had in mind last time I considered trying to split base. 1. I'd like to have text Handles use the Text type and binary Handles use the ByteString type. Right now we have this somewhat awkward setup where the I/O APIs are spread out and bundled with pure types. Splitting base would let us fix this and write a better I/O layer. 2. The I/O manager currently has a copy of IntMap inside its implementation because base cannot use containers. Splitting base would let us get rid of this code duplication. I'm less interested in having super fine-grained dependencies in my libraries. More packages usually means more busy-work managing dependencies. Taken to its extreme you could imagine having base-maybe, base-bool, and whatnot. I don't think this is an improvement. Splitting base into perhaps 3-5 packages (e.g. GHC.*, IO, pure types) should let us get a bunch of benefits without too many downsides. -- Johan

Hi, Am Dienstag, den 26.02.2013, 10:08 +0000 schrieb Simon Peyton-Jones:
I think it would be vv helpful to have all these goals articulated on the wiki page.
well, all the goals are there (or are they not sufficiently well explained)? I also tried to compare the two approaches – the re-exporting API-specifying packages and the actual split of base – by listing their advantages (I skipped the disadvantages, these would just be the negation of the other advantages list...) at http://hackage.haskell.org/trac/ghc/wiki/SplitBase#Approaches I don’t feel in the position to actually make a call here, or even to cast a vote with confidence, but I’m happy to continue summarizing the discussion until a consensus is found. If there is more discussion, that is. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

| I don’t feel in the position to actually make a call here, or even to cast a vote with | confidence, but I’m happy to continue summarizing the discussion until a | consensus is found. If there is more discussion, that is. I've updated the page http://hackage.haskell.org/trac/ghc/wiki/SplitBase a bit. It seems to me that there are two kinds of goals A Better for external clients (stable APRs, few version bumps) B Better for internal implementation (eg using containers or bytestring in base) On the Wiki page, (G1) and (G2) are to do with group (A). And this seems best handled by defining client-facing shim packages that export just what you want. However (G4) and later seem to be concerned with (B), and they clearly do require re-structuring the code. It seems entirely possible to me that BOTH will be needed. That is, the structure to achieve (B) will not be precisely what is wanted for (A). So it seems to be me that both approaches are valid and we might want to do both. We can do them in either order, or simultaneously. (Mind you, doing (A) first would make (B) less disruptive for our users.) In short, less of an either/or, more of a both/and. Simon | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users- | bounces@haskell.org] On Behalf Of Joachim Breitner | Sent: 07 March 2013 20:22 | To: glasgow-haskell-users@haskell.org | Subject: Re: base package -- goals | | Hi, | | Am Dienstag, den 26.02.2013, 10:08 +0000 schrieb Simon Peyton-Jones: | > I think it would be vv helpful to have all these goals articulated on | > the wiki page. | > | > http://hackage.haskell.org/trac/ghc/wiki/SplitBase | > | | well, all the goals are there (or are they not sufficiently well explained)? | | I also tried to compare the two approaches – the re-exporting API-specifying | packages and the actual split of base – by listing their advantages (I skipped the | disadvantages, these would just be the negation of the other advantages list...) at | http://hackage.haskell.org/trac/ghc/wiki/SplitBase#Approaches | | I don’t feel in the position to actually make a call here, or even to cast a vote with | confidence, but I’m happy to continue summarizing the discussion until a | consensus is found. If there is more discussion, that is.| | Greetings, | Joachim | | -- | Joachim "nomeata" Breitner | Debian Developer | nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C | JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

On Mon, Mar 11, 2013 at 4:45 PM, Simon Peyton-Jones
B Better for internal implementation (eg using containers or bytestring in base)
Note that this also means better code for external clients, as we can offer e.g. a better System.IO that lets people use Handles to read Text and ByteStrings in a more natural way (i.e. the I/O functions will be in one place, not spread out throughout N data structure libraries).

Hi, Am Montag, den 11.03.2013, 23:45 +0000 schrieb Simon Peyton-Jones:
| I don’t feel in the position to actually make a call here, or even to cast a vote with | confidence, but I’m happy to continue summarizing the discussion until a | consensus is found. If there is more discussion, that is.
I've updated the page http://hackage.haskell.org/trac/ghc/wiki/SplitBase a bit.
thanks.
It seems to me that there are two kinds of goals
A Better for external clients (stable APRs, few version bumps) B Better for internal implementation (eg using containers or bytestring in base)
On the Wiki page, (G1) and (G2) are to do with group (A). And this seems best handled by defining client-facing shim packages that export just what you want.
However (G4) and later seem to be concerned with (B), and they clearly do require re-structuring the code.
It seems entirely possible to me that BOTH will be needed. That is, the structure to achieve (B) will not be precisely what is wanted for (A). So it seems to be me that both approaches are valid and we might want to do both. We can do them in either order, or simultaneously. (Mind you, doing (A) first would make (B) less disruptive for our users.)
In short, less of an either/or, more of a both/and.
from reading between the lines I get the impression that you’d prefer (A) to happen first, in order to do (B) more easily. If (A) was successful, we even have to worry less about bad decisions while doing (B), as it would be relatively easy to fix these. So if we do (A), we still have the problem that Ian pointed out: Why should anyone use these shim packages, when they can continue to use base? This is especially true when the shim packages are less simple to use, due to the handling of Prelude. I see some options (assuming, just for demonstration, we have to shim packages base-pure and base-io): 1. Prelude stays in base, packages wanting to use the shim packages exclusively have to use {-# LANGUAGE NoImplicitPrelude #-} everywhere and import stuff explicitly. base-pure would probably provide its subset of the Prelude in Prelude.Pure, to be imported explicitly. 2. Prelude goes to a separate shim package, base-prelude. Extra dependency required, packages that want to only depend on base-pure have no Prelude to use, same problem as in 1. 3. Every shim package has a Prelude module. Good for those who depend on base-pure, but those who require both base-pure and base-io have no an ambiguous import. 4. Separate packages base-io-prelude and base-pure-prelude providing Prelude’s containing stuff of base-* (+ dependencies). Packages can pull in precisely the Prelude they want, but yet more packages. None of these look particularly appealing. Here some ideas to make it more convenient for the programmer that require changes to GHC and how it treats packages: I. It automatically imports _all_ visible Prelude modules. So base-pure provides the pure Prelude and base-io adds the IO functions. II. Same, but a bit more explicit: We extend the package configuration by a "prelude-modules" field. Every module listed in that field of every visible package is treated as a Prelude and automatically imported (unless {-# LANGUAGE NoImplicitPreldue #-} is active.) III. Ambiguous module imports do not cause an error if, among the modules, there is one that is a superset of all others, i.e. reexports all other modules. I personally find I. quite elegant and appealing, as it is clearly the behavior expected by library authors if they have their code build-depend on their selection of shim packages, and requires no extra prelude packages. One might see it as a disadvantages that now arbitrary packages can add to the “virtual prelude”, although I don’t think it would be abused without good reason, and who knows what clever things people will do with that feature. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

On Tue, Mar 12, 2013 at 09:47:21AM +0100, Joachim Breitner wrote:
This is especially true when the shim packages are less simple to use, due to the handling of Prelude.
Just to make sure I am following you, I think you are saying: Everything would work fine if there was a Prelude in base (used by packages that still use base, and not the shims) and a Prelude in one of the shim packages (used by packages that exclusively use the shims, and not base). The problem is that Prelude doesn't fit in any of the sensible shim packages (as it contains, for example, both pure and file IO functions), but having a shim package purely for the Prelude seems excessive. This is a problem regardless of whether we do A or B or both. I think we should avoid getting bogged down in one small detail at this stage. If we make the bulk of the changes now then we still have a few months to polish the result before it gets effectively frozen by being released. If you don't like the idea of putting it in its own package, then I think either the file-io package (as that's the "worst" thing it contains) or the pure package (as that's the package most likely to be depended on anyway) would make most sense. Thanks Ian

Hi, Am Dienstag, den 12.03.2013, 14:35 +0000 schrieb Ian Lynagh:
I think we should avoid getting bogged down in one small detail at this stage. If we make the bulk of the changes now then we still have a few months to polish the result before it gets effectively frozen by being released.
I’m not sure it is just a small detail: The handling of Prelude will influence how practical it is to use the shim package, and how practical it is to use just some of the shim packages.
If you don't like the idea of putting it in its own package, then I think either the file-io package (as that's the "worst" thing it contains) or the pure package (as that's the package most likely to be depended on anyway) would make most sense.
Both have issues: Putting it in file-io will cause everyone to depend on file-io, subverting „To allow packages to be explict about what they need (G2)“. Putting it in pure will make pure not pure any more, as the Prelude would still have to contain writeFile etc. But you are right that this discussing this should not prevent us from deciding between A, B and both, and then actually doing it. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

On Tue, Mar 12, 2013 at 03:58:28PM +0100, Joachim Breitner wrote:
Both have issues: Putting it in file-io will cause everyone to depend on file-io
If it ended up there, then we'd presumably encourage people to use NoImplicitPrelude and import e.g. list functions from Data.List rather than Prelude. Thanks Ian

On 13-03-12 04:47 AM, Joachim Breitner wrote:
... None of these look particularly appealing. Here some ideas to make it more convenient for the programmer that require changes to GHC and how it treats packages:
I. It automatically imports _all_ visible Prelude modules. So base-pure provides the pure Prelude and base-io adds the IO functions.
I like this proposal, but it goes further than necessary for your goal, and it could lead to some messy conflicts. All you need is
2. Prelude goes to a separate shim package, base-prelude. Extra dependency required, packages that want to only depend on base-pure have no Prelude to use, same problem as in 1.
plus the ability to specify, per *user* package, from which package it wants to import Prelude. The default would be base-prelude, but a single-line change could switch that to base-pure-prelude or awesome-alternative-prelude. Presumably ghc-pkg would need a new command-line option to specify the Prelude package as well.

| > In short, less of an either/or, more of a both/and. | | from reading between the lines I get the impression that you’d prefer | (A) to happen first, in order to do (B) more easily. If (A) was | successful, we even have to worry less about bad decisions while doing | (B), as it would be relatively easy to fix these. Yes, that's right. (A) sounds like a low-cost way of meeting some goals. (B) is a higher-cost way of meeting some further goals. Your follow-on remarks (appended below) about which implicit Prelude you get if you (say) import only `base-pure` are well-taken, but they apply equally to (B). Worth adding a section to the Wiki page to discuss this? My gut feel: for the minority who do not want to depend on enough base-X packages to get the Haskell-98 Prelude, use NoImplicitPrelude (since indeed you don't depend on enough to get the H98 Prelude) and import what you want explicitly. Most people won't care and will continue to depend on enough to get Prelude. Simon | So if we do (A), we still have the problem that Ian pointed out: Why | should anyone use these shim packages, when they can continue to use | base? | | This is especially true when the shim packages are less simple to use, | due to the handling of Prelude. I see some options (assuming, just for | demonstration, we have to shim packages base-pure and base-io): | | 1. Prelude stays in base, packages wanting to use the shim packages | exclusively have to use {-# LANGUAGE NoImplicitPrelude #-} | everywhere and import stuff explicitly. base-pure would probably | provide its subset of the Prelude in Prelude.Pure, to be | imported explicitly. | 2. Prelude goes to a separate shim package, base-prelude. Extra | dependency required, packages that want to only depend on | base-pure have no Prelude to use, same problem as in 1. | 3. Every shim package has a Prelude module. Good for those who | depend on base-pure, but those who require both base-pure and | base-io have no an ambiguous import. | 4. Separate packages base-io-prelude and base-pure-prelude | providing Prelude’s containing stuff of base-* (+ dependencies). | Packages can pull in precisely the Prelude they want, but yet | more packages. | | None of these look particularly appealing. Here some ideas to make it | more convenient for the programmer that require changes to GHC and how | it treats packages: | | I. It automatically imports _all_ visible Prelude modules. So | base-pure provides the pure Prelude and base-io adds the IO | functions. | II. Same, but a bit more explicit: We extend the package | configuration by a "prelude-modules" field. Every module listed | in that field of every visible package is treated as a Prelude | and automatically imported (unless {-# LANGUAGE | NoImplicitPreldue #-} is active.) | III. Ambiguous module imports do not cause an error if, among the | modules, there is one that is a superset of all others, i.e. | reexports all other modules. | | I personally find I. quite elegant and appealing, as it is clearly the | behavior expected by library authors if they have their code build- | depend on their selection of shim packages, and requires no extra | prelude packages. One might see it as a disadvantages that now arbitrary | packages can add to the “virtual prelude”, although I don’t think it | would be abused without good reason, and who knows what clever things | people will do with that feature. | | Greetings, | Joachim | | | -- | Joachim "nomeata" Breitner | Debian Developer | nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C | JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

Hi, Am Mittwoch, den 13.03.2013, 14:04 +0000 schrieb Simon Peyton-Jones:
Your follow-on remarks (appended below) about which implicit Prelude you get if you (say) import only `base-pure` are well-taken, but they apply equally to (B). Worth adding a section to the Wiki page to discuss this?
Sure, done, including stated opinions so far: http://hackage.haskell.org/trac/ghc/wiki/SplitBase#HandlingPrelude I also noticed an advantage of (P2) (No Prelude in any of the shim packages, but in a separate base-prelude package): It allows programmers to mix conveniently the shim packages with packages that provide a non-standard prelude (classy-prelude comes to my mind) without any use of NoImplicitPrelude. (Just stating that for completeness, my preference is still option (P4)+(I1), i.e. multiple partial Prelude modules which all automatically imported.) Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

On Wed, Mar 13, 2013 at 7:04 AM, Simon Peyton-Jones
Most people won't care and will continue to depend on enough to get Prelude.
Let me just put this out here so keep it in the back of our heads: most people don't care about this whole thing (splitting base) so lets make sure there's still a base package to import that gives people what they have before. :) Other than that I'm quite excited about the prospects of splitting up base a bit. -- Johan

On 25/02/13 19:25, Johan Tibell wrote:
Hi all,
Let me add the goals I had in mind last time I considered trying to split base.
1. I'd like to have text Handles use the Text type and binary Handles use the ByteString type. Right now we have this somewhat awkward setup where the I/O APIs are spread out and bundled with pure types. Splitting base would let us fix this and write a better I/O layer.
2. The I/O manager currently has a copy of IntMap inside its implementation because base cannot use containers. Splitting base would let us get rid of this code duplication.
I'm less interested in having super fine-grained dependencies in my libraries. More packages usually means more busy-work managing dependencies. Taken to its extreme you could imagine having base-maybe, base-bool, and whatnot. I don't think this is an improvement. Splitting base into perhaps 3-5 packages (e.g. GHC.*, IO, pure types) should let us get a bunch of benefits without too many downsides.
+1 to all that. I'd like to add one other thing that we've been wanting to clean up: the unix/Win32 packages should sit low down in the dependency hierarchy, so that the IO library can depend on them. Right now we have bits and pieces of unix/Win32 in the base package, some of which have to be re-exported via internal modules in base to unix/Win32 proper (System.Posix.Internals). I seem to recall the situation with signal handlers being a bit of a mess: the code to handle signals is in base, but the API is in unix. Glancing at the code in GHC.Conc.Signals it looks like I even had to use Dynamic to get around the dependency problems (shhh!). Cleaning up things like this is a win. But I'm with Johan in that having fine-grained packages imposes a cost on the clients (where the clients in this case includes everyone), so there should be significant tangible benefits (e.g. more stability). Cheers, Simon

On Fri, Feb 22, 2013 at 07:52:00PM +0100, Joachim Breitner wrote:
Of course with too much splitting one runs in the Bane of the Orphaned Instances – neither should base-foreign require base-float nor the other way around, but "Storable Double" needs to be define somewhere...
This is no different than the question of whether the instance should be in Foreign.Storable or GHC.Types. One has to import/depend-on the other, and it doesn't matter which: it's an implementation issue, and doesn't affect people importing/depending-on the modules/packages. In this case, GHC.Types.Double is in ghc-prim, so you will presumably need to leave the instance in Foreign.Storable in base-foreign.
Also, I notice that there is an issue with “internal” modules (mostly GHC.something) that should not be part of some stable API, but needed to implement packages further down. Should they just not be considered part of the “public” (and PVP-relevant) API? Or should there be two packages, e.g. base-pure-internal and base-pure, where the latter re-exports those modules that are meant for public consumption?
If it's easy to split out the GHC.* modules, then that's probably better. If not (e.g. because Public.A imports GHC.B, which import Public.C, which imports GHC.D) then it's probably not worth the bother.
So, what is the general opinion?
Looks good to me! Thanks Ian

On 15/02/13 08:36, Joachim Breitner wrote:
Hi,
Am Donnerstag, den 14.02.2013, 21:41 -0500 schrieb brandon s allbery kf8nh:
On Thursday, February 14, 2013 at 8:14 PM, Johan Tibell wrote:
On Thu, Feb 14, 2013 at 2:53 PM, Joachim Breitner
wrote: I don't think having FFI far down the stack is a problem. There are lots of pure data types we'd like in the "pure data" layer (e.g. bytestring) that uses FFI. As long as the I/O layer itself (System.IO, the I/O manager, etc) doesn't get pulled in there's no real problem in depending on the FFI. I think it would be nice, also to other Haskell implementations that might have not have FFI, to separate the really basic stuff from pure-but-impurely-implemented stuff. At least as long as it does not caues trouble.
GHC.Fingerprint does not need to be crippled when it is going to use a pure hashing; I quickly added some simple fingerpriting found via Wikipedia that was easier than MD5. https://github.com/nomeata/packages-base/commit/b7f80066a03fd296950e0cafa227... The choice is of course not final, maybe something with more bits is desirable.
Remember that fingerprinting is not hashing. For fingerprinting we need to have a realistic expectation of no collisions. I don't think FNV is suitable. I'm sure it would be possible to replace the C md5 code with some Haskell. Performance *is* important here though - Typeable is in the inner loop of certain generic programming libraries, like SYB. Cheers, Simon

As a super obvious aside, we could just switch code paths based on platform/environment. That lets you keep the fast code path and have a pure fallback for the javascripters. Just propagate an FFI_AVAILABLE flag into the compilation unit. We're going to have a number of these points which force a tension between generality And speed as we go, and it'd nice to have the ability to gracefully fall back.
On Feb 15, 2013, at 9:45 AM, Simon Marlow
On 15/02/13 08:36, Joachim Breitner wrote:
Hi,
Am Donnerstag, den 14.02.2013, 21:41 -0500 schrieb brandon s allbery kf8nh:
On Thursday, February 14, 2013 at 8:14 PM, Johan Tibell wrote:
On Thu, Feb 14, 2013 at 2:53 PM, Joachim Breitner
wrote: I don't think having FFI far down the stack is a problem. There are lots of pure data types we'd like in the "pure data" layer (e.g. bytestring) that uses FFI. As long as the I/O layer itself (System.IO, the I/O manager, etc) doesn't get pulled in there's no real problem in depending on the FFI. I think it would be nice, also to other Haskell implementations that might have not have FFI, to separate the really basic stuff from pure-but-impurely-implemented stuff. At least as long as it does not caues trouble.
GHC.Fingerprint does not need to be crippled when it is going to use a pure hashing; I quickly added some simple fingerpriting found via Wikipedia that was easier than MD5. https://github.com/nomeata/packages-base/commit/b7f80066a03fd296950e0cafa227... The choice is of course not final, maybe something with more bits is desirable.
Remember that fingerprinting is not hashing. For fingerprinting we need to have a realistic expectation of no collisions. I don't think FNV is suitable.
I'm sure it would be possible to replace the C md5 code with some Haskell. Performance *is* important here though - Typeable is in the inner loop of certain generic programming libraries, like SYB.
Cheers, Simon
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On Fri, Feb 15, 2013 at 02:45:19PM +0000, Simon Marlow wrote:
Remember that fingerprinting is not hashing. For fingerprinting we need to have a realistic expectation of no collisions. I don't think FNV is suitable.
I'm sure it would be possible to replace the C md5 code with some Haskell. Performance *is* important here though - Typeable is in the inner loop of certain generic programming libraries, like SYB.
We currently just compare hash(str) for equality, right? Could we instead compare (hash str, str) ? That would be even more correct, even if a bad/cheap hash function is used, and would only be slower for the case where the types match (unless you're unlucky and get a hash collision). In fact, we may be able to arrange it so that in the equal case the strings are normally exactly the same string, so we can do a cheap pointer equality test (like ByteString already does) to make the equal case fast too (falling back to actually checking the strings are equal, if they aren't the same string). Thanks Ian

Comparing hash, ptr, str gives you a pretty good acceptance/rejection test.
hash for the quick rejection, ptr for quick acceptance, str for accuracy.
Especially since the particular fingerprints for Typeable at least are
usually made up of 3 bytestrings that were just stuffed in and forgotten
about.
That said, this would seem to bring ByteString or at least Ptr in at a
pretty low level for the level of generality folks seem to be suddenly
seeking.
-Edward
On Wed, Feb 20, 2013 at 12:12 PM, Ian Lynagh
On Fri, Feb 15, 2013 at 02:45:19PM +0000, Simon Marlow wrote:
Remember that fingerprinting is not hashing. For fingerprinting we need to have a realistic expectation of no collisions. I don't think FNV is suitable.
I'm sure it would be possible to replace the C md5 code with some Haskell. Performance *is* important here though - Typeable is in the inner loop of certain generic programming libraries, like SYB.
We currently just compare hash(str) for equality, right? Could we instead compare (hash str, str) ? That would be even more correct, even if a bad/cheap hash function is used, and would only be slower for the case where the types match (unless you're unlucky and get a hash collision).
In fact, we may be able to arrange it so that in the equal case the strings are normally exactly the same string, so we can do a cheap pointer equality test (like ByteString already does) to make the equal case fast too (falling back to actually checking the strings are equal, if they aren't the same string).
Thanks Ian
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On 20/02/13 17:12, Ian Lynagh wrote:
On Fri, Feb 15, 2013 at 02:45:19PM +0000, Simon Marlow wrote:
Remember that fingerprinting is not hashing. For fingerprinting we need to have a realistic expectation of no collisions. I don't think FNV is suitable.
I'm sure it would be possible to replace the C md5 code with some Haskell. Performance *is* important here though - Typeable is in the inner loop of certain generic programming libraries, like SYB.
We currently just compare hash(str) for equality, right? Could we instead compare (hash str, str) ? That would be even more correct, even if a bad/cheap hash function is used, and would only be slower for the case where the types match (unless you're unlucky and get a hash collision).
In fact, we may be able to arrange it so that in the equal case the strings are normally exactly the same string, so we can do a cheap pointer equality test (like ByteString already does) to make the equal case fast too (falling back to actually checking the strings are equal, if they aren't the same string).
So it's not a single string, a TypeRep consists of a TyCon applied to some arguments, which themselves are TypeReps etc. You could do pointer equality, and maybe that would work for the common cases. But I don't see why we have to do that when fingerprinting works well and we already have it. Why add a potential performance pitfall when we don't have to? One other thing: it's useful to be able to use the fingerprint as an identifier for the contents, e.g. when sending Typeable values across the network. If you can't do this with the fingerprint, then you need another unique Id, which is the situation we used to have before fingerprints. Cheers, Simon

On Thu, Feb 14, 2013 at 03:48:51PM +0100, Joachim Breitner wrote:
Yesterday, I experimented a bit with base’s code, [...]
Maybe the proper is to reverse the whole approach: Leave base as it is, and then build re-exporting smaller packages (e.g. a base-pure) on top of it. The advantage is: * No need to rewrite the tightly intertwined base. * Libraries still have the option to have tighter dependencies. * Base can evolve with lots of breaking changes, as long as they do not affect the API by the smaller packages. * Development of this collection can happen outside the GHC tree. * Alternative implementations for (some of) these packages can be created, if the reason why they could not be moved out of base is one of implementation, not of API
How does that sound?
Essentially good to me... One might consider instead (as has been proposed before, I believe), to rename the current ``base'' to something like ``ghc-base'' which is not intended to be depended on by packages not shipped with GHC (that is, by default ``hidden'' in ghc-pkg), and instead export: base with a very stable interface io with a very stable interface GHC with a probably rapidly evolving interface. * possibly other packages giving access to internals Most packages that currently depend on ``base'' would then depend only on ``base'' and possibly ``io'', and by virtue of the stability of these two interfaces would therefore not be affected by most GHC releases. This would effectively be ``splitting the interfaces GHC and io out from base'' instead of ``deprecating base and replacing it with the three new interfaces base-pure, io, and GHC''. That choice is possibly mostly a matter of taste --- I think that the name ``base'' is good for a user-facing interface, and the name ``ghc-base'' more indicative of its implementation-dependent character. Wolfram

On Thu, Feb 14, 2013 at 6:48 AM, Joachim Breitner
Maybe the proper is to reverse the whole approach: Leave base as it is, and then build re-exporting smaller packages (e.g. a base-pure) on top of it. The advantage is: * No need to rewrite the tightly intertwined base. * Libraries still have the option to have tighter dependencies. * Base can evolve with lots of breaking changes, as long as they do not affect the API by the smaller packages. * Development of this collection can happen outside the GHC tree. * Alternative implementations for (some of) these packages can be created, if the reason why they could not be moved out of base is one of implementation, not of API
How does that sound?
I'm not in favor of this approach as it precludes pushing any data types down the stack. In particular, we want text and bytestring to be below the I/O layer, so we can defined Handles that work with those in base itself.

wren ng thornton
But again, this depends on how feasible it would be to actually split the packages apart. Is it feasible?
Some time last year, I tried to extract the module import inter-dependencies in the base package (with the help of GHC's -ddump-minimal-imports), and came up with the following relations: The main problem I see are the cyclic dependencies between groups of modules; e.g. consider the import-deps between the Prelude module and the GHC.* modules: GHC.ConsoleHandler : Prelude GHC.Constants : Prelude GHC.Environment : Prelude GHC.Event : Prelude GHC.Exts : Prelude GHC.Handle : Prelude GHC.IOBase : Prelude GHC.PArr : Prelude GHC.Stack : Prelude GHC.Stats : Prelude Prelude : GHC.Base Prelude : GHC.Enum Prelude : GHC.Err Prelude : GHC.Float Prelude : GHC.Num Prelude : GHC.Real Prelude : GHC.Show Moreover, the GHC.* modules often also import other standard-library modules such as Data.* for the types and helpers. Also, cyclic dependencies between CABAL packages aren't supported afaik... cheers, hvr
participants (26)
-
Bardur Arantsson
-
Brandon Allbery
-
brandon s allbery kf8nh
-
Carter Schonwald
-
David Terei
-
Edward A Kmett
-
Edward Kmett
-
Felipe Almeida Lessa
-
Gabriel Dos Reis
-
Ganesh Sittampalam
-
Herbert Valerio Riedel
-
Ian Lynagh
-
Jacques Carette
-
Joachim Breitner
-
Johan Tibell
-
John Lato
-
Krzysztof Skrzętnicki
-
kudah
-
Manuel M T Chakravarty
-
Mario Blažević
-
Roman Cheplyaka
-
Simon Marlow
-
Simon Peyton-Jones
-
Stephen Paul Weber
-
Wolfram Kahl
-
wren ng thornton