Hi there!
So a few more discussions have come up. And they have mainly centered
around the question of quality assurance. Cutting GHC releases is time
consuming and not trivial. And those people would need to take ownership
of those releases and stand by them. How do we ensure that backports do
not inadvertently break working compiler? I'm completely against preventing
new contributors to help with making releases on the ground that things can
go wrong. This would inevitably just end up preventing people form even
trying, and how do you get good at something if you can't even try to get good
at it?
So the question then is: what can we do to improve/ensure quality of releases?
We certainly have the test-suite, but that might have holes, and backporting the
test-suite will only work so far. Language features that change stdout/stderr
will inevitably be fixed in newer test-suites to accomodate newer compilers, but
will not work with older compilers.
However, we have a large body of public libraries on hackage. And a curated
set of packages per compiler in the form Stackage LTS sets. We have something
slightly similar for HEAD with the hackage head overlay. For older compilers
we can rely on something more mature!
Thus, if we can build some automation to test a compiler against an existing set
of packages, and run their test-suites. There will inevitably be failures, but we'd
be interested in looking at the drivitive only anyway. If the same set of tests fail
that previous compilers failed at, I don't think that should be much of concern. If
fewer tests fail, it would indicate something might have been fixed, or the test
now surfaces some new behaviour that we might want to look at. Worst case
would be new test that fail, but didn't before. This should raise red flags and
either have a *very* good argument for why the backport is still the right thing to
do and the test-failures are actually faulty tests, or the backport should just not
be performed.
In the end it will be about striking a balance between fixing bugs and not
regressing, with a higher priority on not regressing. However we if we can't
detect we regress, we have to assume we don't, as we'd otherwise be unable
to even make any releases.
I'd be happy to discuss this further, and setup some nix based test harness for
this, as time permits (with windows test being run through some cross compilation
and wine based) setup.
Cheers,
Moritz