Continuous Integration and Cross Compilation

Hello all, I’ve seen quite a few comments on the list and elsewhere lamenting the time it takes to compile and validate ghc. It’s troublesome not only because it’s inconvenient, but, more seriously, people are holding off on sending patches in which stifles development. I would like to propose a solution: 1. Implement proper cross-compilation, such that build and host may be different— e.g. a linux x86_64 machine can build ghc that runs on Windows x86. What sort of work would this entail? 2. Batch cross-compiled builds for all OSs/archs on a continuous integration service (e.g. Travis CI) or cloud service, then package up the binaries with the test suite. 3. Send the package to our buildbots, and run the test suite. 4. (optional) If using a CI service, have the buildbots send results back to the CI. This could be useful if we'd use GitHub for pulls in the future *. Cheers, Will * I realize vanilla GitHub currently has certain annoying limitations, though some of them are pretty easy to solve via the github-services and/or webhooks. I don’t think this conflicts with the desire to use Phabricator, either, so I’ll send details and motivations to that thread.

Hi William,
Thanks for the email. Here're some things to consider.
For one, cross compilation is a hot topic, but it is going to be a
rather large amount of work to fix and it won't be easy. The primary
problem is that we need to make Template Haskell cross-compile, but in
general this is nontrivial: TemplateHaskell must load and run object
code on the *host* platform, but the compiler must generate code for
the *target* platform. There are ways around some of these problems;
for one, we could compile every module twice, once for the host, and
once for the target. Upon requesting TH, the Host GHC would load Host
Object Code, but the final executable would link with the Target
Object Code.
There are many, many subtle points to consider if we go down this
route - what happens for example if I cross compile from a 64bit
machine to a 32bit one, but TemplateHaskell wants some knowledge like
what "sizeOf (undefined :: CLong)" is? The host code sees a 64-bit
quantity while the target actually will deal with a 32bit one. This
could later explode horribly. And this isn't limited to different
endianness either - it applies to the ABI in general. 64bit Linux ->
64bit Windows would be just as problematic with this exact case, as
one uses LP64, while the other uses LLP64 data models.
So #1 by itself is a very, very non-trivial amount of work, and IMO I
don't think it's necessary for better builds. There are other routes
possible for cross compilation perhaps, but I'd speculate they are all
equally as non-trivial as this one.
Finally, the remainder of the scheme, including shipping builds to
remote machines and have them be tested sounds a bit more complicated,
and I'm wondering what the advantages are. In particular it seems like
this merely exposes more opportunities for failure points in the CI
system, because now all CI depends on cross compilation working
properly, being able to ship reports back and forth, and more.
Depending on CC in particular is a huge burden it sounds: it makes it
hard to distinguish when a cross-compilation bug may cause a failure
as opposed to a changeset from a committer, which widens the scope of
what we need to consider. A CI system should be absolutely as
predictable as possible, and this adds a *lot* of variables to the
mix. Cross compilation is really something that's not just one big
task - there will be many *small* bugs laying in wait after that, the
pain of a thousand cuts.
Really, we need to distinguish between two needs:
1) Continuous integration.
2) Nightly builds.
These two systems have very different needs in practice:
1) A CI system needs to be *fast*, and it needs to have dedicated
resources to respond to changes quickly. This means we need to
*minimize* the amount of time for developer turn around to see
results. That includes minimizing the needed configurations. Shipping
builds to remote machines just for CI would greatly complicate this
and likely make it far longer on its own, not to mention it increases
with every system we add.
2) A nightly build system is under nowhere near the same time
constraints, although it also needs to be dedicated. If an ARM/Linux
machine takes 6 hours to build (perhaps it's shared or something, or
just really wimpy), that's totally acceptable. These can then report
nightly about the results and we can reasonably blame
people/changesets based on that.
Finally, both of these become more complicated by the fact GHC is a
large project that has a highly variable number of configurations we
have to keep under control: static, dynamic, static+dynamic,
profiling, LLVM builds, builds where GHC itself is profiled, as well
as the matrix of those combinations: LLVM+GHC Profiled, etc etc etc.
Each of these configurations expose bugs in their own right.
Unfortunately doing #1 with all these configurations would be
ludicrous: it would explode the build times for any given system, and
it also drastically multiplies the hardware resources we'd need for CI
if we wanted them to respond quickly to any given changeset, because
you not only have to *build* them, you must run them. And now you have
to run a lot of them. A nightly build system is more reasonable for
these problems, because taking hours and hours is expected. These
problems would still be true even with cross compilation, because it
multiplies the amount of work every CI run must do no matter what.
We actually already do have both of these already, too: Joachim
Breitner for example has set us up a Travis-CI[1] setup, while Gabor
Pali has set us up nightly builds[2]. Travis-CI does the job of fast
CI, but it's not good for a few reasons:
1) We have literally zero visibility into it for reports. Essentially
we only know when it explodes because Joachim yells at us (normally at
me :) This is because GitHub is not our center-of-the-universe,
despite how much people yearn for it to be so.
2) The time limit is unacceptable. Travis-CI for example actually
cannot do dynamic builds of GHC because it takes too long. Considering
GHC is shipping dynamically on major platforms now, that's quite a
huge loss for a CI system to miss (and no, a separate build matrix
configuration doesn't work here - GHC builds statically and
dynamically at the same time, and ships both - there's no way to have
"only static" and "only dynamic" entries.)
3) It has limited platform support - only recently did it have OS X,
and Windows is not yet in sight. Ditto for FreeBSD. These are crucial
for CI as well, as they encompass all our Tier-1 platforms. This could
be fixed with cross compilation, but again, that's a big, big project.
And finally, on the GitHub note, as I said in the prior thread about
Phabricator, I don't actually think it offers us anything useful at
this point in time - literally almost nothing other than "other
projects use GitHub", which is not an advantage, it's an appeal to
popularity IMO. Webhooks still cannot do things like ban tabs,
trailing whitespace, or enforce submodule integrity. We have to have
our own setup for all of that. I'm never going to hit the 'Merge
Button' for PRs - validation is 100% mandatory on behalf of the
merger, and again, Travis-CI cannot provide coherent coverage even if
we could use it for that. And because of that there's no difference
between GitHub any other code site - I have to pull the branch
manually and test myself, which I could do with any random git
repository in the world.
The code review tools are worse than Phabricator. Finally, if we are
going to accept patches from people, we need to have a coherent,
singular way to do it - mixing GitHub PRs, Phabricator, and uploading
patches to Trac is just a nightmare for pain, and not just for me,
even though I do most of the patch work - it incurs the burden on
*every* person who wants to review code to now do so in many separate
places. And we need to make code review *easier*, not harder! If
anything, we should be consolidating on a single place (obviously, I'd
vote for Phabricator), not adding more places to make changes that we
all have to keep up with, when we don't even use the service itself!
That's why I proposed Phabricator: because it is coherent and a
singular place to go to, and very good at what it does, and does not
attempt to 'take over' GHC itself. GitHub is a fairly all-or-nothing
proposition if you want any benefits it delivers, if you ask me (I say
this as someone who likes GitHub for smaller projects). I just don't
think their tools are suitable for us.
So, back to the topic. I think the nightly builds are actually in an
OK state at the moment, since we do get reports from them, and
builders do check in regularly. The nightly builders also cover a more
diverse set of platforms than our CI will. But the CI and turnaround
could be *greatly* improved, I think, because ghc-complete is
essentially ignored or unknown by many people.
So I'll also make a suggestion: just to actually get something that
will pull GHC's repo every 10 minutes or so, do a build, and then
email ghc-devs *only* if failures pop up. In fact, we could just
re-use the existing nightly build infrastructure for this, and just
make it check very regularly, and just run standard amd64/Linux and
Windows builds upon changes. I could provide hardware for this. This
would increase the visibility of reports, not require *any* new code,
and already works.
Overall, I will absolutely help you in every possible way, because
this really is a problem for newcomers, and existing developers, when
we catch dumb failures later than we should. But I think the proposed
solution here is extraordinarily complex in comparison to what we
actually need right now.
... I will say that if you *did* fix cross compilation however to work
with TH you would be a hero to many people - myself included -
continuous integration aside! :)
[1] https://github.com/nomeata/ghc-complete
[2] http://haskell.inf.elte.hu/builders/
On Wed, Jun 18, 2014 at 3:10 PM, William Knop
Hello all,
I’ve seen quite a few comments on the list and elsewhere lamenting the time it takes to compile and validate ghc. It’s troublesome not only because it’s inconvenient, but, more seriously, people are holding off on sending patches in which stifles development. I would like to propose a solution:
1. Implement proper cross-compilation, such that build and host may be different— e.g. a linux x86_64 machine can build ghc that runs on Windows x86. What sort of work would this entail?
2. Batch cross-compiled builds for all OSs/archs on a continuous integration service (e.g. Travis CI) or cloud service, then package up the binaries with the test suite.
3. Send the package to our buildbots, and run the test suite.
4. (optional) If using a CI service, have the buildbots send results back to the CI. This could be useful if we'd use GitHub for pulls in the future *.
Cheers, Will
* I realize vanilla GitHub currently has certain annoying limitations, though some of them are pretty easy to solve via the github-services and/or webhooks. I don’t think this conflicts with the desire to use Phabricator, either, so I’ll send details and motivations to that thread.
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
-- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Great and detailed response Austin. Thank you. William, I'm happy to help in any way I can. I run SmartOS x86 and x86_64 builds of GHC HEAD on my own equipment using the GHC Builder Ian Lynagh developed: https://ghc.haskell.org/trac/ghc/wiki/Builder https://github.com/haskell/ghc-builder I'm also currently working on small tweaks to the ghc-builder and getting the GHC testsuite to pass on Illumos (and indirectly Solaris). I follow Gábor's lead on the GHC Builder priorities and Carter Schonwald acts a Pull Request gatekeeper for changes. Best, Alain On 06/18/2014 11:53 PM, Austin Seipp wrote:
Hi William,
Thanks for the email. Here're some things to consider.
For one, cross compilation is a hot topic, but it is going to be a rather large amount of work to fix and it won't be easy. The primary problem is that we need to make Template Haskell cross-compile, but in general this is nontrivial: TemplateHaskell must load and run object code on the *host* platform, but the compiler must generate code for the *target* platform. There are ways around some of these problems; for one, we could compile every module twice, once for the host, and once for the target. Upon requesting TH, the Host GHC would load Host Object Code, but the final executable would link with the Target Object Code.
There are many, many subtle points to consider if we go down this route - what happens for example if I cross compile from a 64bit machine to a 32bit one, but TemplateHaskell wants some knowledge like what "sizeOf (undefined :: CLong)" is? The host code sees a 64-bit quantity while the target actually will deal with a 32bit one. This could later explode horribly. And this isn't limited to different endianness either - it applies to the ABI in general. 64bit Linux -> 64bit Windows would be just as problematic with this exact case, as one uses LP64, while the other uses LLP64 data models.
So #1 by itself is a very, very non-trivial amount of work, and IMO I don't think it's necessary for better builds. There are other routes possible for cross compilation perhaps, but I'd speculate they are all equally as non-trivial as this one.
Finally, the remainder of the scheme, including shipping builds to remote machines and have them be tested sounds a bit more complicated, and I'm wondering what the advantages are. In particular it seems like this merely exposes more opportunities for failure points in the CI system, because now all CI depends on cross compilation working properly, being able to ship reports back and forth, and more. Depending on CC in particular is a huge burden it sounds: it makes it hard to distinguish when a cross-compilation bug may cause a failure as opposed to a changeset from a committer, which widens the scope of what we need to consider. A CI system should be absolutely as predictable as possible, and this adds a *lot* of variables to the mix. Cross compilation is really something that's not just one big task - there will be many *small* bugs laying in wait after that, the pain of a thousand cuts.
Really, we need to distinguish between two needs:
1) Continuous integration.
2) Nightly builds.
These two systems have very different needs in practice:
1) A CI system needs to be *fast*, and it needs to have dedicated resources to respond to changes quickly. This means we need to *minimize* the amount of time for developer turn around to see results. That includes minimizing the needed configurations. Shipping builds to remote machines just for CI would greatly complicate this and likely make it far longer on its own, not to mention it increases with every system we add.
2) A nightly build system is under nowhere near the same time constraints, although it also needs to be dedicated. If an ARM/Linux machine takes 6 hours to build (perhaps it's shared or something, or just really wimpy), that's totally acceptable. These can then report nightly about the results and we can reasonably blame people/changesets based on that.
Finally, both of these become more complicated by the fact GHC is a large project that has a highly variable number of configurations we have to keep under control: static, dynamic, static+dynamic, profiling, LLVM builds, builds where GHC itself is profiled, as well as the matrix of those combinations: LLVM+GHC Profiled, etc etc etc. Each of these configurations expose bugs in their own right. Unfortunately doing #1 with all these configurations would be ludicrous: it would explode the build times for any given system, and it also drastically multiplies the hardware resources we'd need for CI if we wanted them to respond quickly to any given changeset, because you not only have to *build* them, you must run them. And now you have to run a lot of them. A nightly build system is more reasonable for these problems, because taking hours and hours is expected. These problems would still be true even with cross compilation, because it multiplies the amount of work every CI run must do no matter what.
We actually already do have both of these already, too: Joachim Breitner for example has set us up a Travis-CI[1] setup, while Gabor Pali has set us up nightly builds[2]. Travis-CI does the job of fast CI, but it's not good for a few reasons:
1) We have literally zero visibility into it for reports. Essentially we only know when it explodes because Joachim yells at us (normally at me :) This is because GitHub is not our center-of-the-universe, despite how much people yearn for it to be so.
2) The time limit is unacceptable. Travis-CI for example actually cannot do dynamic builds of GHC because it takes too long. Considering GHC is shipping dynamically on major platforms now, that's quite a huge loss for a CI system to miss (and no, a separate build matrix configuration doesn't work here - GHC builds statically and dynamically at the same time, and ships both - there's no way to have "only static" and "only dynamic" entries.)
3) It has limited platform support - only recently did it have OS X, and Windows is not yet in sight. Ditto for FreeBSD. These are crucial for CI as well, as they encompass all our Tier-1 platforms. This could be fixed with cross compilation, but again, that's a big, big project.
And finally, on the GitHub note, as I said in the prior thread about Phabricator, I don't actually think it offers us anything useful at this point in time - literally almost nothing other than "other projects use GitHub", which is not an advantage, it's an appeal to popularity IMO. Webhooks still cannot do things like ban tabs, trailing whitespace, or enforce submodule integrity. We have to have our own setup for all of that. I'm never going to hit the 'Merge Button' for PRs - validation is 100% mandatory on behalf of the merger, and again, Travis-CI cannot provide coherent coverage even if we could use it for that. And because of that there's no difference between GitHub any other code site - I have to pull the branch manually and test myself, which I could do with any random git repository in the world.
The code review tools are worse than Phabricator. Finally, if we are going to accept patches from people, we need to have a coherent, singular way to do it - mixing GitHub PRs, Phabricator, and uploading patches to Trac is just a nightmare for pain, and not just for me, even though I do most of the patch work - it incurs the burden on *every* person who wants to review code to now do so in many separate places. And we need to make code review *easier*, not harder! If anything, we should be consolidating on a single place (obviously, I'd vote for Phabricator), not adding more places to make changes that we all have to keep up with, when we don't even use the service itself! That's why I proposed Phabricator: because it is coherent and a singular place to go to, and very good at what it does, and does not attempt to 'take over' GHC itself. GitHub is a fairly all-or-nothing proposition if you want any benefits it delivers, if you ask me (I say this as someone who likes GitHub for smaller projects). I just don't think their tools are suitable for us.
So, back to the topic. I think the nightly builds are actually in an OK state at the moment, since we do get reports from them, and builders do check in regularly. The nightly builders also cover a more diverse set of platforms than our CI will. But the CI and turnaround could be *greatly* improved, I think, because ghc-complete is essentially ignored or unknown by many people.
So I'll also make a suggestion: just to actually get something that will pull GHC's repo every 10 minutes or so, do a build, and then email ghc-devs *only* if failures pop up. In fact, we could just re-use the existing nightly build infrastructure for this, and just make it check very regularly, and just run standard amd64/Linux and Windows builds upon changes. I could provide hardware for this. This would increase the visibility of reports, not require *any* new code, and already works.
Overall, I will absolutely help you in every possible way, because this really is a problem for newcomers, and existing developers, when we catch dumb failures later than we should. But I think the proposed solution here is extraordinarily complex in comparison to what we actually need right now.
... I will say that if you *did* fix cross compilation however to work with TH you would be a hero to many people - myself included - continuous integration aside! :)
[1] https://github.com/nomeata/ghc-complete [2] http://haskell.inf.elte.hu/builders/
On Wed, Jun 18, 2014 at 3:10 PM, William Knop
wrote: Hello all,
I’ve seen quite a few comments on the list and elsewhere lamenting the time it takes to compile and validate ghc. It’s troublesome not only because it’s inconvenient, but, more seriously, people are holding off on sending patches in which stifles development. I would like to propose a solution:
1. Implement proper cross-compilation, such that build and host may be different— e.g. a linux x86_64 machine can build ghc that runs on Windows x86. What sort of work would this entail?
2. Batch cross-compiled builds for all OSs/archs on a continuous integration service (e.g. Travis CI) or cloud service, then package up the binaries with the test suite.
3. Send the package to our buildbots, and run the test suite.
4. (optional) If using a CI service, have the buildbots send results back to the CI. This could be useful if we'd use GitHub for pulls in the future *.
Cheers, Will
* I realize vanilla GitHub currently has certain annoying limitations, though some of them are pretty easy to solve via the github-services and/or webhooks. I don’t think this conflicts with the desire to use Phabricator, either, so I’ll send details and motivations to that thread.
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJToi1rAAoJEP0rIXJNjNSA7EkIAL2FFR8aBRsxHBTXIcCx6QsM HE9EHpO9zVF0hZYoTTw9+SwyI08NCMUvRg65YD2Wwrgq+yvGurX/+Oat7UI+6ZJY jWRY6LJpTDX9OcIFs3wCv7FmSbMDDLgdNR+2t1/atw/buVBityoYKi+1rqeU4I0y l5mCxL1hXIKwpOVU0IQ1NlZ/Q0G9er5qFSkbQFlRwS2rYNArvmp8UlTxsClZBw07 uSt5Mq2sKuUAth3ZCAt+8Hqp+kWDmV8UPDfDbP/tKSx83XOmH0SDwYCtVj7WwT+V psHkQwKPOg9QBto2DkxNVXLvwedV3awDhS88emtxQeulCZqly9FP5SWuHjRFHsU= =Ldqt -----END PGP SIGNATURE-----

2014-06-19 1:53 GMT+02:00 Austin Seipp
We actually already do have both of these already, too: Joachim Breitner for example has set us up a Travis-CI[1] setup, while Gabor Pali has set us up nightly builds[2]. Travis-CI does the job of fast CI, but it's not good for a few reasons: [..] 3) It has limited platform support - only recently did it have OS X, and Windows is not yet in sight. Ditto for FreeBSD. These are crucial for CI as well, as they encompass all our Tier-1 platforms. This could be fixed with cross compilation, but again, that's a big, big project.
Regarding FreeBSD, I am fine with having only the nightly builds for them. Fortunately, it is seldom the case when something is broken due to some platform-specific setting.
So I'll also make a suggestion: just to actually get something that will pull GHC's repo every 10 minutes or so, do a build, and then email ghc-devs *only* if failures pop up.
Yeah, this could be done by the nightly builders. They have a "Continuous" build mode (thanks to Ian) which probably means that they will start over the same process as soon as the current one has finished. I wrote "probably" because I have not ever tried it but saw it in the sources :-) I think sending mails in case of failures only could be also done somehow, but that may require some changes to the sources.

Hi Austin, Thank you for the quick and thorough reply! There are a lot of points to cover, so I’ll respond in a few sections. *** The CI Scheme I realize the vast majority of the work would be in #1, but just want to highlight the idea that there is a real benefit to be had. To address the latter part of your email, I suggested splitting the test suite from the build for a few reasons: 1. We have a pretty good spread of buildbots, but as far as I know there aren’t very many of them. Running only the test suite would increase their utility by roughly 5x (from looking at the buildbot time breakdowns [1]). 2. Building ghc is time and resource intensive, which makes it hard for people to host buildbots. Even though my machines are relatively new, I can’t usually host one because it would interfere with my other work. I would be more tempted to if it was limited to just the test suite, and perhaps others would as well. 3. Cloud computing would enable very fast builds, such that we could conceivably automatically build (and then test on the buildbots) for every patch set submission / pull request. I believe that sort of streamlining would make ghc development both more accessible to others and more enjoyable for all. *** Cross Compilation and Template Haskell Now on to the meat of the problem! I’m not too familiar with the really scary bits of TH, but I'll start with:
TemplateHaskell must load and run object code on the *host* platform, but the compiler must generate code for the *target* platform.
As you pointed out, this is a big deal. How clear of a delineation does TH have as far as what runs on each platform? I believe this segregation is fundamental to TH, which I’ll explain below.
There are ways around some of these problems; for one, we could compile every module twice, once for the host, and once for the target.
I don’t think that’s necessary (or maybe I’m misunderstanding and we’re saying the same thing). Consider the following: 1. TH compiles to object code 2. The object code is run on the build machine, which generates haskell AST 3. “Regular" GHC compiles the haskell AST to object code Currently, the notion of build, host, and target are sort of mashed together with the assumption that build and host will be the same. It seems like “all” we have to do it tell the TH part of GHC to target the build arch, and the rest of GHC to target the host arch. But then there’s this...
There are many, many subtle points to consider if we go down this route - what happens for example if I cross compile from a 64bit machine to a 32bit one, but TemplateHaskell wants some knowledge like what "sizeOf (undefined :: CLong)" is?
This comes back to the line between build and host in TH— there needs to be one. Perhaps there should be buildSizeOf and hostSizeOf for TH to use, and similar for other machine specific stuff? I think the messiest part of this it that existing packages assume build == host. Their maintainers would have to be prodded to respect the build/host division, and the packages would have to be updated. Actually, one advantage to adding in build and host variations to machine specific functions is that we can just deprecate the unsegregated versions and not break anyone’s stuff. Using one of the deprecated functions in a cross-compile would simply spit out an error and terminate the build, or instead perhaps fall back to double compilation. Regarding the many subtle points to consider, if the sort of path I describe is at all sane (please tell me if not!), I can open a trac ticket so we can chip away at them. *** Cross Compilation, Redux There is one more part to this story, however. Ultimately, a single build of ghc should be able to have multiple targets (or in other words, one build of ghc should be able to target multiple hosts). LLVM allows us to do this, but ghc’s notion of cross compiler is limited. Here is the current setup [2]: Stage 0: • built on: --- • runs on: host • targets: host Libs Boot: • built on: build • runs on: build • targets: --- Stage 1: • built on: build • runs on: host • targets: target Libs Install: • built on: build • runs on: target • targets: --- Stage 2: • built on: build • runs on: target • targets: target What I propose is the following (stage 0 and libs boot are unchanged): Stage 1: • built on: build • runs on: build • targets: targets Libs Toolchain Host: • built on: build • runs on: host • targets: --- Libs Toolchain Target-x: • built on: build • runs on: target-x • targets: --- Libs Toolchain Target-y: • built on: build • runs on: target-y • targets: --- Libs Toolchain Target-z: • built on: build • runs on: target-z • targets: --- Stage 2: • built on: build • runs on: host • targets: host, target-x, target-y, target-z Most people will only want targets == host, in which case only the host toolchain will be built, so "regular" builds should be exactly the same as they are now. One may also produce a specialized cross-compiler (i.e. no host toolchain and one target toolchain), which is equivalent to how ghc currently builds a cross compiler. Or, one may choose to produce a compiler that targets whatever combination of targets one desires (currently impossible). A build of ghc that runs on the cloud (as proposed above), one might have host=linux-x86_64, targets=the-whole-shebang. For the compilers produced by the cloud, one would have targets == host, simply because we just want to be able to run the test suite on a given machine. *** Build Machine Needs
Really, we need to distinguish between two needs:
1) Continuous integration.
2) Nightly builds.
These two systems have very different needs in practice:
1) A CI system needs to be *fast*, and it needs to have dedicated resources to respond to changes quickly. This means we need to *minimize* the amount of time for developer turn around to see results. That includes minimizing the needed configurations. Shipping builds to remote machines just for CI would greatly complicate this and likely make it far longer on its own, not to mention it increases with every system we add.
2) A nightly build system is under nowhere near the same time constraints, although it also needs to be dedicated. If an ARM/Linux machine takes 6 hours to build (perhaps it's shared or something, or just really wimpy), that's totally acceptable. These can then report nightly about the results and we can reasonably blame people/changesets based on that.
I totally agree on the distinction you’ve drawn here, though I don’t think the CI proposed above would increase build times. On the contrary, I think it would greatly reduce build times (assuming we use fast cloud compute nodes). I’ll try to collect some stats (and costs) to back that up.
I didn’t realize Travis CI has a build time limit, so thanks for pointing that out. Fifty minutes, though! Not enough for us, certainly.
I’ve read a fair amount about Jenkins CI [3], which is very actively developed, has zillions of plugins, and integrates with all sorts of sites. It’s also open source and locally installable, which means we could set it to email, generate online reports, tell Phabricator (or GitHub*) that a patch set is bogus, dispense coffee, etc. It might warrant more investigation as a possible replacement for buildbots.
* I agree with most of your points with mixing so many tools, each with their own methodologies. Although I’d get a warm and fuzzy feeling from being able to fork and send a pull request that gets automatically validated, it probably doesn’t makes sense to pursue that right now.
*** Oh my!
This response has gotten pretty long! Apologies if I missed something, or otherwise misunderstood. Anyway, if there’s a path here that seems sensible, I’ll have a go at it.
Will
[1] http://haskell.inf.elte.hu/builders/
[2] https://ghc.haskell.org/trac/ghc/wiki/CrossCompilation
[3] http://jenkins-ci.org
On Jun 18, 2014, at 7:53 PM, Austin Seipp
Hi William,
Thanks for the email. Here're some things to consider.
For one, cross compilation is a hot topic, but it is going to be a rather large amount of work to fix and it won't be easy. The primary problem is that we need to make Template Haskell cross-compile, but in general this is nontrivial: TemplateHaskell must load and run object code on the *host* platform, but the compiler must generate code for the *target* platform. There are ways around some of these problems; for one, we could compile every module twice, once for the host, and once for the target. Upon requesting TH, the Host GHC would load Host Object Code, but the final executable would link with the Target Object Code.
There are many, many subtle points to consider if we go down this route - what happens for example if I cross compile from a 64bit machine to a 32bit one, but TemplateHaskell wants some knowledge like what "sizeOf (undefined :: CLong)" is? The host code sees a 64-bit quantity while the target actually will deal with a 32bit one. This could later explode horribly. And this isn't limited to different endianness either - it applies to the ABI in general. 64bit Linux -> 64bit Windows would be just as problematic with this exact case, as one uses LP64, while the other uses LLP64 data models.
So #1 by itself is a very, very non-trivial amount of work, and IMO I don't think it's necessary for better builds. There are other routes possible for cross compilation perhaps, but I'd speculate they are all equally as non-trivial as this one.
Finally, the remainder of the scheme, including shipping builds to remote machines and have them be tested sounds a bit more complicated, and I'm wondering what the advantages are. In particular it seems like this merely exposes more opportunities for failure points in the CI system, because now all CI depends on cross compilation working properly, being able to ship reports back and forth, and more. Depending on CC in particular is a huge burden it sounds: it makes it hard to distinguish when a cross-compilation bug may cause a failure as opposed to a changeset from a committer, which widens the scope of what we need to consider. A CI system should be absolutely as predictable as possible, and this adds a *lot* of variables to the mix. Cross compilation is really something that's not just one big task - there will be many *small* bugs laying in wait after that, the pain of a thousand cuts.
Really, we need to distinguish between two needs:
1) Continuous integration.
2) Nightly builds.
These two systems have very different needs in practice:
1) A CI system needs to be *fast*, and it needs to have dedicated resources to respond to changes quickly. This means we need to *minimize* the amount of time for developer turn around to see results. That includes minimizing the needed configurations. Shipping builds to remote machines just for CI would greatly complicate this and likely make it far longer on its own, not to mention it increases with every system we add.
2) A nightly build system is under nowhere near the same time constraints, although it also needs to be dedicated. If an ARM/Linux machine takes 6 hours to build (perhaps it's shared or something, or just really wimpy), that's totally acceptable. These can then report nightly about the results and we can reasonably blame people/changesets based on that.
Finally, both of these become more complicated by the fact GHC is a large project that has a highly variable number of configurations we have to keep under control: static, dynamic, static+dynamic, profiling, LLVM builds, builds where GHC itself is profiled, as well as the matrix of those combinations: LLVM+GHC Profiled, etc etc etc. Each of these configurations expose bugs in their own right. Unfortunately doing #1 with all these configurations would be ludicrous: it would explode the build times for any given system, and it also drastically multiplies the hardware resources we'd need for CI if we wanted them to respond quickly to any given changeset, because you not only have to *build* them, you must run them. And now you have to run a lot of them. A nightly build system is more reasonable for these problems, because taking hours and hours is expected. These problems would still be true even with cross compilation, because it multiplies the amount of work every CI run must do no matter what.
We actually already do have both of these already, too: Joachim Breitner for example has set us up a Travis-CI[1] setup, while Gabor Pali has set us up nightly builds[2]. Travis-CI does the job of fast CI, but it's not good for a few reasons:
1) We have literally zero visibility into it for reports. Essentially we only know when it explodes because Joachim yells at us (normally at me :) This is because GitHub is not our center-of-the-universe, despite how much people yearn for it to be so.
2) The time limit is unacceptable. Travis-CI for example actually cannot do dynamic builds of GHC because it takes too long. Considering GHC is shipping dynamically on major platforms now, that's quite a huge loss for a CI system to miss (and no, a separate build matrix configuration doesn't work here - GHC builds statically and dynamically at the same time, and ships both - there's no way to have "only static" and "only dynamic" entries.)
3) It has limited platform support - only recently did it have OS X, and Windows is not yet in sight. Ditto for FreeBSD. These are crucial for CI as well, as they encompass all our Tier-1 platforms. This could be fixed with cross compilation, but again, that's a big, big project.
And finally, on the GitHub note, as I said in the prior thread about Phabricator, I don't actually think it offers us anything useful at this point in time - literally almost nothing other than "other projects use GitHub", which is not an advantage, it's an appeal to popularity IMO. Webhooks still cannot do things like ban tabs, trailing whitespace, or enforce submodule integrity. We have to have our own setup for all of that. I'm never going to hit the 'Merge Button' for PRs - validation is 100% mandatory on behalf of the merger, and again, Travis-CI cannot provide coherent coverage even if we could use it for that. And because of that there's no difference between GitHub any other code site - I have to pull the branch manually and test myself, which I could do with any random git repository in the world.
The code review tools are worse than Phabricator. Finally, if we are going to accept patches from people, we need to have a coherent, singular way to do it - mixing GitHub PRs, Phabricator, and uploading patches to Trac is just a nightmare for pain, and not just for me, even though I do most of the patch work - it incurs the burden on *every* person who wants to review code to now do so in many separate places. And we need to make code review *easier*, not harder! If anything, we should be consolidating on a single place (obviously, I'd vote for Phabricator), not adding more places to make changes that we all have to keep up with, when we don't even use the service itself! That's why I proposed Phabricator: because it is coherent and a singular place to go to, and very good at what it does, and does not attempt to 'take over' GHC itself. GitHub is a fairly all-or-nothing proposition if you want any benefits it delivers, if you ask me (I say this as someone who likes GitHub for smaller projects). I just don't think their tools are suitable for us.
So, back to the topic. I think the nightly builds are actually in an OK state at the moment, since we do get reports from them, and builders do check in regularly. The nightly builders also cover a more diverse set of platforms than our CI will. But the CI and turnaround could be *greatly* improved, I think, because ghc-complete is essentially ignored or unknown by many people.
So I'll also make a suggestion: just to actually get something that will pull GHC's repo every 10 minutes or so, do a build, and then email ghc-devs *only* if failures pop up. In fact, we could just re-use the existing nightly build infrastructure for this, and just make it check very regularly, and just run standard amd64/Linux and Windows builds upon changes. I could provide hardware for this. This would increase the visibility of reports, not require *any* new code, and already works.
Overall, I will absolutely help you in every possible way, because this really is a problem for newcomers, and existing developers, when we catch dumb failures later than we should. But I think the proposed solution here is extraordinarily complex in comparison to what we actually need right now.
... I will say that if you *did* fix cross compilation however to work with TH you would be a hero to many people - myself included - continuous integration aside! :)
[1] https://github.com/nomeata/ghc-complete [2] http://haskell.inf.elte.hu/builders/
On Wed, Jun 18, 2014 at 3:10 PM, William Knop
wrote: Hello all,
I’ve seen quite a few comments on the list and elsewhere lamenting the time it takes to compile and validate ghc. It’s troublesome not only because it’s inconvenient, but, more seriously, people are holding off on sending patches in which stifles development. I would like to propose a solution:
1. Implement proper cross-compilation, such that build and host may be different— e.g. a linux x86_64 machine can build ghc that runs on Windows x86. What sort of work would this entail?
2. Batch cross-compiled builds for all OSs/archs on a continuous integration service (e.g. Travis CI) or cloud service, then package up the binaries with the test suite.
3. Send the package to our buildbots, and run the test suite.
4. (optional) If using a CI service, have the buildbots send results back to the CI. This could be useful if we'd use GitHub for pulls in the future *.
Cheers, Will
* I realize vanilla GitHub currently has certain annoying limitations, though some of them are pretty easy to solve via the github-services and/or webhooks. I don’t think this conflicts with the desire to use Phabricator, either, so I’ll send details and motivations to that thread.
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
-- Regards,
Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/

| This response has gotten pretty long! Apologies if I missed something,
| or otherwise misunderstood. Anyway, if there's a path here that seems
| sensible, I'll have a go at it.
William, I am not qualified to comment on the details, but thank you for offering to help. I do urge you to pick some initial tasks that *don't* involve solving the full cross-compilation problem, desirable as it is. I fear that it is a swamp from which you will not emerge soon, and it'd be better to have some successes to encourage you, and some experience to build on, before diving into it.
Everyone: no responses yet to my email below. Suppose Austin plays secretary: would people like to volunteer to be part of the GHC Nightly-Build/Continuous-Integration Task Force?
Simon
-----Original Message-----
From: Simon Peyton Jones
Sent: 18 June 2014 23:48
To: Simon Peyton Jones; Páli Gábor János; Alain O'Dea
Cc: ghc-devs@haskell.org; William Knop; Karel Gardas
Subject: RE: Offering GHC builder build slaves
Back in April I said:
| Seriously, I advertised a couple of weeks ago for help with our
| nightly- build infrastructure. Quite a few people responded -- thank
| you very much.
|
| So we have willing horsepower. But the moment we lack leadership.
| Alain rightly says "I don't know what the process is" because we don't
| *have* a process. We need a mechanism for creating a process, taking
| decisions, etc.
|
| I think what is needed is:
|
| * A group of people willing to act as a kind of committee. That
| could be everyone who replied. You could create a mailing list,
| or (initially better) just chat on ghc-devs. But it would be
| useful to have a list of who is involved.
|
| * Someone (or a couple of people) to play the role of chair.
| That doesn't mean an autocrat... it means someone who gently pushes
| discussions to a conclusion, and says "I propose that we do X".
|
| * Then the group can formulate a plan and proceed with it.
| For example, should Pali's efforts be "blessed"? I don't
| know enough to know, but you guys do.
|
| In my experience, people are often unwilling to put themselves forward
| as chair, not because they are unwilling, but because they feel it'd
| be "pushy". So I suggest this: if you think (based on the traffic
| you've
| seen) that X would be a chair you'd trust, suggest them.
|
| In short: power to the people! GHC is your compiler.
Since then various people have done various things, but so far as I know we don't have any of the three "*" items above. The people who seem in principle willing to help include Joachim Breitner

I thought Alain already replied? He and Pali are running some ghc-builder
boxes, and i'm helping with code review for patches into ghc-builder
On Fri, Jun 20, 2014 at 3:10 AM, Simon Peyton Jones
| This response has gotten pretty long! Apologies if I missed something, | or otherwise misunderstood. Anyway, if there's a path here that seems | sensible, I'll have a go at it.
William, I am not qualified to comment on the details, but thank you for offering to help. I do urge you to pick some initial tasks that *don't* involve solving the full cross-compilation problem, desirable as it is. I fear that it is a swamp from which you will not emerge soon, and it'd be better to have some successes to encourage you, and some experience to build on, before diving into it.
Everyone: no responses yet to my email below. Suppose Austin plays secretary: would people like to volunteer to be part of the GHC Nightly-Build/Continuous-Integration Task Force?
Simon
-----Original Message----- From: Simon Peyton Jones Sent: 18 June 2014 23:48 To: Simon Peyton Jones; Páli Gábor János; Alain O'Dea Cc: ghc-devs@haskell.org; William Knop; Karel Gardas Subject: RE: Offering GHC builder build slaves
Back in April I said:
| Seriously, I advertised a couple of weeks ago for help with our | nightly- build infrastructure. Quite a few people responded -- thank | you very much. | | So we have willing horsepower. But the moment we lack leadership. | Alain rightly says "I don't know what the process is" because we don't | *have* a process. We need a mechanism for creating a process, taking | decisions, etc. | | I think what is needed is: | | * A group of people willing to act as a kind of committee. That | could be everyone who replied. You could create a mailing list, | or (initially better) just chat on ghc-devs. But it would be | useful to have a list of who is involved. | | * Someone (or a couple of people) to play the role of chair. | That doesn't mean an autocrat... it means someone who gently pushes | discussions to a conclusion, and says "I propose that we do X". | | * Then the group can formulate a plan and proceed with it. | For example, should Pali's efforts be "blessed"? I don't | know enough to know, but you guys do. | | In my experience, people are often unwilling to put themselves forward | as chair, not because they are unwilling, but because they feel it'd | be "pushy". So I suggest this: if you think (based on the traffic | you've | seen) that X would be a chair you'd trust, suggest them. | | In short: power to the people! GHC is your compiler.
Since then various people have done various things, but so far as I know we don't have any of the three "*" items above. The people who seem in principle willing to help include Joachim Breitner < mail@joachim-breitner.de> Herbert Valerio Riedel
Páli Gábor János Karel Gardas < karel.gardas@centrum.cz> Alain O'Dea William Knop Austin Seipp There may well be others! I sense that the problem is not willingness but simply that no one feels accredited to take the lead. Please, I would love someone to do so!
I was reminded of this by William Knop's recent message below, in which he implicitly offers to help (thanks William). But his offer will fall on deaf ears unless that little group exists to welcome him in.
In hope, and with thanks,
Simon _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Hello William,
2014-06-20 0:50 GMT+02:00 William Knop
1. We have a pretty good spread of buildbots, but as far as I know there aren’t very many of them. Running only the test suite would increase their utility by roughly 5x (from looking at the buildbot time breakdowns [1]).
How would this increase their utility? I naively believe the purpose of CI is to rebuild and test the source code after each changeset to see if it was bringing regressions. Running the test suite only does not seem to convey this. Many of the regressions could be observed build-time, which means the most safe bet would be to rebuild and test everything on the very same platform.
2. Building ghc is time and resource intensive, which makes it hard for people to host buildbots. Even though my machines are relatively new, I can’t usually host one because it would interfere with my other work. I would be more tempted to if it was limited to just the test suite, and perhaps others would as well.
My buildbots complete the steps (git clone, full build, testing) in about 1 hour 40 minutes (with about 1 hour 15 minutes spent in the compilation phase), while they run in parallel with a shift about an hour. They run on the same machine, together with the coordination server. This is just a 3.4-GHz 4-core Intel Core i5, with a couple of GBs of RAM, I would not call it a high-end box, though. Note that it is on purpose that the builders do not use -j for builds, meaning that they do not parallelize the invoked make(1)-subprocesses, which automatically makes the builds longer. Perhaps it would be worth experimenting with incremental builds and allowing for parallel builds as they could cut down on the build times more efficiently.

Hi Pali and all,
Sorry for the delayed replies; a bunch of things came up and I probably won’t be able to respond properly for two days or so. I am very interested in progressing with this as soon as I can. Many apologies!
Will
On Jun 20, 2014, at 6:15 AM, Páli Gábor János
Hello William,
2014-06-20 0:50 GMT+02:00 William Knop
: 1. We have a pretty good spread of buildbots, but as far as I know there aren’t very many of them. Running only the test suite would increase their utility by roughly 5x (from looking at the buildbot time breakdowns [1]).
How would this increase their utility? I naively believe the purpose of CI is to rebuild and test the source code after each changeset to see if it was bringing regressions. Running the test suite only does not seem to convey this. Many of the regressions could be observed build-time, which means the most safe bet would be to rebuild and test everything on the very same platform.
2. Building ghc is time and resource intensive, which makes it hard for people to host buildbots. Even though my machines are relatively new, I can’t usually host one because it would interfere with my other work. I would be more tempted to if it was limited to just the test suite, and perhaps others would as well.
My buildbots complete the steps (git clone, full build, testing) in about 1 hour 40 minutes (with about 1 hour 15 minutes spent in the compilation phase), while they run in parallel with a shift about an hour. They run on the same machine, together with the coordination server. This is just a 3.4-GHz 4-core Intel Core i5, with a couple of GBs of RAM, I would not call it a high-end box, though.
Note that it is on purpose that the builders do not use -j for builds, meaning that they do not parallelize the invoked make(1)-subprocesses, which automatically makes the builds longer. Perhaps it would be worth experimenting with incremental builds and allowing for parallel builds as they could cut down on the build times more efficiently.

Hi Pali,
Apologies for the delayed response.
I treated cloud compilation as “free” in the context of the buildbots. If we can cross-compile (on Amazon EC2 or the like) ghcs which run on each arch we have for buildbots, the buildbots themselves will have 1/5 the load. I came to that figure from the buildbot page, where it looked like the average compile time was around 80 minutes, and the average test suite run was around 20 minutes.
I see your point about cloud cross compilation and buildbot testing not covering all cases of regressions. I think this is where the CI vs. nightly builds distinction applies well. Cloud compilation and buildbot testing may be fast enough to do CI on every patch set, while total regression coverage could be provided by nightly builds. Jenkins CI allows us to roll our own CI with our own machines, cloud compute services, and loads of other content/auditing/workflow services.
That said, while I think it would be nice to have quick CI in addition to nightly builds, I don’t know if it’s sensible/desired for ghc. Since Jerkins CI is stable yet very actively developed, it seems at least it wouldn't incur too much maintenance on our part. Of course, the devil is in the details, so I’d be happy to set it up on a few of my machines to investigate.
Will
On Jun 20, 2014, at 6:15 AM, Páli Gábor János
Hello William,
2014-06-20 0:50 GMT+02:00 William Knop
: 1. We have a pretty good spread of buildbots, but as far as I know there aren’t very many of them. Running only the test suite would increase their utility by roughly 5x (from looking at the buildbot time breakdowns [1]).
How would this increase their utility? I naively believe the purpose of CI is to rebuild and test the source code after each changeset to see if it was bringing regressions. Running the test suite only does not seem to convey this. Many of the regressions could be observed build-time, which means the most safe bet would be to rebuild and test everything on the very same platform.
2. Building ghc is time and resource intensive, which makes it hard for people to host buildbots. Even though my machines are relatively new, I can’t usually host one because it would interfere with my other work. I would be more tempted to if it was limited to just the test suite, and perhaps others would as well.
My buildbots complete the steps (git clone, full build, testing) in about 1 hour 40 minutes (with about 1 hour 15 minutes spent in the compilation phase), while they run in parallel with a shift about an hour. They run on the same machine, together with the coordination server. This is just a 3.4-GHz 4-core Intel Core i5, with a couple of GBs of RAM, I would not call it a high-end box, though.
Note that it is on purpose that the builders do not use -j for builds, meaning that they do not parallelize the invoked make(1)-subprocesses, which automatically makes the builds longer. Perhaps it would be worth experimenting with incremental builds and allowing for parallel builds as they could cut down on the build times more efficiently.

Hi again,
I think I may have been too brief in my reply. To recap previous discussion, it seems there are a few pieces which can be approached separately:
1) arbitrary/discretionary cross compilation
2) continuous integration for all patchsets
3) nightly builds
The first, as has been pointed out, is a lot of nontrivial work. The second either requires the first and a cloud service, or a lot of hardware (though it was mentioned that the buildbots can work in a CI mode). The third, we already have, thanks to the buildbots and those who have set them up.
I think using Jenkins may be a step in the right direction for a few reasons:
• there are hundreds of supported plugins [1] which cover notifications, code review [2], cloud computing services, and so on
• there is quite a lot of polish as far as generated reports go [3]
• it seems easy/nice to use out of the box (from a few minutes’ fiddling on my part)
Now, I don’t have much experience with buildbots, so I may be unfairly elevating Jenkins here. If buildbots can be easily extended to do exactly what we need, I’m all for it, and in that case I’d volunteer to help in that regard.
Will
[1] https://wiki.jenkins-ci.org/display/JENKINS/Plugins
[2] http://www.dctrwatson.com/2013/01/jenkins-and-phabricator/
[3] https://ci.jenkins-ci.org
On Jul 6, 2014, at 7:26 PM, William Knop
Hi Pali,
Apologies for the delayed response.
I treated cloud compilation as “free” in the context of the buildbots. If we can cross-compile (on Amazon EC2 or the like) ghcs which run on each arch we have for buildbots, the buildbots themselves will have 1/5 the load. I came to that figure from the buildbot page, where it looked like the average compile time was around 80 minutes, and the average test suite run was around 20 minutes.
I see your point about cloud cross compilation and buildbot testing not covering all cases of regressions. I think this is where the CI vs. nightly builds distinction applies well. Cloud compilation and buildbot testing may be fast enough to do CI on every patch set, while total regression coverage could be provided by nightly builds. Jenkins CI allows us to roll our own CI with our own machines, cloud compute services, and loads of other content/auditing/workflow services.
That said, while I think it would be nice to have quick CI in addition to nightly builds, I don’t know if it’s sensible/desired for ghc. Since Jerkins CI is stable yet very actively developed, it seems at least it wouldn't incur too much maintenance on our part. Of course, the devil is in the details, so I’d be happy to set it up on a few of my machines to investigate.
Will
On Jun 20, 2014, at 6:15 AM, Páli Gábor János
wrote: Hello William,
2014-06-20 0:50 GMT+02:00 William Knop
: 1. We have a pretty good spread of buildbots, but as far as I know there aren’t very many of them. Running only the test suite would increase their utility by roughly 5x (from looking at the buildbot time breakdowns [1]).
How would this increase their utility? I naively believe the purpose of CI is to rebuild and test the source code after each changeset to see if it was bringing regressions. Running the test suite only does not seem to convey this. Many of the regressions could be observed build-time, which means the most safe bet would be to rebuild and test everything on the very same platform.
2. Building ghc is time and resource intensive, which makes it hard for people to host buildbots. Even though my machines are relatively new, I can’t usually host one because it would interfere with my other work. I would be more tempted to if it was limited to just the test suite, and perhaps others would as well.
My buildbots complete the steps (git clone, full build, testing) in about 1 hour 40 minutes (with about 1 hour 15 minutes spent in the compilation phase), while they run in parallel with a shift about an hour. They run on the same machine, together with the coordination server. This is just a 3.4-GHz 4-core Intel Core i5, with a couple of GBs of RAM, I would not call it a high-end box, though.
Note that it is on purpose that the builders do not use -j for builds, meaning that they do not parallelize the invoked make(1)-subprocesses, which automatically makes the builds longer. Perhaps it would be worth experimenting with incremental builds and allowing for parallel builds as they could cut down on the build times more efficiently.

On 2014-07-07 at 03:40:17 +0200, William Knop wrote: [...]
I think using Jenkins may be a step in the right direction for a few reasons:
• there are hundreds of supported plugins [1] which cover notifications, code review [2], cloud computing services, and so on • there is quite a lot of polish as far as generated reports go [3] • it seems easy/nice to use out of the box (from a few minutes’ fiddling on my part)
Now, I don’t have much experience with buildbots, so I may be unfairly elevating Jenkins here. If buildbots can be easily extended to do exactly what we need, I’m all for it, and in that case I’d volunteer to help in that regard.
Btw, one feature I don't know how to achieve with Jenkins (yet): - Try to build/test every single commit, while - priorizing latest commits, - work its way back during idle-time (which is more or less what http://bitten.edgewall.org/ does) For GHC, since it's properly submoduled now, it would suffice to test each single commit in ghc.git. Having easily accessible metrics for each single commit (on various configurations) would be very useful. For instance, knowing the last-known-working commit is especially important if the latest commit fails to build (or exhibits some other significant metric regression). In some cases this can save developers time to git-bisect manually, as the answer is already in plain sight. Does anyone have any idea how so set something like that up with Jenkins? Cheers, hvr

2014-07-07 3:40 GMT+02:00 William Knop
I think using Jenkins may be a step in the right direction for a few reasons: [..] Now, I don’t have much experience with buildbots, so I may be unfairly elevating Jenkins here. If buildbots can be easily extended to do exactly what we need, I’m all for it, and in that case I’d volunteer to help in that regard.
I do not see any problem if you decide to go with Jenkins. I volunteered to maintain the buildbots because I felt it useful for maintaining the FreeBSD port, and because it did not require more than a working Haskell software stack which would the compilation require anyway. To be honest, I do not really want to fiddle with Jenkins and cloud services, and I would feel overkill to turn the buildbots into a fully-fledged CI service. This a home-brew solution and probably has no chance to compete with Jenkins, and it does not want to. I like it is implemented in the functional programming domain, that is all. For what it is worth, I am planning to extend the buildbots with more long-term testing instead. For example, it would be nice to add steps for building cabal-install and then build Stackage on every available platforms to provide some more real-world load. As a side-effect, we could also provide up-to-date snapshots for the users if they do not want to build the sources themselves, which may help with testing. I also want to add clang-based validators to see if everything works with Clang as well. And, of course, working out heuristics for spotting valid errors from the logs without much human intervention. Note that the aforementioned 80 minutes build time and 20 minutes build time is due to the single-threaded build and testing done from scratch. Obviously, with incremental builds and on more multiple threads, things would get quicker, but -- as I wrote previously -- they are disabled for clarity/correctness. Although, what we could do is launching builds for every commit, so they could preserve this invariant while utilizing the underlying hardware more. But that is where I feel this would be just in vain; Jenkins is probably a lot better solution, especially that it integrates nicely with the recently introduced Phabricator. Unfortunately, I cannot offer any experience in that regard. I see the diversity of testing implementations as an advantage, I naively believe different solutions can peacefully co-exist at the same time, helping each other.
participants (7)
-
Alain O'Dea
-
Austin Seipp
-
Carter Schonwald
-
Herbert Valerio Riedel
-
Páli Gábor János
-
Simon Peyton Jones
-
William Knop