Re: On CI

18 Mar 2021

      Karel Gardas  writes:
...
On 3/17/21 4:16 PM, Andreas Klebinger wrote:
...
Now that isn't really an issue anyway I think. The question is rather is
2% a large enough regression to worry about? 5%? 10%?
5-10% is still around system noise even on lightly loaded workstation.
Not sure if CI is not run on some shared cloud resources where it may be
even higher.
I think when we say "performance" we should be clear about what we are
referring to. Currently, GHC does not measure instructions/cycles/time.
We only measure allocations and residency. These are significantly more
deterministic than time measurements, even on cloud hardware.

I do think that eventually we should start to measure a broader spectrum
of metrics, but this is something that can be done on dedicated hardware
as a separate CI job.
...
I've done simple experiment of pining ghc compiling ghc-cabal and I've
been able to "speed" it up by 5-10% on W-2265.
Do note that once we switch to Hadrian ghc-cabal will vanish entirely
(since Hadrian implements its functionality directly).
...
Also following this CI/performance regs discussion I'm not entirely sure
if  this is not just a witch-hunt hurting/beating mostly most active GHC
developers. Another idea may be to give up on CI doing perf reg testing
at all and invest saved resources into proper investigation of
GHC/Haskell programs performance. Not sure, if this would not be more
beneficial longer term.
I don't think this would be beneficial. It's much easier to prevent a
regression from getting into the tree than it is to find and
characterise it after it has been merged.
...
Just one random number thrown to the ring. Linux's perf claims that
nearly every second L3 cache access on the example above ends with cache
miss. Is it a good number or bad number? See stats below (perf stat -d
on ghc with +RTS -T -s -RTS').
It is very hard to tell; it sounds bad but it is not easy to know why or
whether it is possible to improve. This is one of the reasons why I have
been trying to improve sharing within GHC recently; reducing residency should
improve cache locality.

Nevertheless, the difficulty interpreting architectural events is why I
generally only use `perf` for differential measurements.

Cheers,

- Ben

Re: On CI

Ben Gamari