Hello all,

Thank you for the comments and apologies that it took a bit of time to get back to you. My colleague Michael and I have taken a stab at a package demonstrating the slowdown. Our hypothesis was that the presence of type declarations, large values/literals, typeclass instantiations and engagement of TH caused at least some of the slowdown we were seeing in our app code. It turns out this was only partially true:

I'd be happy to create a ticket if this is sufficient support material. It's hard to say these conclusively isolate *all* culprits for the full slowdown we observe in our codebase, but are hopefully at least a significant part.

Here's the package: https://dl.dropboxusercontent.com/u/58525/ghc-slowdown.tar.gz

Best,
Oz



On Tue, Apr 7, 2015 at 3:39 PM, Simon Marlow <marlowsd@gmail.com> wrote:
Just a comment on this one point:

On 05/04/2015 19:01, Ozgun Ataman wrote:
  * With O0 and -j=N the overall wall-clock time is 40% higher, but the
    CPU time spent is a staggering 7.5X higher (!)

When looking at Haskell programs executing in parallel, it's normal to see a high CPU time, because the GC threads spin looking for work.  The extra CPU time is just time spent spinning, it doesn't imply that the compiler was doing extra work.  What matters is the overall wall-clock time - a worse wall-clock time indicates that something has gone wrong.

We should look at compile-time regression independently of performance issues with -j.  Parallel compilation is a new feature and mostly likely needs a lot of tuning.  Note that cabal's -j feature is different from GHC's (and is likely to give much more reliable improvements) because it runs multiple compiler instances in parallel.

Cheers,
Simon


  * With -O0, "cabal repl" load times have gone up by a staggering 2.5X (!)
      o 7.8.4: 3 min. 35 seconds
      o 7.6.3: 1 min. 2 seconds
  * Measuring compile times for individual modules, we see that those
    that are heavily loaded with lots of types, TH-facilitated type and
    typeclass derivations and those that contain large "blobs" of values
    directly at the top level now take much longer to compile:
      o We have 10 modules that each take over 10 seconds
      o We have 3 modules that each take over 35 seconds
  * Sidenote observation: In general, parallel builds with -j appear to
    cause a very significant overhead under the "system" part of timing:
      o Example with 7.6.3, O0: cabal build  140.57s user 13.25s system
        100% cpu 2:33.70 total
      o Example with 7.8.4, O0: cabal build  507.83s user 655.43s system
        549% cpu 3:31.59 total


The main hurt here has been the infeasibility of using "cabal repl /
ghci"  fast-feedback style development (via emacs, vim, command-line,
etc.), since:

  * Unlike "cabal build", "cabal repl" re-compiles from scratch with
    every invocation ":l App.Foo.Bar", which itself is the necessary
    first step when starting to hack on a module (we can then use :r to
    somewhat help the situation)
  * The slowdown is about 2.5X as stated above: What used to take a
    minute now takes 3.5 minutes.
  * cabal repl does not seem to benefit from -j at all
  * cabal repl 7.8.4 appears to be hurt particularly by the heavier modules
  * The heavier modules are often at the top of the compile tree (types,
    derivations, etc) and are practically loaded on the critical path
    all the time

Let me know if I'm missing anything here and any/all feedback is much
appreciated!

Cheers,
Oz


_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs