Did you ever make any progress on this, Saurabh?

We made progress in some sense, by introducing a separate `stack build -j1` step in our CI pipeline for compiling packages that are known to use a lot of memory.
 

 * -j just tells GHC to parallelise compilation across modules. This can
    increase the maximum heap size needed by the compiler.


From the docs, it wasn't very clear to me how -j interacts with -M when both the options are passed to the GHC process. Is it the max heap size across all build, or per build?

 
 * -M is a bit tricky to define. For one, it defines the maximum heap
   size beyond which we will terminate. However, we also use it in
   garbage collector to make various decisions about GC scheduling. I'll
   admit that I'm not terribly familiar with the details here.

Note that -M does not guarantee that GHC will find a way to keep your
program under the limit that you provide. It merely ensures that the
program doesn't exceed the given size, aborting if necessary.


Quoting from https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/runtime_control.html#rts-flag--M:

> The maximum heap size also affects other garbage collection parameters: when the amount of live data in the heap exceeds a certain fraction of the maximum heap size, compacting collection will be automatically enabled for the oldest generation, and the -F parameter will be reduced in order to avoid exceeding the maximum heap size.

It just makes it sound that the RTS is going to tweak the GC algo, and the number of time GC is run, to avoid crossing the heap limit. However, I've found the GHC process easily consuming more memory than what is specified in the -M flag (as reported by top).

-- Saurabh.