Re: Removing latency spikes. Garbage collector related?

29 Sep 2015

      By dumping metrics, I mean essentially the same as the ghc-events-analyze
annotations but with any more information that is useful for the
investigation.  In particular,  if you have a message id, include that. You
may also want to annotate thread names with GHC.Conc.labelThread. You may
also want to add more annotations to drill down if you uncover a problem
area.

If I were investigating, I would take e.g. the five largest outliers, then
look in the (text) eventlog for those message ids, and see what happened
between the start and stop.  You'll likely want to track the thread states
(which is why I suggested you annotate the thread names).

I'm not convinced it's entirely the GC, the latencies are larger than I
would expect from a GC pause (although lots of factors can affect that). I
suspect that either you have something causing abnormal GC spikes, or
there's a different cause.

On 04:15, Tue, Sep 29, 2015 Will Sewell  wrote:
...
Thanks for the reply John. I will have a go at doing that. What do you
mean exactly by dumping metrics, do you mean measuring the latency
within the program, and dumping it if it exceeds a certain threshold?
And from the answers I'm assuming you believe it is the GC that is
most likely causing these spikes. I've never profiled Haskell code, so
I'm not used to seeing what the effects of the GC actually are.
...
Try Greg's recommendations first.  If you still need to do more
investigation, I'd recommend that you look at some samples with either
threadscope or dumping the eventlog to text.  I really like
ghc-events-analyze, but it doesn't provide quite the same level of
detail.
You may also want to dump some of your metrics into the eventlog, because
then you'll be able to see exactly how high latency episodes line up
with GC
pauses.
On Mon, Sep 28, 2015 at 1:02 PM Gregory Collins 
wrote:
...
On Mon, Sep 28, 2015 at 9:08 AM, Will Sewell  wrote:
...
If it is the GC, then is there anything that can be done about it?
Increase value of -A (the default is too small) -- best value for this
is
...
L3 cache size of the chip
Increase value of -H (total heap size) -- this will use more ram but
you'll run GC less often
This will sound flip, but: generate less garbage. Frequency of GC runs
is
proportional to the amount of garbage being produced, so if you can
lower
mutator allocation rate then you will also increase net productivity.
Built-up thunks can transparently hide a lot of allocation so fire up
On 28 September 2015 at 19:31, John Lato  wrote:
the
...
...
profiler and tighten those up (there's an 80-20 rule here). Reuse output
buffers if you aren't already, etc.
G
--
Gregory Collins 
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users