
The original docs are here: https://downloads.haskell.org/ghc/latest/docs/users_guide/runtime_control.ht... They raised more questions than answers for me, so I took a stab at rewriting them, and sprinkled in some questions where I need better understanding. Anyone care to fill in my blanks? ----------------- Set the amount of idle time which must pass before an idle GC is performed. Setting -I0 disables the idle GC. The idle GC only affects the threaded and SMP versions of the RTS (see -threaded, Options affecting linking). When the idle GC is enabled, a major GC is automatically performed if the runtime has been idle (i.e., no Haskell computation has been running) for the specified period of time. For an interactive application, it is probably a good idea to use the idle GC, because this will allow finalizers to run and deadlocked threads to be detected in the idle time when no Haskell computation is happening. [Why is this a good thing? What happens when the idle GC is disabled?] Also, it will mean that a GC is less likely to happen when the application is busy, so application responsiveness may be improved. However, if the amount of live data in the heap is particularly large, then the idle GC can cause a significant penalty to responsiveness. [Why? Is it because the idle GC was delayed by waiting for some idle time, and thus has more work to do?]. Conversely, too small of an interval could adversely affect interactive responsiveness [How? And how is this worse than having idle GC disabled? What is the actual behavior when it's disabled, anyway?] The idle period timer only resets after some activity by a Haskell thread. Therefore, once an idle GC is triggered, another one won't be scheduled until more work is performed. This is an experimental feature. Please let us know if it causes problems and/or could benefit from further tuning.

Hi Bryan, Thanks for improving this documentation! I've often found these flags to be quite confusing.
For an interactive application, it is probably a good idea to use the idle GC, because this will allow finalizers to run and deadlocked threads to be detected in the idle time when no Haskell computation is happening. [Why is this a good thing? What happens when the idle GC is disabled?]
So there's basically 3 ways to trigger a major GC as far as I know: 1. Heap overflow: when we last performed a major GC we checked how much live data there is and set a variable so that we do another major GC when the heap grows to be live * F. 2. Idle GC 3. Manually triggering a GC using the interface in System.Mem When idle gc is disabled, then GC will happen less often. One of the other two may still trigger a GC. A key difference is both of those are only activated by the mutator running code: either through allocation or by calling a GC directly. On the other hand, idle GC can be triggered when the mutator isn't running. So, if you want to ensure that finalizers get called promptly then idle GC can help, especially if your application is idle for long periods of time. The other key benefit of the idle GC is that it can reduce the prevalence of heap overflow GCs. These can only happen when your application is allocating and hence running code. So it's quite likely that it's going to tank the response time for the request your application is serving at the time. And since idle GCs free some memory, it makes it less likely that you reach the limit that would trigger a heap overflow GC. With idle GCs, if you are lucky, major GCs will only run while your application isn't meant to be responding to requests at all, which makes it basically free.
Also, it will mean that a GC is less likely to happen when the application is busy, so application responsiveness may be improved. However, if the amount of live data in the heap is particularly large, then the idle GC can cause a significant penalty to responsiveness. [Why? Is it because the idle GC was delayed by waiting for some idle time, and thus has more work to do?].
The reason this can happen is because the time a major GC takes is proportional to the live data in the heap. So, if the pause required by the GC starts to overlap with time when you'd like the application to be working on a response, then you will regress response times. For instance if it takes 100ms to run an idle GC and a request comes in just after you've started the GC then processing it will have to wait until the GC is over.
Conversely, too small of an interval could adversely affect interactive responsiveness [How? And how is this worse than having idle GC disabled? What is the actual behavior when it's disabled, anyway?]
The smaller the interval, the more time you are spending running an idle GC, the more likely it becomes that it will overlap with time you want to be doing something else. This is similar to the long GC case above due to large heaps. Another reason you might not want to run it too often is that you are unlikely to free much memory. I think this documentation was written before the non-moving GC was added. It would also be important to add that the savings in terms of responsiveness don't really apply if that is enabled as the non-moving GC runs concurrently with the mutator anyway. So, the main advantage would just be more prompt finalization, deadlock detection, etc. I hope that helps; let me know if you'd like anything clarified. Cheers, Teo

Thanks! Let's see if I can find the time to make a patch for this.
On Thu, 2 Nov 2023 at 17:17, Teofil Camarasu
Hi Bryan,
Thanks for improving this documentation! I've often found these flags to be quite confusing.
For an interactive application, it is probably a good idea to use the idle GC, because this will allow finalizers to run and deadlocked threads to be detected in the idle time when no Haskell computation is happening. [Why is this a good thing? What happens when the idle GC is disabled?]
So there's basically 3 ways to trigger a major GC as far as I know: 1. Heap overflow: when we last performed a major GC we checked how much live data there is and set a variable so that we do another major GC when the heap grows to be live * F. 2. Idle GC 3. Manually triggering a GC using the interface in System.Mem
When idle gc is disabled, then GC will happen less often. One of the other two may still trigger a GC.
A key difference is both of those are only activated by the mutator running code: either through allocation or by calling a GC directly. On the other hand, idle GC can be triggered when the mutator isn't running. So, if you want to ensure that finalizers get called promptly then idle GC can help, especially if your application is idle for long periods of time.
The other key benefit of the idle GC is that it can reduce the prevalence of heap overflow GCs. These can only happen when your application is allocating and hence running code. So it's quite likely that it's going to tank the response time for the request your application is serving at the time. And since idle GCs free some memory, it makes it less likely that you reach the limit that would trigger a heap overflow GC.
With idle GCs, if you are lucky, major GCs will only run while your application isn't meant to be responding to requests at all, which makes it basically free.
Also, it will mean that a GC is less likely to happen when the application is busy, so application responsiveness may be improved. However, if the amount of live data in the heap is particularly large, then the idle GC can cause a significant penalty to responsiveness. [Why? Is it because the idle GC was delayed by waiting for some idle time, and thus has more work to do?].
The reason this can happen is because the time a major GC takes is proportional to the live data in the heap. So, if the pause required by the GC starts to overlap with time when you'd like the application to be working on a response, then you will regress response times. For instance if it takes 100ms to run an idle GC and a request comes in just after you've started the GC then processing it will have to wait until the GC is over.
Conversely, too small of an interval could adversely affect interactive responsiveness [How? And how is this worse than having idle GC disabled? What is the actual behavior when it's disabled, anyway?]
The smaller the interval, the more time you are spending running an idle GC, the more likely it becomes that it will overlap with time you want to be doing something else. This is similar to the long GC case above due to large heaps.
Another reason you might not want to run it too often is that you are unlikely to free much memory.
I think this documentation was written before the non-moving GC was added. It would also be important to add that the savings in terms of responsiveness don't really apply if that is enabled as the non-moving GC runs concurrently with the mutator anyway. So, the main advantage would just be more prompt finalization, deadlock detection, etc.
I hope that helps; let me know if you'd like anything clarified.
Cheers, Teo

I've now drafted some new documentation at
https://gitlab.haskell.org/ghc/ghc/-/merge_requests/11655. Reviews welcome.
On Mon, 6 Nov 2023 at 13:13, Bryan Richter
Thanks! Let's see if I can find the time to make a patch for this.
On Thu, 2 Nov 2023 at 17:17, Teofil Camarasu
wrote: Hi Bryan,
Thanks for improving this documentation! I've often found these flags to be quite confusing.
For an interactive application, it is probably a good idea to use the idle GC, because this will allow finalizers to run and deadlocked threads to be detected in the idle time when no Haskell computation is happening. [Why is this a good thing? What happens when the idle GC is disabled?]
So there's basically 3 ways to trigger a major GC as far as I know: 1. Heap overflow: when we last performed a major GC we checked how much live data there is and set a variable so that we do another major GC when the heap grows to be live * F. 2. Idle GC 3. Manually triggering a GC using the interface in System.Mem
When idle gc is disabled, then GC will happen less often. One of the other two may still trigger a GC.
A key difference is both of those are only activated by the mutator running code: either through allocation or by calling a GC directly. On the other hand, idle GC can be triggered when the mutator isn't running. So, if you want to ensure that finalizers get called promptly then idle GC can help, especially if your application is idle for long periods of time.
The other key benefit of the idle GC is that it can reduce the prevalence of heap overflow GCs. These can only happen when your application is allocating and hence running code. So it's quite likely that it's going to tank the response time for the request your application is serving at the time. And since idle GCs free some memory, it makes it less likely that you reach the limit that would trigger a heap overflow GC.
With idle GCs, if you are lucky, major GCs will only run while your application isn't meant to be responding to requests at all, which makes it basically free.
Also, it will mean that a GC is less likely to happen when the application is busy, so application responsiveness may be improved. However, if the amount of live data in the heap is particularly large, then the idle GC can cause a significant penalty to responsiveness. [Why? Is it because the idle GC was delayed by waiting for some idle time, and thus has more work to do?].
The reason this can happen is because the time a major GC takes is proportional to the live data in the heap. So, if the pause required by the GC starts to overlap with time when you'd like the application to be working on a response, then you will regress response times. For instance if it takes 100ms to run an idle GC and a request comes in just after you've started the GC then processing it will have to wait until the GC is over.
Conversely, too small of an interval could adversely affect interactive responsiveness [How? And how is this worse than having idle GC disabled? What is the actual behavior when it's disabled, anyway?]
The smaller the interval, the more time you are spending running an idle GC, the more likely it becomes that it will overlap with time you want to be doing something else. This is similar to the long GC case above due to large heaps.
Another reason you might not want to run it too often is that you are unlikely to free much memory.
I think this documentation was written before the non-moving GC was added. It would also be important to add that the savings in terms of responsiveness don't really apply if that is enabled as the non-moving GC runs concurrently with the mutator anyway. So, the main advantage would just be more prompt finalization, deadlock detection, etc.
I hope that helps; let me know if you'd like anything clarified.
Cheers, Teo
participants (2)
-
Bryan Richter
-
Teofil Camarasu