
Hi! I'm trying to understand the interaction between the -A and -H RTS flags. The documentation at http://www.haskell.org/ghc/docs/7.4.1/html/users_guide/runtime-control.html says that if you use -H (with or without an argument) it implicitly implies some value of -A. However, it's not clear to me what value -A will get and how that value is related to the value of -H. For example, if I set the suggested heap size to 1G, using -H1G, surely the size of the nursery (-A) won't be "whatever is left over," but something more reasonable e.g. the size of the L2 cache? Perhaps it would make sense to document the actual algorithm used to set -A given -H (with and without argument.) Cheers, Johan

On 25/02/2012 16:51, Johan Tibell wrote:
Hi!
I'm trying to understand the interaction between the -A and -H RTS flags. The documentation at
http://www.haskell.org/ghc/docs/7.4.1/html/users_guide/runtime-control.html
says that if you use -H (with or without an argument) it implicitly implies some value of -A. However, it's not clear to me what value -A will get and how that value is related to the value of -H. For example, if I set the suggested heap size to 1G, using -H1G, surely the size of the nursery (-A) won't be "whatever is left over," but something more reasonable e.g. the size of the L2 cache?
Perhaps it would make sense to document the actual algorithm used to set -A given -H (with and without argument.)
Hmm, I took a look at the docs and to me it seems clear (but it would do, since I wrote the docs :-) "Think of -Hsize as a variable -A option. It says: I want to use at least size bytes, so use whatever is left over to increase the -A value." Doesn't that describe exactly what it means? Well, actually it's a bit more complicated than that, and there are some heuristics involved. But the basic idea is to use all of the memory granted by -H for the heap, by increasing -A to fill any free space. The complications arise because we don't know how much of the nursery will need to be copied at the next GC, and the worst case (all of it) very often leaves a lot of memory unused, so we make an approximation. Sometimes this is an underestimate, and we end up using more memory than the -H value for a while. Whether -H is a good idea is not clear. When I added it, it was for backwards compatibilty with the previous GC, which had a fixed-size heap and required that you specify the heap size with -H. Cheers, Simon

Hi Simon,
On Mon, Feb 27, 2012 at 12:25 AM, Simon Marlow
"Think of -Hsize as a variable -A option. It says: I want to use at least size bytes, so use whatever is left over to increase the -A value."
Doesn't that describe exactly what it means?
Maybe. Let me start with the mental model I approach this with: the allocation area (i.e. the nursery) should have a size in the order of megabytes, often around the size of the L2 cache. Given this model, I read the above as: * if you set e.g. -H1G, you'll get an allocation area which is in the order of 1Gb large. That makes no sense to me. * The suggested size of the total heap (-H) has something to do with the size of the allocation area (-A). This makes no sense to me either. So either I do understand what -H does, but it makes no sense to me, or I don't understand what -H does, but what it does makes sense. Perhaps the confusion lies in the phrase "left over." Left over from what? Cheers, Johan

On 27/02/2012 17:34, Johan Tibell wrote:
Hi Simon,
On Mon, Feb 27, 2012 at 12:25 AM, Simon Marlow
wrote: "Think of -Hsize as a variable -A option. It says: I want to use at least size bytes, so use whatever is left over to increase the -A value."
Doesn't that describe exactly what it means?
Maybe. Let me start with the mental model I approach this with: the allocation area (i.e. the nursery) should have a size in the order of megabytes, often around the size of the L2 cache.
Ah, so I see where your confusion arises - this assumption is not true in general. Just discard the assumption, and I think everything will make more sense. Picking a size for -A around the L2 cache is often a good idea, but not always. GHC defaults to -A512K, but programs that benefit from much larger sizes are quite common. For more about the tradeoff, see my SO answer here: http://stackoverflow.com/questions/3171922/ghcs-rts-options-for-garbage-coll...
Given this model, I read the above as:
* if you set e.g. -H1G, you'll get an allocation area which is in the order of 1Gb large. That makes no sense to me.
Right - see above. In fact there's no problem with a 1GB nursery.
* The suggested size of the total heap (-H) has something to do with the size of the allocation area (-A). This makes no sense to me either.
So either I do understand what -H does, but it makes no sense to me, or I don't understand what -H does, but what it does makes sense.
Perhaps the confusion lies in the phrase "left over." Left over from what?
Left over after the memory required by the non-nursery parts of the heap has been deducted. Cheers, Simon

On Tue, Feb 28, 2012 at 12:57 AM, Simon Marlow
Ah, so I see where your confusion arises - this assumption is not true in general. Just discard the assumption, and I think everything will make more sense.
Picking a size for -A around the L2 cache is often a good idea, but not always. GHC defaults to -A512K, but programs that benefit from much larger sizes are quite common. For more about the tradeoff, see my SO answer here:
http://stackoverflow.com/questions/3171922/ghcs-rts-options-for-garbage-coll...
Thanks for the explanation. One has to be very careful in selecting the size of the allocation area in benchmarks. If the allocation area is large enough the GC might not need to run at all for the duration of the benchmark, while in a real program it would run. -- Johan

On 28/02/2012 15:59, Johan Tibell wrote:
On Tue, Feb 28, 2012 at 12:57 AM, Simon Marlow
wrote: Ah, so I see where your confusion arises - this assumption is not true in general. Just discard the assumption, and I think everything will make more sense.
Picking a size for -A around the L2 cache is often a good idea, but not always. GHC defaults to -A512K, but programs that benefit from much larger sizes are quite common. For more about the tradeoff, see my SO answer here:
http://stackoverflow.com/questions/3171922/ghcs-rts-options-for-garbage-coll...
Thanks for the explanation.
One has to be very careful in selecting the size of the allocation area in benchmarks. If the allocation area is large enough the GC might not need to run at all for the duration of the benchmark, while in a real program it would run.
It is a problem, yes. You also have to be careful when comparing two benchmarks runs that one didn't do an extra GC, because that can skew the results against it. I'm fairly sure the GC community have looked into this problem, but I don't know of any references off hand. Trawling Richard Jones' GC bibliography might turn up something: http://www.cs.kent.ac.uk/people/staff/rej/gcbib/ Cheers, Simon
participants (3)
-
Johan Tibell
-
Simon Marlow
-
wren ng thornton