
Folks, There is some disagreement over how the GC options should be specified for Haskell programs. I've identified a couple of issues below, comments and opinions are greatly appreciated. If there's a concensus that things should be changed, then I'll make the changes for the next release. Issue 1: should the maximum heap size be unbounded by default? Currently the maximum heap size is bounded at 64M. Arguments for: this stops programs with a space leak eating all your swap space. Arguments against: it's annoying to have to raise the limit when you legitimately need more space. Options: 1. remove the default limit altogether 2. raise the default limit 3. no change (any others?) Issue 2: Should -M be renamed to -H, and -H renamed to something else? The argument for this change is that GHC's -M option is closer to the traditional meaning of -H. Issue 3: (suggestion from Julian S.) Perhaps there should be two options to specify "optimise for memory use" or "optimise for performance", which have the effect of setting the defaults for various GC options to appropriate values. Optimising for memory use might enable the compacting collector all the time, for instance. Optimising for performance is hard - we may be able to change some of the defaults to trade space for time, but it's unlikely to be entirely reliable (eg. turning on the compacting collector sometimes improves performance, sometimes reduces it). Cheers, Simon
-----Original Message----- From: Julian Seward (Intl Vendor) Sent: Monday, August 06, 2001 10:54 AM To: Simon Marlow; Marcin 'Qrczak' Kowalczyk; cvs-ghc@haskell.org Subject: RE: cvs commit: fptools/ghc/compiler/stranal DmdAnal.lhs
| It appears that no-one understands the GC options except me. | Perhpas they should be redesigned, but here's the current story: | | -H<size> is the *minimum* heap size. The heap will be grown | as necessary, starting with the minimum, and up to the | maximum heap size. | | -M<size> is the *maximum* heap size, by default 64M. The | heap is not allowed to grow beyond this size [...]
In days long since gone (before the 4.XX RTS) -H simply set the heap size, and because the heap was fixed-size, that was the max heap size too.
How about:
* Renaming current -M to -H, and current -H to -HS. * Fixing up the sizing calculations a bit so that the max heap size is more closely observed.
Result: -H means what it meant originally, but you can still set an initial heap size with -HS if you want. That means for the most part we can forget about -M and use -H instead.
Also having an auto-fallback to compacting collection when heap gets full. Overall aim is to reduce, ideally to zero, the number of flags users have to give to the RTS in order to get reasonable performance yet efficient use of memory. People simply won't use the compacting collector if you have to ask for it specially.
J

On Mon, 6 Aug 2001, Simon Marlow wrote:
Issue 1: should the maximum heap size be unbounded by default? Currently the maximum heap size is bounded at 64M. Arguments for: this stops programs with a space leak eating all your swap space. Arguments against: it's annoying to have to raise the limit when you legitimately need more space.
Options: 1. remove the default limit altogether 2. raise the default limit 3. no change
(any others?)
I think that if there should be a default limit it would be nice to be able to set it at compile time. This is something that I've wanted for quite some time. If I know that the program I am compiling is likely to need 100M of heap space it feels silly having to give the RTS parameter to the program each time I run it. It would be much more convenient to just tell the compiler where I want the limit. /Josef

Simon Marlow wrote:
Folks,
There is some disagreement over how the GC options should be specified for Haskell programs.
Something that I think would be very convenient, help alleviate some of the problems discussed, and still very easy to implement, would be support for setting run-time system options from an environment variable, GHCRTS say. That way, you wouldn't have to specify them on the command line *everytime *you run a program. It would allow users to run a shell scripts at login-time to set the heap size limit to some suitable value, in some platform-specific way, for example taking into account the amount of available RAM. An additional benefit is that if you call Haskell programs from shell scripts, and switch between different Haskell compilers (like I often do), you don't have to change your scripts to pass the right RTS options: they could automatically be taken from the right environment variable (GHCRTS, HBCRTS, HUGSRTS, NHC98RTS, etc) In the fudget library, we use the following flexible scheme (*): The /value/ of a parameter called /name/ is taken from 1. the command line, if -/name// value/ is present, else 2. the environment variable FUD_/prog/_/name/ (where /prog/ is the name of the program), if set, else 3. the environment variable FUD_/name/, if set, else 4. a builtin default (indicated in the tables above). This allows users to set global defaults as well as defaults for particular programs.
Issue 2: Should -M be renamed to -H, and -H renamed to something else?
HBC calls these flags -h and -H. I am sure you can figure out which is which!
Issue 3: (suggestion from Julian S.) Perhaps there should be two options to specify "optimise for memory use" or "optimise for performance",
Clever automatic GC tuning would of course be nice. The current solution seems to set the limit on how much can be allocated before the next GC based on heap residency. This lowers the performance of programs with low residency and fast allocation rate. Taking the ratio between GC time and mutator time into account could perhaps help? Regarding the maximum heap size, to avoid letting the heap grow too large, you could perhaps take into account the number of page faults that occur during garbage collection, or the ratio between CPU time and real time... Regards, Thomas Hallgren (*) http://www.cs.chalmers.se/Cs/Research/Functional/Fudgets/userguide.html#para...

Hi Simon.
Issue 1: should the maximum heap size be unbounded by default? Currently the maximum heap size is bounded at 64M. Arguments for: this stops programs with a space leak eating all your swap space. Arguments against: it's annoying to have to raise the limit when you legitimately need more space.
Options:
Remove the default limit altogether - because you don't always know how much data a temporo-spatially remote end user of a finished product might need to use.
(any others?)
with the option to set a limit during debugging in the event that a space leak is beginning to be annoying and troublesome to find during development.
Issue 2: Should -M be renamed to -H, and -H renamed to something else? The argument for this change is that GHC's -M option is closer to the traditional meaning of -H.
I suggest remove -H and -M to avoid legacy semantic confusion and introduce something like "--minimum-heap" (-Hmin) and "--maximum-heap" (-Hmax) to get away from the unintuitive "M".
Issue 3: (suggestion from Julian S.) Perhaps there should be two options to specify "optimise for memory use" or "optimise for performance", which have the effect of setting the defaults for various GC options to appropriate values. Optimising for memory use might enable the compacting collector all the time, for instance. Optimising for performance is hard - we may be able to change some of the defaults to trade space for time, but it's unlikely to be entirely reliable (eg. turning on the compacting collector sometimes improves performance, sometimes reduces it).
Sounds sensible.
How about:
* Renaming current -M to -H, and current -H to -HS.
Don't like this because it's not intuitive and could cause legacy mixup problems.
* Fixing up the sizing calculations a bit so that the max heap size is more closely observed.
Depends on time cost and perceived run-time GC activity - anything which minimises runtime pauses is best.
Also having an auto-fallback to compacting collection when heap gets full. Overall aim is to reduce, ideally to zero, the number of flags users have to give to the RTS in order to get reasonable performance yet efficient use of memory. People simply won't use the compacting collector if you have to ask for it specially.
Agree with all of this paragraph. Cheers Mike Thomas.

In local.glasgow-haskell-users, you wrote:
Issue 1: should the maximum heap size be unbounded by default? Currently the maximum heap size is bounded at 64M. Arguments for: this stops programs with a space leak eating all your swap space. Arguments against: it's annoying to have to raise the limit when you legitimately need more space.
I'm for boundless wasting of memory. If I'd cared, I'd set the ulimits correspondingly. You really don't want to get to work on Monday morning finding that your favourite prime-finder stopped Friday five minutes after you left :-) -- Neues aus Genua? http://germany.indymedia.org/2001/07/4866.html Volker Stolz * stolz@i2.informatik.rwth-aachen.de * PGP + S/MIME
participants (5)
-
Josef Svenningsson
-
Mike Thomas
-
Simon Marlow
-
Thomas Hallgren
-
Volker Stolz