Re: [Haskell-cafe] An interesting paper on VM-friendly GC

16 Oct 2010


      On 16 October 2010 10:35, Andrew Coppin  wrote:
...
 On 15/10/2010 11:50 PM, Gregory Crosswhite wrote:
...
 On 10/15/2010 03:15 PM, Andrew Coppin wrote:
...
On the other hand, their implementation uses a modified Linux kernel, and
no sane person is going to recompile their OS kernel with a custom patch
just to run Haskell applications, so we can't do quite as well as they did.
But still, and interesting read...
Ah, but you are missing an important fact about the article:  it is not
about improving garbage collection for Haskell, it is about improving
collection for *Java*, which a language in heavy use on servers.  If this
performance gain really is such a big win, then I bet that it would highly
motivate people to make this extension as part of the standard Linux kernel,
at which point we could use it in the Haskell garbage collector.
Mmm, that's interesting. The paper talks about "Jikes", but I have no idea
what that is. So it's a Java implementation then?
Jikes as a virtual machine used for research, it actually has a decent
just in time compiler.  Its memory management toolkit (MMTk) also
makes it quite easy to experiment with new GC designs.
...
Also, it's news to me that Java finds heavy use anywhere yet. (Then again,
if they run Java server-side, how would you tell?)
Oh, it's *very* heavily used.  Many commercial products run on Java
both server and client.
...
It seems to me that most operating systems are designed with the assumption
that all the code being executed will be C or C++ with manual memory
management. Ergo, however much memory the process has requested, it actually
*needs* all of it. With GC, this assumption is violated. If you ask the GC
nicely, it may well be able to release some memory back to you. It's just
that the OS isn't designed to do this, so the GC has no idea whether it's
starving the system of memory, or whether there's plenty spare.
I know the GC engine in the GHC RTS just *never* releases memory back to the
OS. (I imagine that's a common choice.) It means that if the amount of truly
live data fluctuates up and down, you don't spend forever allocating and
freeing memory from the OS. I think we could probably do better here.
(There's an [ancient] feature request ticket for it somewhere on the
Traq...) At a minimum, I'm not even sure how much notice the current GC
takes of memory page boundaries and cache effects...
Actually that's been fixed in GHC 7.
...
GC languages are not exactly rare, so maybe we'll see some OSes start adding
new system calls to allow the OS to ask the application whether there's any
memory it can cheaply hand back. We'll see...
I wouldn't be surprised if some OS kernels already have some
undocumented features to aid VM-friendly GC.  I think it's probably
going to have to be the other way around, though.  Not the OS should
ask for its memory back, but the application should ask for the page
access bits and then decide itself (as done in the paper).  I don't
know how that interacts with the VM paging strategy, though.
Microkernels such as L4 already support these things (e.g., L4 using
the UNMAP system call).  Xen and co. probably have something similar.


-- 
Push the envelope. Watch it bend.

Re: [Haskell-cafe] An interesting paper on VM-friendly GC

Thomas Schilling