
On Wed, Jul 11, 2007 at 11:14:20AM +0100, Tony Finch wrote:
registers or on the stack) and a case analysis branch, but a normal function return (predictable by the CPU) is replaced by a less-predictable indirect jump. Does anyone have references to a paper that discusses an
GHC does not use calls and returns. To quote a recent paper on this subject: 7.1 Using call/return instructions As we mentioned in Section 2, GHC generates code that manages the Haskell stack entirely separately from the system-supported C stack. As a result, a case expression must explicitly push a return address, or continuation, onto the Haskell stack; and the "return" takes the form of an indirect jump to this address. There is a lost op- portunity here, because every processor has built-in CALL and RET instructions that help the branch-prediction hardware make good predictions: a RET instruction conveys much more information than an arbitrary indirect jump. Nevertheless, for several tiresome reasons, GHC cannot readily make use of these instructions: * The Haskell stack is allocated in the heap. GHC generates code to check for stack overflow, and relocates the stack if necessary. In this way GHC can support zillions of little stacks (one per thread), each of which may be only a few hundred bytes long. However, operating systems typically take signals on the user stack, and do no limit checking. It is often possible to arrange that signals are executed on a separate stack, however. * The code for a case continuation is normally preceded by an info table that describes its stack frame layout. This arrangement is convenient because the stack frame looks just like a heap closure, which we described in Section 2. The garbage collector can now use the info table to distinguish the point- ers from non-pointers in the stack frame closure. This changes if the scrutinee is evaluated using a CALL instruction: when the called procedure is done, it RETurns to the instruction right after the call. This means that the info table can no longer be placed before a continuation. Thus the possible benefits of a CALL/RET scheme must outweigh the performance penalty of abandoning the current (efficient) info table layout. Stefan