
On Mon, Feb 12, 2007 at 04:45:47PM +1100, Matt Roberts wrote:
Hi all,
I am trying to get a deeper understanding of core's role in GHC and the compilation of functional languages in general. So far I have been through - The hackathon videos, - "A transformation-based optimiser for Haskell", - "An External Representation for the GHC Core Language (DRAFT for GHC5.02)", and - "Secrets of the Glasgow Haskell Compiler inliner".
and I am still a bit hazy on a few points: - What role do *the semantics* of core play (i.e. how and where are they taken advantage of)?
They aren't, at least not directly by the compiler - the semantics are what give people named Simon the courage to implement counterintuitive optimizations without losing sleep. (since they know formally nothing can go wrong)
- Exactly what are the operational and denotational semantics of core? - The headline reasons (and any other arguments that emerge) for having core *and* stg as separate definitions.
I have an intuition on a few of these points, but I would love something concrete to latch on to.
(disclaimer: I am not a GHC hacker, nor have I ever gotten around to actually writing an optimizer.) It's a simple question of expediency and tradeoffs. Core has lots of nice mathematical properties. For instance, Core's call-by-name properties allow the compiler to simply prune unreferenced expressions, without worrying about changing termination behavior. Core expressions can be rearranged by nice pattern matches, and as a strongly typed calculus Core is relatively immune to misoptimization. On the other hand, Core is rather far from the machine, and much is still implicit - invisible and unoptimizable. If GHC only used Core, you would get all the nice large-scale optimizations (fusion comes immediately to mind), but you would pay full price for every forced closure etc - wasted effort could not be optimized away. STG is much closer to the machine. If GHC's desugarer produced STG and Core were removed completely, GHC would still be able to produce nearly perfect code. Every optimization that could be performed at Core, can in principle be performed at STG. In practice things are a bit different. STG, as an impure, strict, untyped langage, is missing the nice properties of Core, and many optimizations that are a no-brainer to write at the Core level, are riddled with side conditions at the lower level. So, to summarize: We have Core because Simon lacks the patience to solve the halting problem and properly perform effects analysis on STG. We have STG because Simon lacks the patience to wait for the 6.6 Simplifier to finish naively graph-reducing every time. (now let's hope that MY intuitions are on the right track!) Stefan