
| My plan is to gather up | the answers on a wiki page. Excellent -- please do that! My replies are below Simon | This mail basically asks just one question: what is the order of | optimizations pefromed on Core? It's entirely defined by SimplCore.getCoreToDo :: DynFlags -> [CoreToDo] The code there should be reasonably self-explanatory. The type signature is very descriptive. | 1. What is the difference between a "simplifier iteration" and | "simplifier phase"? Roughtly, a complete run of the simplifier means "run the simplifier repeatedly until nothing further happens". The iterations are the successive iterations of this loop. Currently there's a (rather arbitrary) limit of four such iterations before we give up and declare victory. | 1a. What is the default maximum iterations count? User documentation does | not specify that. Four | 2. How can I observe the effects of `-ddump-simpl-phases`. I tried | compiling several different | programs and this flag seems to have no effect (ie. nothing gets | printed). I had to read the source code. I think you say "-ddump-simpl-phases=A,B,C" to dump the output of phases called A,B,C. But no, it seems that it only affects output of simplifier statistics (see simplifyPgmIO). I have never used this flag. Maybe it can go. Looks strange to me. | 3. Cardinality anlaysis and inlining: cardinality analysis can determine | that a let binding is | used exactly once. Can the inliner re-use this information from the | cardinality analysis or does | it recompute it per [3], section 3.1? Cardinality analysis determines that something is *demanded* once. The occurrence analyser determines when it *occurs* once. For example if .. then x else x+1 x occurs twice, but is demanded once. So they are different. The inliner uses occurrence information. | 4a. The first phase is "Desugar (after optimization)". What optimizations | are performed during | desugaring? Just a few basic ones; see CoreSubst.simpleOptPgm. It implements the "Very Simple Optimiser" which is only a page or two of code. Reading it and writing a Note that enumerates what optimisations it does would be a Good Thing. | 4b. I'm not sure whether I'm looking at a single iteration of core2core | transformation or at | multiple ones. Some passes are performed several times (Float out, Float | inwards), which suggests | that there might be many iterations here. On the other hand simplifier | phases are decreasing | towards 0, which looks as if it was one core2core iteration. My | assumption here is that every | time a new core2core iteration starts the simplifier phases are counted | anew from 2 towards 0. Is | that correct? No. getCoreToDo produces a list of CoreToDos. Each specifies a stage in the pipeline. One such stage is a run of the simplifier. Such a run has a "phase" number, which is set in getCoreToDo. This phase number is used (only) to control INLINE pragmas and RULES (see extensive documentation in the user manual e.g http://www.haskell.org/ghc/docs/latest/html/users_guide/pragmas.html#phase-c...) | 4c. Why are there several 0 phases of the Simplifier? I find it | confusing. We need to run the simplifier several times to propagate the effects of (say) strictness analysis or let-floating. But by that stage the need for staging RULES and INLINE pragmas is over. | 4d. I understand that some passes can be enabled or disabled using | command-line options. Can the | decission to run some passes be made dynamically by the compiler (eg. to | run extra simplifier | passes)? Yes. Plug-ins do precisely that, by manipulating the [CoreToDo] pipeline list. | 4e. Are there more phases that could appear here, ie. they were ommited | with -O? Could appear where? | 4f. "Levels added" pass before the "Float out" pass: my guess is that | this is preparation for the | full laziness transform. So, is full laziness performed as part of "Float | out" pass? The full laziness pass is simply a combination of set-levels (an analysis) followed by float out (which transforms the program) | A general note is that I am confused by many Simplifier passes being | interleaved with other | passes. I expected that simplifier phases will grouped into a single | pass, as speculated in | question 1. Many passes produce output that can readily be simplified. Rather than require them to perform those simplifications, we delegate it to the simplifier. | 5. What optimizations *exactly* are performed by the Simplifier? I assume | that most of what's | described in chapter 3 of [2]: beta reduction, let elimination, case | elimination, case floating, | constant folding and eta expansion. I'm not sure about floating let | outwards and inwards - [1], | pg. 7, says these are in a pass separate from the simplifier. `-dverbose- | core2core` seems to | confirm that since it reveals separate "Float out" and "Float inwards" | passes. I don't have an exhaustive list, I'm afraid. The general rule is: if it can be done with local knowledge, the simplifier should do it. Making a full list would be another good exercise. If you start I could add more. Examples: - constant foldings - applying RULES - inlining (a really important one) - case of case - case of known constructor - eta expansion and eta reduction - combining adjacent casts - pushing a cast out of the way of an application e.g. (f |> k) a ==> f (a |> k1) |> k2 for suitable k1, k2 and there are probably a lot more. Look for SimplifierTicks; every time the simplifier does something significant it bumps a tick count. (or should do so.) | 6. [4], pg. 31, mentions the Deforestation optimisation. Is everything | described in | that "Deforestation" section subsumed by cardinality analysis ([5], end | of section 2.1 and | section 7.1)? If not then when is deforestation performed? Deforestation is simply the application of rewrite RULES; that's done by the simplifier. | 7. [5], section 6.1 says: "We run the analysis twice, once in the middle | of the optimisation | pipeline and once near the end". When exactly in the middle of the | pipeline? Between which | passes? This does not show up with `-dverbose-core2core` (or at least it | is not explicitly | named). I'm not actually certain that cardinality analysis *is* run twice in HEAD. Maybe it was only in the version for the paper. Ilya can tell us | 8. How does the rules rewriting fit into the picture? It's done by the simplifier