[Haskell-cafe] Re: Properties of optimizer rule application?

16 Jan 2008

      On Wed, 16 Jan 2008, Roman Leshchinskiy wrote:
...
Henning Thielemann wrote:
...
Reading various papers and the Wiki about GHC optimizer rules I got the
impression that there are not much properties I can rely on and I wonder
how I can write a reliable fusion framework with this constraint.
That depends on your definition of reliable. You can't have a framework
which fuses everything that can be fused but then, I don't think that's
even theoretically possible.
At least I expect that it fuses greedily and does not stop as long as
rules are applicable. Thinking of intermediate fusable function
replacements, I can be sure that rules are invoked that prevent me from
making things worse by optimization attempts.
...
...
I read about the strategy to replace functions early by fusable
implementations and replace them back to fast low-level implementation if
fusion was not possible. However, can I rely on the back-translation if I
have no warranty that the corresponding rule is applied? Is there some
warranty that rules are applied as long as applicable rules are available
or is the optimizer free to decide that it worked enough for today?
 I see several phases with a fixed number of iterations in the output of
-ddump-simpl-iterations. Is there some idea behind these phases or is the
structure and number rather arbitrary? If there is only a fixed number of
simplifier runs, how can I rely on complete fusion of arbitrary large
expressions?
In general, you can't.
To give a precise example: If I have a sequence of 'map's
  map f0 . map f1 . ... . map fn
 then there is some length where this is no longer collapsed to a single
'map'? However then I wonder, how it is possible to make the compiler to
go into an infinite loop by the rule

   "loop"   forall x,y.  f x y = f y x

 as stated in the GHC manual:
   http://haskell.org/ghc/docs/latest/html/users_guide/rewrite-rules.html

 I'm still uncertain how much is done in one iteration in one phase, since
there seems to be several rules that can fire in one iteration.
...
You can control the number of simplifier phases with -fsimplifier-phases
(in the HEAD only) and the number of iterations in each phase with
-fmax-simplifier-iterations.
Good to know.
...
That said, there are other things that break fusion (such as code
getting between two functions you want to fuse). Again, you can only try
to make your framework good enough; it'll never be perfect.
It would be nice to have a flag which alters the rule application order of
the compiler randomly in order to see whether the fusion framework
implicitly relies on a particular behaviour of the current compiler
version.
...
...
Another text passage tells that the simplification is inside-out
expressions. Such a direction would make the design of rules definitely
easier. Having both directions, maybe alternating in the runs of the
simplifier, would be also nice. I could then design transforms of the
kind:
  toFastStructure . slowA . slowB . slowC . slowWithNoFastCounterpart
  fastA . toFastStructure . slowB . slowC . slowWithNoFastCounterpart
  fastA . fastB . toFastStructure . slowC . slowWithNoFastCounterpart
  fastA . fastB . fastC . toFastStructure . slowWithNoFastCounterpart
  fastA . fastBC . toFastStructure . slowWithNoFastCounterpart
  fastABC . toFastStructure . slowWithNoFastCounterpart
Again, I don't think you really want to rely on the order of
simplification. For your example, you only need the following rules:
toFastStructure (slow{A|B|C} x) = fast{A|B|C} (toFastStructure x)
fastB (fastC x) = fastBC x
fastA (fastBC x) = fastABC x
They do not require any specific traversal order.
Ok, this was a bad example. Try this one:
   project . project . foo
 with the rules
   project (project x) = project x
   project (foo     x) = projectFoo x

Both rules can be applied to the expression, but you get one fusion more,
if you use the first one first. Let me guess, in order to solve that, I
should restrict the first rule to an earlier phase than the second rule.

Thanks for the detailed answer and thanks to the busy people who have
created the optimizer and who have written all the papers and Wiki pages
for making use of this feature. I don't know another language where it is
possible to control the optimizer in this way.