
Reading various papers and the Wiki about GHC optimizer rules I got the impression that there are not much properties I can rely on and I wonder how I can write a reliable fusion framework with this constraint. I read about the strategy to replace functions early by fusable implementations and replace them back to fast low-level implementation if fusion was not possible. However, can I rely on the back-translation if I have no warranty that the corresponding rule is applied? Is there some warranty that rules are applied as long as applicable rules are available or is the optimizer free to decide that it worked enough for today? I see several phases with a fixed number of iterations in the output of -ddump-simpl-iterations. Is there some idea behind these phases or is the structure and number rather arbitrary? If there is only a fixed number of simplifier runs, how can I rely on complete fusion of arbitrary large expressions? At some place I read that the order of application of rules is arbitrary. I like to have some warranty that more special rules are applied before more general rules. That is, if rule X is applicable whereever Y is applicable then Y shall be tried before X. This is currently not assured, right? Another text passage tells that the simplification is inside-out expressions. Such a direction would make the design of rules definitely easier. Having both directions, maybe alternating in the runs of the simplifier, would be also nice. I could then design transforms of the kind: toFastStructure . slowA . slowB . slowC . slowWithNoFastCounterpart fastA . toFastStructure . slowB . slowC . slowWithNoFastCounterpart fastA . fastB . toFastStructure . slowC . slowWithNoFastCounterpart fastA . fastB . fastC . toFastStructure . slowWithNoFastCounterpart fastA . fastBC . toFastStructure . slowWithNoFastCounterpart fastABC . toFastStructure . slowWithNoFastCounterpart On the one hand the inner of functions may not be available to fusion, if the INLINE pragma is omitted. As far as I know inlining may take place also without the INLINE pragma, but I have no warranty. Can I rely on functions being inlined with INLINE pragma? Somewhere I read that functions are not inlined if there is still an applicable rule that uses the function on the left-hand side. Altogether I'm uncertain how inlining is interleaved with rule application. It was said, that rules are just alternative function definitions. In this sense a function definition with INLINE is a more aggressively used simplifier rule, right? On the other hand if I set the INLINE pragma then the inner of the function is not fused. If this would be the case, I could guide the optimizer to fuse several sub-expressions before others. Say, doubleMap f g = map f . map g could be fused to doubleMap f g = map (f . g) and then this fused version can be fused further in the context of the caller. The current situation seems to be that {-# INLINE doubleMap #-} switches off local fusion and allows global fusion, whereas omitting the INLINE pragma switches on local fusion and disallows global fusion. How can I have both of them?