Re: Recursive functions and constant parameter closures (inlining/strictness analyzer question)

25 Jun 2008

      2008/6/23 Dan Doel :
...
On Monday 23 June 2008, Isaac Dupree wrote:
...
there's no chance for the lower-level near code generation to
reverse-SAT to eliminate the heap usage? (which would obviously be a
different optimization that might be useful in other ways too, if it
could be made to work) (did someone say that -fvia-C with gcc -O3
managed to do that sometimes?)
I said something similar, but I don't actually know the details. What I have
is a SAT and non-SAT version of a benchmark:
In the SAT version, the native code generator is noticeably slower
than -fvia-c -optc-O3, so the latter is doing something right. In the non-SAT
version, -fvia-c is somewhat faster than the SAT version, but the NCG is just
as fast. I don't know what's going on on the C end in the first case that
makes it better, though (heap allocation is the same (low) in all cases;
via-c doesn't fix large heap allocation when it happens in my experience).
I've tested this on my own machine (i386-apple-darwin) and I can't
replicate it. For me (with 6.8.2), the SATed version is faster by
almost 40%, and there is no essentially no difference between the NCG
and GCC (GCC is maybe a fraction faster).

In any event GHC will not automatically SAT this function because it
has only one static argument. Indeed, I'm not quite sure why it should
be faster with SAT at all, and my assembly-fu is still too weak to be
able to work it out from the code generator output.
...
As an aside, I finally got around to compiling 6.9 from a day or so ago, and
noticed that one of the functions I'd mentioned earlier does get SATed to the
bad version (which causes lots of heap allocation in the program) because it
has two static arguments.
I devoted yesterday entirely to SAT and came up with some refinements
to my earlier heuristics which shave a couple of percent of
allocations and runtime off of nofib compared with the existing ones,
and also resolve this problem. I found that my tail call detection
from the day before was a bit broken. Now that it is fixed I use it to
avoid SATing functions which make tail calls, but with the important
wrinkle that if the function has static function parameters we should
SAT it anyway because this lets us do much more inlining and is a
bigger win than avoiding allocation. This criterion avoids the heap
allocation you are experiencing in your particular example.

I've also added rule generation to rewrite instances of functions that
are SATed purely because they have static function parameters back
into the non-SATed versions if they do not experience inlinig. This
prevents the worst cases that I got before when always SATing
functions with static function parameters.

What's more, I've found that most if not all SAT opportunities are
found early in the pipeline by these new criteria, so we can get away
with just one SAT pass instead of the two I reported earlier.

All in all, these developments are pretty promising and I suspect I'll
be able to integrate them into HEAD in time for 6.10, which will
hopefully alleviate some of your worries about the efficacy of SAT.

Cheers,
Max

Re: Recursive functions and constant parameter closures (inlining/strictness analyzer question)

Max Bolingbroke