Re: Issues resulting from inlining build

27 Jul 2014

      Hi,

Am Sonntag, den 27.07.2014, 15:29 -0400 schrieb David Feuer:
...
I think I finally figured out the deeper problem behind an issue I was
having getting some things to fuse fully. I'm hoping someone has a
brilliant idea for getting around it. Of course, it may be that
someone else has thoroughly studied the matter and determined that
there's no good solution. Suppose we write
crazy :: T -> [U]
crazy x = some $ long $ fusion $ pipeline
oops f = map f $ crazy x
uhOh c n = foldr c n $ crazy Pig
ohMy = take bignum $ crazy amigo
When GHC compiles crazy, it rewrites the pieces of the pipeline to
build/foldr forms, fuses them, and produces
crazy x = build someBigFunction
In then inlines build, producing
crazy x = anotherBiggy
So far, this looks reasonably sensible, but it's likely bad.
It is good for the uses of crazy where no further fusion happens.

For the other cases, I believe GHC will rather try to get the original
definition inlined. Maybe alredy "some $ long $ fusion $ pipeline" was
deemed to big to be inlined – in that case an {-# INLINEABLE crazy #-}
could help.
...
The problem is that GHC will (rightly) conclude that `build
someBigFunction` is too big to inline, and therefore the fusion will
break at that boundary and we'll produce intermediate lists in the
functions that use crazy. Now if we were playing the part of the
compiler by hand, we would likely factor out someBigFunction and then
*refrain from inlining build*. That is, we would get
{-# NOINLINE crazyB #-}
crazyB x = someBigFunction
crazy = nonInliningBuild crazyB
Since we've factored out someBigFunction into crazyB, we can now
safely inline crazy itself, allowing the pipeline to continue beyond
it. The problem, of course, is that when we *don't* fuse beyond, there
is some performance penalty (I have not tried to measure it yet) to
passing in (:) and [] at runtime instead of fixing them at compile
time.
There is another downside to this: This way, with fusion, you will get
rid of the intermediate list, but you will have calls from
someBigFunction to unknown functions, which is slow. The nice thing
about fusion is not just getting rid of the list, but also the local
optimizations that happen when the new “cons” and “nil” get in touch
with the code in someBigFunction.

So the approach taken seems to be: Either inline "some $ long $ fusion $
pipeline" and do all the transformation at the use site, or call
"anotherBiggy", but nothing in between.

Greetings,
Joachim

-- 
Joachim “nomeata” Breitner
  mail@joachim-breitner.de • http://www.joachim-breitner.de/
  Jabber: nomeata@joachim-breitner.de  • GPG-Key: 0xF0FBF51F
  Debian Developer: nomeata@debian.org

Re: Issues resulting from inlining build

Joachim Breitner