On Jul 27, 2014 3:48 PM, "Joachim Breitner" <mail@joachim-breitner.de> wrote:
> Am Sonntag, den 27.07.2014, 15:29 -0400 schrieb David Feuer:
> It is good for the uses of crazy where no further fusion happens.
Yes, I know that.
> For the other cases, I believe GHC will rather try to get the original
> definition inlined. Maybe alredy "some $ long $ fusion $ pipeline" was
> deemed to big to be inlined – in that case an {-# INLINEABLE crazy #-}
> could help.
Right, I *could* do that, but then there's a potential for "code explosion" from inlining something too big at a lot of call sites.
> > it. The problem, of course, is that when we *don't* fuse beyond, there
> > is some performance penalty (I have not tried to measure it yet) to
> > passing in (:) and [] at runtime instead of fixing them at compile
> > time.
>
> There is another downside to this: This way, with fusion, you will get
> rid of the intermediate list, but you will have calls from
> someBigFunction to unknown functions, which is slow.
Yes, that was included by reference.
> The nice thing
> about fusion is not just getting rid of the list, but also the local
> optimizations that happen when the new “cons” and “nil” get in touch
> with the code in someBigFunction.
Yes, that's definitely a benefit. My point is that when you *have* to break things up (necessarily sacrificing opportunities for further analysis), it's often better to break between the build and its argument than before the build. Either way, you're giving up the opportunities you describe, but breaking before the build you also get the allocation.
Context:
I was doing some simple testing of my proposed replacement of unfoldr with an inlinable, fusable version. I managed to hit one of those wonky little cutoffs. When I wrote something in the general nature of
f :: Int -> [Int]
f n = map (+1) $ unfoldr go 0
where
go k | k <= n = Just (k, k+1)
| otherwise = Nothing
g :: Int -> Int
g n = foldr (*) 1 $ f n
Then when f was exported, it was not inlined, and didn't fuse with the foldr in g, so there was a ton of allocation running g, whereas when f was not exported, it fused completely.