Thank you for looking at this!


On Tue, Jan 14, 2014 at 1:27 AM, Simon Peyton Jones <simonpj@microsoft.com> wrote:

I’ve hesitated to reply, because I have lots of questions but no time to investigate in.  I’m looking at your wiki page https://github.com/takano-akio/ww-fusion

 

·         Does your proposed new fold’ run faster than the old one?  You give no data.


No, it runs just equally fast as the old one. At the Core level they are the same. I ran some criterion benchmarks:

source: https://github.com/takano-akio/ww-fusion/blob/master/benchmarks.hs
results: http://htmlpreview.github.io/?https://github.com/takano-akio/ww-fusion/blob/master/foldl.html

The point was not to make foldl' faster, but to make it fuse well with good producers.
 

·         The new foldl’ is not a “good consumer” in the foldr/build sense, which a big loss.  What if you say fold’ k z [1..n]; you want the intermediate list to vanish.

For my idea to work, enumFromTo and all other good producers need to be redefined in terms of buildW, which fuses with foldrW. The definition of buildW and the relevant rules are here:

https://github.com/takano-akio/ww-fusion/blob/master/WWFusion.hs

 

·         My brain is too small to truly understand your idea.  But since foldrW is non-recursive, what happens if you inline foldrW into fold’, and then simplify?  I’m betting you get something pretty similar to the old foldl’.  Try in by hand, and with GHC and let’s see the final optimised code.

I checked this and I see the same code as the old foldl', modulo order of arguments. This is what I expected.
 

·         Under “motivation” you say “GHC generates something essentially like…” and then give some code.  Now, if GHC would only eta-expand ‘go’ with a second argument, you’d get brilliant code. And maybe that would help lots of programs, not just this one.  It’s a slight delicate transformation but I’ve often thought we should try it; c.f #7994, #5809

I agree that it would be generally useful if GHC did this transformation. However I don't think it's good enough for this particular goal of making foldl' fuse well.

Consider a function that flattens a binary tree into a list:

data Tree = Tip {-# UNPACK #-} !Int | Bin Tree Tree

toList :: Tree -> [Int]
toList tree = build (toListFB tree)
{-# INLINE toList #-}

toListFB :: Tree -> (Int -> r -> r) -> r -> r
toListFB root cons nil = go root nil
  where
    go (Tip x) rest = cons x rest
    go (Bin x y) rest = go x (go y rest)

Let's say we want to eliminate the intermediate list in the expression (sum (toList t)). Currently sum is not a good consumer, but if it were, after fusion we'd get something like:

sumList :: Tree -> Int
sumList root = go0 root id 0

go0 :: Tree -> (Int -> Int) -> Int -> Int
go0 (Tip x) k = (k $!) .  (x+)
go0 (Bin x y) k = go0 x (go0 y k)

Now, merely eta-expanding go0 is not enough to get efficient code, because the function will still build a partial application every time it sees a Bin constructor. For this recursion to work in an allocation-free way, it must be rather like:

go1 :: Tree -> Int -> Int
go1 (Tip x) n = x + n
go1 (Bin x y) n = go1 y (go1 x n)

And this is what we get if we define foldl' and toList in terms of foldrW and buildW.

I think a similar problem arises whenever you define a good consumer that traverses a tree-like structure, and you want to use a strict fold to consume a list produced by that producer.

Thank you,
Takano Akio

 

Simon

 

From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of Akio Takano
Sent: 09 January 2014 13:25
To: ghc-devs
Subject: Re: Extending fold/build fusion

 

Any input on this is appreciated. In particular, I'd like to know: if I implement the idea as a patch to the base package, is there a chance it is considered for merge?


-- Takano Akio

On Fri, Jan 3, 2014 at 11:20 PM, Akio Takano <tkn.akio@gmail.com> wrote:

Hi,

I have been thinking about how foldl' can be turned into a good consumer, and I came up with something that I thought would work. So I'd like to ask for opinions from the ghc devs: if this idea looks good, if it is a known bad idea, if there is a better way to do it, etc.

The main idea is to have an extended version of foldr:

-- | A mapping between @a@ and @b@.
data Wrap a b = Wrap (a -> b) (b -> a)

foldrW
  :: (forall e. Wrap (f e) (e -> b -> b))
  -> (a -> b -> b) -> b -> [a] -> b
foldrW (Wrap wrap unwrap) f z0 list0 = wrap go list0 z0
  where
    go = unwrap $ \list z' -> case list of
      [] -> z'
      x:xs -> f x $ wrap go xs z'

This allows the user to apply an arbitrary "worker-wrapper" transformation to the loop.

Using this, foldl' can be defined as

newtype Simple b e = Simple { runSimple :: e -> b -> b }

foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f initial xs = foldrW (Wrap wrap unwrap) g id xs initial
  where
    wrap (Simple s) e k a = k $ s e a
    unwrap u = Simple $ \e -> u e id
    g x next acc = next $! f acc x

The wrap and unwrap functions here ensure that foldl' gets compiled into a loop that returns a value of 'b', rather than a function  'b -> b', effectively un-CPS-transforming the loop.

I put preliminary code and some more explanation on Github:

https://github.com/takano-akio/ww-fusion

Thank you,
Takano Akio