An attempt at foldr/build fusion for zip

Hi, I think I may have found a way to get zip & friends to fuse with *both* of their input lists. I am not a heavy ghc hacker, though, so I may be missing something important that makes this unworkable. I have no idea what kind of code this would actually end up creating. Anyway, here's my attempt; it ties in with the current foldr2 scheme. I eagerly any await comments or questions, especially from the foldr/build gurus. -- foldr2_both.lhs: Attempting to fuse zip with both input lists. We seem to be forced to use a recursive datatype to accomplish this. We're using newtype, so there should be no overhead from construction/deconstruction of this type, right? \begin{code} newtype BuildZip a b = BZ ((a -> (BuildZip a b) -> b) -> b) bz :: (forall b. (a->b->b)->b->b) -> b -> BuildZip a b bz f n = f (\x xs -> BZ (\c -> c x xs)) (BZ (\_ -> n)) {-# INLINE bz #-} foldr2_both :: (a->b->c->c) -> BuildZip a c -> BuildZip b c -> c foldr2_both k (BZ xs) (BZ ys) = xs (\x xs' -> ys (\y ys' -> k x y (foldr2_both k xs' ys') ) ) {-# RULES "foldr2/both" forall k n (f::forall z.(a->z->z)->z->z) (g::forall z.(b->z->z)->z->z) . foldr2 k n (build f) (build g) = foldr2_both k (bz f n) (bz g n) #-} \end{code} -- END foldr2_both.lhs Best regards, Matt Harden

Hi Matt,
I think I may have found a way to get zip & friends to fuse with *both* of their input lists. ... I have no idea what kind of code this would actually end up creating.
However, that is the important point. The goal of deforestation/fusion is to optimise a program. Removing data structures is not the final goal. In principal you can replace all algebraic data types by higher order functions (at least with the second order types that ghc allows). You just don't gain anything by doing it. I'm sorry that I don't have the time to look into your definition in detail. But basically you replace the intermediate list by higher order functions. Your foldr2_both does about the same amount of work as foldr2. Fusion of foldr2 with both arguments should give an expression without any recursively defined function (i.e. any foldr variant). Btw: The real benefit of deforestation does not come from saving the time for constructing and destructing the intermediate list. The real benefit comes from moving the code for the construction of an element next to the code for destructing the element which usally enables many further optimisations. A solution to the zip fusion problem is presented by John Launchbury, Sava Krstic, and Tim Sauerwein in: http://www.cse.ogi.edu/PacSoft/publications/phaseiiiq13papers/zipfusion.pdf I haven't yet looked into it in detail. Some problems with this approach are mentioned in the paper and I suppose they are the reason why the approach is not used in ghc.
\begin{code} newtype BuildZip a b = BZ ((a -> (BuildZip a b) -> b) -> b)
bz :: (forall b. (a->b->b)->b->b) -> b -> BuildZip a b bz f n = f (\x xs -> BZ (\c -> c x xs)) (BZ (\_ -> n)) {-# INLINE bz #-}
foldr2_both :: (a->b->c->c) -> BuildZip a c -> BuildZip b c -> c foldr2_both k (BZ xs) (BZ ys) = xs (\x xs' -> ys (\y ys' -> k x y (foldr2_both k xs' ys') ) )
{-# RULES "foldr2/both" forall k n (f::forall z.(a->z->z)->z->z) (g::forall z.(b->z->z)->z->z) . foldr2 k n (build f) (build g) = foldr2_both k (bz f n) (bz g n) #-} \end{code}
Olaf -- OLAF CHITIL, Dept. of Computer Science, University of York, York YO10 5DD, UK. URL: http://www.cs.york.ac.uk/~olaf/ Tel: +44 1904 434756; Fax: +44 1904 432767

Sun, 22 Apr 2001 08:45:45 -0500, Matt Harden
I think I may have found a way to get zip & friends to fuse with *both* of their input lists.
I tried to put in PrelList, changed foldr2_both to use a local recursive function which doesn't pass k around which allows to inline k, and a test shows that it's unfortunately slightly slower than the original. -- __("< Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/ \__/ ^^ SYGNATURA ZASTÊPCZA QRCZAK
participants (3)
-
Matt Harden
-
Olaf Chitil
-
qrczak@knm.org.pl