Re[2]: [GHC] #876: stack overflow on 'length . filter odd $ [0 .. 999999999]'

Hello GHC, Tuesday, August 29, 2006, 10:54:07 PM, you wrote:
It makes sense to me that the above behaviour is seen: length is now a good consumer, but it generates 1+(1+(1+(1+... as it is consuming, and this causes a stack overflow. I don't think we can fix this while staying with fold/build fusion, so it looks to me like the patch should be reverted and the whole problem looked at post-6.6.
in general, for any function we can either implement 1) good consumer based on using foldr 2) tail-recursion are you agree? in particular, filterM, replicateM and other *M list operations (their problems was mentioned earlier) can be implemented in first or second way, but not both may be this can be fixed with some smart rules, which selects first or second implementation depending on context? -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Bulat Ziganshin
It makes sense to me that the above behaviour is seen: length is now a good consumer, but it generates 1+(1+(1+(1+... as it is consuming, and this causes a stack overflow. I don't think we can fix this while staying with fold/build fusion, so it looks to me like the patch should be reverted and the whole problem looked at post-6.6.
in general, for any function we can either implement 1) good consumer based on using foldr 2) tail-recursion
are you agree?
I'd like to point out that, if you are prepared to abandon foldr/build fusion in favour of hylo fusion, you can code good consumers using either foldr or foldl. (Length is a foldl, because it uses an accumulator.) Foldl is tail-recursive, but this does not prevent it being a good consumer, provided your framework of fusion laws is sufficiently flexible. Regards, Malcolm

Hello Malcolm, Thursday, August 31, 2006, 2:29:46 PM, you wrote:
in general, for any function we can either implement 1) good consumer based on using foldr 2) tail-recursion
are you agree?
I'd like to point out that, if you are prepared to abandon foldr/build fusion in favour of hylo fusion, you can code good consumers using either foldr or foldl.
looks promising. but even if GHC HQ will accept your plan, something will still remain to fix: there are some functions, such as 'filter', that can be implemented via foldr but not via foldl. such functions has two possible implementations: 1) using foldr, which make them goo producers 2) using reverse+foldl which makes them tail-recursive which makes situation more interesting is that even without fusion foldr=based solution will be faster for most lists - as long as whole processed data fits in L2 processor cache (at least, this says results of my tests of replicateM implementations) .... browsing GHC.List, i've found the following: #ifdef USE_REPORT_PRELUDE and = foldr (&&) True or = foldr (||) False #else and [] = True and (x:xs) = x && and xs or [] = False or (x:xs) = x || or xs {-# RULES "and/build" forall (g::forall b.(Bool->b->b)->b->b) . and (build g) = g (&&) True "or/build" forall (g::forall b.(Bool->b->b)->b->b) . or (build g) = g (||) False #-} #endif why we can't use foldl in definition and at the same time have rules for using with build ??? i understand that it's impossible for and/or, but for 'length' this should work, i think? -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Aug 31, 2006, at 6:29 AM, Malcolm Wallace wrote:
Bulat Ziganshin
wrote: It makes sense to me that the above behaviour is seen: length is now a good consumer, but it generates 1+(1+(1+(1+... as it is consuming, and this causes a stack overflow. I don't think we can fix this while staying with fold/build fusion, so it looks to me like the patch should be reverted and the whole problem looked at post-6.6.
in general, for any function we can either implement 1) good consumer based on using foldr 2) tail-recursion
are you agree?
I'd like to point out that, if you are prepared to abandon foldr/build fusion in favour of hylo fusion, you can code good consumers using either foldr or foldl. (Length is a foldl, because it uses an accumulator.) Foldl is tail-recursive, but this does not prevent it being a good consumer, provided your framework of fusion laws is sufficiently flexible.
Actually, it's sufficient to do good arity-raising transformations, and use the definition: foldl f z xs = foldr (\x k a -> k (f a x)) id xs z This yields code which looks a bit like this: let loop [] = id loop (x:xs) = let k = loop xs in (\a -> k (f a x)) in loop xs z In the pH compiler we recognized this idiom (and its generalizations to multiple accumulating parameters) and arity-raised loop as follows: let loop [] a = a loop (x:xs) a = loop xs (f a x) in loop xs z So we *can* make foldl into a good consumer---it's not even particularly difficult---but it relies rather heavily on getting some of the knock-on program transformations to be reliable. That means making sure we recognize the above pattern before it gets mangled by other optimizations. A possibility to avoid this problem is just to look for over-saturated applications of "loop" and then attempt arity expansion and see if any under-saturated recursive calls remain. There are alternatives that involve real interprocedural arity analysis, but that's hard enough that I've never tried to do it except on paper. We can actually play some similar games with functions which use an accumulating parameter but exit early, or which can be modeled by little state machines, but the transformations required to get "the code you want" in these cases involve actual arity and closure analysis rather than simple pattern-matching (which suffices for the foldl-using-foldr trick). A look at recent papers by Olin Shivers should give an idea of the kind of tricks required. Hylo fusion is great if you want to deforest non-cleverly written library code or non-lists. Jacob Schwartz (now at Bluespec) did a hylo fusion pass for pH---but the knock-on transformations proved a bit problematic. I speculate that if you do a full-on analysis a la jhc you'd get really great code---but the GRIN argument was that you could just skip all this deforestation nonsense at that point as it happened implicitly during the analysis. -Jan-Willem Maessen
Regards, Malcolm _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

| Actually, it's sufficient to do good arity-raising transformations, | and use the definition: | foldl f z xs = foldr (\x k a -> k (f a x)) id xs z | | This yields code which looks a bit like this: | | let loop [] = id | loop (x:xs) = let k = loop xs in (\a -> k (f a x)) | in loop xs z | | In the pH compiler we recognized this idiom (and its generalizations | to multiple accumulating parameters) and arity-raised loop as follows: | | let loop [] a = a | loop (x:xs) a = loop xs (f a x) | in loop xs z Absolutely right. This particular thing has been on my to-do list for at least 10 years (see Andy Gill's thesis). Dana Xu and I worked on it a bit last year, but her focus has shifted to verifying Haskell programs. We identified two analyses to help with arity raising: the simpler one works by looking a function *definitions*, and will deal with this example; the other looks at function *calls*. The good news is that Kirsten Chevalier is coming to Microsoft for an internship Oct-Dec this year, and we plan to work on exactly this. Watch this space. Simon

On Aug 31, 2006, at 3:03 PM, Simon Peyton-Jones wrote: (replying to me)
| Actually, it's sufficient to do good arity-raising transformations, | and use the definition: | foldl f z xs = foldr (\x k a -> k (f a x)) id xs z | | This yields code which looks a bit like this: | ...
Absolutely right. This particular thing has been on my to-do list for at least 10 years (see Andy Gill's thesis).
In the interests of fairness I should point out that this was where I first saw this formulation (though it might have cropped up in some of Richard Bird's work). The pH pass could do some other fancy footwork, like changing the handedness of a commutative fold and re-associating a nested fold of an associative operator. That's all rather harder in the RULES framework (but I bet it's doable). All that footwork also relied on getting the arity analysis right in the end. I'm pretty convinced that arity munging was a critical enabling optimization. -Jan

On Thu, 2006-08-31 at 11:29 +0100, Malcolm Wallace wrote:
Bulat Ziganshin
wrote: It makes sense to me that the above behaviour is seen: length is now a good consumer, but it generates 1+(1+(1+(1+... as it is consuming, and this causes a stack overflow. I don't think we can fix this while staying with fold/build fusion, so it looks to me like the patch should be reverted and the whole problem looked at post-6.6.
in general, for any function we can either implement 1) good consumer based on using foldr 2) tail-recursion
are you agree?
I'd like to point out that, if you are prepared to abandon foldr/build fusion in favour of hylo fusion, you can code good consumers using either foldr or foldl. (Length is a foldl, because it uses an accumulator.) Foldl is tail-recursive, but this does not prevent it being a good consumer, provided your framework of fusion laws is sufficiently flexible.
We have a reformulation of hylo fusion which we use for the Data.ByteString library. That would cover lists too (and would make it easy to fuse conversions String<->ByteString). I forget, does hylo fusion cover (++) efficiently? For our system it works but is slower than we'd like. Duncan

Duncan Coutts
We have a reformulation of hylo fusion which we use for the Data.ByteString library. That would cover lists too (and would make it easy to fuse conversions String<->ByteString).
I forget, does hylo fusion cover (++) efficiently? For our system it works but is slower than we'd like.
Well, there is a fairly straightforward RULE for (++) which essentially discovers the listy-ness of both arguments simultaneously, and deforests them together. xs ++ ys = foldr (:) ys xs RULES foldr f z (foldr (:) (foldr (:) [] s0) s1) = foldr f (foldr f z s0) s1 However this is not ideal, since it only deals well with a single (++), not a chain of them. But there is a separate set of RULES for deforesting concat/concatMap, especially as used in list comprehensions. It is relatively easy to convert a chain of (++)s to a single application of concat. I am still working on the details of the formulation of concatMap as a hylo-like structure, so no performance results yet. I hope to get it all finished and written up within a few weeks. Regards, Malcolm
participants (5)
-
Bulat Ziganshin
-
Duncan Coutts
-
Jan-Willem Maessen
-
Malcolm Wallace
-
Simon Peyton-Jones