Re: Proposal #3999: Improved folds for Data.Map and Data.IntMap

On 23/04/2010, at 04:36, Gwern Branwen wrote:
On Thu, Apr 22, 2010 at 12:15 PM, Roman Leshchinskiy
wrote: Hmm, I'd love to see some real-world uses of foldl. I have no idea what to optimise it for in vector. Unfortunately, the link above doesn't give any examples.
Roman
Well, since you ask...
$ find . -name "*.hs" -exec grep -H ' foldl ' {} \; > ~/fold.txt $ wc ~/fold.txt 2206 24333 225619 fold.txt
That doesn't really help me, though, because I can't tell why using foldl as opposed to foldl' is essential (or even correct) in these cases. In fact, at a cursory glance a lot of them look quite leaky to me. Roman

On April 22, 2010 21:11:55 Roman Leshchinskiy wrote:
That doesn't really help me, though, because I can't tell why using foldl as opposed to foldl' is essential (or even correct) in these cases. In fact, at a cursory glance a lot of them look quite leaky to me.
Here's a toy example where replacing foldl with foldl' would be wrong sumWhile g y x | g x = y + x | otherwise = 0 foldl (sumWhile (< 3)) 6 [5,undefined,4,3,2,1,0] Perhaps there is other code where forcing values via foldl' before they are known to be required can be bad (due the cost or causing an exception). Cheers! -Tyson

On 24/04/2010, at 00:44, Tyson Whitehead wrote:
On April 22, 2010 21:11:55 Roman Leshchinskiy wrote:
That doesn't really help me, though, because I can't tell why using foldl as opposed to foldl' is essential (or even correct) in these cases. In fact, at a cursory glance a lot of them look quite leaky to me.
Here's a toy example where replacing foldl with foldl' would be wrong
sumWhile g y x | g x = y + x | otherwise = 0
foldl (sumWhile (< 3)) 6 [5,undefined,4,3,2,1,0]
Just replacing foldl by foldl' isn't always correct, of course. I'd just like to see real-world code where using foldl is both correct (with respect to performance) and essential. Incidentally, your example aborts with a stack overflow for a list of 5M elements but this function works fine: sum . takeWhile (<3) . reverse The other example, given by Heinrich, was reverse. The report specifies it in terms of foldl but GHC implements it differently. Roman

In the meantime, to refocus attention on the original proposal.... ;) For the moment, if only because it's currently the more standard approach, I'll concede and use the foldr/build approach. {-# INLINE [0] pairCons #-} pairCons :: ((a, b) -> c -> c) -> a -> b -> c -> c pairCons = curry {-# RULES "Data.Map.toAscList->build" [~1] toAscList = \ m -> GHC.build (\ c n -> foldrWithKey (pairCons c) n m); #-} Since the normal definition of toAscList is just foldrWithKey (curry (:)) [], there's no need to rewrite it back to toAscList. A few possible additional modifications: - Pull a similar trick for toDescList. It's not as if it'd be all that difficult... - Reimplement the (==) and compare functions for Data.Map as follows: m1 == m2 = size m1 == size m2 && and (zipWith (==) (toAscList m1) (toAscList m2)) m1 `compare` m2 = foldr mappend (compare (size m1) (size m2)) (zipWith compare (toAscList m1) (toAscList m2)) which gets some deforesting. Louis Wasserman wasserman.louis@gmail.com http://profiles.google.com/wasserman.louis

On 24/04/2010, at 04:16, Louis Wasserman wrote:
In the meantime, to refocus attention on the original proposal.... ;)
Yes, sorry sidetracking the thread.
{-# INLINE [0] pairCons #-} pairCons :: ((a, b) -> c -> c) -> a -> b -> c -> c pairCons = curry
{-# RULES "Data.Map.toAscList->build" [~1] toAscList = \ m -> GHC.build (\ c n -> foldrWithKey (pairCons c) n m); #-}
Since the normal definition of toAscList is just foldrWithKey (curry (:)) [], there's no need to rewrite it back to toAscList.
I'm not sure why you don't just use curry directly here. Also, why not just implement toAscList like in the rule and INLINE it? Roman

I'm not sure why you don't just use curry directly here. Also, why not just implement toAscList like in the rule and INLINE it?
I concede that using curry directly is the better approach, but when I implement toAscList as in the rule and inline it, it doesn't actually appear to get inlined. (Example: mapKeysWith, which is defined as fromListWith c . List.map fFirst . toList, doesn't actually fold/build with the map.) Revised, complete patch: http://hackage.haskell.org/trac/ghc/attachment/ticket/3999/fold-building.pat...

On 27/04/2010, at 03:20, Louis Wasserman wrote:
I'm not sure why you don't just use curry directly here. Also, why not just implement toAscList like in the rule and INLINE it? I concede that using curry directly is the better approach, but when I implement toAscList as in the rule and inline it, it doesn't actually appear to get inlined. (Example: mapKeysWith, which is defined as fromListWith c . List.map fFirst . toList, doesn't actually fold/build with the map.)
Is this with 6.12 or the HEAD? In any case, could you please submit a bug report so that Simon can have a look at it. Roman

Apologies, this is with 6.12.1. The HEAD appears to work correctly, though.
Will submit a bug report to double-check.
Louis Wasserman
wasserman.louis@gmail.com
http://profiles.google.com/wasserman.louis
On Mon, Apr 26, 2010 at 10:26 PM, Roman Leshchinskiy
On 27/04/2010, at 03:20, Louis Wasserman wrote:
I'm not sure why you don't just use curry directly here. Also, why not just implement toAscList like in the rule and INLINE it? I concede that using curry directly is the better approach, but when I implement toAscList as in the rule and inline it, it doesn't actually appear to get inlined. (Example: mapKeysWith, which is defined as fromListWith c . List.map fFirst . toList, doesn't actually fold/build with the map.)
Is this with 6.12 or the HEAD? In any case, could you please submit a bug report so that Simon can have a look at it.
Roman
participants (4)
-
Louis Wasserman
-
Roman Leshchinskiy
-
Roman Leshchinskiy
-
Tyson Whitehead