Proposal #3999: Improved folds for Data.Map and Data.IntMap

http://hackage.haskell.org/trac/ghc/ticket/3999 Discussion deadline: May 7 Previously, Data.Map's Foldable instance consisted of instance Foldable (Map k) where foldMap ... Even though there were implementations for foldrWithKey and foldlWithKey, these weren't being used in the Foldable instance -- instead, the slower version derived from foldMap was in use. This is silly, since foldr and foldl can be easily derived from foldrWithKey and foldlWithKey. Additionally, it takes relatively little effort to ensure that keys, elems, toList, etc. deforest under folds. Therefore, we add the following rewrite rules: {-# RULES "foldr/Data.Map.elems" forall f z m . foldr f z (elems m) = foldr f z m; "foldl/Data.Map.elems" forall f z m . foldl f z (elems m) = foldl f z m; "foldr/Data.Map.keys" forall f z m . foldr f z (keys m) = foldrWithKey (\ k _ -> f k) z m; "foldl/Data.Map.keys" forall f z m . foldl f z (keys m) = foldlWithKey (\ z k _ -> f z k) z m; "foldr/Data.Map.toAscList" forall f z m . foldr f z (toAscList m) = foldrWithKey (curry f) z m; "foldl/Data.Map.toAscList" forall f z m . foldl f z (toAscList m) = foldlWithKey (curry . f) z m; #-} Finally, we implement foldrWithKey and foldlWithKey for Data.IntMap, and make the same adjustments to Data.IntMap as to Data.Map, adding specialized foldr and foldl definitions in the Foldable instance and adding the corresponding rewrite rules. The only API change is the addition of foldrWithKey and foldlWithKey to Data.IntMap, but folds over Maps and IntMaps are exposed to optimization. Louis Wasserman wasserman.louis@gmail.com http://profiles.google.com/wasserman.louis

On 22/04/2010, at 08:47, Louis Wasserman wrote:
Additionally, it takes relatively little effort to ensure that keys, elems, toList, etc. deforest under folds. Therefore, we add the following rewrite rules:
{-# RULES
"foldr/Data.Map.elems" forall f z m . foldr f z (elems m) = foldr f z m; "foldl/Data.Map.elems" forall f z m . foldl f z (elems m) = foldl f z m; "foldr/Data.Map.keys" forall f z m . foldr f z (keys m) = foldrWithKey (\ k _ -> f k) z m; "foldl/Data.Map.keys" forall f z m . foldl f z (keys m) = foldlWithKey (\ z k _ -> f z k) z m; "foldr/Data.Map.toAscList" forall f z m . foldr f z (toAscList m) = foldrWithKey (curry f) z m; "foldl/Data.Map.toAscList" forall f z m . foldl f z (toAscList m) = foldlWithKey (curry . f) z m; #-}
Why not implement elems etc. in terms of build? This would make them play nicely with foldr/build fusion at no cost. Roman

{-# RULES
"foldr/Data.Map.elems" forall f z m . foldr f z (elems m) = foldr f z m;
"foldl/Data.Map.elems" forall f z m . foldl f z (elems m) = foldl f z m;
"foldr/Data.Map.keys" forall f z m . foldr f z (keys m) = foldrWithKey (\ k _ -> f k) z m;
"foldl/Data.Map.keys" forall f z m . foldl f z (keys m) = foldlWithKey (\ z k _ -> f z k) z m;
"foldr/Data.Map.toAscList" forall f z m . foldr f z (toAscList m) = foldrWithKey (curry f) z m;
"foldl/Data.Map.toAscList" forall f z m . foldl f z (toAscList m) = foldlWithKey (curry . f) z m; #-}
A few thoughts on this issue: - I've occasionally had problems getting inlining to actually happen, which is irritating. - Maintaining portability requires almost as much code for the preprocessor as the rules I wrote above, and is finicky at best. - Perhaps my biggest concern with that approach is that it doesn't play nicely with foldl, which came up in the application that motivated this proposal! The rewrite rule approach I outline above is perfectly compatible with foldr/build rewriting, but also permits foldls to function nicely. Louis Wasserman wasserman.louis@gmail.com http://profiles.google.com/wasserman.louis Why not implement elems etc. in terms of build? This would make them play
nicely with foldr/build fusion at no cost.
Roman

On 22/04/2010, at 13:18, Louis Wasserman wrote:
{-# RULES
"foldr/Data.Map.elems" forall f z m . foldr f z (elems m) = foldr f z m; "foldl/Data.Map.elems" forall f z m . foldl f z (elems m) = foldl f z m; "foldr/Data.Map.keys" forall f z m . foldr f z (keys m) = foldrWithKey (\ k _ -> f k) z m; "foldl/Data.Map.keys" forall f z m . foldl f z (keys m) = foldlWithKey (\ z k _ -> f z k) z m; "foldr/Data.Map.toAscList" forall f z m . foldr f z (toAscList m) = foldrWithKey (curry f) z m; "foldl/Data.Map.toAscList" forall f z m . foldl f z (toAscList m) = foldlWithKey (curry . f) z m; #-} A few thoughts on this issue: • I've occasionally had problems getting inlining to actually happen, which is irritating. • Maintaining portability requires almost as much code for the preprocessor as the rules I wrote above, and is finicky at best. • Perhaps my biggest concern with that approach is that it doesn't play nicely with foldl, which came up in the application that motivated this proposal! The rewrite rule approach I outline above is perfectly compatible with foldr/build rewriting, but also permits foldls to function nicely.
Ah, I missed the foldl rules. Alas, I don't think they'll work reliably. You'd have to delay inlining foldl to give them a chance to fire but that requires a change to base. BTW, you also have to make sure that elems, keys etc. don't get inlined prematurely with appropriate INLINE/NOINLINE pragmas. The rules don't work well with GHC's list fusion system, either. To see why, look for rules involving build in GHC.List (esp. foldr2) and for augment in GHC.Base. FWIW, I'm a bit surprised that you need foldl on lists anyway. I thought everybody uses foldl'. Roman

On Thu, Apr 22, 2010 at 4:08 AM, Roman Leshchinskiy
On 22/04/2010, at 13:18, Louis Wasserman wrote:
{-# RULES
"foldr/Data.Map.elems" forall f z m . foldr f z (elems m) = foldr f z m; "foldl/Data.Map.elems" forall f z m . foldl f z (elems m) = foldl f z m; "foldr/Data.Map.keys" forall f z m . foldr f z (keys m) = foldrWithKey (\ k _ -> f k) z m; "foldl/Data.Map.keys" forall f z m . foldl f z (keys m) = foldlWithKey (\ z k _ -> f z k) z m; "foldr/Data.Map.toAscList" forall f z m . foldr f z (toAscList m) = foldrWithKey (curry f) z m; "foldl/Data.Map.toAscList" forall f z m . foldl f z (toAscList m) = foldlWithKey (curry . f) z m; #-} A few thoughts on this issue: • I've occasionally had problems getting inlining to actually happen, which is irritating. • Maintaining portability requires almost as much code for the preprocessor as the rules I wrote above, and is finicky at best. • Perhaps my biggest concern with that approach is that it doesn't play nicely with foldl, which came up in the application that motivated this proposal! The rewrite rule approach I outline above is perfectly compatible with foldr/build rewriting, but also permits foldls to function nicely.
Ah, I missed the foldl rules. Alas, I don't think they'll work reliably. You'd have to delay inlining foldl to give them a chance to fire but that requires a change to base. BTW, you also have to make sure that elems, keys etc. don't get inlined prematurely with appropriate INLINE/NOINLINE pragmas.
The rules don't work well with GHC's list fusion system, either. To see why, look for rules involving build in GHC.List (esp. foldr2) and for augment in GHC.Base.
FWIW, I'm a bit surprised that you need foldl on lists anyway. I thought everybody uses foldl'.
Not nearly: http://www.reddit.com/r/haskell/comments/bmd5u/thunks_and_haskell/c0nl7u2 There are a lot of foldl uses out there. -- gwern

On 23/04/2010, at 00:32, Gwern Branwen wrote:
On Thu, Apr 22, 2010 at 4:08 AM, Roman Leshchinskiy
wrote: FWIW, I'm a bit surprised that you need foldl on lists anyway. I thought everybody uses foldl'.
Not nearly: http://www.reddit.com/r/haskell/comments/bmd5u/thunks_and_haskell/c0nl7u2
There are a lot of foldl uses out there.
Hmm, I'd love to see some real-world uses of foldl. I have no idea what to optimise it for in vector. Unfortunately, the link above doesn't give any examples. Roman

Roman Leshchinskiy wrote:
Hmm, I'd love to see some real-world uses of foldl. I have no idea what to optimise it for in vector. Unfortunately, the link above doesn't give any examples.
Here a use of foldl from the Haskell98 Prelude: reverse :: [a] -> [a] reverse = foldl (flip (:)) [] Basically, foldl is useful if the accumulating parameter uses equal or more space if evaluated to normal form than the input list. Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

On 23/04/2010, at 19:45, Heinrich Apfelmus wrote:
Roman Leshchinskiy wrote:
Hmm, I'd love to see some real-world uses of foldl. I have no idea what to optimise it for in vector. Unfortunately, the link above doesn't give any examples.
Here a use of foldl from the Haskell98 Prelude:
reverse :: [a] -> [a] reverse = foldl (flip (:)) []
Basically, foldl is useful if the accumulating parameter uses equal or more space if evaluated to normal form than the input list.
What would change if this used foldl' instead? Roman

Roman Leshchinskiy wrote:
Heinrich Apfelmus wrote:
Roman Leshchinskiy wrote:
Hmm, I'd love to see some real-world uses of foldl. I have no idea what to optimise it for in vector. Unfortunately, the link above doesn't give any examples.
Here a use of foldl from the Haskell98 Prelude:
reverse :: [a] -> [a] reverse = foldl (flip (:)) []
Basically, foldl is useful if the accumulating parameter uses equal or more space if evaluated to normal form than the input list.
What would change if this used foldl' instead?
Nothing much, the flip (:) a x will be evaluated to x:a a bit earlier. By inlining the foldl , GHC even skips that tiny evaluation. Note however that it doesn't `seq` the accumulating parameter as foldl' would do. In any case, since the accumulating parameter has the same size in both unevaluated form and in normal form, the choice between foldl or foldl' doesn't make much of a difference. I don't know of any real world examples where this is not the case, but it's not too difficult to come up with artificial ones, like let f a n = let x = replicate n n in x `seq` (x : a) in foldl f [] [1..20] Here, forcing the spine of the list forces the large elements as well. But this is probably best written along the lines of reverse $ map (\n -> replicate n n) [1..20] anyway. Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com
participants (4)
-
Gwern Branwen
-
Heinrich Apfelmus
-
Louis Wasserman
-
Roman Leshchinskiy