
On Wed, Jun 15, 2011 at 11:15 AM, Milan Straka
a) we could left and right folds. For the Set and IntSet it is simple -- we can add foldr and foldl, and deprecated fold with the message 'use foldr instead'.
I think we should have both left and right folds for ordered containers.
For the Map and IntMap it is not so simple -- should we add both foldr, foldl, foldlWithKey and foldrWithKey?
I think so.
b) we could use the Foldable instances more. We can implement foldr and foldl efficiently and let the users to use those instead of providing eg Set.foldr, Set.foldl.
This does not cover the *WithKey variant, as the Foldable instances over Map and IntMap does not utilize the key.
I think we should provide Foldable instances, but singe e.g. foldl' isn't a method of the Foldable class we cannot do so as efficiently as we'd like. I looked into this before and the core for foldl' in Foldable is quite poor.
c) strict folds. If we add also strict folds, we would have a total of _four_ folds in Set and IntSet and __eight__ folds in Map and IntMap.
The Foldable module exports foldl' and foldr', which use foldr, foldl respectively, but needs some intermediate memory (it creates a thunk for every node of the structure). Apart from that, this does not cover the *WithKey variants as in b).
I think we definitely need strict folds. These are by far the most useful folds in my experience (foldr is mostly used to implement toList).
Personally I would add foldr, foldl, foldlWithKey, foldrWithKey and specialised foldr, foldl implementations in the Foldable classes. I am not sure about the strict folds -- I used an equivalent of the foldr' from the Foldable module in GHC and it was reasonable fast.
Any opinions?
I think we need both lazy/strict, left/right, with/without key folds. The duplication is quite annoying but they all provide necessary functionality. The duplication touches on a bigger issue in the design of our container libraries. We currently fail at separating traversal from "consumption" efficiently. Ideally we'd have two functions: preorder :: Map k v -> [(k,v)] postorder :: Map k v -> [(k,v)] and use the list folds to work on these. However, build/foldr fusion doesn't fuse this List.foldl' f (preorder m) which makes the separation less viable, for performance reasons. This also applies to things like union, intersection, etc. Ideally we could define e.g. intersection using a general intersection function on streams (lists) module Data.List where -- Precondition: the lists must be sorted intersection :: Ord a => [a] -> [a] -> [a] module Data.Set where intersection :: Ord a => Set a -> Set a -> Set a intersection s1 s2 = fromList . List.intersection (preorder s1) (preorder s2) or something similar. In other words, I'd like to achieve the separation of algorithms and data structure that the STL achieves. (Note: Scala also does something similar). Right now there's rampant code duplication in the container library implementation that I'd like to address, but at the moment I don't know how to do so efficiently. Cheers, Johan