Proposal: Foldable typeclass: make foldl' and foldr' class methods

All, This issue was brought up again recently by Milan's questions about what to do with folds for the containers package. Currently the Foldable type class has: class Foldable t where fold :: Monoid m => t m -> m foldMap :: Monoid m => (a -> m) -> t a -> m foldr :: (a -> b -> b) -> b -> t a -> b foldl :: (a -> b -> a) -> a -> t b -> a foldr1 :: (a -> a -> a) -> t a -> a foldl1 :: (a -> a -> a) -> t a -> a with default implementations for each in terms of the others. Then it defines: foldr' :: Foldable t => (a -> b -> b) -> b -> t a -> b foldr' f z0 xs = foldl f' id xs z0 where f' k x z = k $! f x z foldl' :: Foldable t => (a -> b -> a) -> a -> t b -> a foldl' f z0 xs = foldr f' id xs z0 where f' x k z = k $! f z x That is, they are fixed definitions so specialised implementations cannot be provided. Note also that these are the classic higher-order "foldl in terms of foldr" definitions. Current releases of GHC cannot optimises these higher-order definitions into efficient versions using accumulating parameters. Since one of the main purposes of foldl' is performance (other purpose being to avoid space leaks) then that's rather unfortunate. The proposal is simple: move these two functions into the Foldable type class itself. They would keep their existing default definitions but since they are then class methods they can have efficient implementations provided by the class instances. This should not break much code. In particular it should not break existing type class instance declarations since there is a default definition for instances that don't defined the new methods. The only potential breakage is that foldl' and foldr' are exported via Foldable(..) rather than directly. This could affect modules that use explicit imports. (I consider this fact to be a slightly unfortunate quirk of the Haskell module system). Patch attached. Deadline: 2 weeks: Monday 4th July. Unresolved: what is a good concise specification of foldr' to use in the documentation? For foldl' we can say: foldl' f z = List.foldl' f z . toList Related issues not covered by this simple proposal: providing foldl1' and foldr1', updating instances to define foldl' and foldr' if possible (e.g. array could provide an efficient impl of foldr'). Duncan

An enthusiastic +1 from me.
G
On Mon, Jun 20, 2011 at 1:02 PM, Duncan Coutts
All,
This issue was brought up again recently by Milan's questions about what to do with folds for the containers package.
Currently the Foldable type class has:
class Foldable t where fold :: Monoid m => t m -> m foldMap :: Monoid m => (a -> m) -> t a -> m
foldr :: (a -> b -> b) -> b -> t a -> b foldl :: (a -> b -> a) -> a -> t b -> a
foldr1 :: (a -> a -> a) -> t a -> a foldl1 :: (a -> a -> a) -> t a -> a
with default implementations for each in terms of the others. Then it defines:
foldr' :: Foldable t => (a -> b -> b) -> b -> t a -> b foldr' f z0 xs = foldl f' id xs z0 where f' k x z = k $! f x z
foldl' :: Foldable t => (a -> b -> a) -> a -> t b -> a foldl' f z0 xs = foldr f' id xs z0 where f' x k z = k $! f z x
That is, they are fixed definitions so specialised implementations cannot be provided.
Note also that these are the classic higher-order "foldl in terms of foldr" definitions. Current releases of GHC cannot optimises these higher-order definitions into efficient versions using accumulating parameters. Since one of the main purposes of foldl' is performance (other purpose being to avoid space leaks) then that's rather unfortunate.
The proposal is simple: move these two functions into the Foldable type class itself.
They would keep their existing default definitions but since they are then class methods they can have efficient implementations provided by the class instances.
This should not break much code. In particular it should not break existing type class instance declarations since there is a default definition for instances that don't defined the new methods.
The only potential breakage is that foldl' and foldr' are exported via Foldable(..) rather than directly. This could affect modules that use explicit imports. (I consider this fact to be a slightly unfortunate quirk of the Haskell module system).
Patch attached.
Deadline: 2 weeks: Monday 4th July.
Unresolved: what is a good concise specification of foldr' to use in the documentation? For foldl' we can say: foldl' f z = List.foldl' f z . toList
Related issues not covered by this simple proposal: providing foldl1' and foldr1', updating instances to define foldl' and foldr' if possible (e.g. array could provide an efficient impl of foldr').
Duncan
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
--
Gregory Collins

+1 and I would vote again if I didn't think folks would catch me stuffing
the ballot box. ;)
On Mon, Jun 20, 2011 at 1:02 PM, Duncan Coutts wrote: All, This issue was brought up again recently by Milan's questions about what
to do with folds for the containers package. Currently the Foldable type class has: class Foldable t where
fold :: Monoid m => t m -> m
foldMap :: Monoid m => (a -> m) -> t a -> m foldr :: (a -> b -> b) -> b -> t a -> b
foldl :: (a -> b -> a) -> a -> t b -> a foldr1 :: (a -> a -> a) -> t a -> a
foldl1 :: (a -> a -> a) -> t a -> a with default implementations for each in terms of the others. Then it
defines: foldr' :: Foldable t => (a -> b -> b) -> b -> t a -> b
foldr' f z0 xs = foldl f' id xs z0
where f' k x z = k $! f x z foldl' :: Foldable t => (a -> b -> a) -> a -> t b -> a
foldl' f z0 xs = foldr f' id xs z0
where f' x k z = k $! f z x That is, they are fixed definitions so specialised implementations
cannot be provided. Note also that these are the classic higher-order "foldl in terms of
foldr" definitions. Current releases of GHC cannot optimises these
higher-order definitions into efficient versions using accumulating
parameters. Since one of the main purposes of foldl' is performance
(other purpose being to avoid space leaks) then that's rather
unfortunate. The proposal is simple: move these two functions into the Foldable type
class itself. They would keep their existing default definitions but since they are
then class methods they can have efficient implementations provided by
the class instances. This should not break much code. In particular it should not break
existing type class instance declarations since there is a default
definition for instances that don't defined the new methods. The only potential breakage is that foldl' and foldr' are exported via
Foldable(..) rather than directly. This could affect modules that use
explicit imports. (I consider this fact to be a slightly unfortunate
quirk of the Haskell module system). Patch attached. Deadline: 2 weeks: Monday 4th July. Unresolved: what is a good concise specification of foldr' to use in the
documentation? For foldl' we can say:
foldl' f z = List.foldl' f z . toList Related issues not covered by this simple proposal: providing foldl1'
and foldr1', updating instances to define foldl' and foldr' if possible
(e.g. array could provide an efficient impl of foldr'). Duncan _______________________________________________
Libraries mailing list
Libraries@haskell.org
http://www.haskell.org/mailman/listinfo/libraries

It's a pity to have to add methods to get around GHC's limitations, but it seems harmless. Looks like you'll want strict versions of all 5 methods, though.
The only potential breakage is that foldl' and foldr' are exported via Foldable(..) rather than directly. This could affect modules that use explicit imports.
I don't think so. You can import methods separately.

The minor breakage consideration is for folks who imported Data.Foldable
(Foldable(..)) now they'll get an extra couple of methods, but that is a
pretty fragile way to do imports anyways, and I think the benefits far
outweigh the costs.
-Edward
On Mon, Jun 20, 2011 at 2:08 PM, Paterson, Ross
It's a pity to have to add methods to get around GHC's limitations, but it seems harmless. Looks like you'll want strict versions of all 5 methods, though.
The only potential breakage is that foldl' and foldr' are exported via Foldable(..) rather than directly. This could affect modules that use explicit imports.
I don't think so. You can import methods separately. _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On Mon, 2011-06-20 at 19:08 +0100, Paterson, Ross wrote:
It's a pity to have to add methods to get around GHC's limitations, but it seems harmless.
It's true, though I don't think it's a trivial limitation. The required analysis and transform is described by Andy Gill in his thesis (where it's needed so that you can fuse foldl-defined-in-terms-of-foldr using foldr/build fusion) and it's not been implemented in the 15 years since. Though GHC HEAD does currently have a "simple" implementation of the arity raising transformation but I don't think anyone has yet seriously investigated if it'd cover these kinds of cases. I was about to say that direct implementations of foldl' and foldr' can be improved compared to foldl and foldr because types like arrays and Data.Sequence can implement some of them as reverse traversals, but actually that'd work already since foldl' is a foldr and foldr' is a foldl, so an implementation of foldl as a reverse traversal would give a reasonable foldr' (apart from the higher order vs accumulator business).
Looks like you'll want strict versions of all 5 methods, though.
Mm, the '1' variants are obvious. The fold and foldMap are more interesting. I'd have to think harder about what their default definition would be, or what specialised implementations might look like (e.g. tree folds starting from the leaves and working towards the root). Are there any obvious use cases for strict monoid folds? Duncan

On Mon, Jun 20, 2011 at 10:11 PM, Duncan Coutts
Mm, the '1' variants are obvious. The fold and foldMap are more interesting. I'd have to think harder about what their default definition would be, or what specialised implementations might look like (e.g. tree folds starting from the leaves and working towards the root). Are there any obvious use cases for strict monoid folds?
How about parallel folds? Here's something I'd like to support for HashMap: parFold :: Monoid m => (v -> m) -> HashMap k v -> m sumValues = parFold Sum Another really nice property of monoid folds is that they get call-site specialized automatically by GHC. Johan

On Mon, 2011-06-20 at 22:44 +0200, Johan Tibell wrote:
On Mon, Jun 20, 2011 at 10:11 PM, Duncan Coutts
wrote: Mm, the '1' variants are obvious. The fold and foldMap are more interesting. I'd have to think harder about what their default definition would be, or what specialised implementations might look like (e.g. tree folds starting from the leaves and working towards the root). Are there any obvious use cases for strict monoid folds?
How about parallel folds? Here's something I'd like to support for HashMap:
parFold :: Monoid m => (v -> m) -> HashMap k v -> m sumValues = parFold Sum
The strategies stuff uses Traversable I think. To some degree I think you could do parallel evaluation with the existing Foldable operations because you can pick the monoid, so you can pick one that evaluates in parallel. Needs a bit of thought and experimentation. Duncan

On Mon, Jun 20, 2011 at 12:02 PM, Duncan Coutts
This should not break much code. In particular it should not break existing type class instance declarations since there is a default definition for instances that don't defined the new methods.
The only potential breakage is that foldl' and foldr' are exported via Foldable(..) rather than directly. This could affect modules that use explicit imports. (I consider this fact to be a slightly unfortunate quirk of the Haskell module system).
What would the broken code look like? GHC 6.12 handles:
import Module (classMethod)
just fine, even when 'classMethod' is exported only via:
module Module (MyClass(..))
I haven't tested this too extensively - only with 'runghc'. Antoine

On Mon, Jun 20, 2011 at 2:21 PM, Antoine Latter
On Mon, Jun 20, 2011 at 12:02 PM, Duncan Coutts
The only potential breakage is that foldl' and foldr' are exported via Foldable(..) rather than directly. This could affect modules that use explicit imports. (I consider this fact to be a slightly unfortunate quirk of the Haskell module system).
What would the broken code look like?
import Data.Foldable (Foldable(..)) import Data.List now you'd have two definitions of foldl' and foldr' in scope
participants (6)
-
Antoine Latter
-
Duncan Coutts
-
Edward Kmett
-
Gregory Collins
-
Johan Tibell
-
Paterson, Ross