What class for splittable data / balanced-fold?

Hi all, We all know and love Data.Foldable and are familiar with left folds and right folds. But what you want in a parallel program is a balanced fold over a tree. Fortunately, many of our datatypes (Sets, Maps) actually ARE balanced trees. Hmm, but how do we expose that? It seems like it would be nice to have a* standard class t*hat allows you to split a datatype into roughly even halves, until you get down to the leaves. This goes along with Guy Steele's argument that we should use "append lists" as primitive rather than "cons-lists", and it's why we added append-lists within the monad-par libraryhttp://hackage.haskell.org/package/monad-par-extras-0.3.3/docs/Control-Monad... . Does this class exist already? A random google search brought up this module by the name Data.Splittablehttp://hackage.haskell.org/package/unfoldable-0.2.0/docs/Data-Splittable.htm..., but it's not quite the right thing. Thanks, -Ryan

On Sat, Sep 28, 2013 at 1:09 PM, Ryan Newton
Hi all,
We all know and love Data.Foldable and are familiar with left folds and right folds. But what you want in a parallel program is a balanced fold over a tree. Fortunately, many of our datatypes (Sets, Maps) actually ARE balanced trees. Hmm, but how do we expose that?
Hi Ryan, At least for Data.Map, the Foldable instance seems to have a reasonably balanced fold called fold (or foldMap):
fold t = go t where go (Bin _ _ v l r) = go l `mappend` (v `mappend` go r)
This doesn't seem to be guaranteed though. For example ghc's derived instance writes the foldr only, so fold would be right-associated for a:
data T a = B (T a) (T a) | L a deriving (Foldable)
Regards, Adam

I've got a Partitionable class that I've been using for this purpose:
https://github.com/mikeizbicki/ConstraintKinds/blob/master/src/Control/Const...
The function called "parallel" in the HLearn library will automatically
parallelize any homomorphism from a Partionable to a Monoid. I
specifically use that to parallelize machine learning algorithms.
I have two thoughts for better abstractions:
1) This Partitionable class is essentially a comonoid. By reversing the
arrows of mappend, we get:
comappend :: a -> (a,a)
By itself, this works well if the number of processors you have is a power
of two, but it needs some more fanciness to get things balanced properly
for other numbers of processors. I bet there's another algebraic structure
that would capture these other cases, but I'm not sure what it is.
2) I'm working with parallelizing tree structures right now (kd-trees,
cover trees, oct-trees, etc.). The real problem is not splitting the
number of data points equally (this is easy), but splitting the amount of
work equally. Some points take longer to process than others, and this
cannot be determined in advance. Therefore, an equal split of the data
points can result in one processor getting 25% of the work load, and the
second processor getting 75%. Some sort of lazy Partitionable class that
was aware of processor loads and didn't split data points until they were
needed would be ideal for this scenario.
On Sat, Sep 28, 2013 at 6:46 PM, adam vogt
On Sat, Sep 28, 2013 at 1:09 PM, Ryan Newton
wrote: Hi all,
We all know and love Data.Foldable and are familiar with left folds and right folds. But what you want in a parallel program is a balanced fold over a tree. Fortunately, many of our datatypes (Sets, Maps) actually ARE balanced trees. Hmm, but how do we expose that?
Hi Ryan,
At least for Data.Map, the Foldable instance seems to have a reasonably balanced fold called fold (or foldMap):
fold t = go t where go (Bin _ _ v l r) = go l `mappend` (v `mappend` go r)
This doesn't seem to be guaranteed though. For example ghc's derived instance writes the foldr only, so fold would be right-associated for a:
data T a = B (T a) (T a) | L a deriving (Foldable)
Regards, Adam _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Sat, Sep 28, 2013 at 1:09 PM, Ryan Newton
Hi all,
We all know and love Data.Foldable and are familiar with left folds and right folds. But what you want in a parallel program is a balanced fold over a tree. Fortunately, many of our datatypes (Sets, Maps) actually ARE balanced trees. Hmm, but how do we expose that?
It seems like it would be nice to have a* standard class t*hat allows you to split a datatype into roughly even halves, until you get down to the leaves. This goes along with Guy Steele's argument that we should use "append lists" as primitive rather than "cons-lists", and it's why we added append-lists within the monad-par libraryhttp://hackage.haskell.org/package/monad-par-extras-0.3.3/docs/Control-Monad... .
Interestingly, in my Fortress days we looked at both using a split-like interface and at a more foldMap / reduce - like interface, and it seemed like the latter worked better – it requires a lot less boilerplate for controlling recursion, and better matches the fanout of whatever structure you're actually using underneath. So I'd just go with a hand-written Foldable instance here. But I'd love to hear if you've come up with an application that requires split itself, and that *isn't* zip. I recall we decided zip was better done with element-and-index iteration over one of the structures and indexing into the other since most tree structures don't actually zip properly anyway. -Jan-Willem Maessen

Thanks, that's interesting to know (re: Fortress). Interestingly, in my Fortress days we looked at both using a split-like
interface and at a more foldMap / reduce - like interface, and it seemed like the latter worked better – it requires a lot less boilerplate for controlling recursion, and better matches the fanout of whatever structure you're actually using underneath.
Ok, we'll have to try that. I may be underestimating the power of a newtype and a monoid instance to expose the structure.. I was wrong about this before [1]. Here's the foldMap instance for Data.Map: foldMap _ Tip = mempty foldMap f (Bin _ _ v l r) = Foldable.foldMap f l `mappend` f v `mappend` Foldable.foldMap f r Simon Marlow in his recent Haxl talk also had a domain where they wanted a symmetric (non-monadic) parallel spawn operation... But it remains pretty hard for me to reason about the operational behavior of these things... especially since foldMap instances may vary. Thanks, -Ryan [1] For example, here is a non-allocating traverseWithKey_ that I failed to come up with: -- Version of traverseWithKey_ from Shachaf Ben-Kiki -- (See thread on Haskell-cafe.) -- Avoids O(N) allocation when traversing for side-effect. newtype Traverse_ f = Traverse_ { runTraverse_ :: f () } instance Applicative f => Monoid (Traverse_ f) where mempty = Traverse_ (pure ()) Traverse_ a `mappend` Traverse_ b = Traverse_ (a *> b) -- Since the Applicative used is Const (newtype Const m a = Const m), the -- structure is never built up. --(b) You can derive traverseWithKey_ from foldMapWithKey, e.g. as follows: traverseWithKey_ :: Applicative f => (k -> a -> f ()) -> M.Map k a -> f () traverseWithKey_ f = runTraverse_ . foldMapWithKey (\k x -> Traverse_ (void (f k x))) foldMapWithKey :: Monoid r => (k -> a -> r) -> M.Map k a -> r foldMapWithKey f = getConst . M.traverseWithKey (\k x -> Const (f k x))

On Sun, Sep 29, 2013 at 9:13 PM, Ryan Newton
Thanks, that's interesting to know (re: Fortress).
Interestingly, in my Fortress days we looked at both using a split-like
interface and at a more foldMap / reduce - like interface, and it seemed like the latter worked better – it requires a lot less boilerplate for controlling recursion, and better matches the fanout of whatever structure you're actually using underneath.
Ok, we'll have to try that. I may be underestimating the power of a newtype and a monoid instance to expose the structure.. I was wrong about this before [1]. Here's the foldMap instance for Data.Map:
foldMap _ Tip = mempty foldMap f (Bin _ _ v l r) = Foldable.foldMap f l `mappend` f v `mappend` Foldable.foldMap f r
Simon Marlow in his recent Haxl talk also had a domain where they wanted a symmetric (non-monadic) parallel spawn operation...
But it remains pretty hard for me to reason about the operational behavior of these things... especially since foldMap instances may vary.
I'll note that there's really a documentation responsibility here that hasn't been honored as much as it should (possibly because lots of folks are driving Foldable, which other commenters have noted doesn't seem to do what you want for tree-like data structures – I certainly didn't realize that). It'd be worth thinking about doing the derivation of foldMap directly from the structure of the underlying type. It'd also be worth documenting when we get tree-structured traversal out of a Foldable instance, and fixing the ones that don't provide it. And I agree that getting down to non-allocating traversals is the ultimate goal here. If we leak space or lose parallelism we might as well not bother. -Jan Thanks,
-Ryan
[1] For example, here is a non-allocating traverseWithKey_ that I failed to come up with:
-- Version of traverseWithKey_ from Shachaf Ben-Kiki -- (See thread on Haskell-cafe.) -- Avoids O(N) allocation when traversing for side-effect.
newtype Traverse_ f = Traverse_ { runTraverse_ :: f () } instance Applicative f => Monoid (Traverse_ f) where mempty = Traverse_ (pure ()) Traverse_ a `mappend` Traverse_ b = Traverse_ (a *> b) -- Since the Applicative used is Const (newtype Const m a = Const m), the -- structure is never built up. --(b) You can derive traverseWithKey_ from foldMapWithKey, e.g. as follows: traverseWithKey_ :: Applicative f => (k -> a -> f ()) -> M.Map k a -> f () traverseWithKey_ f = runTraverse_ . foldMapWithKey (\k x -> Traverse_ (void (f k x))) foldMapWithKey :: Monoid r => (k -> a -> r) -> M.Map k a -> r foldMapWithKey f = getConst . M.traverseWithKey (\k x -> Const (f k x))

Oops, this email got stuck in the pipe (flaky internet):
foldMap _ Tip = mempty foldMap f (Bin _ _ v l r) = Foldable.foldMap f l `mappend` f v `mappend` Foldable.foldMap f r
Btw, from my perspective, one problem with relying on foldMap is that it treats the whole structure uniformly, whereas the split approach would let one, for example, bottom out to a sequential implementation at a certain granularity. Perhaps that is the "boilerplate for controlling recursion" that you referred to... but isn't it sometimes necessary? -Ryan
participants (4)
-
adam vogt
-
Jan-Willem Maessen
-
Mike Izbicki
-
Ryan Newton