New subject: Proposal: Partitionable goes somewhere + containers instances

29 Sep 2013

      <subject change>

On Sun, Sep 29, 2013 at 3:31 AM, Mike Izbicki  wrote:
...
I've got a Partitionable class that I've been using for this purpose:
https://github.com/mikeizbicki/ConstraintKinds/blob/master/src/Control/Const...
Mike -- Neat, that's a cool library!

Edward --  ideally, where in the standard libraries should the
Partitionable comonoid go?

Btw, I'm not sure what the ideal return type for comappend is, given that
it needs to be able to "bottom out".  Mike, our partition function's list
return type seems more reasonable.  Or maybe something simple would be this:

*class Partitionable t where*
*  partition :: t -> Maybe (t,t)*

That is, at some point its not worth splitting and returns Nothing, and
you'd better be able to deal with the 't' directly.

So what I really want is for the *containers package to please get some
kind of Partitionable instances! * Johan & others, I would be happy to
provide a patch if the class can be agreed on. This is important because
currently the balanced tree structure of Data.Set/Map is an *amazing and
beneficial property* that is *not* exposed at all through the API.
   For example, it would be great to have a parallel traverse_ for Maps and
Sets in the Par monad.  The particular impetus is that our new and enhanced
Par monad makes extensive use of Maps and Sets, both the pure, balanced
ones, and lockfree/inplace ones based on concurrent skip lists:

    http://www.cs.indiana.edu/~rrnewton/haddock/lvish/

Alternatively, it would be ok if there were a "Data.Map.Internal" module
that exposed the Bin/Tip, but I assume people would rather have a clean
Partitionable instance...

Best,
  -Ryan

On Sun, Sep 29, 2013 at 3:31 AM, Mike Izbicki  wrote:
...
I've got a Partitionable class that I've been using for this purpose:
https://github.com/mikeizbicki/ConstraintKinds/blob/master/src/Control/Const...
The function called "parallel" in the HLearn library will automatically
parallelize any homomorphism from a Partionable to a Monoid.  I
specifically use that to parallelize machine learning algorithms.
I have two thoughts for better abstractions:
1)  This Partitionable class is essentially a comonoid.  By reversing the
arrows of mappend, we get:
comappend :: a -> (a,a)
By itself, this works well if the number of processors you have is a power
of two, but it needs some more fanciness to get things balanced properly
for other numbers of processors.  I bet there's another algebraic structure
that would capture these other cases, but I'm not sure what it is.
2) I'm working with parallelizing tree structures right now (kd-trees,
cover trees, oct-trees, etc.).  The real problem is not splitting the
number of data points equally (this is easy), but splitting the amount of
work equally.  Some points take longer to process than others, and this
cannot be determined in advance.  Therefore, an equal split of the data
points can result in one processor getting 25% of the work load, and the
second processor getting 75%.  Some sort of lazy Partitionable class that
was aware of processor loads and didn't split data points until they were
needed would be ideal for this scenario.
On Sat, Sep 28, 2013 at 6:46 PM, adam vogt  wrote:
...
On Sat, Sep 28, 2013 at 1:09 PM, Ryan Newton  wrote:
...
Hi all,
We all know and love Data.Foldable and are familiar with left folds and
right folds.  But what you want in a parallel program is a balanced fold
over a tree.  Fortunately, many of our datatypes (Sets, Maps) actually
ARE
balanced trees.  Hmm, but how do we expose that?
Hi Ryan,
At least for Data.Map, the Foldable instance seems to have a
reasonably balanced fold called fold (or foldMap):
...
fold t = go t
   where   go (Bin _ _ v l r) = go l `mappend` (v `mappend` go r)
This doesn't seem to be guaranteed though. For example ghc's derived
instance writes the foldr only, so fold would be right-associated for
a:
...
data T a = B (T a) (T a) | L a deriving (Foldable)
Regards,
Adam
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Proposal: Partitionable goes somewhere + containers instances

Ryan Newton

Nicolas Trangez

Milan Straka

Edward Kmett

Ryan Newton

Mike Izbicki

Edward Kmett

Ryan Newton

Mario Blažević

Ryan Newton

tags

participants (6)