New subject: Why isn't there a cheaper "split-in-two" operation for Data.Set?

22 Sep 2010

      Would there be anything wrong with a Data.Set simply chopping off half its
(balanced) tree and returning two approximately balanced partitions -- i.e.
Data.Set.split without the pivot argument?  Even though it can't quite be
constant-time (still need rebalancing) it could still be cheaper
than Data.Set.split, right?  In the context of Set.hs, I'm thinking along
the lines of something like the following:

cleave :: Set a -> (Set a, Set a)
cleave Tip = (Tip, Tip)
cleave (Bin _ x l r)
  | size l > size r = (l, insertMin x r)
  | otherwise       = (insertMax x l, r)

For one thing, this seems like the proposed function would be useful for
consuming a set in parallel, similar to Guy Steele's arguments in recent
talks (see Organizing Functional Code for Parallel Execution; or, foldl and
foldr Considered Slightly
Harmfulhttp://research.sun.com/projects/plrg/Publications/ICFPAugust2009Steele.pdf).
 That is, one splits a tree and uses "par" to process the left and right
half in parallel, enabling

Alternatively, it would be simple to provide a parallel version of
mapMonotonic that does this internally, i.e.:

mapMonotonic _ Tip = Tip
mapMonotonic f (Bin sz x l r) =
    Bin sz (f x) (mapMonotonic f l) (mapMonotonic f r)

Becomes something like:

parMapMonotonic f (Bin sz x l r) =
    par left $ seq right $
    Bin sz (f x) left right
 where
  left  = mapMonotonic f l
  right = mapMonotonic f r

Cheers,
 -Ryan

Why isn't there a cheaper "split-in-two" operation for Data.Set?

Ryan Newton

Ryan Newton

Bertram Felgenhauer

Ryan Newton

wren ng thornton

Thomas Schilling

tags

participants (5)