Re: [Haskell-cafe] Freeze/thaw fusion

3 Aug 2015

      On Mon, Aug 3, 2015 at 10:18 AM, Ian Duncan  wrote:
...
On August 3, 2015 at 8:28:02 AM, Jake McArthur (jake.mcarthur@gmail.com)
wrote:
Vector also does something like what your are describing. I think the phrase
to google for is "array recycling".
On 11:56PM, Fri, Jul 31, 2015 William Yager  wrote:
...
Has anyone done any research into fusing operations that involve thawing
some data, mutating it, and then freezing it again?
Data.Vector does something similar; it turns vectors into streams,
operates on the streams, and then turns them back into vectors. It can fuse
these operations by removing intermediate  However, I've done a bit of
preliminary work on a fusion system that works on operations of the form
runST $ do
        x' <- thaw x
        foo x'
        freeze x'
and yields promising results in some cases. This could be useful for data
structures that aren't stream-able, but suffer from lots of unnecessary
freezing and unfreezing.
This seems like the sort of thing that someone would have already done, so
I wanted to check if anyone knew of any previous work on this.
Cheers,
Will
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Sorry, I know I already sent this once, but it looks like the
formatting got a little weird, so for anyone who couldn't read it
before, I'll try this again:

I’d be interested in knowing about the preliminary work you’re doing
here. I’ve got a port of Clojure’s persistent vectors (pvector for
short) in development right now, and I’m working on figuring out how
to batch consecutive pure modifications of a pvector into doing
cheaper operations on the mutable version before converting it into
the immutable version again.

Here are the operations scenarios that I’m especially interested in
optimizing this way:

{- each of these requires cloning a 32-element array.
   optimizing consecutive snocs is important because
   fromList is defined as (foldl' snoc empty)
-}
update1 :: Vector a -> Int -> a -> Vector a
snoc :: Vector a -> a -> Vector a
unsnoc :: Vector a -> Maybe (a, Vector a)

{- requires cloning <= N/32 arrays, where N is length of update batch -}
update :: Vector a -> Vector (Int, a) -> Vector a

{- would be nice to preallocate a fully-sized pvector for consecutive concats.
   for example (\vs -> foldr concat empty vs)
-}
concat :: Vector a -> Vector a -> Vector a

I only just started working on freeze/thaw fusion myself, so I haven’t
gotten far at all. One trick that I’ve got in place in the transient
structure to make this cheaper though is how I track modifications of
the persistent vector. The immutable version is represented like so:

data Node a
  = Leaf   !(Array a) {- Array here from primitive package -}
  | Branch !(Array (Node a))
  | EmptyNode

data Vector a = Vector
  { vCount     :: !Int
  , vShift     :: !Int
  , vRoot      :: !(Node a)
  , vTail      :: !(Array a)
  , vTailCount :: !Int
  }

And then here’s the transient version:

data TransientNode s a
  = UneditedLeaf   !(Array a)
  | UneditedBranch !(Array (Node a))
  | EditedLeaf     !(MutableArray s a)
  | EditedBranch   !(MutableArray s (TransientNode s a))
  | EmptyTransientNode

data TransientVector s a = TransientVector
  { tvCount      :: !Int
  , tvShift      :: !Int
  , tvRoot       :: !(TransientNode s a)
  , tvTail       :: !(MutableArray s a)
  , tvTailEdited :: !Bool
  , tvTailCount  :: !Int
  }

When the persistent version is converted to the transient version, the
data structure tracks which children have been modified over the
course of operations on the transient structure. So, when the mutable
version is converted back into the persistent version, any of the
underlying arrays which haven’t been touched are still shared with any
of the other live pvectors that are referencing them.