
On Mon, Aug 3, 2015 at 10:18 AM, Ian Duncan
On August 3, 2015 at 8:28:02 AM, Jake McArthur (jake.mcarthur@gmail.com) wrote:
Vector also does something like what your are describing. I think the phrase to google for is "array recycling".
On 11:56PM, Fri, Jul 31, 2015 William Yager
wrote: Has anyone done any research into fusing operations that involve thawing some data, mutating it, and then freezing it again?
Data.Vector does something similar; it turns vectors into streams, operates on the streams, and then turns them back into vectors. It can fuse these operations by removing intermediate However, I've done a bit of preliminary work on a fusion system that works on operations of the form
runST $ do x' <- thaw x foo x' freeze x'
and yields promising results in some cases. This could be useful for data structures that aren't stream-able, but suffer from lots of unnecessary freezing and unfreezing.
This seems like the sort of thing that someone would have already done, so I wanted to check if anyone knew of any previous work on this.
Cheers, Will _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Sorry, I know I already sent this once, but it looks like the formatting got a little weird, so for anyone who couldn't read it before, I'll try this again: I’d be interested in knowing about the preliminary work you’re doing here. I’ve got a port of Clojure’s persistent vectors (pvector for short) in development right now, and I’m working on figuring out how to batch consecutive pure modifications of a pvector into doing cheaper operations on the mutable version before converting it into the immutable version again. Here are the operations scenarios that I’m especially interested in optimizing this way: {- each of these requires cloning a 32-element array. optimizing consecutive snocs is important because fromList is defined as (foldl' snoc empty) -} update1 :: Vector a -> Int -> a -> Vector a snoc :: Vector a -> a -> Vector a unsnoc :: Vector a -> Maybe (a, Vector a) {- requires cloning <= N/32 arrays, where N is length of update batch -} update :: Vector a -> Vector (Int, a) -> Vector a {- would be nice to preallocate a fully-sized pvector for consecutive concats. for example (\vs -> foldr concat empty vs) -} concat :: Vector a -> Vector a -> Vector a I only just started working on freeze/thaw fusion myself, so I haven’t gotten far at all. One trick that I’ve got in place in the transient structure to make this cheaper though is how I track modifications of the persistent vector. The immutable version is represented like so: data Node a = Leaf !(Array a) {- Array here from primitive package -} | Branch !(Array (Node a)) | EmptyNode data Vector a = Vector { vCount :: !Int , vShift :: !Int , vRoot :: !(Node a) , vTail :: !(Array a) , vTailCount :: !Int } And then here’s the transient version: data TransientNode s a = UneditedLeaf !(Array a) | UneditedBranch !(Array (Node a)) | EditedLeaf !(MutableArray s a) | EditedBranch !(MutableArray s (TransientNode s a)) | EmptyTransientNode data TransientVector s a = TransientVector { tvCount :: !Int , tvShift :: !Int , tvRoot :: !(TransientNode s a) , tvTail :: !(MutableArray s a) , tvTailEdited :: !Bool , tvTailCount :: !Int } When the persistent version is converted to the transient version, the data structure tracks which children have been modified over the course of operations on the transient structure. So, when the mutable version is converted back into the persistent version, any of the underlying arrays which haven’t been touched are still shared with any of the other live pvectors that are referencing them.