
Here's the approach I was attempting: a while back there was a proposal to
modify Data.Map to support a sort of zipper, including the following API:
data Location k a -- represents a map with a "hole" at a particular key
search :: k -> Map k a -> (Maybe a, Location k a)
key :: Location k a -> k
assign :: a -> Location k a -> Map k a
clear :: Location k a -> Map k a
and from here you might define
insert :: k -> a -> Map k a -> Map k a
insert k a m = case search k m of
(_, loc) -> assign a loc
The surprising thing was that this approach *increased* the speed of the Map
implementation by something like 7%. I don't know what happened to the
original proposal, but I was going to try something similar, with something
like
data HashMap k v = HM !Int !(Tree k v)
insert k v (HM sz tree) = case search k tree of
(Nothing, loc) -> HM (sz + 1) (assign v loc)
(Just _, loc) -> HM sz (assign v loc)
Louis Wasserman
wasserman.louis@gmail.com
http://profiles.google.com/wasserman.louis
On Tue, Feb 22, 2011 at 8:45 PM, Johan Tibell
On Tue, Feb 22, 2011 at 6:28 PM, Louis Wasserman
wrote: Apologies!
Accepted. :)
I was, in point of fact, working on such a patch, but I think I've been convinced that it's a legitimate position for a package to take. (I'm also curious, Johann, about the approach you figured out that didn't cost a word per node!)
I'm working on a patch that provides O(1) size right now. The trick is to define HashMap as:
data HashMap k v = HM {-# UNPACK #-} !Int !(Tree k v)
data Tree k v = Nil | Tip {-# UNPACK #-} !Hash {-# UNPACK #-} !(FL.FullList k v) | Bin {-# UNPACK #-} !Prefix {-# UNPACK #-} !Mask !(Tree k v) !(Tree k v)
And have insert, delete, and other operations that change the size return both the updated tree and the size delta (-1, 0, or 1 in the case of insert/delete). The size delta is then applied to the stored size.
At the moment I'm trying to figure out the performance impact of making this change.
Johan