Efficent lens operation for Data.Map et al.

Using the existing interface to Data.Map it is impossible to implement efficent lenses for the structures. Right now, the implementation of mapLens et al. in data-lens must traverse the entire structure of the map twice: once to do a lookup and another to do an alteration. mapLens :: Ord k => k -> Lens (Map k v) (Maybe v) mapLens k = Lens $ \m -> store (\mv -> case mv of Nothing -> Map.delete k m Just v' -> Map.insert k v' m ) (Map.lookup k m) A native lensing operation would improve the situtation much. Two operations come close already: (1) alter comes close but doesn't return the lookup. (2) updateLookupWithKey comes even closer, however there is no way to insert a new key-value pair if the key isn't already in the map. So I propose adding a lens operation to the Data.Map (et al.) interfaces. Or, if you guys like warm-fuzzy names, we could call it alterLookup. lens :: k -> Map k a -> (Maybe a -> Map k a, Maybe a) with the specification that snd (lens k m) = lookup k m fst (lens k m) v = update (const v) k m The relationship between lens and alter is alter f k m = let (u,v) = lens k m in u (f v) In fact, every other Map update operation should be derivable from the lens function. I'm slowly working on a patch; however I thought I'd bring it up here on the mailing list in case some enterprising person who is familiar with these libraries wants to go ahead and implement it in Data.Map, Data.IntMap, Data.Set and Data.IntSet. -- Russell O'Connor http://r6.ca/ ``All talk about `theft,''' the general counsel of the American Graphophone Company wrote, ``is the merest claptrap, for there exists no property in ideas musical, literary or artistic, except as defined by statute.''

+1 on the concept, although IIRC the last time something similar came up the constant factors were kind of bad. Edward Excerpts from roconnor's message of Tue Jan 17 14:22:17 -0500 2012:
Using the existing interface to Data.Map it is impossible to implement efficent lenses for the structures. Right now, the implementation of mapLens et al. in data-lens must traverse the entire structure of the map twice: once to do a lookup and another to do an alteration.
mapLens :: Ord k => k -> Lens (Map k v) (Maybe v) mapLens k = Lens $ \m -> store (\mv -> case mv of Nothing -> Map.delete k m Just v' -> Map.insert k v' m ) (Map.lookup k m)
A native lensing operation would improve the situtation much. Two operations come close already:
(1) alter comes close but doesn't return the lookup. (2) updateLookupWithKey comes even closer, however there is no way to insert a new key-value pair if the key isn't already in the map.
So I propose adding a lens operation to the Data.Map (et al.) interfaces. Or, if you guys like warm-fuzzy names, we could call it alterLookup.
lens :: k -> Map k a -> (Maybe a -> Map k a, Maybe a)
with the specification that
snd (lens k m) = lookup k m
fst (lens k m) v = update (const v) k m
The relationship between lens and alter is
alter f k m = let (u,v) = lens k m in u (f v)
In fact, every other Map update operation should be derivable from the lens function.
I'm slowly working on a patch; however I thought I'd bring it up here on the mailing list in case some enterprising person who is familiar with these libraries wants to go ahead and implement it in Data.Map, Data.IntMap, Data.Set and Data.IntSet.

On Tue, 17 Jan 2012, Johan Tibell wrote:
On Tue, Jan 17, 2012 at 11:22 AM,
wrote: In fact, every other Map update operation should be derivable from the lens function.
We tried this an performance was terrible (although we used a zipper.)
I don't really see what zippers have to do with this. Here is my non-typechecked proposal for IntMap.lens adapted from updateLookupWithKey and alter: lens :: Key -> IntMap a -> (Maybe a -> IntMap a, Maybe a) lens k t = case t of Bin p m l r | nomatch k p m -> (\nv -> case nv of Nothing -> t Just x -> join k (Tip k x) p t , Nothing) | zero k m -> let (fl',found) = lens k l in (\nv -> bin p m (fl' nv) r, found) | otherwise -> let (fr',found) = lens k r in (\nv -> bin p m l (fr' nv), found) Tip ky y | k==ky -> (\nv -> case nv of Just y' -> Tip ky y') Nothing -> Nil , Just y) | otherwise -> (\nv -> case nv of Just x -> join k (Tip k x) ky t Nothing -> t , Nothing) Nil -> (\nv -> case nv of Just x -> Tip k x Nothing -> Nil , Nothing) -- Russell O'Connor http://r6.ca/ ``All talk about `theft,''' the general counsel of the American Graphophone Company wrote, ``is the merest claptrap, for there exists no property in ideas musical, literary or artistic, except as defined by statute.''

On Tue, Jan 17, 2012 at 3:26 PM,
I don't really see what zippers have to do with this.
It's another way of creating a continuation by saving the stack.
Here is my non-typechecked proposal for IntMap.lens adapted from updateLookupWithKey and alter:
lens :: Key -> IntMap a -> (Maybe a -> IntMap a, Maybe a) lens k t = case t of Bin p m l r | nomatch k p m -> (\nv -> case nv of Nothing -> t Just x -> join k (Tip k x) p t , Nothing) | zero k m -> let (fl',found) = lens k l in (\nv -> bin p m (fl' nv) r, found) | otherwise -> let (fr',found) = lens k r in (\nv -> bin p m l (fr' nv), found) Tip ky y | k==ky -> (\nv -> case nv of Just y' -> Tip ky y') Nothing -> Nil , Just y) | otherwise -> (\nv -> case nv of Just x -> join k (Tip k x) ky t Nothing -> t , Nothing) Nil -> (\nv -> case nv of Just x -> Tip k x Nothing -> Nil , Nothing)
Try it out on the Criterion benchmarks in the source repo. I'd be curious to see the results. -- Johan

On Tue, 17 Jan 2012, Johan Tibell wrote:
Try it out on the Criterion benchmarks in the source repo. I'd be curious to see the results.
Okay, I never intended to try and replace the existing operations with a lens based one, but now you've got me curious to try. Here is my fastLens implementation: fastLens :: Key -> IntMap a -> (IntMap a -> (a -> IntMap a) -> a -> c) -> (IntMap a -> (a -> IntMap a) -> c) -> c fastLens k = k `seq` go where go t@(Bin p m l r) c1 c2 | nomatch k p m = c2 t (\x -> join k (Tip k x) p t) | zero k m = go l (updateContL c1) (updateContL c2) | otherwise = go r (updateContR c1) (updateContR c2) where updateContL c dl il = c (bin p m dl r) (\x -> bin p m (il x) r) updateContR c dr ir = c (bin p m l dr) (\x -> bin p m r (ir x)) go t@(Tip ky y) c1 c2 | k == ky = c1 Nil (Tip ky) y | otherwise = c2 t (\x -> join k (Tip k x) ky t) go Nil _ c2 = c2 Nil (Tip k) lookup :: Key -> IntMap a -> Maybe a lookup k t = fastLens k t (\_ _ -> Just) (\_ _ -> Nothing) insert :: Key -> a -> IntMap a -> IntMap a insert k x t = fastLens k t (\_ it _ -> it x) (\_ it -> it x) delete :: Key -> IntMap a -> IntMap a delete k t = fastLens k t (\dt _ _ -> dt) (\dt _ -> dt) update :: (a -> Maybe a) -> Key -> IntMap a -> IntMap a update f k t = fastLens k t (\dt it x -> maybe dt it (f x)) (\dt _ -> dt) alter :: (Maybe a -> Maybe a) -> Key -> IntMap a -> IntMap a alter f k t = fastLens k t (\dt it x -> maybe dt it (f (Just x))) (\dt it -> maybe dt it (f Nothing)) lens :: Key -> IntMap a -> (Maybe a -> IntMap a, Maybe a) lens k t = fastLens k t (\dt it x -> (maybe dt it,Just x)) (\dt it -> (maybe dt it,Nothing)) (See also: https://github.com/roconnor/containers/tree/FastLens) How do I use criterion to benchmark this? -- Russell O'Connor http://r6.ca/ ``All talk about `theft,''' the general counsel of the American Graphophone Company wrote, ``is the merest claptrap, for there exists no property in ideas musical, literary or artistic, except as defined by statute.''

IIRC you just replace the current functions with yours and run make in
the benchmarks/ directory to compile the benchmark binaries (which use
Criterion). Then simply run them.
On Wed, Jan 18, 2012 at 8:06 AM,
On Tue, 17 Jan 2012, Johan Tibell wrote:
Try it out on the Criterion benchmarks in the source repo. I'd be curious to see the results.
Okay, I never intended to try and replace the existing operations with a lens based one, but now you've got me curious to try.
Here is my fastLens implementation:
fastLens :: Key -> IntMap a -> (IntMap a -> (a -> IntMap a) -> a -> c) -> (IntMap a -> (a -> IntMap a) -> c) -> c fastLens k = k `seq` go where go t@(Bin p m l r) c1 c2 | nomatch k p m = c2 t (\x -> join k (Tip k x) p t) | zero k m = go l (updateContL c1) (updateContL c2) | otherwise = go r (updateContR c1) (updateContR c2) where updateContL c dl il = c (bin p m dl r) (\x -> bin p m (il x) r) updateContR c dr ir = c (bin p m l dr) (\x -> bin p m r (ir x)) go t@(Tip ky y) c1 c2 | k == ky = c1 Nil (Tip ky) y | otherwise = c2 t (\x -> join k (Tip k x) ky t) go Nil _ c2 = c2 Nil (Tip k)
lookup :: Key -> IntMap a -> Maybe a lookup k t = fastLens k t (\_ _ -> Just) (\_ _ -> Nothing)
insert :: Key -> a -> IntMap a -> IntMap a insert k x t = fastLens k t (\_ it _ -> it x) (\_ it -> it x)
delete :: Key -> IntMap a -> IntMap a delete k t = fastLens k t (\dt _ _ -> dt) (\dt _ -> dt)
update :: (a -> Maybe a) -> Key -> IntMap a -> IntMap a update f k t = fastLens k t (\dt it x -> maybe dt it (f x)) (\dt _ -> dt)
alter :: (Maybe a -> Maybe a) -> Key -> IntMap a -> IntMap a alter f k t = fastLens k t (\dt it x -> maybe dt it (f (Just x))) (\dt it -> maybe dt it (f Nothing))
lens :: Key -> IntMap a -> (Maybe a -> IntMap a, Maybe a) lens k t = fastLens k t (\dt it x -> (maybe dt it,Just x)) (\dt it -> (maybe dt it,Nothing))
(See also: https://github.com/roconnor/containers/tree/FastLens)
How do I use criterion to benchmark this?
-- Russell O'Connor http://r6.ca/ ``All talk about `theft,''' the general counsel of the American Graphophone Company wrote, ``is the merest claptrap, for there exists no property in ideas musical, literary or artistic, except as defined by statute.''

On Wed, 18 Jan 2012, Johan Tibell wrote:
IIRC you just replace the current functions with yours and run make in the benchmarks/ directory to compile the benchmark binaries (which use Criterion). Then simply run them.
I got an error trying to build the benchmarks: $ ghc --version The Glorious Glasgow Haskell Compilation System, version 6.12.3 $ make ghc -DTESTING -cpp -O2 -fregs-graph -rtsopts --make -fforce-recomp -i.. -o bench-Map Map.hs [1 of 7] Compiling Data.StrictPair ( ../Data/StrictPair.hs, ../Data/StrictPair.o ) [2 of 7] Compiling Data.Set ( ../Data/Set.hs, ../Data/Set.o ) [3 of 7] Compiling Data.Map.Base ( ../Data/Map/Base.hs, ../Data/Map/Base.o ) [4 of 7] Compiling Data.Map.Lazy ( ../Data/Map/Lazy.hs, ../Data/Map/Lazy.o ) [5 of 7] Compiling Data.Map.Strict ( ../Data/Map/Strict.hs, ../Data/Map/Strict.o ) [6 of 7] Compiling Data.Map ( ../Data/Map.hs, ../Data/Map.o ) [7 of 7] Compiling Main ( Map.hs, Map.o ) Linking bench-Map ... ghc -DTESTING -cpp -O2 -fregs-graph -rtsopts --make -fforce-recomp -i.. -o bench-Set Set.hs [1 of 2] Compiling Data.Set ( ../Data/Set.hs, ../Data/Set.o ) [2 of 2] Compiling Main ( Set.hs, Set.o ) Linking bench-Set ... ghc -DTESTING -cpp -O2 -fregs-graph -rtsopts --make -fforce-recomp -i.. -o bench-IntMap IntMap.hs ../Data/IntMap/Base.hs:195:44: parse error on input `#' make: *** [bench-IntMap] Error 1 Any tips on how to proceed? -- Russell O'Connor http://r6.ca/ ``All talk about `theft,''' the general counsel of the American Graphophone Company wrote, ``is the merest claptrap, for there exists no property in ideas musical, literary or artistic, except as defined by statute.''

On Wed, Jan 18, 2012 at 1:35 PM,
On Wed, 18 Jan 2012, Johan Tibell wrote:
IIRC you just replace the current functions with yours and run make in the benchmarks/ directory to compile the benchmark binaries (which use Criterion). Then simply run them.
I got an error trying to build the benchmarks:
$ ghc --version The Glorious Glasgow Haskell Compilation System, version 6.12.3
$ make ghc -DTESTING -cpp -O2 -fregs-graph -rtsopts --make -fforce-recomp -i.. -o bench-Map Map.hs [1 of 7] Compiling Data.StrictPair ( ../Data/StrictPair.hs, ../Data/StrictPair.o ) [2 of 7] Compiling Data.Set ( ../Data/Set.hs, ../Data/Set.o ) [3 of 7] Compiling Data.Map.Base ( ../Data/Map/Base.hs, ../Data/Map/Base.o ) [4 of 7] Compiling Data.Map.Lazy ( ../Data/Map/Lazy.hs, ../Data/Map/Lazy.o ) [5 of 7] Compiling Data.Map.Strict ( ../Data/Map/Strict.hs, ../Data/Map/Strict.o ) [6 of 7] Compiling Data.Map ( ../Data/Map.hs, ../Data/Map.o ) [7 of 7] Compiling Main ( Map.hs, Map.o ) Linking bench-Map ... ghc -DTESTING -cpp -O2 -fregs-graph -rtsopts --make -fforce-recomp -i.. -o bench-Set Set.hs [1 of 2] Compiling Data.Set ( ../Data/Set.hs, ../Data/Set.o ) [2 of 2] Compiling Main ( Set.hs, Set.o ) Linking bench-Set ... ghc -DTESTING -cpp -O2 -fregs-graph -rtsopts --make -fforce-recomp -i.. -o bench-IntMap IntMap.hs
../Data/IntMap/Base.hs:195:44: parse error on input `#' make: *** [bench-IntMap] Error 1
Any tips on how to proceed?
You've hit a bug in GHC 6.12.3; it doesn't handle conditional language pragmas well. In the beginning of Data/IntMap/Base.hs we have: {-# LANGUAGE CPP #-} #if __GLASGOW_HASKELL__ {-# LANGUAGE MagicHash, DeriveDataTypeable, StandaloneDeriving #-} #endif and 6.12.3 does the wrong thing here. Should work in 7.0. -- Johan

* roconnor@theorem.ca
On Wed, 18 Jan 2012, Johan Tibell wrote:
IIRC you just replace the current functions with yours and run make in the benchmarks/ directory to compile the benchmark binaries (which use Criterion). Then simply run them.
I got an error trying to build the benchmarks:
Worked for me with GHC 7.0.4. The results are attached. In short, your version is indeed typically slower, up to a factor of 5 (for lookup). -- Roman I. Cheplyaka :: http://ro-che.info/

On Wed, Jan 18, 2012 at 1:51 PM, Roman Cheplyaka
* roconnor@theorem.ca
[2012-01-18 16:35:52-0500] On Wed, 18 Jan 2012, Johan Tibell wrote:
IIRC you just replace the current functions with yours and run make in the benchmarks/ directory to compile the benchmark binaries (which use Criterion). Then simply run them.
I got an error trying to build the benchmarks:
Worked for me with GHC 7.0.4.
The results are attached.
In short, your version is indeed typically slower, up to a factor of 5 (for lookup).
That's even slower than I expected. Doesn't mean that we cannot add the operation though (although we should think about exactly which operation(s) we need.) -- Johan

On Wed, 18 Jan 2012, Johan Tibell wrote:
On Wed, Jan 18, 2012 at 1:51 PM, Roman Cheplyaka
wrote: * roconnor@theorem.ca
[2012-01-18 16:35:52-0500] On Wed, 18 Jan 2012, Johan Tibell wrote:
IIRC you just replace the current functions with yours and run make in the benchmarks/ directory to compile the benchmark binaries (which use Criterion). Then simply run them.
I got an error trying to build the benchmarks:
Worked for me with GHC 7.0.4.
The results are attached.
In short, your version is indeed typically slower, up to a factor of 5 (for lookup).
That's even slower than I expected. Doesn't mean that we cannot add the operation though (although we should think about exactly which operation(s) we need.)
Ya, all I really want is one function: lens :: Key -> IntMap a -> (Maybe a -> IntMap a, Maybe a) for every container type. -- Russell O'Connor http://r6.ca/ ``All talk about `theft,''' the general counsel of the American Graphophone Company wrote, ``is the merest claptrap, for there exists no property in ideas musical, literary or artistic, except as defined by statute.''

* roconnor@theorem.ca
On Wed, 18 Jan 2012, Johan Tibell wrote:
On Wed, Jan 18, 2012 at 1:51 PM, Roman Cheplyaka
wrote: * roconnor@theorem.ca
[2012-01-18 16:35:52-0500] On Wed, 18 Jan 2012, Johan Tibell wrote:
IIRC you just replace the current functions with yours and run make in the benchmarks/ directory to compile the benchmark binaries (which use Criterion). Then simply run them.
I got an error trying to build the benchmarks:
Worked for me with GHC 7.0.4.
The results are attached.
In short, your version is indeed typically slower, up to a factor of 5 (for lookup).
That's even slower than I expected. Doesn't mean that we cannot add the operation though (although we should think about exactly which operation(s) we need.)
Ya, all I really want is one function:
lens :: Key -> IntMap a -> (Maybe a -> IntMap a, Maybe a)
for every container type.
Considering the benchmark results, it seems that implementing lens in terms of lookup and update actually is the fastest option we've seen? The difference between one and two traversals could become significant as the tree depth increases, but: * for Data.{Map,Set} it's logarithmic in the container size, so it won't be large for something that fits into memory * for Data.Int{Map,Set} it's bounded by the size of Int -- Roman I. Cheplyaka :: http://ro-che.info/

On Wed, 18 Jan 2012, Roman Cheplyaka wrote:
* roconnor@theorem.ca
[2012-01-18 16:35:52-0500] On Wed, 18 Jan 2012, Johan Tibell wrote:
IIRC you just replace the current functions with yours and run make in the benchmarks/ directory to compile the benchmark binaries (which use Criterion). Then simply run them.
I got an error trying to build the benchmarks:
Worked for me with GHC 7.0.4.
The results are attached.
In short, your version is indeed typically slower, up to a factor of 5 (for lookup).
Apparently calling nomatch on every Bin is really slow. Or so I'm guessing. -- Russell O'Connor http://r6.ca/ ``All talk about `theft,''' the general counsel of the American Graphophone Company wrote, ``is the merest claptrap, for there exists no property in ideas musical, literary or artistic, except as defined by statute.''

On Tue, 17 Jan 2012, Johan Tibell wrote:
On Tue, Jan 17, 2012 at 11:22 AM,
wrote: In fact, every other Map update operation should be derivable from the lens function.
We tried this an performance was terrible (although we used a zipper.)
Ah, sorry I misread your message the first time around. Ya, I don't mean to imply that every other operation should be derived from the lens function, just that, in principle, they could be. It would be interesting to compare how fast a lens based implemenation would be. (To be really fast, we probably want to CPS transform the store comonad so we don't have to decompose and recompose the resulting pair all the time.) But regardless, I want my fast lens primitive. :) -- Russell O'Connor http://r6.ca/ ``All talk about `theft,''' the general counsel of the American Graphophone Company wrote, ``is the merest claptrap, for there exists no property in ideas musical, literary or artistic, except as defined by statute.''
participants (4)
-
Edward Z. Yang
-
Johan Tibell
-
roconnor@theorem.ca
-
Roman Cheplyaka