
dear traversable geniuses -- i am looking for "better" implementations of unzipMap :: M.Map a (b, c) -> (M.Map a b, M.Map a c) unzipMap m = (M.map fst m, M.map snd m) unliftMap :: (Ord a) => M.Map a (b -> c) -> M.Map a b -> M.Map a c unliftMap mf ma = M.mapWithKey (\k v -> mf M.! k $ v) ma the first is obviously inefficient as it traverses the map twice. the second just seems like it is some kind of fmap. any ideas? ben

On Fri, Jul 30, 2010 at 08:13:43PM -0700, Ben wrote:
unliftMap :: (Ord a) => M.Map a (b -> c) -> M.Map a b -> M.Map a c unliftMap mf ma = M.mapWithKey (\k v -> mf M.! k $ v) ma
I always thought a useful map primitive would be unionWithJoin :: (a -> b -> c) -- combine values that appear in both maps -> (b -> c) -- value appears in second map but not the first -> (a -> c) -- value appears in first map but not second -> Map k a -> Map k b -> Map k c along with the 'WithKey' and 'Maybe' variants. John -- John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/

Ben wrote:
unliftMap :: (Ord a) => M.Map a (b -> c) -> M.Map a b -> M.Map a c unliftMap mf ma = M.mapWithKey (\k v -> mf M.! k $ v) ma
the first is obviously inefficient as it traverses the map twice. the second just seems like it is some kind of fmap.
It's the (<*>) operator of Applicative, which is equivalent to `ap` on Monad (though (<*>) may be more efficient for certain monads). Or rather, it *should* be the (<*>) operator. The use of (!) means it will explode when when the keys of ma are not a subset of the keys of mf. Really, it should be apMap mf ma = M.mapMaybeWithKey (\k f -> f <$> M.lookup k ma) mf Though that's not any more efficient. It will be somewhat slower because of the need to remove the spine for deleted elements, but the difference should be marginal. The similarity between (<*>) and fmap aka (<$>) is certainly notable. However, they are quite distinct: (<$>) :: Functor f => (a -> b) -> f a -> f b (<*>) :: Applicative f => f (a -> b) -> f a -> f b (=<<) :: Monad f => (a -> f b) -> f a -> f b Each one gives more power than the previous because it can accept functions with more structure. I.e. fmap doesn't allow any structure; (<*>) allows top-level structure and so it must be able to perform a sort of multiplication (e.g., intersection or cross-product) on f-structures; and (=<<) allows embedded structure, and so it must be able to perform a kind of extension (e.g., the multiplication of a semiring) on f-structures. -- Live well, ~wren

Ben wrote:
dear traversable geniuses --
i am looking for "better" implementations of
unzipMap :: M.Map a (b, c) -> (M.Map a b, M.Map a c) unzipMap m = (M.map fst m, M.map snd m)
I don't think you can give a more efficient implementation using the public interface of Data.Map. You need to have a sort of mapping function that allows you to thread them together, either via continuations or via a primitive: splitMap :: (a -> (b,c)) -> Map k a -> (Map k b, Map k c) This splitMap and the mapEither primitive could be combined: data Or a b = Fst a | Both a b | Snd b eitherOr (Left a) = Fst a eitherOr (Right b) = Snd a mapOr :: (a -> Or b c) -> Map k a -> (Map k b, Map k c) mapEither f = mapOr (eitherOr . f) splitMap f = mapOr (uncurry Both) And the type of John's primitive could be prettied up: unionWithJoin :: (Or a b -> c) -> Map k a -> Map k b -> Map k c -- Live well, ~wren

On 31 July 2010 06:45, wren ng thornton
Ben wrote:
dear traversable geniuses --
i am looking for "better" implementations of
unzipMap :: M.Map a (b, c) -> (M.Map a b, M.Map a c) unzipMap m = (M.map fst m, M.map snd m)
I don't think you can give a more efficient implementation using the public interface of Data.Map. You need to have a sort of mapping function that allows you to thread them together, either via continuations or via a primitive:
Unless I'm missing something. This one has one traversal... unzipMap :: Ord a => M.Map a (b, c) -> (M.Map a b, M.Map a c) unzipMap = M.foldrWithKey fn (M.empty,M.empty) where fn k a (m1,m2) = (M.insert k (fst a) m1, M.insert k (snd a) m2)

Stephen Tetley wrote:
wren ng thornton wrote:
Ben wrote:
unzipMap :: M.Map a (b, c) -> (M.Map a b, M.Map a c) unzipMap m = (M.map fst m, M.map snd m)
I don't think you can give a more efficient implementation using the public interface of Data.Map. You need to have a sort of mapping function that allows you to thread them together, either via continuations or via a primitive:
Unless I'm missing something. This one has one traversal...
unzipMap :: Ord a => M.Map a (b, c) -> (M.Map a b, M.Map a c) unzipMap = M.foldrWithKey fn (M.empty,M.empty) where fn k a (m1,m2) = (M.insert k (fst a) m1, M.insert k (snd a) m2)
Well, that's one traversal of the original map, but you have to traverse the new maps repeatedly with all those M.insert calls. And since Data.Map is a balanced tree, that could lead to a whole lot of work rebalancing things. However, because we are not altering the set of keys, we are guaranteed that the structure of both new maps will be identical to the structure of the old map. Therefore, with the right primitives, we can keep one finger in each of the three maps and traverse them all in parallel without re-traversing any part of the spine. (The Either and Or variants will have some retraversal as the smart constructors prune out the spine leading to deleted keys. But this is, arguably, necessary.) -- Live well, ~wren

On 31/07/10 12:13, wren ng thornton wrote:
Stephen Tetley wrote:
wren ng thornton wrote:
Ben wrote:
unzipMap :: M.Map a (b, c) -> (M.Map a b, M.Map a c) unzipMap m = (M.map fst m, M.map snd m)
I don't think you can give a more efficient implementation using the public interface of Data.Map. You need to have a sort of mapping function that allows you to thread them together, either via continuations or via a primitive:
Unless I'm missing something. This one has one traversal...
unzipMap :: Ord a => M.Map a (b, c) -> (M.Map a b, M.Map a c) unzipMap = M.foldrWithKey fn (M.empty,M.empty) where fn k a (m1,m2) = (M.insert k (fst a) m1, M.insert k (snd a) m2)
Well, that's one traversal of the original map, but you have to traverse the new maps repeatedly with all those M.insert calls. And since Data.Map is a balanced tree, that could lead to a whole lot of work rebalancing things.
However, because we are not altering the set of keys, we are guaranteed that the structure of both new maps will be identical to the structure of the old map. Therefore, with the right primitives, we can keep one finger in each of the three maps and traverse them all in parallel without re-traversing any part of the spine. (The Either and Or variants will have some retraversal as the smart constructors prune out the spine leading to deleted keys. But this is, arguably, necessary.)
Why not something like this (with the correctness proof as an exercise): \begin{code} import Data.Map (Map) import qualified Data.Map as M unzipMap :: Map a (b, c) -> (Map a b, Map a c) unzipMap m = let (ab, ac) = unzip . map fiddle . M.toAscList $ m in (M.fromDistinctAscList ab, M.fromDistinctAscList ac) where fiddle :: (x, (y, z)) -> ((x, y), (x, z)) fiddle (x, (y, z)) = ((x, y), (x, z)) \end{code} (and I now see after writing this that Henning Thielemann wrote much the same some hours ago, however there are some slight differences so I'm sending this anyway) Claude -- http://claudiusmaximus.goto10.org

Claude Heiland-Allen wrote:
On 31/07/10 12:13, wren ng thornton wrote:
Stephen Tetley wrote:
wren ng thornton wrote:
Ben wrote:
unzipMap :: M.Map a (b, c) -> (M.Map a b, M.Map a c) unzipMap m = (M.map fst m, M.map snd m)
I don't think you can give a more efficient implementation using the public interface of Data.Map. You need to have a sort of mapping function that allows you to thread them together, either via continuations or via a primitive:
Unless I'm missing something. This one has one traversal...
unzipMap :: Ord a => M.Map a (b, c) -> (M.Map a b, M.Map a c) unzipMap = M.foldrWithKey fn (M.empty,M.empty) where fn k a (m1,m2) = (M.insert k (fst a) m1, M.insert k (snd a) m2)
Well, that's one traversal of the original map, but you have to traverse the new maps repeatedly with all those M.insert calls. And since Data.Map is a balanced tree, that could lead to a whole lot of work rebalancing things.
However, because we are not altering the set of keys, we are guaranteed that the structure of both new maps will be identical to the structure of the old map. Therefore, with the right primitives, we can keep one finger in each of the three maps and traverse them all in parallel without re-traversing any part of the spine. (The Either and Or variants will have some retraversal as the smart constructors prune out the spine leading to deleted keys. But this is, arguably, necessary.)
Why not something like this (with the correctness proof as an exercise):
\begin{code}
import Data.Map (Map) import qualified Data.Map as M
unzipMap :: Map a (b, c) -> (Map a b, Map a c) unzipMap m = let (ab, ac) = unzip . map fiddle . M.toAscList $ m in (M.fromDistinctAscList ab, M.fromDistinctAscList ac) where fiddle :: (x, (y, z)) -> ((x, y), (x, z)) fiddle (x, (y, z)) = ((x, y), (x, z))
\end{code}
That O(n)+O(n) is much better than the O(n)*2*O(log n) foldrWithKey/insert version. But it's still about the same as the original 2*O(n) map fst/map snd version. With the primitive I mentioned we could reduce the constant factor by about half. -- Live well, ~wren

wren ng thornton wrote:
That O(n)+O(n) is much better than the O(n)*2*O(log n) foldrWithKey/insert version. But it's still about the same as the original 2*O(n) map fst/map snd version. With the primitive I mentioned we could reduce the constant factor by about half.
Oops, the foldrWithKey/insert should be O(n) + n*2*O(log n). -- Live well, ~wren

On 31 July 2010 12:13, wren ng thornton
Well, that's one traversal of the original map, but you have to traverse the new maps repeatedly with all those M.insert calls. And since Data.Map is a balanced tree, that could lead to a whole lot of work rebalancing things.
Thanks. Indeed, I was missing that the traversal is cheap compared to the rebuilding. Although I haven't calculated the Big-O scores suspect that original post will actually be the best, the solutions that metamorph into a list and out again look like they're doing needless extra work.

On 31/07/10 13:49, Stephen Tetley wrote:
Although I haven't calculated the Big-O scores suspect that original post will actually be the best, the solutions that metamorph into a list and out again look like they're doing needless extra work.
They're both O(size m) time, and yes the original is best (not least for its simplicity and elegance); I now think that (on my part) it was a case of following optimisation strategies without thinking hard enough whether they apply: ie, traversing only once can be beneficial for space reasons under certain circumstances [1] But as Data.Map is spine-strict, there is no space saving here by traversing only once, as the spine is already there taking up O(size m) space before we even start traversing (whereas with lazy lists the spine might not be taking up any space yet). [1] to give a classic example: mean :: Fractional a => [a] -> a mean xs = sum xs / genericLength xs which often consumes O(length xs) space, reducible to O(1) if only one traversal is performed. Claude -- http://claudiusmaximus.goto10.org

On Fri, 30 Jul 2010, Ben wrote:
dear traversable geniuses --
i am looking for "better" implementations of
unzipMap :: M.Map a (b, c) -> (M.Map a b, M.Map a c) unzipMap m = (M.map fst m, M.map snd m)
Maybe: mapPair (M.fromAscList, M.fromAscList) $ unzip $ map (\(a,(b,c)) -> ((a,b), (a,c))) $ M.toAscList m

On Fri, Jul 30, 2010 at 08:13:43PM -0700, Ben wrote:
dear traversable geniuses --
i am looking for "better" implementations of
unzipMap :: M.Map a (b, c) -> (M.Map a b, M.Map a c) unzipMap m = (M.map fst m, M.map snd m)
unliftMap :: (Ord a) => M.Map a (b -> c) -> M.Map a b -> M.Map a c unliftMap mf ma = M.mapWithKey (\k v -> mf M.! k $ v) ma
the first is obviously inefficient as it traverses the map twice. the second just seems like it is some kind of fmap.
The second one assumes that every key in ma is also in mf. A generalization without that assumption is unliftMap = intersectionWith id As Wren said, it looks like a <*> operator, but in this case there's no corresponding pure function. To make the laws work, pure x would have to be a map that took every key to x.

dear applicative geniuses --
thanks for all the help. some comments :
1) the fromList / unzip versions are nice but as others have pointed
out, construction is more expensive than traversal (or traversal w/
copying, like Data.Map.map.) hopefully the maintainers of Data.Map et
al will consider adding one-traversal unzip functions.
2) i too would love John Meacham's unionWithJoin function. i might
call that "outerJoin" myself. there was discussion about this at some
point in time on the libraries list, i'm not sure what happened to it.
3) wren, thanks for the (applicative) pointer. when i tried to
interrogate hoogle it pointed me there too -- though i had to boil
down my question several times. i think it went something like this
Map a (b -> c) -> Map a b -> Map a c
--> no answer
m a (b -> c) -> m a b -> m a c
--> no answer
m (b -> c) -> m b -> m c
--> <*> and a million other hits
at which point i'd practically answered the question myself. this is
not a knock on hoogle -- i often have this experience, it seems to
lead me in the right direction Socratically.
4) ross, i had to ask ghci to even believe your code type-checks! i
didn't realize currying worked that way -- i've never thought to pass
in functions of different arities. as an experiment, i tried
Prelude Data.Map> :t intersectionWith 1
intersectionWith 1
:: (Num (a -> b -> c), Ord k) => Map k a -> Map k b -> Map k c
Prelude Data.Map> :t intersectionWith (const 1)
intersectionWith (const 1)
:: (Num (b -> c), Ord k) => Map k a -> Map k b -> Map k c
Prelude Data.Map> :t intersectionWith id
intersectionWith id
:: (Ord k) => Map k (b -> c) -> Map k b -> Map k c
Prelude Data.Map> :t intersectionWith (\a b c -> c)
intersectionWith (\a b c -> c)
:: (Ord k) => Map k a -> Map k b -> Map k (t -> t)
all of which make sense to me now, but still honestly blow my mind!
best regards, b
ps actually the first two don't make much sense to me, when i think
about it.....
On Fri, Jul 30, 2010 at 8:13 PM, Ben
dear traversable geniuses --
i am looking for "better" implementations of
unzipMap :: M.Map a (b, c) -> (M.Map a b, M.Map a c) unzipMap m = (M.map fst m, M.map snd m)
unliftMap :: (Ord a) => M.Map a (b -> c) -> M.Map a b -> M.Map a c unliftMap mf ma = M.mapWithKey (\k v -> mf M.! k $ v) ma
the first is obviously inefficient as it traverses the map twice. the second just seems like it is some kind of fmap.
any ideas?
ben

Ben wrote:
4) ross, i had to ask ghci to even believe your code type-checks! i didn't realize currying worked that way -- i've never thought to pass in functions of different arities. as an experiment, i tried
N.B. intersectionWith id == intersectionWith ($), which might cause it to make a bit more sense. ($) is an infix version of 'id' restricted to function types. But then, ($) is a weird combinator; e.g., flip($) is the T combinator for type lifting.
Prelude Data.Map> :t intersectionWith 1 intersectionWith 1 :: (Num (a -> b -> c), Ord k) => Map k a -> Map k b -> Map k c
[...]
ps actually the first two don't make much sense to me, when i think about it.....
In order to allow overloading of literals, discrete numeric literals are parsed as if wrapped in fromInteger(_::Integer) and continuous numeric literals are parsed as if wrapped in fromRational(_::Rational). Thus, Prelude> :t 1 1 :: (Num t) => t Prelude> :t 1.0 1.0 :: (Fractional t) => t So, since intersectionWith is expecting an (a->b->c) we figure out that "1" must be interpreted as belonging to that type, which means we need a Num(a->b->c) instance. -- Live well, ~wren
participants (7)
-
Ben
-
Claude Heiland-Allen
-
Henning Thielemann
-
John Meacham
-
Ross Paterson
-
Stephen Tetley
-
wren ng thornton