
The old FiniteMap module contains the addListToFM* functions that are missing in Data.Map. But how to implement it? First answer: insertList_foldl :: Ord k => [(k,a)] -> Map.Map k a -> Map.Map k a insertList_foldl = flip (foldl ins) where fluc = flip . uncurry ins = fluc Map.insert Second answer: insertList_union :: Ord k => [(k,a)] -> Map.Map k a -> Map.Map k a insertList_union kas = Map.union (Map.fromList kas) Situation and Problem: We already have a map of size n and want to insert a list with m elements. Then we look at the documentation and one gets: * For the first answer we do need O(sum_{i=0}^{m-1}log(n+i)) = O(log((m+n)!/n!)) operations. * For the sencond answer we do need O(m*log(m))+O(n+m) = O(m*log(m)+n+m) operations. These terms are hard to compare but a plot shows that insertList_foldl should be faster than insertList_union. But a testprogram shows that insertList_union is faster (and needs less memory) whatever the sizes of n and m are. So my questions are: Why are the insertList* functions missing? How would the library implementors implement insertList* and why? thanks, -- -- Mirko Rahn -- Tel +49-721 608 7504 -- --- http://liinwww.ira.uka.de/~rahn/ ---

On Wed, Feb 23, 2005 at 12:16:13PM +0100, Mirko Rahn wrote:
The old FiniteMap module contains the addListToFM* functions that are missing in Data.Map. But how to implement it?
Here's a cheatsheet I use for converting to Data.Map: -- Definitions of FiniteMap operations in terms of Map, for porting purposes. module FiniteMap ( FiniteMap, emptyFM, unitFM, listToFM, lookupFM, lookupWithDefaultFM, elemFM, addToFM, addToFM_C, addListToFM, addListToFM_C, delFromFM, delListFromFM, plusFM, plusFM_C, fmToList, keysFM, eltsFM, sizeFM, isEmptyFM, minusFM, foldFM, intersectFM, intersectFM_C, mapFM, filterFM, foldFM_GE, fmToList_GE, keysFM_GE, eltsFM_GE, foldFM_LE, fmToList_LE, keysFM_LE, eltsFM_LE, minFM, maxFM ) where import Data.Map as Map import Data.List (foldl') type FiniteMap = Map emptyFM :: FiniteMap k a emptyFM = empty unitFM :: k -> a -> FiniteMap k a unitFM = singleton listToFM :: (Ord k) => [(k,a)] -> FiniteMap k a listToFM = fromList addToFM :: (Ord k) => FiniteMap k a -> k -> a -> FiniteMap k a addToFM m k v = insert k v m addListToFM :: (Ord k) => FiniteMap k a -> [(k,a)] -> FiniteMap k a addListToFM m kvs = foldl' add m kvs where add m' (k,v) = insert k v m' addToFM_C :: (Ord k) => (a -> a -> a) -> FiniteMap k a -> k -> a -> FiniteMap k a addToFM_C c m k v = insertWith (flip c) k v m addListToFM_C :: (Ord k) => (a -> a -> a) -> FiniteMap k a -> [(k,a)] -> FiniteMap k a addListToFM_C c m kvs = foldl' add m kvs where add m' (k,v) = insertWith (flip c) k v m' delFromFM :: (Ord k) => FiniteMap k a -> k -> FiniteMap k a delFromFM m k = delete k m delListFromFM :: (Ord k) => FiniteMap k a -> [k] -> FiniteMap k a delListFromFM m keys = foldl' (flip delete) m keys plusFM :: (Ord k) => FiniteMap k a -> FiniteMap k a -> FiniteMap k a plusFM = flip union plusFM_C :: (Ord k) => (a -> a -> a) -> FiniteMap k a -> FiniteMap k a -> FiniteMap k a plusFM_C = unionWith minusFM :: (Ord k) => FiniteMap k a -> FiniteMap k b -> FiniteMap k a minusFM = difference intersectFM :: (Ord k) => FiniteMap k a -> FiniteMap k a -> FiniteMap k a intersectFM = flip intersection intersectFM_C :: (Ord k) => (a -> b -> c) -> FiniteMap k a -> FiniteMap k b -> FiniteMap k c intersectFM_C = intersectionWith foldFM :: (k -> a -> b -> b) -> b -> FiniteMap k a -> b foldFM = foldWithKey mapFM :: (k -> elt1 -> elt2) -> FiniteMap k elt1 -> FiniteMap k elt2 mapFM = mapWithKey filterFM :: (Ord k) => (k -> a -> Bool) -> FiniteMap k a -> FiniteMap k a filterFM = filterWithKey sizeFM :: FiniteMap k a -> Int sizeFM = size isEmptyFM :: FiniteMap k a -> Bool isEmptyFM = Map.null elemFM :: (Ord k) => k -> FiniteMap k a -> Bool elemFM = member lookupFM :: (Ord k) => FiniteMap k a -> k -> Maybe a lookupFM m k = Map.lookup k m lookupWithDefaultFM :: (Ord k) => FiniteMap k a -> a -> k -> a lookupWithDefaultFM m v k = findWithDefault v k m fmToList :: FiniteMap k a -> [(k,a)] fmToList = toList keysFM :: FiniteMap k a -> [k] keysFM = keys eltsFM :: FiniteMap k a -> [a] eltsFM = elems -- NB: if == is less discriminating than true equality, then these are -- slightly different from the originals: they use the key supplied, -- rather than the one in the tree that's equal to it. foldFM_GE :: Ord k => (k -> a -> b -> b) -> b -> k -> FiniteMap k a -> b foldFM_GE f z k m | Map.null m = z | otherwise = case splitLookup k m of (_, Nothing, m_gt) -> foldWithKey f z m_gt (_, Just x, m_gt) -> f k x (foldWithKey f z m_gt) fmToList_GE :: Ord k => FiniteMap k a -> k -> [(k,a)] fmToList_GE m k | Map.null m = [] | otherwise = case splitLookup k m of (_, Nothing, m_gt) -> toList m_gt (_, Just x, m_gt) -> (k,x) : toList m_gt keysFM_GE :: Ord k => FiniteMap k a -> k -> [k] keysFM_GE m k | Map.null m = [] | otherwise = case splitLookup k m of (_, Nothing, m_gt) -> keys m_gt (_, Just _, m_gt) -> k : keys m_gt eltsFM_GE :: Ord k => FiniteMap k a -> k -> [a] eltsFM_GE m k | Map.null m = [] | otherwise = case splitLookup k m of (_, Nothing, m_gt) -> elems m_gt (_, Just x, m_gt) -> x : elems m_gt foldFM_LE :: Ord k => (k -> a -> b -> b) -> b -> k -> FiniteMap k a -> b foldFM_LE f z k m | Map.null m = z | otherwise = case splitLookup k m of (m_lt, Nothing, _) -> foldWithKey f z m_lt (m_lt, Just x, _) -> foldWithKey f (f k x z) m_lt fmToList_LE :: Ord k => FiniteMap k a -> k -> [(k,a)] fmToList_LE m k | Map.null m = [] | otherwise = case splitLookup k m of (m_lt, Nothing, _) -> toList m_lt (m_lt, Just x, _) -> toList m_lt ++ [(k,x)] keysFM_LE :: Ord k => FiniteMap k a -> k -> [k] keysFM_LE m k | Map.null m = [] | otherwise = case splitLookup k m of (m_lt, Nothing, _) -> keys m_lt (m_lt, Just x, _) -> keys m_lt ++ [k] eltsFM_LE :: Ord k => FiniteMap k a -> k -> [a] eltsFM_LE m k | Map.null m = [] | otherwise = case splitLookup k m of (m_lt, Nothing, _) -> elems m_lt (m_lt, Just x, _) -> elems m_lt ++ [x] minFM :: Ord k => FiniteMap k a -> Maybe k minFM m | Map.null m = Nothing | otherwise = Just (fst (findMin m)) maxFM :: Ord k => FiniteMap k a -> Maybe k maxFM m | Map.null m = Nothing | otherwise = Just (fst (findMax m))

Ross Paterson wrote:
import Data.List (foldl')
addListToFM :: (Ord k) => FiniteMap k a -> [(k,a)] -> FiniteMap k a addListToFM m kvs = foldl' add m kvs where add m' (k,v) = insert k v m'
Okay, using the strict foldl' the situation changes: Now
insertList_foldl is faster for n>m and slower for n

Would it not be more efficient to put the list into a new map, and then merge the old and new maps (a bit like the sort and merge technique used in old tape databases)... Keean. Mirko Rahn wrote:
The old FiniteMap module contains the addListToFM* functions that are missing in Data.Map. But how to implement it?
First answer:
insertList_foldl :: Ord k => [(k,a)] -> Map.Map k a -> Map.Map k a insertList_foldl = flip (foldl ins) where fluc = flip . uncurry ins = fluc Map.insert
Second answer:
insertList_union :: Ord k => [(k,a)] -> Map.Map k a -> Map.Map k a insertList_union kas = Map.union (Map.fromList kas)
Situation and Problem:
We already have a map of size n and want to insert a list with m elements. Then we look at the documentation and one gets:
* For the first answer we do need O(sum_{i=0}^{m-1}log(n+i)) = O(log((m+n)!/n!)) operations. * For the sencond answer we do need O(m*log(m))+O(n+m) = O(m*log(m)+n+m) operations.
These terms are hard to compare but a plot shows that insertList_foldl should be faster than insertList_union.
But a testprogram shows that insertList_union is faster (and needs less memory) whatever the sizes of n and m are.
So my questions are:
Why are the insertList* functions missing? How would the library implementors implement insertList* and why?
thanks,

Would it not be more efficient to put the list into a new map, and then merge the old and new maps
Sometimes yes. The function
insertList_union :: Ord k => [(k,a)] -> Map.Map k a -> Map.Map k a insertList_union kas = Map.union (Map.fromList kas)
is faster on inserting m>n new elements into a map of size n and slower
on inserting m

Mirko Rahn wrote:
These terms are hard to compare but a plot shows that insertList_foldl should be faster than insertList_union.
But a testprogram shows that insertList_union is faster (and needs less memory) whatever the sizes of n and m are.
So my questions are:
Why are the insertList* functions missing?
Just my laziness :) Maybe someone can add them to the libraries?
How would the library implementors implement insertList* and why?
I think I would use the "union" version as it is the simplest. However, since the "foldl'" version seems faster, maybe one should use that anyway. (Be careful though -- the union on IntMap's is very fast -- it'll be hard to beat that even using foldl'). -- Daan.
thanks,
participants (4)
-
Daan Leijen
-
Keean Schupke
-
Mirko Rahn
-
Ross Paterson