Implementing ParseChart with Data.Map

Hi, I have to write ParseChart implementation with Data.Map/Set. The chart is type like this: type Chart k v = Map k (Set v) now I need operation like: insert :: k -> v -> Chart k v -> Maybe (Chart k v) where the result is (Just _) if the (k,v) is actually added to the chart or Nothing if it was already there and nothing have to be done. The straight forward implementation is: case Map.lookup k chart of Nothing -> Just (Map.insert k (Set.singleton v) chart) Just set | Set.member v set -> Nothing | otherwise -> Just (Map.insert k (Set.insert v set) chart) The problem with this is that both the Map and the Set are traversed twice. The first time from lookup/member and the second time from insert. Does someone have an idea how to do this with the current libraries? There are the Map.updateLookupWithKey and the Map.alter functions: updateLookupWithKey :: Ord k => (k -> a -> Maybe a) -> k -> Map k a -> (Maybe a,Map k a) alter :: Ord k => (Maybe a -> Maybe a) -> k -> Map k a -> Map k a which are the closest that I need. The problem is that the first doesn't allow the client function to be called if there isn't matching key in the map and the second is doesn't allow to return value from the client function. What I really need is an alterLookúp function: alterLookúp :: Ord k => (Maybe a -> (b,Maybe a)) -> k -> Map k a -> (b,Map k a) The chart manipulation is in the tight loop of my application so I need fast code. Any other ideas? Regards, Krasimir

Hi Krasimir,
insert :: k -> v -> Chart k v -> Maybe (Chart k v)
where the result is (Just _) if the (k,v) is actually added to the chart or Nothing if it was already there and nothing have to be done.
insertLookupWithKey :: Ord k => (k -> a -> a -> a) -> k -> a -> Map k a -> (Maybe a, Map k a) This should provide the information you need, without the double traversal of the data structure. Thanks Neil

Not completely! This is a possible implementation:
case insertLookupWithKey (\_ -> Set.union) k (Set.singleton v) chart of
(Nothing, chart) -> Just chart
(Just set, chart) | Set.member v set -> Nothing
| otherwise -> Just chart
but notice that the set is still traversed twice.
On Tue, Jun 3, 2008 at 12:07 AM, Neil Mitchell
Hi Krasimir,
insert :: k -> v -> Chart k v -> Maybe (Chart k v)
where the result is (Just _) if the (k,v) is actually added to the chart or Nothing if it was already there and nothing have to be done.
insertLookupWithKey :: Ord k => (k -> a -> a -> a) -> k -> a -> Map k a -> (Maybe a, Map k a)
This should provide the information you need, without the double traversal of the data structure.
Thanks
Neil

Hi
case insertLookupWithKey (\_ -> Set.union) k (Set.singleton v) chart of (Nothing, chart) -> Just chart (Just set, chart) | Set.member v set -> Nothing | otherwise -> Just chart
but notice that the set is still traversed twice.
Yes, I missed that bit. I don't see any way of reducing that, other than using an unsafePerformIO and IORef to track the combining function - which is probably a really bad idea. Thanks Neil

Krasimir Angelov wrote:
but notice that the set is still traversed twice.
Neil Mitchell wrote:
I don't see any way of reducing that
Yeah, it looks like the Data.Set (and Data.IntSet) library is missing the functions insertMember :: Ord a => a -> Set a -> (Bool, Set a) deleteMember :: Ord a => a -> Set a -> (Bool, Set a) analagous to splitMember. It should be easy to write those functions. If you do that for yourself, consider making a patch of them and submitting the patch as a library proposal. But anyway, a set lookup is very cheap, even for a set that is quite large. You may want to try just doing the extra lookup, it might be good enough for you. At least you eliminated the Map lookup. Regards, Yitz

I actually made my own copy of Data.Map and added an extra:
alterLookúp :: Ord k => (Maybe a -> (b,Maybe a)) -> k -> Map k a -> (b,Map k a)
function. I also changed my data type to:
type ParseChart k v = Map k (Map v ())
so I don't have to copy the Data.Set module also. Unfortunately this
doesn't give much better performance - 5734 msec instead of 5828 msec.
Fortunately I found that there is a way to avoid to use Map at all in
one common case. This gave me time about 5024 msec.
Regards,
Krasimir
On 6/3/08, Yitzchak Gale
Krasimir Angelov wrote:
but notice that the set is still traversed twice.
Neil Mitchell wrote:
I don't see any way of reducing that
Yeah, it looks like the Data.Set (and Data.IntSet) library is missing the functions
insertMember :: Ord a => a -> Set a -> (Bool, Set a) deleteMember :: Ord a => a -> Set a -> (Bool, Set a)
analagous to splitMember. It should be easy to write those functions. If you do that for yourself, consider making a patch of them and submitting the patch as a library proposal.
But anyway, a set lookup is very cheap, even for a set that is quite large. You may want to try just doing the extra lookup, it might be good enough for you. At least you eliminated the Map lookup.
Regards, Yitz

On Mon, 2008-06-02 at 22:35 +0200, Krasimir Angelov wrote:
The problem with this is that both the Map and the Set are traversed twice. The first time from lookup/member and the second time from insert. Does someone have an idea how to do this with the current libraries?
The chart manipulation is in the tight loop of my application so I need fast code. Any other ideas?
I'm not sure if it helps your application but we had a discussion on #haskell the other day about Data.Map and were talking about a general insert/modify/delete operator like: modify :: k -> Map k e -> (e, Maybe e -> Map k e) so it's a lookup that returns the element at k and also a continuation that lets you rebuild a new map with an altered element. I guess that doesn't account for the element not existing. There's probably a generalisation that does. Duncan

Duncan Coutts wrote:
modify :: k -> Map k e -> (e, Maybe e -> Map k e)
so it's a lookup that returns the element at k and also a continuation that lets you rebuild a new map with an altered element. I guess that doesn't account for the element not existing. There's probably a generalisation that does.
isn't it just adding the necessary Maybe?: modify :: k -> Map k e -> (Maybe e, Maybe e -> Map k e) ? -Isaac

Hello Krasimir, Krasimir Angelov wrote:
Hi,
I have to write ParseChart implementation with Data.Map/Set. The chart is type like this:
type Chart k v = Map k (Set v)
now I need operation like:
insert :: k -> v -> Chart k v -> Maybe (Chart k v)
where the result is (Just _) if the (k,v) is actually added to the chart or Nothing if it was already there and nothing have to be done. The straight forward implementation is:
case Map.lookup k chart of Nothing -> Just (Map.insert k (Set.singleton v) chart) Just set | Set.member v set -> Nothing | otherwise -> Just (Map.insert k (Set.insert v set) chart)
You can do this quite easily with the AVL library, something like this (untested code) import Data.Cordering import Data.Tree.AVL type Chart k v = AVL (k, AVL v) insert :: (Ord k, Ord v) => k -> v -> Chart k v -> Maybe (Chart k v) insert k v tk = case genOpenPathWith cmpk tk of EmptyBP pthk -> Just $! insertPath pthk (k, singleton v) tk FullBP pthk tv -> case genOpenPath (compare v) tv of EmptyBP pthv -> let tv' = insertPath pthv v tv in tv' `seq` (Just $! writePath pthk (k, tv') tk) FullBP _ _ -> Nothing where cmpk (k',tv) = case compare k k' of LT -> Lt EQ -> Eq tv GT -> Gt ..or something like that (maybe you don't want all that strictness) The insertPath & writePath functions do involve a second traversal but do not repeat all the comparisons. Also, provided not too much has happened in between, they should be very fast as the nodes on the path are probably still in cache. The important thing is that in the case where Nothing is returned you'll have burned very little heap. Regards -- Adrian Hey
participants (6)
-
Adrian Hey
-
Duncan Coutts
-
Isaac Dupree
-
Krasimir Angelov
-
Neil Mitchell
-
Yitzchak Gale