Data.Map, Data.Set working together

Cale Gibbard

5 Apr 2005 5 Apr '05

8:51 a.m.

It just crossed my mind that due to the apparent similarity in implementation it might be possible to implement some extra functions on Maps which take Sets as parameters and which are more efficient than first going through lists. For instance: restrict :: Set a -> Map a b -> Map a b which would restrict the map's domain to the given set -- basically set intersection, but one of the objects is a map. fromFunction :: Set a -> (a -> b) -> Map a b Build a finite map in the obvious way from a set and a function. (basically, restrict the function on the potentially infinite domain a to the finite one given by the set, and record the result as a finite map) Depending on the way things work internally, perhaps the constructed Map could inherit the tree structure of the Set directly. Thoughts? I haven't really looked at the code, but it seems plausible that these sorts of operations where a Set and Map are involved could be done in a better fashion than going through association lists. - Cale

Show replies by date

Benjamin Franksen

5 Apr 5 Apr

12:48 p.m.

On Tuesday 05 April 2005 10:51, Cale Gibbard wrote:

...

It just crossed my mind that due to the apparent similarity in implementation it might be possible to implement some extra functions on Maps which take Sets as parameters and which are more efficient than first going through lists. For instance:

restrict :: Set a -> Map a b -> Map a b which would restrict the map's domain to the given set -- basically set intersection, but one of the objects is a map.

fromFunction :: Set a -> (a -> b) -> Map a b Build a finite map in the obvious way from a set and a function. (basically, restrict the function on the potentially infinite domain a to the finite one given by the set, and record the result as a finite map) Depending on the way things work internally, perhaps the constructed Map could inherit the tree structure of the Set directly.

Thoughts? I haven't really looked at the code, but it seems plausible that these sorts of operations where a Set and Map are involved could be done in a better fashion than going through association lists.

You might be interested in using http://www.stud.tu-ilmenau.de/~robertw/dessy/fun/ instead of Data.*. It is a framework for abstract collections (with interesting implementations, too) where the things you want come out almost automatically. For instance, in the interface http://www.stud.tu-ilmenau.de/~robertw/dessy/fun/Collections.hs you'll find these definitions: class ( Collection coll (a, b) , Eq a, Eq b -- , Ord a, Ord b ) => AssociativeCollection coll a b where domain :: Collection coll' a => coll (a, b) -> coll' a range :: Collection coll' b => coll (a, b) -> coll' b -- [...] (<|), (<-|) :: OrderedSet set a => set a -> coll (a, b) -> coll (a, b) (|>), (|->) :: OrderedSet set b => coll (a, b) -> set b -> coll (a, b) set <| rel = filter ((set`has`).fst) rel set <-| rel = filter (not.(set`has`).fst) rel rel |> set = filter ((set`has`).snd) rel rel |-> set = filter (not.(set`has`).snd) rel Ben

Ketil Malde

7 Apr 7 Apr

3:19 a.m.

I need to search intervals -- i.e. I'd like to store integer keys, and looking up any integer should return the largest key less than my query. (Ideally with it's associated datum, but that could of course be stored in a separate map). Do any of the set/map implementations support this? -kzm -- If I haven't seen further, it is by standing in the footprints of giants

Adrian Hey

1:08 p.m.

On Thursday 07 Apr 2005 4:19 am, Ketil Malde wrote:

...

I need to search intervals -- i.e. I'd like to store integer keys, and looking up any integer should return the largest key less than my query. (Ideally with it's associated datum, but that could of course be stored in a separate map).

Do any of the set/map implementations support this?

The Zipper for the AVL library supports this. Unfortunately I haven't published the code yet :-(. The relevant function is: -- | Attempts to open a sorted AVL tree at the greatest element which -- is less than or equal to some value, according to the supplied comparison. tryOpenByLE :: (a -> b -> Ordering) -> a -> AVL b -> Maybe (ZAVL b) You can then step left or right through the tree (inspecting modifying/inserting/deleting elements as you require). Regards -- Adrian Hey

Adrian Hey

5 Apr 5 Apr

12:57 p.m.

Hello, On Tuesday 05 Apr 2005 9:51 am, Cale Gibbard wrote:

...

It just crossed my mind that due to the apparent similarity in implementation it might be possible to implement some extra functions on Maps which take Sets as parameters and which are more efficient than first going through lists. For instance:

My AVL library was designed to allow easy implementation of things like this (I.E. Uses general purpose AVL tree type and functions rather than specialised Set/Map types). http://homepages.nildram.co.uk/~ahey/HLibs/Data.Tree.AVL/

...

restrict :: Set a -> Map a b -> Map a b which would restrict the map's domain to the given set -- basically set intersection, but one of the objects is a map.

With the AVL library this would use.. genIntersection :: (a -> b -> COrdering c) -> AVL a -> AVL b -> AVL c -- Assumes both trees are sorted restrict :: Ord a => AVL a -> AVL (a,b) -> AVL (a,b) restrict = genIntersection ccmp where ccmp a p@(a',_) = case compare a a' of LT -> Lt EQ -> Eq p GT -> Gt

...

fromFunction :: Set a -> (a -> b) -> Map a b Build a finite map in the obvious way from a set and a function. (basically, restrict the function on the potentially infinite domain a to the finite one given by the set, and record the result as a finite map) Depending on the way things work internally, perhaps the constructed Map could inherit the tree structure of the Set directly.

With the AVL library this would just be.. -- Assumes tree is sorted fromFunction :: AVL a -> (a -> b) -> AVL (a,b) fromFuncton set f = mapAVL' (\a -> (a, f a)) set -- uses strict map or maybe a stricter version if you wanted.. fromFuncton' set f = mapAVL' (\a -> let b = f a in b `seq` (a,b)) set

...

Thoughts? I haven't really looked at the code, but it seems plausible that these sorts of operations where a Set and Map are involved could be done in a better fashion than going through association lists.

Well (not that I'm biased at all :-), but I think all set/map users should look at the AVL library. It's generally more efficient, flexible and gives better control over practical things (like strictness control). Both the old and the new set/map libraries are based on the weight balanced trees Adams used in his "efficient sets" paper. Trouble is these don't seem to be very efficient at all compared to AVL trees. I suspect they might be a better choice sequences though. BTW, I've been working on an improved version of the AVL library, which will include.. * Flexible zipper for AVL trees * AVL tree based clones of Data.Set and Data.Map * Other stuff too. Hopefully I'll get that finished off soon. Regards -- Adrian Hey

Tomasz Zielonka

11:59 a.m.

On Tue, Apr 05, 2005 at 01:57:18PM +0100, Adrian Hey wrote:

...

BTW, I've been working on an improved version of the AVL library, which will include.. * Flexible zipper for AVL trees * AVL tree based clones of Data.Set and Data.Map * Other stuff too.

Have you thought about adding support for augmented AVL trees? Best regards Tomasz

Adrian Hey

5:25 p.m.

Hello, On Tuesday 05 Apr 2005 12:59 pm, Tomasz Zielonka wrote:

...

Have you thought about adding support for augmented AVL trees?

I'm not sure what you have in mind, but no, other than perhaps using an unboxed Int field to hold size information too. This would make some sequence operations (like taking leftmost n elements) O(log n), vs. O(n) as it currently is with the AVL implementation. Unfortunately at the moment doing this efficiently means creating a separate type and doing a cut and paste job on an awful lot of code. BTW, in the ghc survey I asked the ghc folk for type specialisation to allow unboxing in polymorphic data types. Dunno when they're going to deliver that though :-) I wonder if I could do something like this with template Haskell? (haven't used it at all yet and have already forgotten everything I read about it :-( I suspect that for randomish keys (hash codes say) a specialised AVL implementation would outperform the patricia trees of Data.IntMap too. This is something I'd been thinking of doing to speed up the comparisons in situations where the elements/keys where quite complex data structures (I.E. when comparison is a non-trivial operation). But again, it seems like an awful lot of work to do this by hand (though not as much as I'm asking from the ghc folk I guess :-) In the mean time I might consider a somewhat less efficient non-specialised implementation of these or other things. Could you elaborate on what you'd like to see? (or better still, could you supply the code :-) Regards -- Adrian Hey

Tomasz Zielonka

6 Apr 6 Apr

7:54 a.m.

On Tue, Apr 05, 2005 at 06:25:00PM +0100, Adrian Hey wrote:

...

Hello,

Hello!

...

On Tuesday 05 Apr 2005 12:59 pm, Tomasz Zielonka wrote:

...
Have you thought about adding support for augmented AVL trees?

I'm not sure what you have in mind, but no, other than perhaps using an unboxed Int field to hold size information too. This would make some sequence operations (like taking leftmost n elements) O(log n), vs. O(n) as it currently is with the AVL implementation.

That's a good example of an augmented tree. In augmented binary search trees there is an additional information in each node. This information should be computable from (key, value) in the node and (key, value) plus additional information from node's children. This allows to update the info on adding, deleting, balancing, etc. Probably the most known example of augmented BST is an interval tree, but there are many other applications for this technique.

...

Unfortunately at the moment doing this efficiently means creating a separate type and doing a cut and paste job on an awful lot of code.

That's what I feared.

...

BTW, in the ghc survey I asked the ghc folk for type specialisation to allow unboxing in polymorphic data types. Dunno when they're going to deliver that though :-)

I am not sure I fully understand, but if I do, it would be a nice feature.

...

I wonder if I could do something like this with template Haskell? (haven't used it at all yet and have already forgotten everything I read about it :-(

Probably. The question is how convenient you can make it.

...

In the mean time I might consider a somewhat less efficient non-specialised implementation of these or other things. Could you elaborate on what you'd like to see?

Hmmm... something like: data ATree a k v -- A for Augmented class Augmentation a k v where compute :: Maybe (a, k, v) -> (k, v) -> Maybe (a, k, v) -> a empty :: ATree a k v add :: Augmentation a k v => ATree a k v -> k -> v -> ATree a k v ... lookup :: ATree a k v -> k -> Maybe (a, k, v) For some (most?) applications it would probably be necessary to have operations working directly on tree structure, like walking down the path from tree root to a particular node.

...

(or better still, could you supply the code :-)

Sure, next time when I'll need such trees :-) Best regards Tomasz

Adrian Hey

8 Apr 8 Apr

7:50 a.m.

On Wednesday 06 Apr 2005 8:54 am, Tomasz Zielonka wrote:

...

...
BTW, in the ghc survey I asked the ghc folk for type specialisation to allow unboxing in polymorphic data types. Dunno when they're going to deliver that though :-)

I am not sure I fully understand, but if I do, it would be a nice feature.

Useage would be something like this perhaps.. newtype IAVL a = IAVL (AVL (# Int# , a #)) mapIAVL :: (a -> b) -> IAVL a -> IAVL b mapIAVL f (IAVL iavl) = IAVL (mapAVL (\(# i, a #) -> (# i, f a #)) iavl) So I'd still have to create a distinct type and distinct functions for operating on that type. But those new functions would usually be simple one liners which re-use the fully polymorphic definitions (mapAVL in this example). But I think this or something similar this will be difficult in practice. Regards -- Adrian Hey

7397

Age (days ago)

7400

Last active (days ago)

List overview

Download

8 comments

5 participants

participants (5)

Adrian Hey
Benjamin Franksen
Cale Gibbard
Ketil Malde
Tomasz Zielonka

Data.Map, Data.Set working together

tags

participants (5)