
Milan Straka
writes: Ticket: http://hackage.haskell.org/trac/ghc/ticket/5242 another problem is that more methods could be overloaded for IntSets -- also union and intersect. Adding the variant only for difference seems quite arbitrary, but adding all is bloating the API.
For intersection, sure. I can see plausible use-cases for this too. Patch 0007.
For union, where would it get a value from, if a key was found in the set but not the map? Take a function to generate the value from the key like fromSet? Given union is left-biased in its effect but not in its type, I'm going to need separate versions to handle the two ways a map and a set can be combined. Not sure I want to go down this route, especially as I can't think of a good use-case.
My point was, that there are many combinations you can think of -- IntMap minus IntSet, IntMap minus Map Int, IntMap minus list of Ints, IntMap union Map Int, IntMap intersection IntSet, IntMap intersection Set Int and so on. With fromSet and fromDistinctAscList, you can handle all of these easily, although with some time penalty. For example the union case, union map (fromSet (const 1) set) seems fine.
I liked the second idea with the fromSet :: (Key -> a) -> IntSet.IntSet -> IntMap a but I would not add the rewriting rules. The intermediate IntMap has to be constructed, but I think it is reasonable price for doing cross-structure operations and not extending the API.
I don't understand your reasoning behind not adding the rewrite rules. Simply don't export differenceKeysSet, as patch 0005 does.
(Johan, please read this too.) I am quite unhappy about exporting differenceKeysSet and other hybrid functions working on different data structures. But I do not like depending on rules alone either. If we promise that difference map (fromSet set) works without creating intermediate map, then it - works only for GHC - is nontrivial to use -- which functions do really fuse with fromSet? My biggest problem is "Does difference with IntSet deserves to have a special function implementing it?" I do not believe so. There are many functions that could fuse with fromSet, you could even implement better implementations of e.g. union with Map Int. But how do we measure which combinations are worth it? For me, right now no functions working with different data structures are worth it. The API is complicated as it is and I do not feel adding cross-structure functions is a good thing to do. I am happy with being able to convert between the structures efficiently. Johan, what is your opinion?
Instead we could make fromSet more efficient -- if the IntMap could access the internal representation of IntSet, the can implement fromSet just by adding values to each node.
Right. I factored "data IntSet" out to a separate module so that this was possible. This takes its time complexity from O(n*min(n,W)) down to O(n), I believe.
The strictness annotations are a bit harmful here -- if they were not there, only the parts of the structure needed for the following difference (union, intersect) would be constructed. BTW, seeing the IntSet representation in IntMap would allow us to define keysSet more efficient too.
Done both! Patch 0006.
Thanks for all the pathes! The patch 0006 is unfortunately missing the new module and also the changes to IntSet. Cheers, Milan