
Hello, thanks for patience with me. <snip>
My point was, that there are many combinations you can think of -- IntMap minus IntSet, IntMap minus Map Int, IntMap minus list of Ints, IntMap union Map Int, IntMap intersection IntSet, IntMap intersection Set Int and so on.
Actually, it's just IntMap and IntSet, since they share the same underlying patricia tree implementation. <snip>
My biggest problem is "Does difference with IntSet deserves to have a special function implementing it?" I do not believe so.
The point is that IntSet and IntMap are identical internally.
The same implementation means we can implement those. But does it also imply a reason for doing so? I am not sure we are doing it just because we can or because it is useful. Does some other people think it is worth having intersection m (fromSet f s) and difference m (fromSet g s) perform equally well as intersection/difference between pairs of maps, instead of a penalty for converting the set to the map?
But I do not like depending on rules alone either. If we promise that difference map (fromSet set) works without creating intermediate map, then it - works only for GHC - is nontrivial to use -- which functions do really fuse with fromSet?
I'm not promising anything performance-wise… it still produces the same result without the rules, just that an intermediate IntMap will be produced and consumed. With the rules, it's the same O(m+n) time complexity, but with a much smaller constant.
How is this any more controversial than list fusion?
Hm, you are right. I was still thinking about deterministically calling differenceKeysSet instead of just invisible optimization.
For me, right now no functions working with different data structures are worth it. The API is complicated as it is and I do not feel adding cross-structure functions is a good thing to do. I am happy with being able to convert between the structures efficiently.
For the record, the only API extension I'm proposing is:
fromSet :: (Key -> a) -> IntSet -> IntMap a
and I am happy with this one.
Nothing else changes, as far as users are concerned, except that
intersection m (fromSet f s) and difference m (fromSet g s)
will perform equally well as intersection/difference between pairs of maps.
The last thing I am worried about is whether the new duplicated code (intersectionKeysSet and differenceKeysSet) is worth the speedup. I will wait for some other opinions on this.
Thanks for all the pathes! The patch 0006 is unfortunately missing the new module and also the changes to IntSet.
Those are in patch 0001.
Oh, I am sorry I did not read the first one. Also, the IntMap is considered to be a replacement of Map -- I would like to add the fromSet method to the Map module too. But I am not sure I want to add the specialization of intersectionKeysSet and differenceKeysSet, at least not now -- I want to tune the representation of Map and Set and less code I need to change the better. Cheers, Milan