Edison StandardSet has inefficient function implementation

Edision does not yet have all the asymtotic description of its functions. I got the Edision 1.2 source and looked into the code whether the container implementations meet the expected asymtotic bounds. In the module Data.Coll.StandardSet which packages Data.Set, some functions which can be O(log n) is implemented as O(n). Data.Set has a split and splitMember running in O(log n). With those functions we can implement OrdCollX operations like filterLT, filterLE, filterGT, filterGE, partitionLT_GE, partitionLE_GT, partitionLT_GT all in O(log n). However, only partitionLT_GT was O(log n) implemended using split. All other function implmentation just used its axiomaic description using CollX operations like filter and partition, which is O(n). It needs to be fixed. P.S. I haven't checked the darcs version yet. -- Ahn, Ki Yung

On Aug 3, 2006, at 7:14 AM, Ahn, Ki Yung wrote:
Edision does not yet have all the asymtotic description of its functions.
Indeed. This is a big job which requires a lot of time, attention to detail, and a pretty good working understanding of lazy amortized analysis. Unfortunately I'm currently lacking in categories 1 and 3... Many of the bounds can be obtained from the referenced papers. I'll work on adding those as I am able.
I got the Edision 1.2 source and looked into the code whether the container implementations meet the expected asymtotic bounds.
In the module Data.Coll.StandardSet which packages Data.Set, some functions which can be O(log n) is implemented as O(n).
Data.Set has a split and splitMember running in O(log n). With those functions we can implement OrdCollX operations like filterLT, filterLE, filterGT, filterGE, partitionLT_GE, partitionLE_GT, partitionLT_GT all in O(log n). However, only partitionLT_GT was O(log n) implemended using split. All other function implmentation just used its axiomaic description using CollX operations like filter and partition, which is O(n).
Thanks for pointing this out. filterLT and filterGT can indeed be written in terms of "split"; I just missed that somehow. For the others, however, splitMember won't suffice. The problem here is that splitMember doesn't return the "equal" member from the original set, it just returns a Bool indicating whether the set contained and "equal" element. As of now, Edison is supposed to guarantee that, for observable collections, you will always get back the identical object(s) that you put in. This accounts for the fact that you may supply an Eq instance which is only a weak equivalance, that is, even if x == y returns true, x and y may be distinguishable in some way. I am considering dropping this guarantee in a future version of Edison, because I think its value is highly dubious. (Using a set or bag with a weak element equivalence is really just creating a finite map/relation. If you need a finite map/relation you should just _use_ a finite map/relation!) If I dropped the guarantee, I could indeed implement the other operations as you suggest.
It needs to be fixed.
P.S. I haven't checked the darcs version yet.
-- Ahn, Ki Yung
Thanks for your comments! Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG
participants (2)
-
Ahn, Ki Yung
-
Robert Dockins