Re: Data.* collections maintenance

21 Oct 2005

      Hello folks,

On Friday 21 Oct 2005 8:37 am, Jean-Philippe Bernardy wrote:
...
As some demand seem to exist for the job, I volunteer to take
over/resume maintenance of Data.Map and friends. In the hope that will
alleviate the heavy load on the compiler maintainers :)
Well not that I'm in any way biased :-) but might I suggest that if these
libraries are going to get an API overhaul you could probably save yourself
quite a bit of work (and get a performance boost too) if you switched from
using size balanced trees (aka Adams trees ?) to using Data.Tree.AVL.

AFAICT all the problems people seem to have mentioned recently with
current Data.Set/Map APIs are already solved in Data.Tree.AVL or are
trivially self solvable e.g. you can chose whether you want left or
right biasing (AVL is uncommitted), intersection works on trees of
different element types, strictness in elements is much more fine
tuneable..

I was about to offer half finished AVL based clones of Data.Set/Map
I have written (but not published), but having just checked it seems I
must have deleted them at some point :-( I can remember implementation
was quite easy though.

The only reservation I have about this is the size function and indexing
operations would be O(n) for AVL, not O(1) or O(log n). But the only
reason size is O(1) for size balanced trees is that you pay extra for this
information everywhere else, and I'm not sure that index functions
should be supported at all in an abstracted Map API (they seem a bit
unsafe to me). Does anybody use Data.Map indexing? and why doesn't
Data.Set support this too? (just curious :-)

Also, more generally it seems impossible to have a single optimal Set/Map
implementation for all possible element or key types. IntSet/Map will
perform better for Ints, StringMap will perform better for Strings (the
latter could be generalised to ListMap I guess, or maybe even a
SequenceMap). Where would these fit in the general scheme of things?

Regards
--
Adrian Hey