Re: Data.* collections maintenance

21 Oct 2005

      Hi Adrian,

I've always wanted to try out your AVL implementation, but never found 
the time to adapt your interface to that of Data.Set and Data.Map (or a 
sub-interface). If you have done so, that would only mean to replace 
import entries (and would allow to switch back if it doesn't work as 
expected).

(It would also be handy if your modules could be passed to the same 
properties and performance checker if we had those.)

A different performance profile is of course fine (and desirable if
there is a big gain in certain or even most cases).

So far, the problem always was to agree on an accepted interface (that I 
didn't want to be a class but leave to the discipline of library writers.)

How about something like Data.Set.Tree.AVL (and Data.Set.Seq.Ordered)
or Data.SetByAVLTree (and Data.SetByOrderedSeq)?

Cheers Christian

Adrian Hey wrote:
...
Well not that I'm in any way biased :-) but might I suggest that if these
libraries are going to get an API overhaul you could probably save yourself
quite a bit of work (and get a performance boost too) if you switched from
using size balanced trees (aka Adams trees ?) to using Data.Tree.AVL.
AFAICT all the problems people seem to have mentioned recently with
current Data.Set/Map APIs are already solved in Data.Tree.AVL or are
trivially self solvable e.g. you can chose whether you want left or
right biasing (AVL is uncommitted), intersection works on trees of
different element types, strictness in elements is much more fine
tuneable..
I was about to offer half finished AVL based clones of Data.Set/Map
I have written (but not published), but having just checked it seems I
must have deleted them at some point :-( I can remember implementation
was quite easy though.
The only reservation I have about this is the size function and indexing
operations would be O(n) for AVL, not O(1) or O(log n). But the only
reason size is O(1) for size balanced trees is that you pay extra for this
information everywhere else, and I'm not sure that index functions
should be supported at all in an abstracted Map API (they seem a bit
unsafe to me). Does anybody use Data.Map indexing? and why doesn't
Data.Set support this too? (just curious :-)
Also, more generally it seems impossible to have a single optimal Set/Map
implementation for all possible element or key types. IntSet/Map will
perform better for Ints, StringMap will perform better for Strings (the
latter could be generalised to ListMap I guess, or maybe even a
SequenceMap). Where would these fit in the general scheme of things?
Regards
--
Adrian Hey
_______________________________________________
Libraries mailing list
Libraries@haskell.org
http://www.haskell.org/mailman/listinfo/libraries

Re: Data.* collections maintenance

Christian Maeder