New subject: AVL trees for Data.Map and Data.Set

23 Feb 2006

      Hello,

I'm glad you see so many good point in my design.

I must say however that the operator names are (c) Ross Paterson :)

On 2/21/06, Robert Dockins  wrote:
...
Well, I have to begin with the caveat that I am not a data structures
expert, and that the design of Edison is almost entirely due to Chris
Okasaki.  That said, I can certainly share my opinion.  In no
particular order:
-- Your use of two type parameters to model read-only or unobservable
collections is very interesting.  I wonder, though, how useful it
actually is.  I facilitates some interesting possibilities with
views, but I can't think of any compelling use cases.  I sort of
dislike the fact that it can turn certain of the class methods
"useless", but maybe the idea just hasn't grown on me yet.
The truth is, I hope this will become nicer when (if) we move to
associated types.
I need to play a bit more with Phrac to have a clearer idea though.
...
-- I like the idea of views generally, and I'm now trying to think if
there is a good way to work something like them into Edison.
-- The 'isSingleton' method seems unnecessary.  Are singleton
collections interesting enough to supply a special test for them?
This is based on Christan Maeder idea. We are planning to move to AVL
trees for A. Hey, which do not provide a O(1) size function. A
singleton test thus becomes valuable, and since costs next to nothing
to add to the class I decided to to it.
...
-- Your 'lookup' method has a Monad context, which I assume is the
'NotJustMaybe' pattern, but 'front' and 'back' return Maybe.  You
should probably pick one style and stick to it.
True.
...
-- Qualified infix functions look horrible.  I know you say this
module should be imported unqualified, but with Prelude clashes
people qualify it  anyway.  I'd suggest you stick to regular prefix
functions for your class methods and define the infix functions as
the appropriate aliases.  That way people importing qualified can
still use something reasonable looking.
-- In a related thread I've been discussing argument orders.  I think
I've just about come to the decision that all the Edison API
functions will take the collection argument last, including the
troublesome 'rcons'.  If you take my suggestion to remove infix
symbols from your classes, I suggest you also make your right cons
and index functions take the collection argument last.  However, I'd
probably leave the symbols as is, so that, e.g.,  (|>) = (flip rcons).
If i understand this correctly, you suggest that regular functions
should have a prefix meaning, whereas operators should have infix
meaning (also in the light of your answer to Ross).

I remember some peope spoke against this when we tuned Data.Map (&
friends) API some years ago. I guess the reason is that some functions
cannot be assigned a clear enough operator (isSubsetOf, etc.)
...
-- In fact, I like your right and left insert symbols so much, I may
adopt them myself.
-- You claim that 'union' takes an unspecified element in the case of
duplicates, but you also claim it is commutative.  You need to
specify in what sense you claim commutativity; your collection class
doesn't require an Eq or Ord instance, and the claim seems very
suspect if you mean up to identity.  Similar with intersection.
This has been written in a hurry, and I didn't mean anything very formal.
We have re-hashed implied meanings of Eq and Ord quite a few times
however, and the general idea is that we require Eq to mean equality,
and similarly for Ord, otherwise all bets are off.

This is better stated in a general comment rather than on the
definition on union though.
...
-- Does 'adjust' affect all elements with the given key, or just a
single unspecified one?
The class has the (implicit) assumption that a key cannot exist twice.
The semantic could be generalized however.
...
-- 'Indexed' doesn't have any notion of bounds.  That seems like a
pretty necessary concept.  Humm, I see now that MapLike is a subclass
of Indexed.  Perhaps then you should have 'DenseIndexed' or
'ArrayLike' which has bounds related methods together with the
"denseness" property.
It seems like the class you suggest to add would become a replacement
for the Array class.
...
Overall, I'd say the methods you have carved out seem to be the ones
that are most-used, with the notable exception of 'head' and 'tail'
for sequences.
Rather, I intend to start with those. It seems easier to add than to remove.
...
If you are attempting to go for a fairly minimal set
of operations (that's more or less what I read your design goals as
saying), then the only method I'd consider adding is a strict fold.
People get irritated if they can't, e.g., sum the elements in a
collection without building a gajillion unnecessary closures.
Every day I pray for a genius to come with the generic solution to
eliminate unnecessary lazyness :)

Thanks again for the helpful comments.
JP.

RFC: Collection Framework, reprise.

Jean-Philippe Bernardy

Jan-Willem Maessen

Robert Dockins

Udo Stenzel

Sebastian Sylvan

Sebastian Sylvan

Ross Paterson

Jean-Philippe Bernardy

Christian Maeder

Christian Maeder

Adrian Hey

Christian Maeder

Adrian Hey

John Meacham

David Menendez

Bulat Ziganshin

John Meacham

Adrian Hey

tags

participants (11)