On Wednesday 11 Jan 2006 7:09 am, Adrian Hey wrote:
The FGL implementation seems to do the same thing. So I guess lookup performance might not be too good either, at least for non-trivial comparisons.
Well, speaking of "non-trivial comparisons" reminds me of something that's been worrying me about this whole approach to implementing Sets and Maps. I.E. Requiring that element/key types are instances of Ord and assuming that lookup etc is achieved by performing some kind of binary search. This seems wrong in general. E.G. A trie based implementation of ListSet/ListMap does not require that lists are instances of Ord. Furthermore, a trie based implementation for Lists is likely to be very much more efficient than the proposed polymorphic Sets/Maps based on balanced binary trees, because it avoids a lot of repeated comparison on the same list element. What I really want is Sets/Maps that work efficiently where the element/keys can be arbitrarily complex data structures, so comparison is non-trivial and typically quite expensive. (I guess we need to restrict "arbitrarily complex" to exclude cyclic data structures.) So couldn't we generalise the idea of tries to work with any data type? Suppose I wanted a Set or Map of pairs. The obvious implementation (using current polymorphic Sets/Maps) would be.. -- Set of pairs (a,b) newtype Set2 a b = Set2 (Map a (Set b)) -- Map of pairs (a,b) to values v newtype Map2 a b v = Map2 (Map a (Map b v)) .. and for triples .. -- Set of triples (a,b,c) newtype Set3 a b c = Set3 (Map a (Set2 b c)) -- or perhaps .. newtype Set3 a b c = Set3 (Map2 a b (Set c)) -- Map of triples (a,b,c) to values v newtype Map3 a b c v = Map3 (Map a (Map2 b c v)) -- or perhaps .. newtype Map3 a b c v = Map3 (Map2 a b (Map c v)) This can obviously be extended for any product type or data constructor of arbitrary arity. For sum of product types with N distinct constructors, what's needed is an N-tuple of product Sets/Maps (one for each constructor), with nullary constructors just using a simple Bool/Maybe. So, assuming this scheme can be used to implement Sets/Maps for any product or algebraic data type, the only real implementation needed is something like IntSet/IntMap (which could be used for CharSet/CharMap etc too). We'd probably want a hand generated implementation for Lists using "proper" tries, but other than these cases couldn't all this stuff just be derived by the compiler? (would be nice if it could). The only problem I see here is the structural equality problem we've discussed at some length recently. The above scheme assumes structural equality. If this isn't what's wanted then I guess we'd have to rely on hand written instances (as is the case with Eq/Ord). This also highlights another reason why IMO specifying "biasing" is bad, because doing this implicitly assumes that Sets/Maps somehow contain concrete structural representations of elements/keys. This is not necessarily so. It's not true with any implementation like that described above (or even simple tries). Hmm.., just thought of something else. Although this scheme does not require types to be instances of Ord, it would be nice if derived or hand written instances of Set/Map were consistent with any Ord instances, so we could have meaningful ordering with functions like elemsAscending, foldAscending etc.. elemsAscending :: (Set s e, Ord e) => s e -> [e] foldAscending :: (Set s e, Ord e) => (e -> a -> a) -> a -> s e -> a Any comments? Or maybe suggestions about how to use exotic ghc extensions (about which I understand little) to implement this? (I know we shouldn't be assuming ghc, in an ideal world :-) [P.S Now of course having written all this I decided this must surely have been done already by someone :-) Googling for "generalised tries" quickly took me to this paper by Ralf Hinze.. http://citeseer.ist.psu.edu/233124.html which in turn references Chris Okasakis book So why don't we implement something like this? (somehow:-) ] Regards -- Adrian Hey