Re: RFC: Data.StringMap

18 Aug 2005

      On Thursday 18 Aug 2005 8:27 am, Axel Simon wrote:
...
On Thu, 2005-08-18 at 08:25 +0100, Adrian Hey wrote:
...
Also, I'd be grateful someone could tell me what the
data structure I've used is actually called :-)
AFAIK, what you've implemented is called a trie, pronounced "try" from
retrieval.
Thanks, I guessed it is some form of trie but I'm a bit confused about
the precise definitions of various very similar data structures. I was
looking at..
 http://www.cs.mcgill.ca/~cs251/OldCourses/1997/topic7/

I'm not entirly clear about the difference between "tries", "compact tries"
(depends what you call compact AFAICS) and "PATRICIA tries", and is a PATRICIA
trie the same thing as a "PATRICIA tree"?

The TString type is basically a linked list which may or may not be terminated
with an N way fork (in the form of an AVL tree, N >=2), but then strings are
linked lists anyway in Haskell. So is this compact or not I wonder?

It's compact in the sense that it doesn't require an AVL tree at every
character (most of which would be singletons in practice). I guess you
could achieve the same thing with this definition..

data TString v = TString {-# UNPACK #-} !Char String v !(AVL (TString v))

.. which might look more compact, but isn't really AFAICS (well not while
Strings are linked lists of boxed Chars).

Replacing the String with a packed string might be better, but I'm not
sure how efficient those are in reality (and if I was to go that route
it might be a lot simpler and more efficient to use a completely different
approach, such as hashing and a pure AVL tree say).

Actually, now I think of it I will try that and compare performance of the
two for simple inserts and lookups. (If it's fast enough it will certainly
be much simpler to implement).

Regards
--
Adrian Hey