
On Mon, 2007-06-18 at 19:35 +1000, Thomas Conway wrote:
On 6/18/07, apfelmus
wrote: Do you need the hash function for a hash table or for fingerprints/signatures? In the former case, Tries are a much better choice. For launching your own trie, see also
I'm actually using them for bucket addressing for external indexing with a linear hash table. (Yes, the hashing does count, because buckets are cached in memory.)
Actually, if one wants a concurrent dictionary, using something in the vein of
type HashTable k v = TVar (Array Int (TVar [(k,v)]))
has very good performance. It always seems something of a shame that if you want all the benefits of lazy functional programming, you too often have to settle for O(n log n) data structures.
<non-haskell-slight-rant> Incidentally, while I've got your attention, I note that if you use a good quality hash function like the one I ripped off, you don't need to use [mostly] prime numbers for sizing your hash tables, and you can use powers of two instead, which simplifies a bunch of things. This is kind of obvious when you think about it, but every hash function I came across as an undergraduate or even as a post-grad, with the exception of md5 et al, was not good. The dogma was that *good* hash functions are too expensive for everyday use. So a word of advice to all you worthy tertiary educators - this is not the 1970s any more - good, cheap hash functions do exist, so update your course notes. :-) </non-haskell-slight-rant>
Indeed, "Performance in Practice of String Hashing Functions" http://citeseer.ist.psu.edu/530453.html