Re: [Haskell-cafe] Simple hash table creation

17 Nov 2009

      So, what you're telling me is, my dummy hash function IS being used, and because of collisions the other values are placed in different locations?

Michael

--- On Tue, 11/17/09, Brad Larsen  wrote:

From: Brad Larsen 
Subject: Re: [Haskell-cafe] Simple hash table creation
To: "michael rice" 
Cc: "Gregory Crosswhite" , haskell-cafe@haskell.org
Date: Tuesday, November 17, 2009, 4:17 PM

On Tue, Nov 17, 2009 at 4:00 PM, michael rice  wrote:
...
Hi Gregory,
I was wondering about that, because of the following:
[1 of 1] Compiling Main             ( hash1.hs, interpreted )
Ok, modules loaded: Main.
*Main> ht <- new (==) dummy :: IO MyHashTable
*Main> dummy "mike"
7
*Main> dummy "michael"
7
*Main> insert ht "mike" 1
*Main> toList ht
[("mike",1)]
*Main> insert ht "michael" 2
*Main> toList ht
[("michael",2),("mike",1)]
*Main> insert ht "miguel" 3
*Main> toList ht
[("miguel",3),("michael",2),("mike",1)]
*Main> :t dummy "miguel"
dummy "miguel" :: Int32
*Main>
It seems my dummy function is being ignored. I figured I would only be able to store a single value with a hash function that always returns 7. Why ask for a hash function and not use it?
[...]

Most hash tables deal with collisions, i.e. the case when two keys
stored in the table hash to the same value.  In the case of your
'dummy' hash function, which always returns 7, every key hashes to the
same value, hence collisions galore.

One way to deal with collisions is to hash a key to a bucket (i.e.
list) of items, then walk down the list looking for the given key.  In
such an implementation (and I believe for hash tables in general), the
quality of the hash function greatly affects the performance of the
hash table operations.

I am not sure what implementation Data.HashTable uses.  However, I
believe Data.HashTable is not all that good:  for multi-million
element tables from Int to Int, Data.IntMap runs many times faster
than Data.HashTable.  I have no wish to start a flame war here (this
topic has in the past), but the state of affairs regarding hash tables
in Haskell is not good.

Sincerely,
Brad