
On 30 December 2005 01:23, Jan-Willem Maessen wrote:
Probably. The minimum table chunk size was rather large. I have been experimenting (tests are running even as I type) with alternate implementations of Data.HashTable. So far the winning implementation is one based on multiplicative hashing and simple table doubling. This seems to be consistently competitive with / faster than Data.Map. At this point my inclination is to make the whole suite available:
Data.HashTable.Class Data.HashTable.DataMap Data.HashTable.Doubling Data.HashTable.Flat Data.HashTable.Freeze Data.HashTable.Modulus Data.HashTable.Multiplicative Data.HashTable.Original Data.HashTable.Test Data.HashTable
I've separated out the hashing technique (Modulus, Multiplicative) from the hash table implementation. Note that a few of the above are bogus, and this is only a fraction of the implementations tried thus far.
I need to get distribution clearance for a bunch of this code from my employer, and figure out how to package it. The latter may be tricky, as Data.Hashtable is currently rather intimately tied to a bunch of rather tricky bits of GHC.
I wonder if you could put together a drop-in replacement for Data.HashTable that we can incorporate? There's not much point in us still providing the inefficient version as standard after you've done all this work to figure out how to do it better. Cheers, Simon