
On Dec 29, 2005, at 7:44 PM, Branimir Maksimovic wrote:
To comment some observation on this program. Most of the pressure now is on Data.HashTable. I've susspected such large memory usage on substring from array conversions, so mad version with data MyString = MakeMyStrinf { buf :: Ptr Char, size :: Int } and there was no substrings in map or anywhere else, but memory consumption remains. So after eliminating inserts and updates into HashTable memory was ok. Finally I've realized that updates into hash table actually increase memory usage to large extent instead to keep memory same on average. So I guess this is bug in HashTable?
Probably. The minimum table chunk size was rather large. I have been experimenting (tests are running even as I type) with alternate implementations of Data.HashTable. So far the winning implementation is one based on multiplicative hashing and simple table doubling. This seems to be consistently competitive with / faster than Data.Map. At this point my inclination is to make the whole suite available: Data.HashTable.Class Data.HashTable.DataMap Data.HashTable.Doubling Data.HashTable.Flat Data.HashTable.Freeze Data.HashTable.Modulus Data.HashTable.Multiplicative Data.HashTable.Original Data.HashTable.Test Data.HashTable I've separated out the hashing technique (Modulus, Multiplicative) from the hash table implementation. Note that a few of the above are bogus, and this is only a fraction of the implementations tried thus far. I need to get distribution clearance for a bunch of this code from my employer, and figure out how to package it. The latter may be tricky, as Data.Hashtable is currently rather intimately tied to a bunch of rather tricky bits of GHC. -Jan-Willem Maessen
Second in this test, hash function needs to be very strong, as even with following I got longest chain of 16 elements. myHashString = fromIntegral . ff''' . ff'' . ff' . foldr f 0 where f c m = f'' $ f' (ord c + m) f' m = m + (m `shiftL` 10) f'' m = m `xor` (m `shiftR` 6) ff' m = m + (m `shiftL` 3) ff'' m = m `xor` (m `shiftR` 11) ff''' m = m + (m `shiftL` 15) Default hashString has longestChain of 18 elements. Perhaps someone can explain if such a memory leaks from HashTable updates are normal or are bugs? All in all functional version consumes less memory and is twice as fast.
Greetings, Bane.
_________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe