
Hello, This post is a about a tangential issue to the shootout, not the shootout itself. It concerns the lack of a fast mutable hashtable. If you have a good replacement for Data.Hashtable, I would be interested to know about it (and so would the wiki!). The shootout entry "k-nucleotide" is a good test of the hashtable provided by each of the languages: http://shootout.alioth.debian.org/gp4/benchmark.php?test=knucleotide&lang=all Several of us are building a proposed entry on the wiki at http://haskell.org/hawiki/KnucleotideEntry where I have just posted a plea which I will paste below: On mutable data structures in Haskell This benchmark is completely bottlenecked by Data.Hashtable (in GHC 6.4.1) and this discussion is based on the assumption that I am using Hashtable optimally. I downloaded the GHD 0.17 compiler and the DMD entry to benchmark on my machine. The DMD entry uses the "associative arrays" built into the language: "int[char[]] frequencies" and places [WWW]second (runs in 3.0s). The winning entry is interesting, since the c-code does not have a hash table, and so it uses #include "../../Include/simple_hash.h" which pulls in a dead simple, inline, string to int hashtable and runs in 2s. The entry below runs 16 slower than the DMD entry on my powerbook G4. Profiling shows the bottleneck. I downloaded simple_hash.h and shamelessly optimized it to replace Data.Hashtable for exactly this benchmark code. This sped up the proposed entry by a factor of 4.1 so that it is now about a factor of 4 slower than the DMD entry. This shows that Data.Hashtable is doing *at least* four times more work that is needed for this usage pattern. And even with my over specialized hash table, Haskell is almost 50% slower than OCaml's "module H = Hashtbl.Make(S)" (note that I my code uses the same hash function as OCaml). Unfortunately I cannot submit this optimized hash table entry to the shootout. The only mutable data structure that come with GHC besides arrays is Data.Hashtable, which is not comptetitive with OCaml Hashtbl or DMD's associative arrays (unless there is something great hidden under Data.Graph). Is there any hope for GHC 6.6? Does anyone have pointers to an existing library at all? Perl and Python and Lua also have excellent built in hashtable capabilities. Where is a good library for Haskell? -- Chris Kuklewicz

Hello Chris, Monday, January 23, 2006, 12:27:53 PM, you wrote: CK> The only mutable data structure that come with GHC besides arrays is CK> Data.Hashtable, which is not comptetitive with OCaml Hashtbl or DMD's CK> associative arrays (unless there is something great hidden under CK> Data.Graph). Is there any hope for GHC 6.6? Does anyone have pointers to CK> an existing library at all? Perl and Python and Lua also have excellent CK> built in hashtable capabilities. Where is a good library for Haskell? 1) are you used "+RTS -A10m" / "+RTS -H100m"? 2) Simon Marlow optimized something in the IOArray handling, but i don't understand that is changed. see http://cvs.haskell.org/trac/ghc/ticket/650 -- Best regards, Bulat mailto:bulatz@HotPOP.com

Bulat Ziganshin wrote:
Hello Chris,
Monday, January 23, 2006, 12:27:53 PM, you wrote:
CK> The only mutable data structure that come with GHC besides arrays is CK> Data.Hashtable, which is not comptetitive with OCaml Hashtbl or DMD's CK> associative arrays (unless there is something great hidden under CK> Data.Graph). Is there any hope for GHC 6.6? Does anyone have pointers to CK> an existing library at all? Perl and Python and Lua also have excellent CK> built in hashtable capabilities. Where is a good library for Haskell?
1) are you used "+RTS -A10m" / "+RTS -H100m"?
2) Simon Marlow optimized something in the IOArray handling, but i don't understand that is changed. see http://cvs.haskell.org/trac/ghc/ticket/650
Much of the GC overhead of Data.Hash should be gone in GHC 6.6. I also have an improved implementation of Data.Hash from Jan-Willem Maessen to import, but that will be 6.6 rather than 6.4.2. Cheers, Simon

Simon Marlow wrote:
Bulat Ziganshin wrote:
Hello Chris,
Monday, January 23, 2006, 12:27:53 PM, you wrote:
CK> The only mutable data structure that come with GHC besides arrays is CK> Data.Hashtable, which is not comptetitive with OCaml Hashtbl or DMD's CK> associative arrays (unless there is something great hidden under CK> Data.Graph). Is there any hope for GHC 6.6? Does anyone have pointers to CK> an existing library at all? Perl and Python and Lua also have excellent CK> built in hashtable capabilities. Where is a good library for Haskell?
1) are you used "+RTS -A10m" / "+RTS -H100m"?
2) Simon Marlow optimized something in the IOArray handling, but i don't understand that is changed. see http://cvs.haskell.org/trac/ghc/ticket/650
Much of the GC overhead of Data.Hash should be gone in GHC 6.6. I also have an improved implementation of Data.Hash from Jan-Willem Maessen to import, but that will be 6.6 rather than 6.4.2.
Cheers, Simon
That is good to hear. The benchmark's tests take 1,250,000 pre-generated strings as the keys. At the end, the string keys are 18 characters long, drawn randomly from a set of 4 characters. So the hash computations are a nontrivial hit. Using -A400m I get 39s down from 55s. That is the best Data.HashTable time I have seen. (Using -A10m and -A100m were a little slower). Using my over optimized c-code hashtable I get 12.3 seconds. The associative arrays in OCaml and D are still faster. So you see why I long for GHC 6.6. Is Jan-Willem Maessen's Hash available anywhere? I could benchmark it.

Hello Chris, Monday, January 23, 2006, 6:09:15 PM, you wrote: CK> Using -A400m I get 39s down from 55s. That is the best Data.HashTable time I CK> have seen. (Using -A10m and -A100m were a little slower). 1) "-A400m" is a bit unusual. "-H400m" for 500-meg machine, "-H800m" for 1g computer and so on will be fastest. current GHC doc leaks explanations in this area, but basically -H just allocates that much area and then dynamically changes -A after each GC allocating all available space to the generation-0 memory pool 2) it's better to say that was MUT and GC times in your program, and even better just to post its output with "+RTS -sstderr" please post improved results here. that's really interesting for me, and for my programs too ;) -- Best regards, Bulat mailto:bulatz@HotPOP.com

Bulat Ziganshin wrote:
Hello Chris,
Monday, January 23, 2006, 6:09:15 PM, you wrote:
CK> Using -A400m I get 39s down from 55s. That is the best Data.HashTable time I CK> have seen. (Using -A10m and -A100m were a little slower).
1) "-A400m" is a bit unusual. "-H400m" for 500-meg machine, "-H800m" for 1g computer and so on will be fastest. current GHC doc leaks explanations in this area, but basically -H just allocates that much area and then dynamically changes -A after each GC allocating all available space to the generation-0 memory pool
2) it's better to say that was MUT and GC times in your program, and even better just to post its output with "+RTS -sstderr"
please post improved results here. that's really interesting for me, and for my programs too ;)
Here is all the data you wanted: Running "./cekp +RTS -sstderr -RTS < kfile" 6,292,511,740 bytes allocated in the heap 1,641,755,092 bytes copied during GC 38,233,484 bytes maximum residency (215 sample(s)) 24003 collections in generation 0 ( 27.23s) 215 collections in generation 1 ( 7.80s) 82 Mb total memory in use INIT time 0.00s ( 0.01s elapsed) MUT time 71.32s ( 78.99s elapsed) GC time 35.03s ( 39.19s elapsed) RP time 0.00s ( 0.00s elapsed) PROF time 0.00s ( 0.00s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 106.35s (118.19s elapsed) %GC time 32.9% (33.2% elapsed) Alloc rate 88,229,272 bytes per MUT second Productivity 67.1% of total user, 60.3% of total elapsed Is that 6 Billion bytes allocated? Yes it is. So use -H 400m: " ./cekp +RTS -H400m -sstderr -RTS < kfile " 6,293,156,400 bytes allocated in the heap 99,679,428 bytes copied during GC 8,742,464 bytes maximum residency (2 sample(s)) 18 collections in generation 0 ( 2.84s) 2 collections in generation 1 ( 0.38s) 392 Mb total memory in use INIT time 0.00s ( 0.00s elapsed) MUT time 82.42s (100.62s elapsed) GC time 3.22s ( 3.93s elapsed) RP time 0.00s ( 0.00s elapsed) PROF time 0.00s ( 0.00s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 85.65s (104.55s elapsed) %GC time 3.8% (3.8% elapsed) Alloc rate 76,345,461 bytes per MUT second Productivity 96.2% of total user, 78.8% of total elapsed So this is the small improvement, at the cost of turning off the GC. The Data.HashTable performance is the bottleneck, but it is not the GC's fault. Using the way over optimized hashtable in c-code that I adapted: " ./useit-tree +RTS -s -RTS < ../kfile " 1,096,743,184 bytes allocated in the heap 190,832,852 bytes copied during GC 38,233,484 bytes maximum residency (9 sample(s)) 4183 collections in generation 0 ( 1.39s) 9 collections in generation 1 ( 0.88s) 82 Mb total memory in use INIT time 0.00s ( 0.00s elapsed) MUT time 14.55s ( 16.06s elapsed) GC time 2.27s ( 2.87s elapsed) RP time 0.00s ( 0.00s elapsed) PROF time 0.00s ( 0.00s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 16.82s ( 18.93s elapsed) %GC time 13.5% (15.2% elapsed) Alloc rate 75,377,538 bytes per MUT second Productivity 86.5% of total user, 76.9% of total elapsed Here I see only 1 billion bytes allocated, but I think that hides 100 to 200 million that are being allocated in the c-code. Note that the total time has dropped by a factor of five over the hashtable w/o GC, a factor of 6 if I had let GC run. I can reduce the total time from 16.82s to 16.20s by adding -H here but that is not called for. ( The time without profiling is 12.5s) This benchmark stresses hash tables by adding 1,250,000 string keys, with the values being the count of the times the key was added. This is the common "make a histogram" usage, and never needs to delete from the hashtable. ( The keys are all fixed length, and the number is known. ) Looking at it as a black box, Data.HashTable is doing a tremendous amount of MUT work that dominates the run-time, taking many times longer than the Hashtbl that OCaml uses. And D's associative array is scary fast. I have not yet tested the proposed Data.Hash for GHC 6.6. Apparently I need to get a darcs checkout to avoid a bug in GHC 6.4.1 first. I don't expect a Haskell data structure to be as fast as the over optimized c-code. But I am very surprised that there is no mutable data structure that can compete with OCaml. -- Chris

Because Data.HashTable is tied rather specially into the internals of Data.Typeable (I think I'm getting that right) it's hard to just drop in a new module with the same name. But for those eager to tinker, here are two modules for simple hash tables by table doubling and for multiplicative hashing. The interface is identical to Data.HashTable. When I get time (not for a couple of weeks I fear) I'm planning to learn cabal and cabalize these and a few other related hash table modules all under Data.HashTable.* (including a module of tests and a unifying type class). -Jan
... Is Jan-Willem Maessen's Hash available anywhere? I could benchmark it.
participants (5)
-
Bulat Ziganshin
-
Chris Kuklewicz
-
Chris Kuklewicz
-
Jan-Willem Maessen
-
Simon Marlow