Proposal: Add function to create new hash tables with a default size

In other some standard libraries (Java, OCaml), hash tables can be created with a user-supplied size. This avoids some resizing for users who know they will be inserting a lot of data. Data.HashTable does not expose a function to do this. This proposal changes that. For a dictionary of 5 million strings, this patch saves me about 33% of the total execution time. For 10 million strings, this patch saves me 50% of the total execution time. http://hackage.haskell.org/trac/ghc/ticket/4193


Hello Jim, Thursday, July 15, 2010, 9:12:04 AM, you wrote:
In other some standard libraries (Java, OCaml), hash tables can be created with a user-supplied size.
+1. it's useful thing btw, if you build full hash before using it, you may use pure hashes. i've once published simple 10-line library that implements them over an array -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On 15/07/2010 06:12, Jim Apple wrote:
In other some standard libraries (Java, OCaml), hash tables can be created with a user-supplied size. This avoids some resizing for users who know they will be inserting a lot of data.
Data.HashTable does not expose a function to do this. This proposal changes that.
For a dictionary of 5 million strings, this patch saves me about 33% of the total execution time. For 10 million strings, this patch saves me 50% of the total execution time.
+1 Simon

On 15 July 2010 06:12, Jim Apple
In other some standard libraries (Java, OCaml), hash tables can be created with a user-supplied size. This avoids some resizing for users who know they will be inserting a lot of data.
Data.HashTable does not expose a function to do this. This proposal changes that.
For a dictionary of 5 million strings, this patch saves me about 33% of the total execution time. For 10 million strings, this patch saves me 50% of the total execution time.
+1 Your round-up-to-next-power-of-2 code looks a bit weird and complicated. Why not use the standard bit-twiddling tricks? (It's not like it's polymorphic in Num.) http://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2

I've set the state to patch; I don't have authority to commit, and the "Library submissions" wiki page doesn't say whether to give a bug a new owner in such a case. Jim
participants (4)
-
Bulat Ziganshin
-
Jim Apple
-
Simon Marlow
-
Thomas Schilling