
G'day all.
Quoting Peter Simons
If you don't mind using FFI, the tool of choice would probably be http://www.gnu.org/software/gperf/.
Perfect hash functions are actually not that much better than "imperfect" hash functions for the case where you have keys to search for which are not in the set. "Imperfect" hashing requires that a key be scanned at least twice: Once to compute the hash and once for the final equality test. The equality test may need to be performed more than once if two keys hash to the same value. A good rule of thumb is that hash collisions are unlikely until you reach around sqrt(N) keys, where N is the size of the hash space. So, for example, for 32-bit hash values, you almost certainly won't get a collision unless you insert 65,000 or so keys, which is much more than Hal's 1,300 or so. In the absence of hash collisions "perfect" hashing is pretty much the same. The only differences are a) the hash function is more expensive to compute, and b) the equality test is guaranteed to happen at most once. Perfect hashing is best when you know that you're not going to search for keys that are not in the set. For example, an algorithm which requires two passes over the words in some set of documents could easily benefit from perfect hashing, since the words that you'll find in the second pass will only be those found in the first pass. Radix-based searching, on the other hand, requires only one pass through the key and no arithmetic. Cheers, Andrew Bromage