
On Thu, Mar 21, 2013 at 2:26 PM, Isaac Dupree < ml@isaac.cedarswampstudios.org> wrote:
Regarding hashWithSalt determinism:
hashable 1.1: "The general contract of hash is: * This integer need not remain consistent from one execution of an application to another execution of the same application. [...] The contract for hashWithSalt is the same as for hash, with the additional requirement that any instance that defines hashWithSalt must make use of the salt in its implementation." [1]
hashable 1.2: "The general contract of hashWithSalt is: * If a value is hashed using the same salt during distinct runs of an application, the result must remain the same. (This is necessary to make it possible to store hashes on persistent media.) [...]"
Which contract do we want?
When I wrote the 1.1 contract I gave me lots of leeway to change the implementation in the future. The actual implementation was that the hash was always the same between run to run for a given architecture. I'm not terribly happy with the "This is necessary to make it possible to store hashes on persistent media." part of the 1.2 contract. I should probably not have let that go in. If you're persisting your hashes you should use a hash function that guarantees exactly which algorithm is used. I think the contract should be: the hash function is guaranteed to return the same hash code for a given value as long as the code is compiled with the same version of hashable, unless the user explicit turns on hash randomization (i.e. random seed read from /dev/urandom). I don't think we should make any guarantees that a new version of hashable won't change the hash function used. As for word sizes the only practical thing is to use the native word size, as anything else is much too slow (i.e. Int64 is terribly slow on 32-bit platforms). -- Johan