On Thu, Nov 25, 2010 at 2:23 PM, Johan Tibell <johan.tibell@gmail.com> wrote:
Is this an interesting problem to optimize for?
I think so. The purpose of having the Hashable typeclass is so that the universe of typed values that we can hash is open. It thus behooves us to figure out how to do that safely. For instance, someone using the Hashable class should be able to figure out how to hash this:
data HttpUrl = HttpUrl {
urlScheme :: ByteString,
urlHost :: ByteString,
urlPort :: Int,
urlPath :: ByteString,
urlQuery :: Maybe ByteString
}
Using the Hashable class, I can see how to hash the individual elements of this, but not how to safely glue them together into a hash for the whole value.
Over in the Java world, where they depend to an arguably ridiculous degree on hashing, we often see horror stories of some hash function having bad dispersal properties (or worse, equality and hashing not matching) and causing nasty knock-on performance effects due to too many values hashing to the same few numbers in the range.
I notice that the Hashable class doesn't have a way to supply a seed. If it did, we could possibly chain the result of one hash as the seed of the next, allowing us to construct a complete hash in an obvious way (although introducing a possibly undesirable data dependency, making it difficult or impossible to hash a large structure in parallel).