
Johan Tibell wrote:
* Instances of Hashable for common types, in particular ByteString and Text.
And of course, all of the numeric types, unit, and Bool, Maybe, and Either. More interesting are Maybe, Either, lists, tuples, Data.Map, Data.Set, etc. These are far from trivial; I did some work on it for Python. If you think you can just do something fairly simple and reasonable and hope that it will have the right statistical properties, I promise you that it won't work. Unless someone knows of a better methodology, you need to develop a battery of QuickCheck-like tests to build complexity into data structures in various ways, then use trial and error to come up with an algorithm that gives results within some tolerance for all of the tests while still being very fast.
I already have fast instances for Text and ByteString, based on MurmurHash2 [2].
Ah, I see from that site that they used pretty much the same kinds of techniques. They actually used chi-square, I didn't get that fancy. Theoretically, if the hashes for Maybe and Either are good enough, all the other container hashes could be built out of those. But I doubt that would be very easy. Anyway, the main work is building all the tests, which you would have to do in any case. Regards, Yitz