
Oh, I just realised that this proposal is to include the older version
of hashable. In principle, I'm not against that, but I do wonder what
the upgrade path is. I don't think the performance problems can be
fixed in general -- that's just the price of security. So it becomes
critical what the upgrade path looks like. Do users get a slowdown of
2x by default and then have to manually make it faster again if
something is not security sensitive? Do users have to explicitly opt
in for security (a bad default, IMO)? Do we have any idea how that
switch may affect the API?
On 19 March 2013 16:51, Johan Tibell
Hi Thomas,
On Tue, Mar 19, 2013 at 8:41 AM, Thomas Schilling
wrote: On 19 March 2013 16:01, Johan Tibell
wrote: http://trac.haskell.org/haskell-platform/wiki/Proposals/unordered-containers
The links to the repos are wrong. It should be "tibbe" instead of "tibbel".
Fixed.
Bryan's recent change to change "hashable" to use SipHash is certainly the right default. There were some complaints about performance for use cases where security is not an issue. What are the options for users that wish to use a different hash function? According to the paper, SipHash is about 2x slower than CityHash.
2x is *a lot*. 2x is about the performance difference between Map and HashMap. Since the raison d'etre for HashMap is that it's faster than Map, if we'd see a 2x slowdown in HashMap there would be little reason to use it.
For example, 'delete' for HashMap ByteString got almost 2x slower with hashable-1.2. Since 'delete' does more than just hashing, that means that SipHash is quite a bit slower than the current (insecure) hash function. Another example: with GHC 7.6.2 HashMap String is almost unusable slow (5x slower than before). This is likely due to a GHC bug, but it's something we need to investigate. At the moment I don't encourage people to upgrade to hashable-1.2.0.5.
The right way to go is probably to make this a user decision. Many applications (e.g. data processing) has no need for the security guarantee so paying for it makes little sense.
Cheers, Johan