
Although I'd prefer a secure-by-default implementation, this is a good
compromise. Then unordered-containers' docs should reference
SipHashed in a prominent.
Cheers,
On Wed, Mar 20, 2013 at 2:02 PM, Thomas Schilling
To make this more precise the next version of hashable (say, 1.3) would include this:
newtype SipHashed a = SipHashed a
class SipHashable a where sipHashWithSalt :: Int -> a -> Int
instance SipHashable a => Hashable (SipHashed a) where hashWithSalt salt (SipHashed x) = sipHashWithSalt salt x
Then all Hashable instances are taken from hashable-1.1. All Hashable instances from hashable-1.2 are renamed to become instances of SipHashable.
Alternatively, hashable-1.2 could be renamed to hashable-sip-1.0 or so.
Note that I do not propose to make this change now. We include hashable-1.1 now, and the above will be the upgrade path to include secure hashing.
On 20 March 2013 16:47, Thomas Schilling
wrote: OK. I think a reasonable approach would be the following:
- add hashable-1.1 (ie., without SipHash) to the platform - later create a new release of hashable, that is fast by default and provides SipHash functionality via a newtype wrapper (or it could be a new package that defines the newtype and all the standard instances)
What's important is that the default behaviour (fast vs. secure) won't change in a later version of the platform.
We should also make it clear that hashable (even with siphash) does not aim to implement secure hashing. I.e., no replacement for proper HMACs, SHA1, etc.
On 19 March 2013 16:51, Johan Tibell
wrote: Hi Thomas,
On Tue, Mar 19, 2013 at 8:41 AM, Thomas Schilling
wrote: On 19 March 2013 16:01, Johan Tibell
wrote: http://trac.haskell.org/haskell-platform/wiki/Proposals/unordered-containers
The links to the repos are wrong. It should be "tibbe" instead of "tibbel".
Fixed.
Bryan's recent change to change "hashable" to use SipHash is certainly the right default. There were some complaints about performance for use cases where security is not an issue. What are the options for users that wish to use a different hash function? According to the paper, SipHash is about 2x slower than CityHash.
2x is *a lot*. 2x is about the performance difference between Map and HashMap. Since the raison d'etre for HashMap is that it's faster than Map, if we'd see a 2x slowdown in HashMap there would be little reason to use it.
For example, 'delete' for HashMap ByteString got almost 2x slower with hashable-1.2. Since 'delete' does more than just hashing, that means that SipHash is quite a bit slower than the current (insecure) hash function. Another example: with GHC 7.6.2 HashMap String is almost unusable slow (5x slower than before). This is likely due to a GHC bug, but it's something we need to investigate. At the moment I don't encourage people to upgrade to hashable-1.2.0.5.
The right way to go is probably to make this a user decision. Many applications (e.g. data processing) has no need for the security guarantee so paying for it makes little sense.
Cheers, Johan
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
-- Felipe.