Re: Proposal: Add the unordered-containers package and the hashable package to the Haskell Platform

On 3/20/13 1:02 PM, Thomas Schilling wrote:> To make this more precise the next version of hashable (say, 1.3)
would include this:
newtype SipHashed a = SipHashed a
class SipHashable a where sipHashWithSalt :: Int -> a -> Int
instance SipHashable a => Hashable (SipHashed a) where hashWithSalt salt (SipHashed x) = sipHashWithSalt salt x
Then all Hashable instances are taken from hashable-1.1. All Hashable instances from hashable-1.2 are renamed to become instances of SipHashable.
That seems like a reasonable solution. I only ever use hashmaps explicitly for speed reasons and in applications that are not susceptible to DoS. The arguments about not wanting to introduce security issues due to some transitive dependency also hold for not wanting to introduce performance issues due to some transitive dependency. I respect all the folks working on web frameworks in Haskell, but there are also those of us who work on programs where compute time really is the number one concern (once an acceptable level of correctness and precision has been obtained); and when experiments are measured in days or weeks of cluster time, those "little" 1.3x and 1.5x slowdowns really hurt, let alone the 2.0x people've mentioned with siphash. One concern with the above approach: is "siphash" a sufficiently generic name, or is it just one hashing method that happens to deflect this DoS issue? I haven't read the paper, so... -- Live well, ~wren

On Thu, Mar 21, 2013 at 6:54 AM, wren ng thornton
On 3/20/13 1:02 PM, Thomas Schilling wrote:> To make this more precise the next version of hashable (say, 1.3)
would include this:
newtype SipHashed a = SipHashed a
class SipHashable a where sipHashWithSalt :: Int -> a -> Int
instance SipHashable a => Hashable (SipHashed a) where hashWithSalt salt (SipHashed x) = sipHashWithSalt salt x
Then all Hashable instances are taken from hashable-1.1. All Hashable instances from hashable-1.2 are renamed to become instances of SipHashable.
That seems like a reasonable solution.
I only ever use hashmaps explicitly for speed reasons and in applications that are not susceptible to DoS. The arguments about not wanting to introduce security issues due to some transitive dependency also hold for not wanting to introduce performance issues due to some transitive dependency. I respect all the folks working on web frameworks in Haskell, but there are also those of us who work on programs where compute time really is the number one concern (once an acceptable level of correctness and precision has been obtained); and when experiments are measured in days or weeks of cluster time, those "little" 1.3x and 1.5x slowdowns really hurt, let alone the 2.0x people've mentioned with siphash.
One concern with the above approach: is "siphash" a sufficiently generic name, or is it just one hashing method that happens to deflect this DoS issue? I haven't read the paper, so...
There was also a complaint that the old hashable uses id as the hash function for Ints, while the new one always hashes them, also degrading performance. This might be a harder nut because `hash` itself was removed as a method of Hashable, but perhaps it could be added back(?), and then it could also be a case for the newtype. I can't remember what the motivation was for hashing them. In any case I'd prefer a more generic name for the newtype, who knows if this kind of tradeoff might not be present elsewhere.
-- Live well, ~wren
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
-- Your ship was destroyed in a monadic eruption.

On 21 March 2013 06:54, wren ng thornton
One concern with the above approach: is "siphash" a sufficiently generic name, or is it just one hashing method that happens to deflect this DoS issue? I haven't read the paper, so...
One could of course generalize the above method using something like: -- A type hashed as h newtype Hashed h a = Hashed a instance (HashableAs h a) => Hashable (Hashed h a) where hashWithSalt = hashWithSaltAs class HashableAs h a where hashWithSaltAs :: Int -> Hashed h a -> Int data Sip sip :: a -> Hashed Sip a sip = Hashed instance HashableAs Sip Text where hashWithSaltAs salt (Hashed x) = sipHashWithSalt salt x instance HashableAs Sip ByteString where ... Regards, Bas

One annoying with an MPTC approach is that suddenly defaulting stops
working.
This can result in a requiring of a lot more signatures from users or
little helpers everywhere like the 'sip' below.
-Edward
On Thu, Mar 21, 2013 at 10:36 AM, Bas van Dijk
On 21 March 2013 06:54, wren ng thornton
wrote: One concern with the above approach: is "siphash" a sufficiently generic name, or is it just one hashing method that happens to deflect this DoS issue? I haven't read the paper, so...
One could of course generalize the above method using something like:
-- A type hashed as h newtype Hashed h a = Hashed a
instance (HashableAs h a) => Hashable (Hashed h a) where hashWithSalt = hashWithSaltAs
class HashableAs h a where hashWithSaltAs :: Int -> Hashed h a -> Int
data Sip
sip :: a -> Hashed Sip a sip = Hashed
instance HashableAs Sip Text where hashWithSaltAs salt (Hashed x) = sipHashWithSalt salt x
instance HashableAs Sip ByteString where ...
Regards,
Bas
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
participants (4)
-
Bas van Dijk
-
Edward Kmett
-
Gábor Lehel
-
wren ng thornton