
Hi, On Tue, Aug 28, 2007 at 11:41:22AM -0400, Jan-Willem Maessen wrote:
golden :: Int32 golden = 1013904242 -- = round ((sqrt 5 - 1) * 2^32) :: Int32 -- was -1640531527 = round ((sqrt 5 - 1) * 2^31) :: Int32 -- but that has bad mulHi properties (even adding 2^32 to get its inverse) -- Whereas the above works well and contains no hash duplications for -- [-32767..65536]
hashInt32 :: Int32 -> Int32 hashInt32 x = mulHi x golden + x
This gives
map hashInt [0..16] [0,1,2,3,4,6,7,8,9,11,12,13,14,16,17,18,19]
-- (length $ group $ sort $ map hashInt [-32767..65536]) == 65536 + 32768
This test also passes for the golden :: Int32 golden = -1640531527 hashInt :: Int -> Int32 hashInt x = fromIntegral x * golden implementation, which has a very pretty distribution; graph at the bottom of http://www.brpreiss.com/books/opus4/html/page214.html
hashString :: String -> Int32 hashString = foldl' f golden where f m c = fromIntegral (fromEnum c) * magic + hashInt32 m magic = 0xdeadbeef
Why use magic rather than golden? This makes sense to me: hashString :: String -> Int32 hashString = foldl' f golden where f m c = (fromIntegral (ord c) `xor` m) * golden Is anything obviously wrong with it? Thanks Ian