On November 21, 2016 12:32:03 AM GMT+11:00, Bertram Felgenhauer via Haskell-Cafe <haskell-cafe@haskell.org> wrote:
raichoo via Haskell-Cafe wrote:
First of all the documentation of `hashStableName` says the following:

The Int returned is not necessarily unique; several StableNames may map
to the same Int (in practice however, the chances of this are small, so
the result of hashStableName makes a good hash key).

OK, sounds fine, let's put this to the test. So I wrote a little program.

module Main where

import Control.Monad

import System.Mem.StableName
import System.Environment

main :: IO ()
main = do
args <- getArgs
res <- forM [0..10000] $ \i -> do
x <- makeStableName i
let h' = hashStableName x
unless (null args) $
print h'
return h'

putStrLn "---------------------------"
print $ minimum res
print $ maximum res

There is nothing in this program that keeps the stable names alive.
It appears, from your experiments, that once a stable name is garbage
collected, its ID, which also serves as its hash value, may be reused
for another stable pointer. Consider this variant of your main function:

main = do
res <- forM [0..10000] $ fmap hashStableName . makeStableName
performGC
res <- forM [0..10000] $ fmap hashStableName . makeStableName
performGC
res <- forM [0..10000] $ fmap hashStableName . makeStableName
print (minimum res, maximum res)

This produces `(1,10001)` as output in my tests.

I'm not sure how exactly `print` affects garbage collections.

Overall, I believe the documentation of `hashStableName` is mostly
correct, but it would make sense to stress that the statement is
only valid for stable names that are currently alive at a particular
point in time, not globally for the whole run of a program.

Cheers,

Bertram



Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

I suspect that when print h' isn't called, then `let h' = hashStableName x` stays as a thunk in the list, until you later compute the minimum and maximum. The thunk refers to the stable name, and keeps it from being GCed. So they *are* all live at the same time.

When you print h' in the loop body, that forces the hash calculation and discharges the thunk. So the stable name becomes garbage immediately, and once collected the internal id is available for reuse. Perhaps the 1-260 range observed tells you roughly how many iterations of that loop you can get through in the smallest GC generation, on your system.