
Hi,
Sigh. I think that Isaac is quite right. Even though I think that it
would be quite rare for multiple threads to share the same name supply
in practise, I would really rather have safe code and not have to
think about it. So... I have reverted to the non-dupable versions by
default. I also added an "unsafeNewIntSupply" that uses that dupable
primitives, for those who like living on the edge :) Thanks to
Sebastian for the bench marking program! The performance numbers for
the current version on hackage (0.4) are as follows:
With safe primitives value-supply is about 7 times slower, with the
unsafe ones it is about 2 times slower (a bit less actually). Even
though this seems like a big difference, the actual time it takes to
generate names is very small: I have to generate about 10 million
names to get reliable measurements, so I am not sure if the difference
matters in practise.
If anyone has further ideas, please chime in.
-Iavor
On Tue, Dec 23, 2008 at 8:30 AM, Isaac Dupree
Iavor Diatchki wrote:
- It uses unsafeDupableInterleaveIO to avoid double locking,
in particular,
gen r = unsafeDupableInterleaveIO $ do v <- unsafeDupableInterleaveIO (genSym r) ls <- gen r rs <- gen r return (Node v ls rs)
where is the double locking? We want referential transparency...
e.g. suppose you use newNumSupply to create a thunk for a Gen; when evaluated, it will run unsafeDupableInterleaveIO. You send that thunk off to two different other threads. Then those threads decide to evaluate it (say, enough to get the first genSym'd value) and happen to make a race condition, so it becomes two separate IO computations. Therefore one of them runs atomicModifyIORef, and the other one runs atomicModifyIORef, and so they get two different values out.
Node 0 (...) (...) Node 1 (...) (...)
when it's suppose to be the very same Gen data structure.
so, am I right or wrong about what the perils of unsafeDupableInterleaveIO are?
I could see changing (unsafe[Dupable]InterleaveIO (genSym r)) to (genSym r), to halve the number of unsafeInterleaveIOs needed if we assume that most of the time a node is evaluated in order to get a value... but it's hard to see a good way to make *fewer* InterleaveIOs than there are genSym'd values. (possible, but hard, and really depends on the relative expenses/risks of locking, of computing the next number, and of using up the "address space" of all possible Ints for example). Maybe the outer InterleaveIO could "strictly" make a few levels of Nodes (with lazy genSym'd values this time) before "interleaving" again, to reduce the amount of interleaving from the non-semantics-changing side.
-Isaac