Can reallyUnsafePtrEquality give false positives?

Hello, Can reallyUnsafePtrEquality give false positives? I can see how it can give false negatives (eg, compiler optimisations increasing or decreasing sharing), but I'm not so sure if it can give false positives. I don't see how in a garbage collected language two live values could compare reference equal. Unless the implementation is something like: reallyUnsafePtrEquality a b = getptr a == getptr b ...as that then means that if the GC moves things after `getptr a` is evaluated but before `getptr b` is, then you could get a false positive. But that doesn't seem like a sensible implementation to me, because then reallyUnsafePtrEquality would surely be totally useless. -- Michael Walker (http://www.barrucadu.co.uk)

It cannot give false positives. If it could, that would make it totally worthless. Sent from my iPhone
On Nov 22, 2017, at 12:22 PM, Michael Walker
wrote: Hello,
Can reallyUnsafePtrEquality give false positives? I can see how it can give false negatives (eg, compiler optimisations increasing or decreasing sharing), but I'm not so sure if it can give false positives.
I don't see how in a garbage collected language two live values could compare reference equal. Unless the implementation is something like:
reallyUnsafePtrEquality a b = getptr a == getptr b
...as that then means that if the GC moves things after `getptr a` is evaluated but before `getptr b` is, then you could get a false positive. But that doesn't seem like a sensible implementation to me, because then reallyUnsafePtrEquality would surely be totally useless.
-- Michael Walker (http://www.barrucadu.co.uk) _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

According to Edwark Kmett it can give false positives as well, or at least
could in 2010:
https://mail.haskell.org/pipermail/haskell-cafe/2010-June/079532.html
Erik
On 22 November 2017 at 20:06, Andrew Martin
It cannot give false positives. If it could, that would make it totally worthless.
Sent from my iPhone
On Nov 22, 2017, at 12:22 PM, Michael Walker
wrote: Hello,
Can reallyUnsafePtrEquality give false positives? I can see how it can give false negatives (eg, compiler optimisations increasing or decreasing sharing), but I'm not so sure if it can give false positives.
I don't see how in a garbage collected language two live values could compare reference equal. Unless the implementation is something like:
reallyUnsafePtrEquality a b = getptr a == getptr b
...as that then means that if the GC moves things after `getptr a` is evaluated but before `getptr b` is, then you could get a false positive. But that doesn't seem like a sensible implementation to me, because then reallyUnsafePtrEquality would surely be totally useless.
-- Michael Walker (http://www.barrucadu.co.uk) _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Actually, I'd say that it can, but only if used incorrectly. Which is why
it's reallyUnsafe.
The OP is correct in that a gc at the wrong time can lead to spurious
positives. The key to this is that, if you force everything to be evaluated
to WHNF (so you actually have the pointers) and then gc, you have some
determinacy as to when the next gc will happen: force to WHNF first, gc,
make sure any subsequent operations between that and your
reallyUnsafePtrEquality either don't allocate or fit in the nursery --- and
in this case, you can trust the result. So it requires a fair amount of
care, but there is a window in which it is safe.
On Wed, Nov 22, 2017 at 2:06 PM, Andrew Martin
It cannot give false positives. If it could, that would make it totally worthless.
Sent from my iPhone
On Nov 22, 2017, at 12:22 PM, Michael Walker
wrote: Hello,
Can reallyUnsafePtrEquality give false positives? I can see how it can give false negatives (eg, compiler optimisations increasing or decreasing sharing), but I'm not so sure if it can give false positives.
I don't see how in a garbage collected language two live values could compare reference equal. Unless the implementation is something like:
reallyUnsafePtrEquality a b = getptr a == getptr b
...as that then means that if the GC moves things after `getptr a` is evaluated but before `getptr b` is, then you could get a false positive. But that doesn't seem like a sensible implementation to me, because then reallyUnsafePtrEquality would surely be totally useless.
-- Michael Walker (http://www.barrucadu.co.uk) _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

This is out of curiosity: does the runtime actually provide guarantees
about when the GC must kick in? Or is this implementation defined but
common knowledge?
Thanks,
Siddharth
On Wed 22 Nov, 2017, 20:20 Brandon Allbery,
Actually, I'd say that it can, but only if used incorrectly. Which is why it's reallyUnsafe.
The OP is correct in that a gc at the wrong time can lead to spurious positives. The key to this is that, if you force everything to be evaluated to WHNF (so you actually have the pointers) and then gc, you have some determinacy as to when the next gc will happen: force to WHNF first, gc, make sure any subsequent operations between that and your reallyUnsafePtrEquality either don't allocate or fit in the nursery --- and in this case, you can trust the result. So it requires a fair amount of care, but there is a window in which it is safe.
On Wed, Nov 22, 2017 at 2:06 PM, Andrew Martin
wrote: It cannot give false positives. If it could, that would make it totally worthless.
Sent from my iPhone
On Nov 22, 2017, at 12:22 PM, Michael Walker
wrote: Hello,
Can reallyUnsafePtrEquality give false positives? I can see how it can give false negatives (eg, compiler optimisations increasing or decreasing sharing), but I'm not so sure if it can give false positives.
I don't see how in a garbage collected language two live values could compare reference equal. Unless the implementation is something like:
reallyUnsafePtrEquality a b = getptr a == getptr b
...as that then means that if the GC moves things after `getptr a` is evaluated but before `getptr b` is, then you could get a false positive. But that doesn't seem like a sensible implementation to me, because then reallyUnsafePtrEquality would surely be totally useless.
-- Michael Walker (http://www.barrucadu.co.uk) _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
-- Sending this from my phone, please excuse any typos!

yeah, and if you look at how its generated, its pretty safe with some care
(though still unsafe)
https://github.com/ghc/ghc/blob/314bc31489f1f4cd69e913c3b1e33236b2bdf553/com...
and
https://github.com/ghc/ghc/blob/ec080ea1f160263282500b30444cb2db857f2f93/com...
are the two places in question :)
On Wed, Nov 22, 2017 at 2:15 PM, Brandon Allbery
Actually, I'd say that it can, but only if used incorrectly. Which is why it's reallyUnsafe.
The OP is correct in that a gc at the wrong time can lead to spurious positives. The key to this is that, if you force everything to be evaluated to WHNF (so you actually have the pointers) and then gc, you have some determinacy as to when the next gc will happen: force to WHNF first, gc, make sure any subsequent operations between that and your reallyUnsafePtrEquality either don't allocate or fit in the nursery --- and in this case, you can trust the result. So it requires a fair amount of care, but there is a window in which it is safe.
On Wed, Nov 22, 2017 at 2:06 PM, Andrew Martin
wrote: It cannot give false positives. If it could, that would make it totally worthless.
Sent from my iPhone
On Nov 22, 2017, at 12:22 PM, Michael Walker
wrote: Hello,
Can reallyUnsafePtrEquality give false positives? I can see how it can give false negatives (eg, compiler optimisations increasing or decreasing sharing), but I'm not so sure if it can give false positives.
I don't see how in a garbage collected language two live values could compare reference equal. Unless the implementation is something like:
reallyUnsafePtrEquality a b = getptr a == getptr b
...as that then means that if the GC moves things after `getptr a` is evaluated but before `getptr b` is, then you could get a false positive. But that doesn't seem like a sensible implementation to me, because then reallyUnsafePtrEquality would surely be totally useless.
-- Michael Walker (http://www.barrucadu.co.uk) _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

There's some other complications here as well. Threads by default get their
own nurseries, so you have such a safe window even in threaded programs ---
but if another thread allocates enough (and subject to RTS options, IIRC -n
matters here?) it could fill its nursery, and the RTS notices your thread's
nursery is mostly unused and let the other thread start allocating from
your nursery and shrink your safe window.
(Also, it looks to me like 8.2's runtime changed some things; I thought
there was an option explicitly specifying/controlling the
separate-nurseries behavior, but I didn't see it in checking through the
runtime options just now.)
On Wed, Nov 22, 2017 at 2:15 PM, Brandon Allbery
Actually, I'd say that it can, but only if used incorrectly. Which is why it's reallyUnsafe.
The OP is correct in that a gc at the wrong time can lead to spurious positives. The key to this is that, if you force everything to be evaluated to WHNF (so you actually have the pointers) and then gc, you have some determinacy as to when the next gc will happen: force to WHNF first, gc, make sure any subsequent operations between that and your reallyUnsafePtrEquality either don't allocate or fit in the nursery --- and in this case, you can trust the result. So it requires a fair amount of care, but there is a window in which it is safe.
On Wed, Nov 22, 2017 at 2:06 PM, Andrew Martin
wrote: It cannot give false positives. If it could, that would make it totally worthless.
Sent from my iPhone
On Nov 22, 2017, at 12:22 PM, Michael Walker
wrote: Hello,
Can reallyUnsafePtrEquality give false positives? I can see how it can give false negatives (eg, compiler optimisations increasing or decreasing sharing), but I'm not so sure if it can give false positives.
I don't see how in a garbage collected language two live values could compare reference equal. Unless the implementation is something like:
reallyUnsafePtrEquality a b = getptr a == getptr b
...as that then means that if the GC moves things after `getptr a` is evaluated but before `getptr b` is, then you could get a false positive. But that doesn't seem like a sensible implementation to me, because then reallyUnsafePtrEquality would surely be totally useless.
-- Michael Walker (http://www.barrucadu.co.uk) _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Actually, the current -n looks like a generalization of what I remember
being there --- and one that re-adds safety, since instead of "scavenging"
off another thread's nursery, the exhausting thread gets another chunk of
nursery for itself if one is available.
On Wed, Nov 22, 2017 at 2:25 PM, Brandon Allbery
There's some other complications here as well. Threads by default get their own nurseries, so you have such a safe window even in threaded programs --- but if another thread allocates enough (and subject to RTS options, IIRC -n matters here?) it could fill its nursery, and the RTS notices your thread's nursery is mostly unused and let the other thread start allocating from your nursery and shrink your safe window.
(Also, it looks to me like 8.2's runtime changed some things; I thought there was an option explicitly specifying/controlling the separate-nurseries behavior, but I didn't see it in checking through the runtime options just now.)
On Wed, Nov 22, 2017 at 2:15 PM, Brandon Allbery
wrote: Actually, I'd say that it can, but only if used incorrectly. Which is why it's reallyUnsafe.
The OP is correct in that a gc at the wrong time can lead to spurious positives. The key to this is that, if you force everything to be evaluated to WHNF (so you actually have the pointers) and then gc, you have some determinacy as to when the next gc will happen: force to WHNF first, gc, make sure any subsequent operations between that and your reallyUnsafePtrEquality either don't allocate or fit in the nursery --- and in this case, you can trust the result. So it requires a fair amount of care, but there is a window in which it is safe.
On Wed, Nov 22, 2017 at 2:06 PM, Andrew Martin
wrote:
It cannot give false positives. If it could, that would make it totally worthless.
Sent from my iPhone
On Nov 22, 2017, at 12:22 PM, Michael Walker
wrote: Hello,
Can reallyUnsafePtrEquality give false positives? I can see how it can give false negatives (eg, compiler optimisations increasing or decreasing sharing), but I'm not so sure if it can give false positives.
I don't see how in a garbage collected language two live values could compare reference equal. Unless the implementation is something like:
reallyUnsafePtrEquality a b = getptr a == getptr b
...as that then means that if the GC moves things after `getptr a` is evaluated but before `getptr b` is, then you could get a false positive. But that doesn't seem like a sensible implementation to me, because then reallyUnsafePtrEquality would surely be totally useless.
-- Michael Walker (http://www.barrucadu.co.uk) _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Brandon Allbery wrote:
Actually, I'd say that it can, but only if used incorrectly. Which is why it's reallyUnsafe.
The OP is correct in that a gc at the wrong time can lead to spurious positives.
There will be no such GCs. Consider the type: reallyUnsafePtrEquality# :: a -> a -> Int# The operation takes two *Haskell values*, represented by pointers to the heap, and compares those pointers. (See also Carter Schonwald's mail.) The same code would also be allowed to dereference those pointers to find the tags of the corresponding heap objects, check whether they are constructors, and if so, read the corresponding values from the heap. In other words, if the identity of heap objects could be confused at this point, then perfectly ordinary compiled Haskell code could go wrong as well. The compiler and RTS ensure that the pointers that the pointer comparison sees are valid pointers for the Haskell values. (The story would be different if reallyUnsafePtrEquality# were implemented in terms of anyToAddr#.) So there are no false positives; if reallyUnsafePtrEquality# x y returned 1#, then x and y had a common corresponding heap object at the time. There is plenty of room for surprises and unsafety though: - Even if x and y compare as equal once, this may change later on. For example, with multiple threads running in parallel, a thunk may be evaluated several times, resulting in several results on the heap. - Extending the previous scenario, if purity is violated in the evaluation of the initially shared thunk, then x and y may become distinct values later on. - Even without parallelism, it is possible that after evaluating y, x points to an indirection on the heap and y to the evaluated value, resulting in distinct pointers (at least until the next GC). - Bottoms also complicate the picture. For example, if one uses pointer equality as an optimization in an Eq instance, one may find x == y by pointer equality once, just to later find that x == y bottoms out. The opposite scenario is also possible. So it's quite easy to violate purity this way. These scenarios can be mitigated by evaluating to WHNF first. [Note: No spurious positives by GCs] In current GHC, all garbage collections (even minor ones) stop execution of all Haskell threads; this is done by tweaking the heap check, and Haskell threads are not interrupted between heap checks. There is a design and an experimental implementation of "local garbage collections" in ghc, from 2011: https://ghc.haskell.org/trac/ghc/blog/new-gc-preview https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/local-gc... In this case, garbage collections of nurseries would run in parallel to other Haskell threads. However, even in this design, reallyUnsafePtrEquality# would not have false positives. To quote from the paper, "the key invariant is that the processor that owns the local heap has exclusive access to its contents." (In one design there are actually two local heaps; there is a second, "sticky" heap that contains objects that other processors might see; however, these objects are pinned.) Cheers, Bertram

Thanks for the feedback, all. For some more context, I have a concurrency testing library[1], which uses an algorithm called DPOR to reduce the space of possible schedules it needs to explore. There's an algorithm called MCR which has the potential to reduce this space significantly further. One of the differences in MCR is taking into account reference equality of values written to shared mutable variables. I can't just use Eq because (a) putMVar/etc doesn't have an Eq constraint in the type, and (b) that would change strictness. So I looked to reallyUnsafePtrEquality# as a possible solution. In this setting, false negatives are fine. If the algorithm thinks two equal values are distinct, then it doesn't run as fast as it could potentially do, but should still outperform DPOR due to its other improvements. But false positives are not fine, and lead to unsoundness. An alternative is just using "\_ _ -> False", but if reallyUnsafePtrEquality# will sometimes identify two equal things without false positives, then it's effectively a free performance boost in the cases where it happens to work.
- Extending the previous scenario, if purity is violated in the evaluation of the initially shared thunk, then x and y may become distinct values later on.
Fortunately, I don't need to worry about impure actions leading to a shared thunk evaluating to different things in different places, as any nondeterminism beyond scheduler nondeterminism and relaxed memory is already explicitly out of scope. [1] http://hackage.haskell.org/package/dejafu -- Michael Walker (http://www.barrucadu.co.uk)

In that case, you probably want to use StableNames [1], which are
unique and persistent even across GCs, and give you something you can
use for a hash table.
On Thu, Nov 23, 2017 at 6:48 AM, Michael Walker
Thanks for the feedback, all.
For some more context, I have a concurrency testing library[1], which uses an algorithm called DPOR to reduce the space of possible schedules it needs to explore. There's an algorithm called MCR which has the potential to reduce this space significantly further. One of the differences in MCR is taking into account reference equality of values written to shared mutable variables.
I can't just use Eq because (a) putMVar/etc doesn't have an Eq constraint in the type, and (b) that would change strictness. So I looked to reallyUnsafePtrEquality# as a possible solution. In this setting, false negatives are fine. If the algorithm thinks two equal values are distinct, then it doesn't run as fast as it could potentially do, but should still outperform DPOR due to its other improvements. But false positives are not fine, and lead to unsoundness.
An alternative is just using "\_ _ -> False", but if reallyUnsafePtrEquality# will sometimes identify two equal things without false positives, then it's effectively a free performance boost in the cases where it happens to work.
- Extending the previous scenario, if purity is violated in the evaluation of the initially shared thunk, then x and y may become distinct values later on.
Fortunately, I don't need to worry about impure actions leading to a shared thunk evaluating to different things in different places, as any nondeterminism beyond scheduler nondeterminism and relaxed memory is already explicitly out of scope.
[1] http://hackage.haskell.org/package/dejafu
-- Michael Walker (http://www.barrucadu.co.uk) _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Zemlya, I think you dropped this.
[1]
https://hackage.haskell.org/package/base-4.10.0.0/docs/System-Mem-StableName...
On Thu, Nov 23, 2017 at 8:03 PM, Zemyla
In that case, you probably want to use StableNames [1], which are unique and persistent even across GCs, and give you something you can use for a hash table.
On Thu, Nov 23, 2017 at 6:48 AM, Michael Walker
wrote: Thanks for the feedback, all.
For some more context, I have a concurrency testing library[1], which uses an algorithm called DPOR to reduce the space of possible schedules it needs to explore. There's an algorithm called MCR which has the potential to reduce this space significantly further. One of the differences in MCR is taking into account reference equality of values written to shared mutable variables.
I can't just use Eq because (a) putMVar/etc doesn't have an Eq constraint in the type, and (b) that would change strictness. So I looked to reallyUnsafePtrEquality# as a possible solution. In this setting, false negatives are fine. If the algorithm thinks two equal values are distinct, then it doesn't run as fast as it could potentially do, but should still outperform DPOR due to its other improvements. But false positives are not fine, and lead to unsoundness.
An alternative is just using "\_ _ -> False", but if reallyUnsafePtrEquality# will sometimes identify two equal things without false positives, then it's effectively a free performance boost in the cases where it happens to work.
- Extending the previous scenario, if purity is violated in the evaluation of the initially shared thunk, then x and y may become distinct values later on.
Fortunately, I don't need to worry about impure actions leading to a shared thunk evaluating to different things in different places, as any nondeterminism beyond scheduler nondeterminism and relaxed memory is already explicitly out of scope.
[1] http://hackage.haskell.org/package/dejafu
-- Michael Walker (http://www.barrucadu.co.uk) _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
participants (9)
-
Andrew Martin
-
Bertram Felgenhauer
-
Brandon Allbery
-
Carter Schonwald
-
Erik Hesselink
-
Michael Walker
-
Siddharth Bhat
-
Theodore Lief Gannon
-
Zemyla