
#15719: Primitive atomic logical operations -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by dfeuer: Old description:
We offer fetch-and-and, fetch-and-or, etc., which are available as GCC intrinsics. Unfortunately, as far as I can tell those operations are ''not'' actually offered by x86_64. So I ''believe'' GCC must be implementing them using CAS loops on that (rather important) architecture. On the other hand, that instruction set does offer a plain atomic-and, atomic-or, etc., that do not fetch. In some cases, that may be sufficient. For example, I'm currently designing a potential replacement for the stable pointer system that uses a bitmap to represent a free list. Each deletion operation needs to update the bitmap (or) and then check whether a certain threshold number of entries are free (popcnt). One straightforward way to accomplish this with available instructions would be
1. Atomic OR to mark an entry free 2. Load the bitmap 3. Perform the popcnt test
Of course, it would be ''nice'' to have fetch-and-or here, but we don't actually need it--if a bunch of deletions happen concurrently, then at least one of them will see the popcnt over the threshold.
Could we (and should we) offer non-fetching atomic operations implemented natively where they're available? One important question is of course whether we'd actually get better performance this way than using a CAS loop. I would conjecture that this is this case under contention, but I don't actually know.
New description: We offer fetch-and-and, fetch-and-or, etc., which are available as GCC intrinsics. Unfortunately, as far as I can tell those operations are ''not'' actually offered by x86_64. So I ''believe'' GCC must be implementing them using CAS loops on that (rather important) architecture. On the other hand, that instruction set does offer a plain atomic-and, atomic-or, etc., that do not fetch. In some cases, that may be sufficient. For example, I'm currently designing a potential replacement for the stable pointer system that uses a bitmap to represent a free list. Each deletion operation needs to update the bitmap (or) and then check whether a certain threshold number of entries are free (popcnt). One straightforward way to accomplish this with available instructions would be 1. Atomic OR to mark an entry free 2. Load the bitmap 3. Perform the popcnt test Of course, it would be ''nice'' to have fetch-and-or here, but we don't actually need it--if a bunch of deletions happen concurrently, then at least one of them will see the popcnt over the threshold (unless intervening allocations with the same bitmap bring the popcnt below the threshold, which is also perfectly acceptable). Could we (and should we) offer non-fetching atomic operations implemented natively where they're available? One important question is of course whether we'd actually get better performance this way than using a CAS loop. I would conjecture that this is this case under contention, but I don't actually know. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15719#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler