
Ok, my bad, sorry all.
This is NOT a problem that will crop up in 7.8. Rather, it's just a
problem with the duplicated bits of GHC RTS functionality that were stuck
into the atomic-primops library. It was a C preprocessor problem that was
causing the inline asm we were discussing in this thread to not actually be
called.
Still, I'd like to be reminded of the rational for all this conditional
inline asm rather than using the C compiler intrinsics! Anyone?
Best,
-Ryan
On Sat, Feb 1, 2014 at 2:17 PM, Carter Schonwald wrote: I got the test suite running on my (2 core) machine mac book air, with 7.8
i've run it several times, not seeing any failures On Sat, Feb 1, 2014 at 1:03 PM, Carter Schonwald <
carter.schonwald@gmail.com> wrote: hrmmmm I have a crazy idea "Compare RAX with r/m64. If equal, ZF is set and r64 is loaded into r/m64.
Else, clear ZF
and load r/m64 into RAX." is what the docs say for the cmpxchng
instruction so RAX is the old values, (EAX in the 32bit case). And it looks like we
dont' set that explicitly when calling the asm .. CMPXCHG r/m64, r64 hrmmm On Sat, Feb 1, 2014 at 12:52 PM, Ryan Newton Ok, here's another experiment, on this commit: https://github.com/rrnewton/haskell-lockfree/commit/399bb19fa02eaf2f2eab5d02... Here, if I use GCC's __sync_val_compare_and_swap instead of GHC's
version of cas(), the problem also goes away. I think these two
implementations should behave identically, and that they don't perhaps
indicates that there is something off about the inline asm, as Carter was
suggesting: *#if i386_HOST_ARCH || x86_64_HOST_ARCH*
* __asm__ __volatile__ (*
* "lock\ncmpxchg %3,%1"*
* :"=a"(o), "=m" (*(volatile unsigned int *)p) *
* :"0" (o), "r" (n));*
* return o;* The x86 CAS instruction must put the "return value" in the accumulator
register, and indeed this constrains "o" to be allocated to the accumulator
register, while the new value "n" can be in any register. So if there's a problem, I don't know what it is. Except I'm not sure
what the ramifications of "o" being a function parameter AND having an "=a"
constraint on it are... -Ryan On Sat, Feb 1, 2014 at 11:27 AM, Carter Schonwald <
carter.schonwald@gmail.com> wrote: Hey Ryan, i've made the leap to using 7.8 on my machine, so i'll first
have to get some pull requests in on atomic-primops before I can test it
locally :), expect those patches later today! looks like gcc's inline ASM logic is pretty correct, after testing it a
bit locally, pardon my speculative jumping the gun. On Sat, Feb 1, 2014 at 9:10 AM, Ryan Newton Hi Carter & others, Carter, yes, this is CAS on pointers and in my next mail I'll try to
come up with some hypotheses as to why we may have (remaining) problems
there. But first, I have been assured that on x86 there is no failure mode in
which doing a comparison on the value read by CAS should not correctly
diagnose success or failure (same as directly reading the Zero Flag) [1]. And yet, there's this discrepancy, where the modified casMutVar that I
linked to does not have the failure. As for reproducing the failure,
either of the two following tests will currently show problems: - Two threads try to casIORef False->True, both succeed
- 120 threads try to read, increment, CAS until they succeed. The
total is often not 120 because multiple threads think the successfully
incremented, say, 33->34. Here's a specific recipe for the latter test on GHC 7.6.3 Mac or Linux: *git clone git@github.com:rrnewton/haskell-lockfree-queue.git *
*cd haskell-lockfree-queue/AtomicPrimops/* *git checkout 1a1e7e55f6706f9e5754* *cabal sandbox init* *cabal install -f-withTH -fforeign ./ ./testing --enable-tests* *./testing/dist/dist-sandbox-*/build/test-atomic-primops/test-atomic-primops
-t n_threads* You may have to run the last line several times to see the failure. Best,
-Ryan [1] I guess the __sync_bool_compare_and_swap intrinsic which reads ZF
is there just to avoid the extra comparison. [2] P.S. I'd like to try this on GHC head, but the RHEL 6 machine I
usually use to build it is currently not validating (below error, commit
65d05d7334). After I debug this gmp problem I'll confirm that the bug
under discussion applies on the 7.8 branch. ./sync-all checkout ghc-7.8
sh validate
...
/usr/bin/ld: libraries/integer-gmp/gmp/objs/aors.o: relocation
R_X86_64_32 against `__gmpz_sub' can not be used when making a shared
object; recompile with -fPIC
libraries/integer-gmp/gmp/objs/aors.o: could not read symbols: Bad
value On Sat, Feb 1, 2014 at 2:55 AM, Carter Schonwald <
carter.schonwald@gmail.com> wrote: Ryan, is your benchmark using CAS on pointers, or immediate words?
trying to get atomic primops to build on my 7.8 build on my mac On Sat, Feb 1, 2014 at 2:44 AM, Carter Schonwald <
carter.schonwald@gmail.com> wrote: > https://ghc.haskell.org/trac/ghc/ticket/8724#ticket is the ticket
>
> when i'm more awake i'll experiment some more
>
>
> On Sat, Feb 1, 2014 at 2:33 AM, Carter Schonwald <
> carter.schonwald@gmail.com> wrote:
>
>> i have a ticket for tracking this, though i'm thinking my initial
>> attempt at a patch generates the same object code as it did before.
>>
>> @ryan, what CPU variant are you testing this on? is this on a NUMA
>> machine or something?
>>
>>
>> On Sat, Feb 1, 2014 at 1:58 AM, Carter Schonwald <
>> carter.schonwald@gmail.com> wrote:
>>
>>> woops, i mean cmpxchgq
>>>
>>>
>>> On Sat, Feb 1, 2014 at 1:36 AM, Carter Schonwald <
>>> carter.schonwald@gmail.com> wrote:
>>>
>>>> ok, i can confirm that on my 64bit mac, both clang and gcc
>>>> use cmpxchgl rather than cmpxchg
>>>> i'll whip up a strawman patch on head that can be cherrypicked /
>>>> tested out by ryan et al
>>>>
>>>>
>>>> On Sat, Feb 1, 2014 at 1:12 AM, Carter Schonwald <
>>>> carter.schonwald@gmail.com> wrote:
>>>>
>>>>> Hey Ryan,
>>>>> looking at this closely
>>>>> Why isn't CAS using CMPXCHG8B on 64bit architectures? Could
>>>>> that be the culprit?
>>>>>
>>>>> Could the issue be that we've not had a good stress test that
>>>>> would create values that are equal on the 32bit range, but differ on the
>>>>> 64bit range, and you're hitting that?
>>>>>
>>>>> Could you try seeing if doing that change fixes things up?
>>>>> (I may be completely wrong, but just throwing this out as a
>>>>> naive "obvious" guess)
>>>>>
>>>>>
>>>>> On Sat, Feb 1, 2014 at 12:58 AM, Ryan Newton