SV: Windows breakage -- again

I posted a working and tested patch last night. Please feel free to commit it, I haven't the rights to do it.
Niklas
----- Ursprungligt meddelande -----
Från: "Simon Peyton Jones"
" Not all operations are supported by all target processors. If a particular operation cannot be implemented on the target processor, a warning will be generated and a call an external function will be generated. The external function will carry the same name as the builtin, with an additional suffix `_n' where n is the size of the data type."
I'm a bit surprised by this error for two reasons:
* A call to that symbol should only be generated if the CPU doesn't support the atomic instructions. What CPU model does Windows report that you have?
* gcc should define such a symbol. For me the following test program compiles:
#include

Thanks Niklas, this is now committed.
On Fri, Jul 18, 2014 at 9:21 AM, Niklas Larsson
I posted a working and tested patch last night. Please feel free to commit it, I haven't the rights to do it.
Niklas ________________________________ Från: Simon Peyton Jones Skickat: 2014-07-18 15:55 Till: Niklas Larsson; Johan Tibell Kopia: ghc-devs@haskell.org Ämne: RE: Windows breakage -- again
Thank you all for pursuing this. I gather that you know what is going on, so no further info needed from me. Yell if it is otherwise.
Meanwhile, is the fix imminent, or should we revert Johan’s patch?
Simon
From: Niklas Larsson [mailto:metaniklas@gmail.com] Sent: 16 July 2014 19:58 To: Johan Tibell; Simon Peyton Jones Cc: ghc-devs@haskell.org Subject: Re: Windows breakage -- again
I get the same failure when I try to build HEAD. Turns out the error occurs on the 32-bit Windows build, and my successful build was a 64-bit build. My 64-bit build still succeeds.
Also, gcc is 4.5.2 on 32-bit, not 4.6.3 as on 64-bit.
Niklas
2014-07-16 14:48 GMT+02:00 Niklas Larsson
: I have built ghc on windows after that was added with no issue.
I can take a look this evening and see how HEAD works for me.
The standard gcc in the tarballs is 4.6.3, which is getting long in the tooth, there is an issue on trac to upgrade it.
-- Niklas
________________________________
Från: Johan Tibell Skickat: 2014-07-16 09:57 Till: Simon Peyton Jones Kopia: ghc-devs@haskell.org Ämne: Re: Windows breakage -- again
You can rollback the commit (git revert 4ee4ab01c1d97845aecb7707ad2f9a80933e7a49) and push that to the repo if you wish. I will try to re-add the primop again after I figure out what's wrong.
On Wed, Jul 16, 2014 at 9:37 AM, Johan Tibell
wrote: I added some primops about a month ago (4ee4ab01c1d97845aecb7707ad2f9a80933e7a49) that call __sync_fetch_and_add, a gcc/llvm builtin. I'm a bit surprised to see this error. The GCC manual [1] says:
" Not all operations are supported by all target processors. If a particular operation cannot be implemented on the target processor, a warning will be generated and a call an external function will be generated. The external function will carry the same name as the builtin, with an additional suffix `_n' where n is the size of the data type."
I'm a bit surprised by this error for two reasons:
* A call to that symbol should only be generated if the CPU doesn't support the atomic instructions. What CPU model does Windows report that you have?
* gcc should define such a symbol. For me the following test program compiles:
#include
uint8_t test(uint8_t* ptr, uint8_t val) {
return __sync_fetch_and_add_1(ptr, val);
}
int main(void) {
uint8_t n;
return test(&n, 1);
}
Does that compile for you? Which version of GCC do we end up using on Windows?
The reported symbol (___sync_fetch_and_add_1) has three leading underscores, that looks weird. Can you compile just libraries/ghc-prim/cbits/atomic.c and see if it's indeed GCC that generates a reference to that symbol?
1. http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html
On Wed, Jul 16, 2014 at 12:29 AM, Simon Peyton Jones
wrote: Aargh! The Windows build has broken – again. I can’t build GHC on my laptop any more.
[Hela det ursprungliga meddelandet tas inte med.]
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
-- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/

Great. Thanks all for your help!
On Sat, Jul 19, 2014 at 2:04 AM, Austin Seipp
Thanks Niklas, this is now committed.
On Fri, Jul 18, 2014 at 9:21 AM, Niklas Larsson
wrote: I posted a working and tested patch last night. Please feel free to commit it, I haven't the rights to do it.
Niklas ________________________________ Från: Simon Peyton Jones Skickat: 2014-07-18 15:55 Till: Niklas Larsson; Johan Tibell Kopia: ghc-devs@haskell.org Ämne: RE: Windows breakage -- again
Thank you all for pursuing this. I gather that you know what is going on, so no further info needed from me. Yell if it is otherwise.
Meanwhile, is the fix imminent, or should we revert Johan’s patch?
Simon
From: Niklas Larsson [mailto:metaniklas@gmail.com] Sent: 16 July 2014 19:58 To: Johan Tibell; Simon Peyton Jones Cc: ghc-devs@haskell.org Subject: Re: Windows breakage -- again
I get the same failure when I try to build HEAD. Turns out the error occurs on the 32-bit Windows build, and my successful build was a 64-bit build. My 64-bit build still succeeds.
Also, gcc is 4.5.2 on 32-bit, not 4.6.3 as on 64-bit.
Niklas
2014-07-16 14:48 GMT+02:00 Niklas Larsson
: I have built ghc on windows after that was added with no issue.
I can take a look this evening and see how HEAD works for me.
The standard gcc in the tarballs is 4.6.3, which is getting long in the tooth, there is an issue on trac to upgrade it.
-- Niklas
________________________________
Från: Johan Tibell Skickat: 2014-07-16 09:57 Till: Simon Peyton Jones Kopia: ghc-devs@haskell.org Ämne: Re: Windows breakage -- again
You can rollback the commit (git revert 4ee4ab01c1d97845aecb7707ad2f9a80933e7a49) and push that to the repo if you wish. I will try to re-add the primop again after I figure out what's wrong.
On Wed, Jul 16, 2014 at 9:37 AM, Johan Tibell
wrote: I added some primops about a month ago (4ee4ab01c1d97845aecb7707ad2f9a80933e7a49) that call __sync_fetch_and_add, a gcc/llvm builtin. I'm a bit surprised to see this error. The GCC manual [1] says:
" Not all operations are supported by all target processors. If a particular operation cannot be implemented on the target processor, a warning will be generated and a call an external function will be generated. The external function will carry the same name as the builtin, with an additional suffix `_n' where n is the size of the data type."
I'm a bit surprised by this error for two reasons:
* A call to that symbol should only be generated if the CPU doesn't support the atomic instructions. What CPU model does Windows report that you have?
* gcc should define such a symbol. For me the following test program compiles:
#include
uint8_t test(uint8_t* ptr, uint8_t val) {
return __sync_fetch_and_add_1(ptr, val);
}
int main(void) {
uint8_t n;
return test(&n, 1);
}
Does that compile for you? Which version of GCC do we end up using on Windows?
The reported symbol (___sync_fetch_and_add_1) has three leading underscores, that looks weird. Can you compile just libraries/ghc-prim/cbits/atomic.c and see if it's indeed GCC that generates a reference to that symbol?
1. http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html
On Wed, Jul 16, 2014 at 12:29 AM, Simon Peyton Jones
wrote: Aargh! The Windows build has broken – again. I can’t build GHC on my laptop any more.
[Hela det ursprungliga meddelandet tas inte med.]
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
-- Regards,
Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/

2014-07-21 21:31 GMT+02:00 Johan Tibell
Great. Thanks all for your help!
I am afraid we are not done with this yet. Yesterday I have also committed the fix for the FreeBSD platform, but today I noticed that the corresponding test case ("AtomicPrimops") is failing due to SIGILL, that is, illegal instruction. And it has been happening for all the 32-bit platforms, including Linux [1], SmartOS [2], and Solaris [3]. I do not know yet why it goes wrong. [1] http://haskell.inf.elte.hu/builders/validator1-linux-x86-head/34/10.html [2] http://haskell.inf.elte.hu/builders/smartos-x86-head/73/21.html [3] http://haskell.inf.elte.hu/builders/solaris-x86-head/116/21.html

AtomicPrimOps.hs flakes out for:
fetchAndTest
fetchNandTest
fetchOrTest
fetchXorTest
casTest
but not for fetchAddSubTest and readWriteTest.
If I step through it, the segfault comes at line 166, it doesn't reach the
.fetchXXXIntArray function that was called from the thread (at least ghci
doesn't hit a breakpoint set at it).
GDB says the bad instruction is:
4475: f0 8b 4c 24 40 lock mov 0x40(%esp),%ecx
Niklas
2014-07-22 6:53 GMT+02:00 Páli Gábor János
2014-07-21 21:31 GMT+02:00 Johan Tibell
: Great. Thanks all for your help!
I am afraid we are not done with this yet. Yesterday I have also committed the fix for the FreeBSD platform, but today I noticed that the corresponding test case ("AtomicPrimops") is failing due to SIGILL, that is, illegal instruction. And it has been happening for all the 32-bit platforms, including Linux [1], SmartOS [2], and Solaris [3].
I do not know yet why it goes wrong.
[1] http://haskell.inf.elte.hu/builders/validator1-linux-x86-head/34/10.html [2] http://haskell.inf.elte.hu/builders/smartos-x86-head/73/21.html [3] http://haskell.inf.elte.hu/builders/solaris-x86-head/116/21.html _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

On Tue, Jul 22, 2014 at 9:50 AM, Niklas Larsson
AtomicPrimOps.hs flakes out for: fetchAndTest fetchNandTest fetchOrTest fetchXorTest casTest
but not for fetchAddSubTest and readWriteTest.
If I step through it, the segfault comes at line 166, it doesn't reach the .fetchXXXIntArray function that was called from the thread (at least ghci doesn't hit a breakpoint set at it).
GDB says the bad instruction is: 4475: f0 8b 4c 24 40 lock mov 0x40(%esp),%ecx
Is this on FreeBSD only or does it happen elsewhere?

That's true, I used mingw.
I have created a ticket https://ghc.haskell.org/trac/ghc/ticket/9346#ticket.
2014-07-22 12:22 GMT+02:00 Páli Gábor János
2014-07-22 11:49 GMT+02:00 Johan Tibell
: Is this on FreeBSD only or does it happen elsewhere?
I would say it happens everywhere (on 32 bits). I guess Niklas was debugging the mingw32 version.

I suggest we continue the discussion on the ticket:
https://ghc.haskell.org/trac/ghc/ticket/9346
Summary so far is that LOCK is not a valid prefix to MOV, but the x86
code generator doesn't emit any LOCKs before MOVs so I'm not sure how
that instruction got there.
On Tue, Jul 22, 2014 at 12:41 PM, Niklas Larsson
That's true, I used mingw.
I have created a ticket https://ghc.haskell.org/trac/ghc/ticket/9346#ticket.
2014-07-22 12:22 GMT+02:00 Páli Gábor János
: 2014-07-22 11:49 GMT+02:00 Johan Tibell
: Is this on FreeBSD only or does it happen elsewhere?
I would say it happens everywhere (on 32 bits). I guess Niklas was debugging the mingw32 version.
participants (4)
-
Austin Seipp
-
Johan Tibell
-
Niklas Larsson
-
Páli Gábor János