Possible floating point bug in GHC?

For days I'm fighting against a weird bug. My Haskell code calls into a C function residing in a DLL (I'm on Windows, the DLL is generated using Visual Studio). This C function computes a floating point expression. However, the floating point result is incorrect. I think I found the source of the problem: the C code expects that all the Intel's x86's floating point register tag bits are set to 1, but it seems the Haskell code does not preserve that. Since the x86 has all kinds of floating point weirdnesshttp://www.informit.com/articles/article.aspx?p=770362 - it is both a stack based and register based system - so it is crucially important that generated code plays nice. For example, when using MMX one must always emit an EMMS instructionhttp://msdn.microsoft.com/en-us/library/590b9ks9(VS.80).aspxto clear these tag bits. If I manually clear these tags bits, my code works fine. Is this something other people encountered as well? I'm trying to make a very simple test case to reproduce the behavior... I'm not sure if this is a visual C compiler bug, GHC bug, or something I'm doing wrong... Is it possible to annotate a foreign imported C function to tell the Haskell code generator the functioin is using floating point registers somehow?

Interesting. This could be the cause of a weird floating point bug that has been showing up in the ghc testsuite recently, specifically affecting MacOS/Intel (but not MacOS/ppc). http://darcs.haskell.org/testsuite/tests/ghc-regress/lib/Numeric/num009.hs That test compares the result of the builtin floating point ops with the same ops imported via FFI. The should not be different, but on Intel they sometimes are. Regards, Malcolm On 3 Apr 2009, at 18:58, Peter Verswyvelen wrote:
For days I'm fighting against a weird bug.
My Haskell code calls into a C function residing in a DLL (I'm on Windows, the DLL is generated using Visual Studio). This C function computes a floating point expression. However, the floating point result is incorrect.
I think I found the source of the problem: the C code expects that all the Intel's x86's floating point register tag bits are set to 1, but it seems the Haskell code does not preserve that.
Since the x86 has all kinds of floating point weirdness - it is both a stack based and register based system - so it is crucially important that generated code plays nice. For example, when using MMX one must always emit an EMMS instruction to clear these tag bits.
If I manually clear these tags bits, my code works fine.
Is this something other people encountered as well? I'm trying to make a very simple test case to reproduce the behavior...
I'm not sure if this is a visual C compiler bug, GHC bug, or something I'm doing wrong...
Is it possible to annotate a foreign imported C function to tell the Haskell code generator the functioin is using floating point registers somehow?

Well this situation can indeed not occur on PowerPCs since these CPUs just have floating point registers, not some weird dual stack sometimes / registers sometimes architecture. But in my case the bug is consistent, not from time to time. So I'll try to reduce this to a small reproducible test case, maybe including the assembly generated by the VC++ compiler. On Fri, Apr 3, 2009 at 9:02 PM, Malcolm Wallace < Malcolm.Wallace@cs.york.ac.uk> wrote:
Interesting. This could be the cause of a weird floating point bug that has been showing up in the ghc testsuite recently, specifically affecting MacOS/Intel (but not MacOS/ppc).
http://darcs.haskell.org/testsuite/tests/ghc-regress/lib/Numeric/num009.hs
That test compares the result of the builtin floating point ops with the same ops imported via FFI. The should not be different, but on Intel they sometimes are.
Regards, Malcolm
On 3 Apr 2009, at 18:58, Peter Verswyvelen wrote:
For days I'm fighting against a weird bug.
My Haskell code calls into a C function residing in a DLL (I'm on Windows, the DLL is generated using Visual Studio). This C function computes a floating point expression. However, the floating point result is incorrect.
I think I found the source of the problem: the C code expects that all the Intel's x86's floating point register tag bits are set to 1, but it seems the Haskell code does not preserve that.
Since the x86 has all kinds of floating point weirdness - it is both a stack based and register based system - so it is crucially important that generated code plays nice. For example, when using MMX one must always emit an EMMS instruction to clear these tag bits.
If I manually clear these tags bits, my code works fine.
Is this something other people encountered as well? I'm trying to make a very simple test case to reproduce the behavior...
I'm not sure if this is a visual C compiler bug, GHC bug, or something I'm doing wrong...
Is it possible to annotate a foreign imported C function to tell the Haskell code generator the functioin is using floating point registers somehow?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

What floating point model is your DLL compiled with? There are a variety of
different options here with regards to optimizations, and I don't know about
the specific assembly that each option produces, but I know there are
options like Strict, Fast, or Precise, and maybe when you do something like
that it makes different assumptions about the caller. Although that doesn't
say anything about whose "fault" it is, but at least it might be helpful to
know if changing the floating point model causes the bug to go away.
On Fri, Apr 3, 2009 at 2:31 PM, Peter Verswyvelen
Well this situation can indeed not occur on PowerPCs since these CPUs just have floating point registers, not some weird dual stack sometimes / registers sometimes architecture. But in my case the bug is consistent, not from time to time.
So I'll try to reduce this to a small reproducible test case, maybe including the assembly generated by the VC++ compiler.
On Fri, Apr 3, 2009 at 9:02 PM, Malcolm Wallace < Malcolm.Wallace@cs.york.ac.uk> wrote:
Interesting. This could be the cause of a weird floating point bug that has been showing up in the ghc testsuite recently, specifically affecting MacOS/Intel (but not MacOS/ppc).
http://darcs.haskell.org/testsuite/tests/ghc-regress/lib/Numeric/num009.hs
That test compares the result of the builtin floating point ops with the same ops imported via FFI. The should not be different, but on Intel they sometimes are.
Regards, Malcolm
On 3 Apr 2009, at 18:58, Peter Verswyvelen wrote:
For days I'm fighting against a weird bug.
My Haskell code calls into a C function residing in a DLL (I'm on Windows, the DLL is generated using Visual Studio). This C function computes a floating point expression. However, the floating point result is incorrect.
I think I found the source of the problem: the C code expects that all the Intel's x86's floating point register tag bits are set to 1, but it seems the Haskell code does not preserve that.
Since the x86 has all kinds of floating point weirdness - it is both a stack based and register based system - so it is crucially important that generated code plays nice. For example, when using MMX one must always emit an EMMS instruction to clear these tag bits.
If I manually clear these tags bits, my code works fine.
Is this something other people encountered as well? I'm trying to make a very simple test case to reproduce the behavior...
I'm not sure if this is a visual C compiler bug, GHC bug, or something I'm doing wrong...
Is it possible to annotate a foreign imported C function to tell the Haskell code generator the functioin is using floating point registers somehow?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

I tried both precise and fast, but that did not help. Compiling to SSE2
fixed it, since that does not use a floating point stack I guess.
I'm preparing a repro test case, but it is tricky since removing code tends
to change the optimizations and then the bug does not occur.
Does anybody know what the calling convention for floating points is for
cdecl on x86? The documentation says that the result is returned in st(0),
but it says nothing about the floating point tags. I assume that every
function expects the FP stack to be empty, potentially containing just
argument values. But GHC calls the C function with some FP registers
reserved on the stack...
On Fri, Apr 3, 2009 at 9:54 PM, Zachary Turner
What floating point model is your DLL compiled with? There are a variety of different options here with regards to optimizations, and I don't know about the specific assembly that each option produces, but I know there are options like Strict, Fast, or Precise, and maybe when you do something like that it makes different assumptions about the caller. Although that doesn't say anything about whose "fault" it is, but at least it might be helpful to know if changing the floating point model causes the bug to go away.
On Fri, Apr 3, 2009 at 2:31 PM, Peter Verswyvelen
wrote: Well this situation can indeed not occur on PowerPCs since these CPUs just have floating point registers, not some weird dual stack sometimes / registers sometimes architecture. But in my case the bug is consistent, not from time to time.
So I'll try to reduce this to a small reproducible test case, maybe including the assembly generated by the VC++ compiler.
On Fri, Apr 3, 2009 at 9:02 PM, Malcolm Wallace < Malcolm.Wallace@cs.york.ac.uk> wrote:
Interesting. This could be the cause of a weird floating point bug that has been showing up in the ghc testsuite recently, specifically affecting MacOS/Intel (but not MacOS/ppc).
http://darcs.haskell.org/testsuite/tests/ghc-regress/lib/Numeric/num009.hs
That test compares the result of the builtin floating point ops with the same ops imported via FFI. The should not be different, but on Intel they sometimes are.
Regards, Malcolm
On 3 Apr 2009, at 18:58, Peter Verswyvelen wrote:
For days I'm fighting against a weird bug.
My Haskell code calls into a C function residing in a DLL (I'm on Windows, the DLL is generated using Visual Studio). This C function computes a floating point expression. However, the floating point result is incorrect.
I think I found the source of the problem: the C code expects that all the Intel's x86's floating point register tag bits are set to 1, but it seems the Haskell code does not preserve that.
Since the x86 has all kinds of floating point weirdness - it is both a stack based and register based system - so it is crucially important that generated code plays nice. For example, when using MMX one must always emit an EMMS instruction to clear these tag bits.
If I manually clear these tags bits, my code works fine.
Is this something other people encountered as well? I'm trying to make a very simple test case to reproduce the behavior...
I'm not sure if this is a visual C compiler bug, GHC bug, or something I'm doing wrong...
Is it possible to annotate a foreign imported C function to tell the Haskell code generator the functioin is using floating point registers somehow?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Fri, Apr 03, 2009 at 10:10:17PM +0200, Peter Verswyvelen wrote:
I tried both precise and fast, but that did not help. Compiling to SSE2 fixed it, since that does not use a floating point stack I guess.
You didn't say what version of GHC you are using, but it sounds like
this might already be fixed in 6.10.2 by:
Tue Nov 11 12:56:19 GMT 2008 Simon Marlow

Ouch, what a waste of time on my side :-(
This bugfix is not mentioned in the "notable bug fixes"
herehttp://haskell.org/ghc/docs/6.10.2/html/users_guide/release-6-10-2.html
Since this is such a severe bug, I would recommend listing it :)
Anyway, I have a very small repro test case now. Will certainly test this
with GHC 6.10.2.
On Fri, Apr 3, 2009 at 10:35 PM, Ian Lynagh
On Fri, Apr 03, 2009 at 10:10:17PM +0200, Peter Verswyvelen wrote:
I tried both precise and fast, but that did not help. Compiling to SSE2 fixed it, since that does not use a floating point stack I guess.
You didn't say what version of GHC you are using, but it sounds like this might already be fixed in 6.10.2 by:
Tue Nov 11 12:56:19 GMT 2008 Simon Marlow
* Fix to i386_insert_ffrees (#2724, #1944) The i386 native code generator has to arrange that the FPU stack is clear on exit from any function that uses the FPU. Unfortunately it was getting this wrong (and has been ever since this code was written, I think): it was looking for basic blocks that used the FPU and adding the code to clear the FPU stack on any non-local exit from the block. In fact it should be doing this on a whole-function basis, rather than individual basic blocks. Thanks Ian

Okay, I can confirm the bug is fixed.
It's insane this bug did not cause any more problems. Every call into every
C function that uses floating point could have been affected (OpenGL, BLAS,
etc)
On Fri, Apr 3, 2009 at 10:47 PM, Peter Verswyvelen
Ouch, what a waste of time on my side :-( This bugfix is not mentioned in the "notable bug fixes" herehttp://haskell.org/ghc/docs/6.10.2/html/users_guide/release-6-10-2.html
Since this is such a severe bug, I would recommend listing it :)
Anyway, I have a very small repro test case now. Will certainly test this with GHC 6.10.2.
On Fri, Apr 3, 2009 at 10:35 PM, Ian Lynagh
wrote: On Fri, Apr 03, 2009 at 10:10:17PM +0200, Peter Verswyvelen wrote:
I tried both precise and fast, but that did not help. Compiling to SSE2 fixed it, since that does not use a floating point stack I guess.
You didn't say what version of GHC you are using, but it sounds like this might already be fixed in 6.10.2 by:
Tue Nov 11 12:56:19 GMT 2008 Simon Marlow
* Fix to i386_insert_ffrees (#2724, #1944) The i386 native code generator has to arrange that the FPU stack is clear on exit from any function that uses the FPU. Unfortunately it was getting this wrong (and has been ever since this code was written, I think): it was looking for basic blocks that used the FPU and adding the code to clear the FPU stack on any non-local exit from the block. In fact it should be doing this on a whole-function basis, rather than individual basic blocks. Thanks Ian
participants (4)
-
Ian Lynagh
-
Malcolm Wallace
-
Peter Verswyvelen
-
Zachary Turner