
@sven and @henning :
i'm actually doing some preliminary work to add save and restore for FPU
state to the GHC RTS, at the green/haskell thread layer. after first
ripping out x87 code gen, which just needs some more docs written out
before its merged in. note that i'm speaking specifically of the MXCSR
register save and restore, not the more hefty operations you might be
thinking.
FPU mode state save and restore is done already on EVERY OS when switching
threads/processes, and in the agner fog latency tables the cost of
manipulating mxcsr registers is pretty small!
https://www.agner.org/optimize/instruction_tables.pdf
LDMXCSR (restore) and STMXCSR (save) have cpu latencies at like 5-20
cycles (more often 8-15), so having the current C ffi calls set the
default C FPU environment (as we currently have ordinarily) is super doable
to ensure no breakage of existing C bindings, plus have a new ccall variant
that inherits the host haskell thread FPU state. we're talking sub 10
nanosecond overhead on x86 and x86_64 platforms (and either way, on those
platforms soon ghc will only be using the sse2 or higher ).
point being: aside from like AMD piledriver micro architecture and some
stuff from VIA, the performance of the CPU instruction for the signalling
nans state setup and related rounding mode etc, should work perfectly well,
@Daniel Cartwright
Am Do., 7. Feb. 2019 um 17:22 Uhr schrieb Henning Thielemann < lemming@henning-thielemann.de>:
[...] What about calling into foreign code? If I call a BLAS routine and one element of the result vector is NaN, shall this be trapped? Or shall it be trapped once I access the NaN element?
IMHO this is the biggest show stopper for some exotic NaN handling, as correct as it may be mathematically or aesthetically: The floating point environment is a thread-local (i.e. basically global) entity on most platforms, and most programming language runtimes expect a "default" environment, i.e. no traps when NaNs are encountered. So if Haskell wants to do things differently, the FPE has to be set/reset around foreign calls and for around every Haskell callback. I am not sure if this is really worth the trouble and the performance loss. For some special applications it might be OK or even important, but my gut feeling is that trapping NaNs is the wrong default in our current world... _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries