RealFrac methods for Double and Float

Greetings, I have put together a package to test possible implementations of the RealFrac methods for Double and Float (base-2 IEEE754) and uploaded a .tar.gz bundle to http://hackage.haskell.org/trac/ghc/ticket/2271 . On the one hand, pure Haskell implementations, on the other hand implementations calling out to rint[f], trunc[f], floor[f] and ceil[f] from math.h. Both ways go via Integer by default, with a specialised faster implementation for Int (and narrower types, but those RULES haven't yet been written) enabled by a rewrite rule. Overall, the pure Haskell implementations don't fare badly on my computer. All give a speedup compared to the current implementation, for most conversions, pure Haskell is on par with or faster than the C-call (although that would probably change if the C functions were made primops). The FFI calls are significantly faster for properFraction :: Double -> (Integer, Double) and for round (except round :: Integral a => Float -> a when compiled via C, then native and FFI are on par). Sample results for the speedups against the current implementation (note: for truncate :: x -> Int, the Prelude value is fst . properFraction, not the rewritten float2Int or double2Int) are included in the tarball. I would appreciate feedback from your tests/benchmarks on other platforms, especially 64-bit platforms (mine is x86 linux, 32 bit). To run the QuickCheck tests, you need QuickCheck-2.*, to run the benchmarks, criterion. More instructions in the README. Thanks, Daniel

On Sun, Oct 10, 2010 at 7:02 PM, Daniel Fischer
Greetings,
I have put together a package to test possible implementations of the RealFrac methods for Double and Float (base-2 IEEE754) and uploaded a .tar.gz bundle to http://hackage.haskell.org/trac/ghc/ticket/2271 .
On the one hand, pure Haskell implementations, on the other hand implementations calling out to rint[f], trunc[f], floor[f] and ceil[f] from math.h.
Both ways go via Integer by default, with a specialised faster implementation for Int (and narrower types, but those RULES haven't yet been written) enabled by a rewrite rule.
Overall, the pure Haskell implementations don't fare badly on my computer. All give a speedup compared to the current implementation, for most conversions, pure Haskell is on par with or faster than the C-call (although that would probably change if the C functions were made primops).
The FFI calls are significantly faster for properFraction :: Double -> (Integer, Double) and for round (except round :: Integral a => Float -> a when compiled via C, then native and FFI are on par).
Sample results for the speedups against the current implementation (note: for truncate :: x -> Int, the Prelude value is fst . properFraction, not the rewritten float2Int or double2Int) are included in the tarball.
I would appreciate feedback from your tests/benchmarks on other platforms, especially 64-bit platforms (mine is x86 linux, 32 bit).
To run the QuickCheck tests, you need QuickCheck-2.*, to run the benchmarks, criterion.
More instructions in the README.
Thanks, Daniel
I got a lot of errors (or warnings?) during compilation. Is it something I should worry about? I'm using GHC 6.12.1 on 64-bit Linux. Antoine +++++ $sh build.sh [1 of 1] Compiling Main ( getSummary.hs, getSummary.o ) Linking getSummary ... building with the NCG [1 of 1] Compiling RFDouble ( RFDouble.hs, RFDouble.o ) RFDouble.hs:88:31: Not in scope: `negateInt64#' RFDouble.hs:89:33: Not in scope: `negateInt64#' RFDouble.hs:92:37: Not in scope: `minusInt64#' RFDouble.hs:111:31: Not in scope: `negateInt64#' RFDouble.hs:112:33: Not in scope: `negateInt64#' RFDouble.hs:146:40: Not in scope: `minusInt64#' [1 of 1] Compiling RFFloat ( RFFloat.hs, RFFloat.o ) [1 of 2] Compiling RFDouble ( RFDouble.hs, RFDouble.o ) RFDouble.hs:88:31: Not in scope: `negateInt64#' RFDouble.hs:89:33: Not in scope: `negateInt64#' RFDouble.hs:92:37: Not in scope: `minusInt64#' RFDouble.hs:111:31: Not in scope: `negateInt64#' RFDouble.hs:112:33: Not in scope: `negateInt64#' RFDouble.hs:146:40: Not in scope: `minusInt64#' [2 of 2] Compiling Main ( benchFloat.hs, benchFloat.o ) Linking benchFloat ... building via C [1 of 1] Compiling RFDouble ( RFDouble.hs, RFDouble.o ) RFDouble.hs:88:31: Not in scope: `negateInt64#' RFDouble.hs:89:33: Not in scope: `negateInt64#' RFDouble.hs:92:37: Not in scope: `minusInt64#' RFDouble.hs:111:31: Not in scope: `negateInt64#' RFDouble.hs:112:33: Not in scope: `negateInt64#' RFDouble.hs:146:40: Not in scope: `minusInt64#' [1 of 1] Compiling RFFloat ( RFFloat.hs, RFFloat.o ) [1 of 2] Compiling RFDouble ( RFDouble.hs, RFDouble.o ) RFDouble.hs:88:31: Not in scope: `negateInt64#' RFDouble.hs:89:33: Not in scope: `negateInt64#' RFDouble.hs:92:37: Not in scope: `minusInt64#' RFDouble.hs:111:31: Not in scope: `negateInt64#' RFDouble.hs:112:33: Not in scope: `negateInt64#' RFDouble.hs:146:40: Not in scope: `minusInt64#' Linking cbenchFloat ... [1 of 2] Compiling RFDouble ( RFDouble.hs, RFDouble.o ) RFDouble.hs:88:31: Not in scope: `negateInt64#' RFDouble.hs:89:33: Not in scope: `negateInt64#' RFDouble.hs:92:37: Not in scope: `minusInt64#' RFDouble.hs:111:31: Not in scope: `negateInt64#' RFDouble.hs:112:33: Not in scope: `negateInt64#' RFDouble.hs:146:40: Not in scope: `minusInt64#' +++++

On Monday 11 October 2010 02:31:13, Antoine Latter wrote:
I got a lot of errors (or warnings?) during compilation.
Yuck.
Is it something I should worry about?
No.
I'm using GHC 6.12.1 on 64-bit Linux.
Just means I should've looked closely at the #if's. On 64 bits, data Int64 = I64# Int# and apparently the explicit 64-bit shifts and adds aren't defined (I thought they were aliased to Int# shifts etc). Give me a couple of minutes to fix it.
Antoine
+++++ $sh build.sh [1 of 1] Compiling Main ( getSummary.hs, getSummary.o ) Linking getSummary ... building with the NCG [1 of 1] Compiling RFDouble ( RFDouble.hs, RFDouble.o )
RFDouble.hs:88:31: Not in scope: `negateInt64#'
RFDouble.hs:89:33: Not in scope: `negateInt64#'

On Sun, Oct 10, 2010 at 7:02 PM, Daniel Fischer
Greetings,
I have put together a package to test possible implementations of the RealFrac methods for Double and Float (base-2 IEEE754) and uploaded a .tar.gz bundle to http://hackage.haskell.org/trac/ghc/ticket/2271 .
On the one hand, pure Haskell implementations, on the other hand implementations calling out to rint[f], trunc[f], floor[f] and ceil[f] from math.h.
Both ways go via Integer by default, with a specialised faster implementation for Int (and narrower types, but those RULES haven't yet been written) enabled by a rewrite rule.
Overall, the pure Haskell implementations don't fare badly on my computer. All give a speedup compared to the current implementation, for most conversions, pure Haskell is on par with or faster than the C-call (although that would probably change if the C functions were made primops).
The FFI calls are significantly faster for properFraction :: Double -> (Integer, Double) and for round (except round :: Integral a => Float -> a when compiled via C, then native and FFI are on par).
Sample results for the speedups against the current implementation (note: for truncate :: x -> Int, the Prelude value is fst . properFraction, not the rewritten float2Int or double2Int) are included in the tarball.
I would appreciate feedback from your tests/benchmarks on other platforms, especially 64-bit platforms (mine is x86 linux, 32 bit).
To run the QuickCheck tests, you need QuickCheck-2.*, to run the benchmarks, criterion.
More instructions in the README.
Thanks, Daniel
I have results from my Intel-based MacBook, 64-bits, GHC 7 rc. The quickchecks failed to run: QuickChecking Double properFraction/Int qcDouble: qcDouble.hs:(10,10)-(15,5): Missing field in record construction Test.QuickCheck.Test.chatty QuickChecking Float properFraction/Int qcFloat: qcFloat.hs:(10,10)-(15,5): Missing field in record construction Test.QuickCheck.Test.chatty This is with QuickCheck 2.3.0.2. Take care, Antoine $ sh bench.sh Results from ncgDouble: Relations for properFraction: Prelude 1.000000 C via Integer 2.429916 Hs via Integer 1.904705 C Int 17.251877 Hs Int 28.596726 Relations for truncate: Prelude 1.000000 C via Integer 3.743818 Hs via Integer 3.535077 C Int 15.390511 Hs Int 25.461061 Relations for floor: Prelude 1.000000 C via Integer 3.853145 Hs via Integer 4.790661 C Int 14.547410 Hs Int 15.272021 Relations for ceiling: Prelude 1.000000 C via Integer 4.234731 Hs via Integer 2.934738 C Int 14.128199 Hs Int 15.183423 Relations for round: Prelude 1.000000 C via Integer 4.380557 Hs via Integer 1.836171 C Int 27.479207 Hs Int 9.352543 Results from viaCDouble: Relations for properFraction: Prelude 1.000000 C via Integer 2.424372 Hs via Integer 1.899951 C Int 17.387781 Hs Int 28.502385 Relations for truncate: Prelude 1.000000 C via Integer 3.735578 Hs via Integer 3.533399 C Int 15.513855 Hs Int 25.389048 Relations for floor: Prelude 1.000000 C via Integer 3.844556 Hs via Integer 4.793321 C Int 14.418142 Hs Int 15.235541 Relations for ceiling: Prelude 1.000000 C via Integer 4.238978 Hs via Integer 2.939887 C Int 14.255595 Hs Int 15.012826 Relations for round: Prelude 1.000000 C via Integer 4.375220 Hs via Integer 1.832638 C Int 27.413318 Hs Int 9.386865 Results from ncgFloat: Relations for properFraction: Prelude 1.000000 C via Integer 0.333999 Hs via Integer 0.499852 C Int 4.417461 Hs Int 4.558648 Relations for truncate: Prelude 1.000000 C via Integer 0.496294 Hs via Integer 0.525865 C Int 4.311807 Hs Int 4.556461 Relations for floor: Prelude 1.000000 C via Integer 0.519939 Hs via Integer 0.558956 C Int 4.611193 Hs Int 4.154393 Relations for ceiling: Prelude 1.000000 C via Integer 0.532954 Hs via Integer 0.572631 C Int 4.608487 Hs Int 4.234378 Relations for round: Prelude 1.000000 C via Integer 0.618715 Hs via Integer 0.575423 C Int 5.438467 Hs Int 3.705514 Results from viaCFloat: Relations for properFraction: Prelude 1.000000 C via Integer 0.334372 Hs via Integer 0.500771 C Int 4.479688 Hs Int 4.716925 Relations for truncate: Prelude 1.000000 C via Integer 0.499481 Hs via Integer 0.529176 C Int 4.343629 Hs Int 4.589870 Relations for floor: Prelude 1.000000 C via Integer 0.520805 Hs via Integer 0.559927 C Int 4.631558 Hs Int 4.216547 Relations for ceiling: Prelude 1.000000 C via Integer 0.532622 Hs via Integer 0.574791 C Int 4.635416 Hs Int 4.266135 Relations for round: Prelude 1.000000 C via Integer 0.619787 Hs via Integer 0.577242 C Int 5.477356 Hs Int 3.731998

On Monday 11 October 2010 04:44:24, Antoine Latter wrote:
I have results from my Intel-based MacBook, 64-bits, GHC 7 rc. The quickchecks failed to run:
QuickChecking Double properFraction/Int qcDouble: qcDouble.hs:(10,10)-(15,5): Missing field in record construction Test.QuickCheck.Test.chatty
QuickChecking Float properFraction/Int qcFloat: qcFloat.hs:(10,10)-(15,5): Missing field in record construction Test.QuickCheck.Test.chatty
This is with QuickCheck 2.3.0.2.
Oops, I didn't even notice that QC-2.3.* was out, let alone that they added a field to Args. Sorry again.
Take care, Antoine
Thanks for benchmarking. Your numbers come as an unpleasant surprise. The factors are generally lower than anywhere else I've so far got feedback from, but what's really disconcerting is that the Float -> Integer conversions are actually slower than the current, apparently. I have no idea yet how that can be. Cheers, Daniel

On Monday 11 October 2010 04:44:24, Antoine Latter wrote:
I have results from my Intel-based MacBook, 64-bits, GHC 7 rc.
Results from ncgFloat:
Relations for properFraction: Prelude 1.000000 C via Integer 0.333999 Hs via Integer 0.499852 C Int 4.417461 Hs Int 4.558648
Puzzle solved. I have run the benchmarks with the latest HEAD and got broadly similar results. GHC's new code generator does some incredible stuff with the code for Float's RealFrac instance, properFraction has become about _seven times_ faster, floor and ceiling about _thirteen times_ and round about _10.5 times_ (values for my box, the exact numbers will differ, but the tendency will be the same). Just Wow! My code for properFraction :: Float -> (Integer, Float) has become 1.7 times slower when calling out to C, everything else has either changed hardly at all or become slightly faster (5-10%). The overall result is that for Float, in the general case, the current implementation is faster and overall the speedups are less impressive than they were for 6.12. So I'll have to re-learn what GHC does with which kind of code. Cheers, Daniel

On October 11, 2010 15:12:12 Daniel Fischer wrote:
GHC's new code generator does some incredible stuff with the code for Float's RealFrac instance, properFraction has become about _seven times_ faster, floor and ceiling about _thirteen times_ and round about _10.5 times_ (values for my box, the exact numbers will differ, but the tendency will be the same).
Out of curiosity, by new code generator, are you meaning just the stock GHC 7 code path or are you meaning the new now available but disabled by default unless you use some flag (which I forget right now) code generation path. Cheers! -Tyson

On Tue, Oct 12, 2010 at 10:02 AM, Tyson Whitehead
On October 11, 2010 15:12:12 Daniel Fischer wrote:
GHC's new code generator does some incredible stuff with the code for Float's RealFrac instance, properFraction has become about _seven times_ faster, floor and ceiling about _thirteen times_ and round about _10.5 times_ (values for my box, the exact numbers will differ, but the tendency will be the same).
Out of curiosity, by new code generator, are you meaning just the stock GHC 7 code path or are you meaning the new now available but disabled by default unless you use some flag (which I forget right now) code generation path.
Cheers! -Tyson
When I did the GHC 7 timings I simply ran the shell script provided in the linked tarball - no funny business. Antoine

On Tuesday 12 October 2010 17:15:14, Antoine Latter wrote:
On Tue, Oct 12, 2010 at 10:02 AM, Tyson Whitehead
wrote: On October 11, 2010 15:12:12 Daniel Fischer wrote:
GHC's new code generator does some incredible stuff with the code for Float's RealFrac instance, properFraction has become about _seven times_ faster, floor and ceiling about _thirteen times_ and round about _10.5 times_ (values for my box, the exact numbers will differ, but the tendency will be the same).
Out of curiosity, by new code generator, are you meaning just the stock GHC 7 code path or are you meaning the new now available but disabled by default unless you use some flag (which I forget right now) code generation path.
Cheers! -Tyson
When I did the GHC 7 timings I simply ran the shell script provided in the linked tarball - no funny business.
Antoine
Yep, just a vanilla perf build of HEAD.
participants (3)
-
Antoine Latter
-
Daniel Fischer
-
Tyson Whitehead