Re: [Haskell-beginners] The cost of generality, or how expensive is realToFrac?

15 Sep 2010

      Hey, thanks, Daniel.

I hadn't come across rewrite rules yet.  They definitely look like something worth learning, though I'm not sure I'm prepared to start making custom versions of OpenGL.Raw...

It looks like I managed to put that battle off for another day, however.  I did look at how realToFrac is implemented and (as you mention) it does the fromRational . toRational transform pair suggested in a number of sources, including Real World Haskell.  Looking at what toRational is doing, creating a ratio of integers out of a float it seems like a crazy amount of effort to go through just to convert floating point numbers.

Looking at the RealFloat class rather that Real and Fractional, it seems like this is a much more efficient way to go:

floatToFloat :: (RealFloat a, RealFloat b) => a -> b
floatToFloat = (uncurry encodeFloat) . decodeFloat

I substituted this in for realToFrac and I'm back to close to my original performance.  Playing with a few test cases in ghci, it looks numerically equivalent to realToFrac.

This begs the question though-- am I doing something dangerous here?  Why isn't this the standard approach?  

If I understand what's happening, decodeFloat and encodeFloat are breaking the floating point numbers up into their constituent parts-- presumably by bit masking the raw binary.  That would explain the performance improvement.  I suppose there is some implementation dependence here, but as long as the encode and decode are implemented as a matched set then I think I'm good.

Cheers--
 Greg

On Sep 15, 2010, at 1:56 AM, Daniel Fischer wrote:
...
On Wednesday 15 September 2010 02:51:01, Greg wrote:
...
First, to anyone who recognizes me by name, thanks to the help I've been
getting here I've managed to put together a fairly complex set of code
files that all compile together nicely, run, and do exactly what I
wanted them to do.  Success!
The trouble is that my implementation is dog slow
Fortunately, this isn't the first time I've been in over my head and I
started by putting up some simpler scaffolding- which runs much more
quickly.  Working backwards, it looks like the real bottle neck is in
the data types I've created, the type variables I've introduced, and the
conversion code I needed to insert to make it all happy.
I'm not sure it helps, but I've attached a trimmed down version of the
relevant code.  What should be happening is my pair  is being converted
to the canonical form for Coord2D which is Cartesian2D and then
converted again to Vertex2.  There shouldn't be any change made to the
values, they're only being handed from one container to another in this
case (Polar coordinates would require computation, but I've stripped
that out for the time being).  However, those handoffs require calls to
realToFrac to make the type system happy, and that has to be what is
eating up all my CPU.
Not all, but probably a big chunk of it.
The problem is that the default implementation of realToFrac is
realToFrac = fromRational . toRational
a) with that implementation, realToFrac :: Double -> Double is not the 
identity (doesn't respect NaNs)
b) it's slow, there are no special operations to convert Double, Float etc. 
from/to Rational.
For a lot of types, GHC provides rewrite rules (you need to compile with 
optimisations to have them fire) which give faster versions (with somewhat 
different behaviour, e.g. realToFrac :: Double -> Double is rewritten to 
id, realToFrac between Float and Double uses primitive widening/narrowing 
ops, for several newtype wrappers around Float/Double there are rules too).
...
I think there are probably 4 calls to realToFrac.  If I walk through the
 code, the result, given the pair p, should be: Vertex2 (realToFrac
(realToFrac (fst p)))  (realToFrac (realToFrac (snd p)))
I'd like to maintain type independence if possible, but I expect most
uses of this code to feed Doubles in for processing and probably feed
GLclampf (Floats, I believe)
newtype wrapper around CFloat, which is a newtype wrapper around Float
Unfortunately, there are no rewrite rules in the module where it is 
defined, apparently neither any other module that has access to the 
constructor. And the constructor is not accessible from any of the exposed 
modules, so as far as I know, you can't provide your own rewrite rules.
...
to the OpenGL layer.  If there's a way to
do so, I wouldn't mind optimizing for that particular set of types.
 I've tried GLdouble, and it doesn't really improve things with the
current code.
Is there a way to short circuit those realToFrac calls if we know the
input and output are the same type?  Is there a way merge the nested
calls?
You can try rewrite rules
{-# RULES
  "realToFrac2/realToFrac"         realToFrac . realToFrac = realToFrac
  "realToFrac/id"                  realToFrac = id
  #-}
but I'm afraid the second won't work at all, then you'd have to specify all 
interesting cases yourself (there are rules for the cases Double -> Double 
and Float -> Float in GHC.Float, rules for converting from/to CFloat and 
CDouble in Foreign.C.Types, so those should be fine too)
  "realToFrac/GLclampf->GLclampf"  realToFrac = id :: GLclampf -> GLclampf
and what ese you need.
Whether the first one will help (or even work), I don't know either, you 
have to try.
...
Any other thoughts on what I can do here?  The slow down between the two
implementations is at least 20x, which seems like a steep penalty to
pay.
In case of emergency, put the needed rewrite rules into the source of 
OpenGLRaw yourself.
...
And while I'm at it, is turning on FlexibleInstances the only way to
create an instance for (a,a)?
Yes. Haskell98 doesn't allow such instance declarations, so you need the 
extension.