
On Thursday 16 September 2010 21:47:43, Greg wrote:
Wow, thanks for all the analysis on this, Daniel!
So, I think the summary of your evaluation is that realToFrac does just fine in almost all cases
I wouldn't say that. realToFrac does fine in those cases where a rewrite rule provides a fast conversion (or one of the types bewteen which you want to convert is Rational, when you get fromRational or toRational - the latter can be improved in several cases).
due to careful optimization and outperforms my floatToFloat trick (which answers the question I flagrantly begged... ;~) except that I can't access those optimizations because the OpenGL types are hidden behind newtypes and buried in a library.
I've confirmed your results to make sure I could.
I went a step further and typed convert for GLclampf expecting to see unoptimized performance, but it ran just as fast as Double->Float.
convert :: Double -> GL.GLclampf convert = realToFrac
It would be interesting to see what core GHC produces for that (you can get the core with the `-ddump-simpl' command line flag [redirect stdout to a file] or with the ghc-core tool [available on hackage]). If it runs as fast as realToFrac :: Double -> Float (with optimisations), GHC must have rewritten realToFrac to double2Float# and it should only do that if there are rewrite rules for GLclampf. In that case, the problem is probably that GHC doesn't see the realToFrac applications because they're too deeply wrapped in your coordToCoord2D calls. If that is the problem, it might help to use {-# INLINE #-} pragmas on coordToCoord2D, fromCartesian2D and toCartesian2D. Can you try with realToFrac and the {-# INLINE #-} pragmas?
And still ran faster than floatToFloat. However there's no denying that floatToFloat runs *much* faster than realToFrac in the larger application. Profiling shows floatToFloat taking about 50% of my CPU
That's too much for my liking, a simple conversion from Double to Float shouldn't take long, even if the Float is wrapped in newtypes (after all, the newtypes don't exist at runtime).
time, but the frame rate is close to (though not quite) 30 fps. Using realToFrac takes seconds just to display the first frame.
As it stands, floatToFloat is responsible for 50% of my CPU time which I think is mainly a tribute the OpenGL implementation, but still makes it the obvious target for optimization. Having gotten all excited about the new tool in my box, I wrote the following rule:
{-# RULES "floatToFloat/id" floatToFloat=id "floatToFloat x2" floatToFloat . floatToFloat = floatToFloat #-}
Neither of which seems to fires in this application,
GHC reports fired rules with -ddump-simpl-stats. Getting rules to fire is a little brittle, GHC does not try too hard to match expressions with rules, and if several rules match, it chooses one arbitrarily, so your rules may have been missed because the actual code looked different (perhaps because other rewrite rules fired first).
but I did get the first one to fire by importing the same file into my benchmark.
The next obvious step is to optimize a level up in the call hierarchy, and rewrite coordToCoord2D since I know my original pair2vertex function was faster. So, I added this rule in the file where Vertex2 is made an instance of Coord2D:
{-# RULES "coordToCoord2D/p2v2" coordToCoord2D = pair2vertex #-}
which refers to two functions, each in different files (I don't think that matters, but mention it just in case)
pair2vertex :: (Num a) => (a,a) -> GL.Vertex2 a pair2vertex (x,y) = GL.Vertex2 x y
coordToCoord2D :: (Coord2D a, Coord2D b) => a -> b coordToCoord2D = fromCartesian2D . toCartesian2D
directly after my coordToCoord2D definition I have this rule as well:
{-# RULES "coordToCoord2D/id" coordToCoord2D = id "coordToCoord2D x2" coordToCoord2D . coordToCoord2D = coordToCoord2D #-}
I get a compile time error that I can't make sense of. It's asking me to put a context on my rule, but I can't find any references on how to do that...
-----------
Could not deduce (Num a) from the context (Coord2D (Vertex2 a), Coord2D (a, a)) arising from a use of `pair2vertex' at GCB/OpenGL/Geometry.hs:32:40-50 Possible fix: add (Num a) to the context of the RULE "coordToCoord2D/p2v2" In the expression: pair2vertex When checking the transformation rule "coordToCoord2D/p2v2"
-----------
The file:line:column is the "pair2vertex" token in the rule I list above.
Yes, there's a Num constraint on pair2vertex, but not on coordToCoord2D, so it's not type correct. You could try removing the Num constraint from pair2vertex or add the constraint to the rule, {-# RULES "coordToCoord2D/p2v2" forall a. Num a => (coordToCoord2D :: (a,a) -> GL.Vertex2 a) = pair2vertex #-} (well, I don't know whether that's the correct way, but you can try).
Cheers-- Greg