
David Roundy wrote:
On Wed, Mar 26, 2008 at 05:07:10PM -0700, Don Stewart wrote:
droundy:
On Thu, Mar 27, 2008 at 01:09:47AM +0300, Bulat Ziganshin wrote:
-Collecting rendering stats is not easy without global variables. It occurs to me that it would be neat if there were some sort of write-only global variables that can be incremented by pure code but can only be read from within monadic code; that would be sufficient to ensure that the pure code wasn't affected by the values.
the code is called *pure* exactly because it has no side-effects and compiler may select either to call some function two times or reuse already computed result. actually, you can make sideeffects with unsafePerformIO, but there is no guarantees of how many times such code will be executed. try this:
plus a b = unsafePerformIO (modifyIORef counter (+1)) `seq` a+b
This is exactly what he wants to do. The point of putting traces into the code is precisely to figure out how many times it is called. The only trouble is that unsafePerformIO (I believe) can inhibit optimizations, since there are certain transformations that ghc won't do to unsafePerformIO code.
could we just use -fhpc or profiling here. HPC at least will tell you how many times top level things are called, and print pretty graphs about it.
It depends what the point is. I've found traces to be very helpful at times when debugging (for instance, to get values as well as counts). Also, I imagine that manual tracing is likely to be far less invasive (if you do it somewhat discretely) than profiling or using hpc.
The unsafePerformIO looks like what I want. Profiling isn't really that helpful in this situation, since sometimes what you want is the number of times something gets called per ray and then add a bit to the color value of the corresponding pixel. Something like this http://syn.cs.pdx.edu/~jsnow/glome/dragon-bih.png tells you a lot more about where your code is spending its time (the bright green places) than some numbers from a profiler. I could return the relevant stats as part of the standard results from ray-intersection tests, but I think that would clutter the code unnecessarily. Thanks everyone for the advice, it'll keep me busy for awhile. I got converted over to doubles, it seems to be about 10% faster or so with -fvia-C than regular floats with -fasm. (I'm using ghc 6.8.2 by the way, which seems to generate faster code than the 6.6.1 version I was using earlier, so maybe the difference between -fasm and -fvia-C isn't as significant as it used to be.) I'm looking into using ByteString, but it doesn't seem compatible with "lex" and "reads". I should probably do more heap profiling before I get too carried away, though. -jim