
On 2010-03-01 19:37 +0000 (Mon), Thomas Schilling wrote:
A possible workaround would be to sprinkle lots of 'rnf's around your code....
As I learned rather to my chagrin on a large project, you generally don't want to do that. I spent a couple of days writing instance of NFData and loading up my application with rnfs and then watched performance fall into a sinkhole. I believe that the problem is that rnf traverses the entirety of a large data structure even if it's already strict and doesn't need traversal. My guess is that doing this frequently on data structures (such as Maps) of less than tiny size was blowing out my cache. I switched strategies to forcing a deep(ish) evaluation of only newly constructed data instead. For example, after inserting a newly constructed object into a Map, I would look it up and force evaluation only of the result of that lookup. That solved my space leak problem and made things chug along quite nicely. Understanding the general techniques for this sort of thing and seeing where you're likely to need to apply them isn't all that difficult, once you understand the problem. (It's probably much easier if you don't have to work it out all for yourself, as I did. Someone needs to write the "how to manage lazyness in Haskell" guide.) The difficult part of it is that you've really got to stay on top of it, because if you don't, the space leaks come back and you have to go find them again. It feels a little like dealing with buffers and their lengths in C. On 2010-03-01 16:06 -0500 (Mon), Job Vranish wrote:
All of our toplevel inputs will be strict, and if we keep our frame-to-frame state strick, our variances in runtimes, given the same inputs, should be quite low modulo the GC.
This is exactly the approach I need to take for the trading system. I basically have various (concurrent) loops that process input, update state, and possibly generate output. The system runs for about six hours, processing five million or so input messages with other loops running anywhere from hundreds of thousands to millions of times. The trick is to make sure that I never, ever start a new loop with an unevaluated thunk referring to data needed only by the previous loop, because otherwise I just grow and grow and grow.... Some tool to help with this would be wonderful. There's something for y'all to think about. On 2010-03-01 22:01 +0000 (Mon), Thomas Schilling wrote:
As Job and John have pointed out, though, laziness per se doesn't seem to be an issue, which is good to hear. Space leaks might, but there is no clear evidence that they are particularly harder to avoid than in strict languages.
As I mentioned above, overall I find them so. Any individual space
leak you're looking at is easy to fix, but the constant vigilance is
difficult.
cjs
--
Curt Sampson