
On Mon, 2007-04-02 at 13:54 +0100, Malcolm Wallace wrote:
An observation about your state setter functions, ... You can shorten your code considerably by using the standard named-field update syntax for exactly this task:
setDecision :: String -> State -> State setDecision decision state = state { sDecision = decision }
If I do that, the run time insists on the state being "more" evaluated before it changes that specific field. This kills streaming, enforcing each production (including the top one) to be fully parsed before I can access its generated tokens. So the GC won't be hanging on to State objects, but memory still explodes - with unconsumed Token objects. And there's no output from the program until it dies :-(
Not only is it shorter, but it will often be much more efficient, since the entire structured value is copied once once, then a single field updated, rather than being re-built piece-by-piece in 15 steps.
I know! Is there an efficient way to lazily modify just one field record?
You probably want to be strict in the state component, but not in the output values of your monad. So as well as replacing let ... in (finalState, rightResult) with let ... in finalState `seq` (finalState, rightResult) in the (>>=) method in your Monad instance (and in the separate defn of
For some strange reason, adding this didn't solve the problem - the GC still refuses to collect the state objects. BTW, forcing the evaluation of the intermediate states (originalState, leftState, rightState etc.) doesn't help either. I have tried to ensure that when '>>=' and '/' will allow the GC to discard old states "as soon as possible", but I'm obviously missing something. Is there a way to get more detailed retainer information than what's available with '-hr'?
you might also need to make all the named fields of your State datatype strict.
If I make any of them strict, streaming goes away :-( Writing a streaming parser in Haskell is turning out to be much harder than I originally expected. Every fix I tried so far either broke streaming (memory blows up due to tokens) or had no effect (memory blows up due to states). I am assuming that there's a magic point in the middle where tokens are consumed and states are GC-ed... but it has eluded me so far. Thanks, Oren Ben-Kiki P.S. I uploaded the package to Hackage. I added a debug-leak production to make it easier to profile this with even less productions involved. ``yes '#' | yaml2yeast -p debug-leak''.