
reply = parse ... -- Lazily evaluated tokens = rTokens reply -- Has some values "immediately" list = D.toList tokens -- Has some values "immediately" mapM_ list print -- Start printing "immediately"! .. reply = parse ... -- Lazily evaluated result = rResult reply -- Lazy; has value when parsing is done extra = case result ... -- Lazy; has value when parsing is done parsed = rTokens reply -- Has some values "immediately" tokens = D.append parsed extra -- Has some values "immediately" list = D.toList tokens -- Has some values "immediately" mapM_ list print -- Starts printing "immediately"!
This *still* starts printing tokens immediately. However, while in the previous case the GC is smart enough to keep the program in constant memory size, in the second case it for some reasons starts missing more and more PAP objects so memory usage goes through the roof.
is that a nail for this hammer, perhaps?-) http://hackage.haskell.org/trac/ghc/ticket/917 if you don't use rResult reply, reply can be used and freed as it is used. if you do use rResult reply, you are going to use it late, and that is going to hang on to reply, which is being expanded by the main thread of activities (rTokens). i'm just guessing here, but if that is indeed the problem, you would need to exert more control over what is evaluated when and shared where: - evaluate rResult synchronously with rTokens, instead of rResult long after rTokens has unfolded the reply - evaluate rResult independent of rTokens, on a separate copy of reply since you want to use parts of the output before you can be sure whether the whole input is correct, you might also want local errors instead of global ones (i've seen a correct chunk of input, here is the corresponding chunk of output; instead of here is a list of output chunks i've produced so far, i'll tell you later whether they are worth anything or whether they were based on invalid input).
You'd think... but the fact of the matter is that while the first version works fine, the second doesn't, UNLESS I add the magic SCC section:
extra = {-# SCC "magic" #-} case result ...
And compile with '-prof' (no '-O' flags). Then it somehow, finally, "get the idea" and the program runs perfectly well with constant memory consumption. Which, as you aptly put it, is very "fishy" indeed...
adding profiling might (another wild guess here..) lose sharing, just as in the ticket, i used \()->[..] to avoid sharing of the list. (although that guess wouldn't necessarily suggest this particular SCC to be useful, so perhaps it is the wrong track..) hth, claus