Re: [Haskell-cafe] following up on space leak

6 Jul 2009

      On 7/5/09, Alexander Dunlap  wrote:
...
On Sun, Jul 5, 2009 at 7:46 PM, Uwe Hollerbach
wrote:
...
On 7/5/09, Paul L  wrote:
...
Previously you had lastOrNil taking m [a] as input, presumably
generated by mapM. So mapM is actually building an entire list before
it returns the argument for you to call lastOrNil. This is where you
had unexpected memory behavior.
Now you are "fusing" lastOrNil and mapM together, and instead of
building a list, you traverse it and perform monadic action along the
way. This can happen in a constant memory if the original pure list is
generated lazily.
I think the real problem you had was a mis-understanding of mapM, and
there was nothing wrong with your previous lastOrNil function. mapM
will only return a list after all monadic actions are performed, and
in doing so, it inevitably has to build the entire list along the way.
--
Regards,
Paul Liu
Yale Haskell Group
http://www.haskell.org/yale
Hi, Paul, thanks for the comments. You're quite right that I am fusing
the two functions together, but I think I wasn't mis-understanding
mapM... I knew I was generating the entire list, and aside from the
slight inefficiency of generating it only to tear it down an instant
later, that would have been no problem. But I was expecting all of the
memory associated with the list to be reclaimed after I had processed
it, and that was what was not happening as far as I could tell. (This
isn't one monolithic list, by the way; it's the small bodies of a
couple of small scheme functions that get evaluated over and over. So
the setup and teardown happens a lot.) I don't have very good
intuition yet about what should get garbage-collected and what should
get kept in such situations, and in fact I'm kind of in the same boat
again: the test case now runs much better, but it still leaks memory,
and I am again stumped as to why. Could I see something useful by
examining ghc core? I haven't looked at that yet, no idea what to look
for...
Uwe
_______________________________________________
mapM_ might be useful to you. I know there are cases where mapM leaks
memory but mapM_ doesn't, basically because mapM_ throws away all of
the intermediate results immediately. You might want to condition on
nullness of the list and then mapM_ your function over the init of the
list and then just return the function on the last element of the
list.
Alex
Oh, sorry, I was not clear in my original note in this thread: the
lastOrNil issue seems to be solved. That part of the code is, as far
as I can tell, not leaking memory at all anymore. I think I can claim
that because now the constant memory allocation is showing up visibly
in the profiling output; before, it was lost in the noise. So, if
there is a leak there, it's tiny compared with the constant stuff at
least for this benchmark. There are still two or perhaps three leaks,
and these show up as large but not huge compared to the constant
stuff. I've got a plot of this up on the haskeem website:
http://www.korgwal.com/haskeem/run_new.png.

The bits where I am stumped now are two-fold: one is (I think)
analogous to the lastOrNil issue, except that instead of feeding the
result to lastOrNil, I am doing a more general fold. So there I do
need all the results. I tried the same fusion as with lastOrNil/mapML,
and as far as I can tell I'm not building any lists; but this time it
didn't change the behavior at all, other than causing the names of
some profiling cost centers to change. This is the #2 entry on the
plot above.

The other issue seems to have something to do with IORefs, I'm
dynamically building environments for my scheme functions, and somehow
there seems to be something going wrong with reclaiming that memory
after it's done. This is the #1 and I think #3 entry on the plot. I
don't know enough details there yet to be able to say any more.

Uwe