
Here's my feeble understanding of GC: 1. GC runs when after some specified amount of allocations 2. GC runs in time proportional to the live heap, which it needs to traverse. This means that for a long running mapM, like any other computation generating a long list, (1) GC will run a number of times proportional to the length of the list, and (2) each run will have a cost proportional to the length of the list. I.e. a linear algorithm is now quadratic. A lazy mapM (or mapM_), consuming the list as fast as it is generated, will of course keep the list short/heap small, and thus the cost of each GC is constant (for some value of "constant"). I suppose generational GC will help in practice, since the young generation gets to start near the end of the list, but it will still be linear in generated length, and you still need major GCs too, occasionally. Also, I guess mapM is more vulnerable to this, since the operations (IO, say) involved in building the list likely do short-lived allocations, triggering more GCs than simpler, pure computations would. Do let me know if this is probably a terribly naive view. -k -- If I haven't seen further, it is by standing in the footprints of giants