
On Friday 23 January 2009 4:39:02 pm George Pollard wrote:
with your proposed mapM_ will leave a thunk equivalent to
() `mappend` () `mappend` () `mappend`...
in memory until the mapM_ has completely finished, where each () is actually an unevalutated thunk that still has a reference to one of the elements in the lotsOfLargeStrings list.
Perhaps this is why the Monoid instance for () in GHC's source has the comment "should this be strict?" :)
Aside from all the stuff about bottoms, a strict mappend will not solve the thunk problem above. The difference will be that with a strict mappend, once the delayed computation is demanded, with a strict mappend, evaluation will have to go all the way to the innermost application, possibly causing a stack overflow, whereas the lazy mappend (presumably \_ _ -> ()) will return () immediately, just throwing away most of the thunk. Whether or not such a thunk is actually built up in memory has to do with how mapM_ is written, not how mappend is written.* -- Dan [*] That's not 100% true. If we define: mapM_ f = loop mempty where loop acc (x:xs) = f x >>= \y -> loop (acc `mappend` y) xs loop acc [ ] = return acc Then specialization to () could lead to strictness analysis reducing acc at each step. However, this definition is bad for lists, for example, since it results in: (((l1 ++ l2) ++ l3) ++ l4) ++ l5 style appending. And it may be a long shot to hope that GHC's optimizer produces the behavior we want in the () case. (And it still depends on mapM_ being written in one out of several available choices for the given type and expected results.)