
Justin Bailey:
Would "retainer profiling" help me see what was building up this large thunk/closure?
I'm not really familiar enough with GHC's profiling to answer that, but I'll take a guess. My guess is that profiling will only sometimes be useful in diagnosing stack overflows, because I suspect that memory stats reported by the profiler will usually be dominated by heap usage. So profiling *might* point you towards some big thunks on the heap which might cause a stack overflow on evaluation. If so, then you're in luck. But the problem is that you don't actually *need* a huge unevaluated thunk to cause a stack overflow. Sure, the foldl example had one, but consider what happens if we use foldr instead: print (foldr (+) 0 [1..]) => print (1+(foldr (+) 0 [2..])) => print (1+(2+(foldr (+) 0 [3..]))) => print (1+(2+(3+(foldr (+) 0 [4..])))) => ... => print (1+(2+(3+(...+(foldr (+) 0 [...])))) => stack overflow It's a bit more tricky to explain what's going on here, which may be one reason why foldr is not the usual stack overflow example. While the nested additions in the foldl example represented a long chain of unevaluated thunks on the heap, here they represent partially executed computations on the stack. There is no big thunk! But there are still many nested contexts on the stack, so we still get an overflow. Another way of contrasting the foldl and foldr examples is to realise that foldl always consumes its entire input list, while foldr only consumes as much as its asked to. In the former, foldl drives the process of thunk building. In the latter, it is the evaluation of the innermost (+) function that drives foldr to generate the next iteration. I suspect that explanation is not very clear, so I give a small experiment which will at least show that I'm not lying. :-) Run a basic GHC profile (without optimisations) on each of the following, and observe the total memory usage. With foldl, memory usage is very high, because the entire list is consumed to produce a huge thunk on the heap. With foldr, memory usage is only about 16M, just enough to blow the stack. -- trial 1: stack overflow, lots of memory consumed main = print (foldl (+) 0 [1..10000000] :: Int) -- trial 2: stack overflow, minimal memory consumption main = print (foldr (+) 0 [1..10000000] :: Int) In fact, we could give foldr an infinite list, and get exactly the same result. Curiously, if we give foldl an infinite list, we don't get a stack overflow, because we never get to the point of evaluating the thunk. Instead, we get heap exhaustion, because we just keep building thunks. -- trial 4: heap exhaustion, nasty main = print (foldl (+) 0 [1..] :: Int) -- trial 5: stack overflow, minimal memory consumption main = print (foldr (+) 0 [1..] :: Int) It's also instructive to run these tests with optimisations (no profiling), to see how they are affected by strictness analysis. Note that strictness analysis doesn't work for the default Integer type, so the Int type annotations are necessary.