
Bernd Brassel wrote:
Thanks for your answer Simon.
Simon Marlow wrote:
Bernd Brassel wrote:
Consider the following program:
module Stack where
import System.IO.Unsafe
main = print (sim (replicate 1299959 ()))
sim [] = True sim (_:xs) = goodStack (sim xs)
goodStack x = fromJust (Just x) --no stack overflow badStack x = unsafePerformIO (return x) --stack overflow
fromJust (Just x) = x goodStack == id, and GHC even with no optimisation will transform it into id, and inline it into sim. So with goodStack, sim ends up being tail-recursive. With badStack, sim is no longer tail recursive (unsafePerformIO is not inlined), so it runs out of stack. Simple!
Is it really that simple? I guess that in a lazy language we have to look a bit closer to see what is tail recursive and what is not. If I understand you correctly, you say that if goodStack was not inlined you would have a bad stack as well, right?
My apologies - I oversimplified things. You're quite right, it looks like the unsafePerformIO version ought to be tail-recursive too. It turns out to be rather subtle. If you replace unsafePerformIO by unsafeDupablePerformIO (from GHC.IOBase), then you do indeed get tail-recursion. But this is only due to a clever trick in the RTS: what happens is that (sim xs) is a thunk, so when evaluating it, even in a tail-recursive position, an update frame is pushed on the stack. The next recursive call to (sim xs) pushes another update frame on the stack, and so on. Since these update frames are all adjacent to one another, a trick known as "stack squeezing" can squash them down into a single frame, and this is what happens for unsafeDupablePerformIO. The ordinary unsafePerformIO is performing stack squeezing once per call, because the stack squeezing is done by the same code that does the "duplicate computation" check that unsafePerformIO needs to do. Stack squeezing doesn't look at the same bit of stack twice, so subsequent squeezings don't manage to remove any update frames. I've fixed this in my tree. It also needed some tweaks to the heuristic which decides whether to squeeze or not based on a cost/benefit tradeoff. So the upshot is: you can use unsafeDupablePerformIO right now, or you can wait until I've tested and committed this patch to get tail-recursion with unsafePerformIO. I've no idea how it works in Hugs, you'll have to ask them :) Cheers, Simon