Re: What does unsafePerformIO do to the stack

7 Feb 2008

      Bernd Brassel wrote:
...
Thanks for your answer Simon.
Simon Marlow wrote:
...
Bernd Brassel wrote:
...
Consider the following program:
module Stack where
import System.IO.Unsafe
main = print (sim (replicate 1299959 ()))
sim []     = True
sim (_:xs) = goodStack (sim xs)
goodStack x = fromJust (Just x)          --no stack overflow
badStack  x = unsafePerformIO (return x) --stack overflow
fromJust (Just x) = x
goodStack == id, and GHC even with no optimisation will transform it
into id, and inline it into sim.  So with goodStack, sim ends up being
tail-recursive.  With badStack, sim is no longer tail recursive
(unsafePerformIO is not inlined), so it runs out of stack.  Simple!
Is it really that simple? I guess that in a lazy language we have to
look a bit closer to see what is tail recursive and what is not. If I
understand you correctly, you say that if goodStack was not inlined you
would have a bad stack as well, right?
My apologies - I oversimplified things.  You're quite right, it looks like 
the unsafePerformIO version ought to be tail-recursive too.

It turns out to be rather subtle.  If you replace unsafePerformIO by 
unsafeDupablePerformIO (from GHC.IOBase), then you do indeed get 
tail-recursion.  But this is only due to a clever trick in the RTS: what 
happens is that (sim xs) is a thunk, so when evaluating it, even in a 
tail-recursive position, an update frame is pushed on the stack.  The next 
recursive call to (sim xs) pushes another update frame on the stack, and so 
on.  Since these update frames are all adjacent to one another, a trick 
known as "stack squeezing" can squash them down into a single frame, and 
this is what happens for unsafeDupablePerformIO.

The ordinary unsafePerformIO is performing stack squeezing once per call, 
because the stack squeezing is done by the same code that does the 
"duplicate computation" check that unsafePerformIO needs to do.  Stack 
squeezing doesn't look at the same bit of stack twice, so subsequent 
squeezings don't manage to remove any update frames.  I've fixed this in my 
tree.  It also needed some tweaks to the heuristic which decides whether to 
squeeze or not based on a cost/benefit tradeoff.

So the upshot is: you can use unsafeDupablePerformIO right now, or you can 
wait until I've tested and committed this patch to get tail-recursion with 
unsafePerformIO.

I've no idea how it works in Hugs, you'll have to ask them :)

Cheers,
	Simon