
Ian Lynagh wrote:
On Tue, Jan 10, 2006 at 04:44:33PM +0000, Ian Lynagh wrote:
readChunks :: FirstMonad String readChunks = do xs <- get if null xs then return [] else do let (ys, zs) = foo xs put zs rest <- readChunks return (ys ++ rest)
It looks like changing this let to a case fixes this example, but at the time I'd experimented with that there must have been other issues clouding the effect, such as the following.
Foo1 (attached) uses large amounts of memory whereas Foo2 (also attached) runs in a little constant space. The difference is only changing this:
else do chunk <- case foo xs of (ys, zs) -> do put zs return ys chunks <- readChunks return (chunk ++ chunks)
to this:
else case foo xs of (ys, zs) -> do put zs chunks <- readChunks return (ys ++ chunks)
I had great difficulty understanding this, but I think I do now. It's a bit easier to understand if you inline the monads away. Foo1 translates to this: bar [] = ([],[]) bar (x:xs) = let (zs,ys) = bar xs in (x:zs,ys) readChunks [] = ([],[]) readChunks xs = let (ys,zs) = bar xs (chunks,rest) = readChunks zs in (ys ++ chunks, rest) and Foo2: readChunks [] = ([],[]) readChunks xs = case bar xs of (zs,ys) -> let (chunks,rest) = readChunks ys in (zs ++ chunks, rest) This is pretty much what GHC ends up with when you give -O (actually it turns some of the tuples into unboxed tuples, but that's not important). We can see in Foo1 that chunks is a thunk holding on to zs, which is a thunk that holds on to xs, so you never get to release xs until the whole result list (ys) is traversed. GHC's lazy tuple optimisation doesn't kick in, because neither chunks nor rest are evaluated. However, it's not so clear why Foo2 is better. chunks holds on to ys, the second of the pair returned by bar. In fact, ys will point to a chain of thunks that looks like this: ys = snd (_, snd (_, snd (_, snd (_, snd ...)))) every time GC runs, it can completely eliminate this list via the well-known lazy tuple optimisation. Unfortunately it doesn't *completely* eliminate the list, because of a shortcoming in our implementation, actually reported earlier by Ian Lynagh with a very similar program :-) Fortunately in this example we do seem to be reducing enough of the list to eliminate the space leak, though. My suggestion: don't use the lazy state monad if you can help it. Cheers, Simon