GHC, odd concurrency space leak

Jesper Louis Andersen

14 Apr 2010 14 Apr '10

8:22 p.m.

A problem with GHC? =================== This post describes some odd behaviour I have seen in GHC 6.12.1 when writing Combinatorrent. The post is literate Haskell so you can run it. The executive summary: A space leak occurs when a new process is spawned from inside another process - and I can't figure out why. I am asking for help on haskell-cafe. We begin by upgrading GHC from Haskell98 to something mature and modern:

...

{-# LANGUAGE GeneralizedNewtypeDeriving #-}

And then we do the import-tango:

...

module Main where

import Control.Monad.Reader import Control.Monad.State

In our system, a *Process* is an identifier for a thread of execution. Since these heavily makes use of the following imports,

...

import Control.Concurrent import Control.Concurrent.STM

it is beneficial to make processes live in the IO-monad. However, it turns out that having everything living in the IO monad is a clunky idea. A Process also has a current configuration: The channels and variables on which it can communicate for instance. The configuration is not expected to be changed over the course of the process running. For data which do change when the process runs, we want some state tracking. Luckily, the Xmonad X-monad comes to our rescue. In all its gory and glory details, we generalize it and rewrite it for our processes, which is how they are used in combinatorrent. Our model-kit for building new threads of execution is the following beast:

...

newtype Process a b c = Process (ReaderT a (StateT b IO) c) deriving (Functor, Monad, MonadIO, MonadState b, MonadReader a)

Note that the automatic derivations of *MonadState b* and *MonadReader a* makes GHC spit our some mkUsageInfo warnings in its generation of the .hi-files. They don't seem to be dangerous. Glueing instructions for our model kit is given by spawning off new threads:

...

run :: a -> b -> Process a b c -> IO (c, b) run c st (Process p) = runStateT (runReaderT p c) st

spawn :: a -> b -> Process a b () -> IO ThreadId spawn c st p = forkIO $ run c st p >> return ()

Our first dummy --------------- Machinery for having fun is now in place. Here is a crash-test-dummy we would like to play with:

...

main1 = do spawn () () (forever $ return ()) threadDelay (3 * 1000000)

-- main = main1

Note that the given units () and () are usually much more complicated, but for the sake of this minimal example units will do. And running that does what we expect it to do: ./Post +RTS -tstderr <> At this point, we got our own little processor-heater installed in our room. But spring is coming to the Northern hemisphere, so we have little use for this until November or December. To actually make our system do something, we build this gem:

...

p1 :: Process () () () p1 = forever $ return ()

startp1 :: IO ThreadId startp1 = spawn () () p1

startp2 :: IO ThreadId startp2 = spawn () () (forever $ do liftIO startp1 liftIO $ putStrLn "Delaying" liftIO $ threadDelay (10 * 1000000))

main2 = do putStrLn "Main thread starting" startp2 threadDelay (1 * 1000000)

main = main2

Running *this* beast gives: ./Post +RTS -tstderr Main thread starting Delaying <> Notice the improvement! Now we both get to burn the CPU and we have a space leak! Removing the "forever" in startp2 will make this code not leak space. startp2 only spawns a single process, which then runs and each time it takes a cycle in the forever-loop, it builds up live heap. This boggles my mind. Why does this code leak like mad? What could one do to remedy the leak? The leak is PAP and GHC.Base.sat_... (according to -hd heap profiling). In my combinatorrent code it is a ->StateT (does anyone know what PAP and ->StateT means and where to find more definitions like these?). In confusion, -- J.

Show replies by date

Gregory Collins

14 Apr 14 Apr

9:13 p.m.

Jesper Louis Andersen writes:

...

This post describes some odd behaviour I have seen in GHC 6.12.1 when writing Combinatorrent. The post is literate Haskell so you can run it. The executive summary: A space leak occurs when a new process is spawned from inside another process - and I can't figure out why. I am asking for help on haskell-cafe.

...[snip]...

...
import Control.Monad.State

Does the problem go away if you use "Control.Monad.State.Strict"? G -- Gregory Collins

Jason Dagit

9:44 p.m.

On Wed, Apr 14, 2010 at 2:13 PM, Gregory Collins wrote:

...

Jesper Louis Andersen writes:

...
This post describes some odd behaviour I have seen in GHC 6.12.1 when writing Combinatorrent. The post is literate Haskell so you can run it. The executive summary: A space leak occurs when a new process is spawned from inside another process - and I can't figure out why. I am asking for help on haskell-cafe.

...[snip]...

...
import Control.Monad.State

Does the problem go away if you use "Control.Monad.State.Strict"?

Nope :) That was the first thing I tried here. I tried playing with optimization level too. Next I tried making two versions that were as similar as possible and then comparing the core with ghc-core. I can't see a difference between a version that uses 1MB and a version that uses 160MB (on my system 160MB is the worst I can get it to blow up). The two versions I compared: Low memory: \begin{code}

...

startp4 :: IO ThreadId startp4 = spawn () () (return ())

...

startp3 :: IO ThreadId startp3 = spawn () () (forever $ do liftIO startp4 liftIO $ putStrLn "Delaying" liftIO $ threadDelay (3 * 1000000))

...

main1 = do putStrLn "Main thread starting" startp3 threadDelay (1 * 1000000)

main = main1 \end{code}

Too much memory: \begin{code}

...

startp4 :: IO ThreadId startp4 = spawn () () (forever $ return ())

...

startp3 :: IO ThreadId startp3 = spawn () () (forever $ do liftIO startp4 liftIO $ putStrLn "Delaying" liftIO $ threadDelay (3 * 1000000))

...

main1 = do putStrLn "Main thread starting" startp3 threadDelay (1 * 1000000)

main = main1 \end{code}

The difference is whether or not the threads must keep returning () or if they returns it once. I'm not sure what to make of it. My conclusion is that keeping the thread alive via forever is the problem, but when I test this hypothesis with a threadDelay the space leak goes away: \begin{code}

...

startp4 :: IO ThreadId startp4 = spawn () () (liftIO $ threadDelay (100 * 1000000))

...

startp3 :: IO ThreadId startp3 = spawn () () (forever $ do liftIO startp4 liftIO $ putStrLn "Delaying" liftIO $ threadDelay (3 * 1000000))

...

main1 = do putStrLn "Main thread starting" startp3 threadDelay (1 * 1000000)

main = main1 \end{code}

It will be interesting to hear what fixes this! Jason

Jason Dagit

9:49 p.m.

On Wed, Apr 14, 2010 at 2:44 PM, Jason Dagit wrote:

...

On Wed, Apr 14, 2010 at 2:13 PM, Gregory Collins wrote:

...
Jesper Louis Andersen writes:

...
This post describes some odd behaviour I have seen in GHC 6.12.1 when writing Combinatorrent. The post is literate Haskell so you can run it. The executive summary: A space leak occurs when a new process is spawned from inside another process - and I can't figure out why. I am asking for help on haskell-cafe.

...[snip]...

...
import Control.Monad.State

Does the problem go away if you use "Control.Monad.State.Strict"?

Nope :) That was the first thing I tried here.

I tried playing with optimization level too.

Next I tried making two versions that were as similar as possible and then comparing the core with ghc-core. I can't see a difference between a version that uses 1MB and a version that uses 160MB (on my system 160MB is the worst I can get it to blow up).

The two versions I compared: Low memory: \begin{code}

...
startp4 :: IO ThreadId startp4 = spawn () () (return ())

...
startp3 :: IO ThreadId startp3 = spawn () () (forever $ do liftIO startp4 liftIO $ putStrLn "Delaying" liftIO $ threadDelay (3 * 1000000))

...
main1 = do putStrLn "Main thread starting" startp3 threadDelay (1 * 1000000)

main = main1 \end{code}

Too much memory: \begin{code}

...
startp4 :: IO ThreadId startp4 = spawn () () (forever $ return ())

...
startp3 :: IO ThreadId startp3 = spawn () () (forever $ do liftIO startp4 liftIO $ putStrLn "Delaying" liftIO $ threadDelay (3 * 1000000))

...
main1 = do putStrLn "Main thread starting" startp3 threadDelay (1 * 1000000)

main = main1 \end{code}

The difference is whether or not the threads must keep returning () or if they returns it once.

I'm not sure what to make of it. My conclusion is that keeping the thread alive via forever is the problem, but when I test this hypothesis with a threadDelay the space leak goes away:

\begin{code}

...
startp4 :: IO ThreadId startp4 = spawn () () (liftIO $ threadDelay (100 * 1000000))

...
startp3 :: IO ThreadId startp3 = spawn () () (forever $ do liftIO startp4 liftIO $ putStrLn "Delaying" liftIO $ threadDelay (3 * 1000000))

...
main1 = do putStrLn "Main thread starting" startp3 threadDelay (1 * 1000000)

main = main1 \end{code}

It will be interesting to hear what fixes this!

...

forever' m = do _ <- m forever' m

When I define that version of forever, the space leak goes away.

Daniel Fischer

10:13 p.m.

Am Mittwoch 14 April 2010 23:49:43 schrieb Jason Dagit:

...

...
It will be interesting to hear what fixes this!

forever' m = do _ <- m forever' m

When I define that version of forever, the space leak goes away.

Not with optimisations.

Jason Dagit

10:52 p.m.

On Wed, Apr 14, 2010 at 3:13 PM, Daniel Fischer wrote:

...

Am Mittwoch 14 April 2010 23:49:43 schrieb Jason Dagit:

...
...
It will be interesting to hear what fixes this!

forever' m = do _ <- m forever' m

When I define that version of forever, the space leak goes away.

Not with optimisations.

Thanks for pointing that out. I forgot to say so in my email. Here are two reduced versions of the original program: Good version, ghc --make Terminate.hs: \begin{code} {-# OPTIONS -O0 #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} module Main where import Control.Monad (forever) import Control.Concurrent import Control.Concurrent.STM spawn :: IO a -> IO ThreadId spawn io = forkIO (io >> return ()) forever' m = do _ <- m forever' m startp4 :: IO ThreadId startp4 = spawn (forever' (return ())) startp3 :: IO ThreadId startp3 = spawn (forever $ do startp4 putStrLn "Delaying" threadDelay (3 * 1000000)) main = do putStrLn "Main thread starting" startp3 threadDelay (1 * 1000000) \end{code} The bad version, ghc --make NonTermination.hs: \begin{code} {-# OPTIONS -O2 #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} -- Note: Change the optimization to -O1 to get a terminating version -- that uses much more memory than it should. module Main where import Control.Monad (forever) import Control.Concurrent import Control.Concurrent.STM spawn :: IO a -> IO ThreadId spawn io = forkIO (io >> return ()) startp4 :: IO ThreadId startp4 = spawn (forever (return ())) startp3 :: IO ThreadId startp3 = spawn (forever $ do startp4 putStrLn "Delaying" threadDelay (3 * 1000000)) main = do putStrLn "Main thread starting" startp3 threadDelay (1 * 1000000) \end{code} Can some core expert please look at these and explain the difference? Thanks! Jason

Daniel Fischer

11:33 p.m.

Am Donnerstag 15 April 2010 00:52:22 schrieb Jason Dagit:

...

The bad version, ghc --make NonTermination.hs: \begin{code} {-# OPTIONS -O2 #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-}

-- Note: Change the optimization to -O1 to get a terminating version

Doesn't seem to terminate with -O1 here (killed after ~30 seconds). However, it does terminate with -O2 and -fno-state-hack, but leaks (and there's no difference between forever and forever' with -O2 -fno-state- hack).

...

Can some core expert please look at these and explain the difference?

I'm interested in an explanation too.

...

Thanks! Jason

& Daniel

Jesper Louis Andersen

16 Apr 16 Apr

9:51 p.m.

On Thu, Apr 15, 2010 at 1:33 AM, Daniel Fischer wrote:

...

...
Can some core expert please look at these and explain the difference?

I'm interested in an explanation too.

+1 The behaviour is consistent. GHC 6.8.3, 6.10.4, 6.12.1 and 6.13-20100416 all agree on the space leak. Here is the minimal program I have with the leak: \begin{code} {-# LANGUAGE GeneralizedNewtypeDeriving #-} module Main where import Control.Monad.State import Control.Concurrent newtype Process b c = Process (StateT b IO c) deriving (Monad, MonadIO, MonadState b) run :: b -> Process b c -> IO (c, b) run st (Process p) = runStateT p st spawn :: b -> Process b () -> IO ThreadId spawn st p = forkIO $ run st p >> return () p1 :: Process () () p1 = forever $ return () startp1 :: IO ThreadId startp1 = spawn () p1 startp2 :: IO ThreadId startp2 = spawn () (forever $ do liftIO startp1 liftIO $ putStrLn "Delaying" liftIO $ threadDelay (10 * 1000000)) main = do putStrLn "Main thread starting" startp2 threadDelay (1 * 1000000) \end{code} .. so it looks like it is the state monad. I used ghc-core to print out this program in Core-format, killed all the type casts from System-F_c and inspected the code. I can't see what would make any problem there, but that was my first use of Core, so I might have overlooked something. The only thing I can see is that we "split" the State# RealWorld whenever we fork, but I think that is expected behaviour. The only other culprit I could guess at is the exception catch# primops in there. Should I file this as a bug? It has some bug-like qualities to it. In any case, what is going on is quite complicated so a resolution would be nice. If for nothing else to understand what is going on. -- J.

Jason Dagit

10 p.m.

On Fri, Apr 16, 2010 at 2:51 PM, Jesper Louis Andersen < jesper.louis.andersen@gmail.com> wrote:

...

On Thu, Apr 15, 2010 at 1:33 AM, Daniel Fischer wrote:

...
...
Can some core expert please look at these and explain the difference?

I'm interested in an explanation too.

+1

The behaviour is consistent. GHC 6.8.3, 6.10.4, 6.12.1 and 6.13-20100416 all agree on the space leak. Here is the minimal program I have with the leak:

Myself and others posted "simpler" programs that had similar bad behavior, including the space leak (depending on optimizations flags). I realize it's tedious to retest all those versions, but do you think you could check with one of the other versions that doesn't need mtl?

...

\begin{code} {-# LANGUAGE GeneralizedNewtypeDeriving #-}

module Main where import Control.Monad.State import Control.Concurrent

newtype Process b c = Process (StateT b IO c) deriving (Monad, MonadIO, MonadState b)

run :: b -> Process b c -> IO (c, b) run st (Process p) = runStateT p st

spawn :: b -> Process b () -> IO ThreadId spawn st p = forkIO $ run st p >> return ()

p1 :: Process () () p1 = forever $ return ()

startp1 :: IO ThreadId startp1 = spawn () p1

startp2 :: IO ThreadId startp2 = spawn () (forever $ do liftIO startp1 liftIO $ putStrLn "Delaying" liftIO $ threadDelay (10 * 1000000))

main = do putStrLn "Main thread starting" startp2 threadDelay (1 * 1000000) \end{code}

.. so it looks like it is the state monad.

I don't think so because we were able to produce the space leak without using StateT.

...

I used ghc-core to print out this program in Core-format, killed all the type casts from System-F_c and inspected the code. I can't see what would make any problem there, but that was my first use of Core, so I might have overlooked something. The only thing I can see is that we "split" the State# RealWorld whenever we fork, but I think that is expected behaviour. The only other culprit I could guess at is the exception catch# primops in there.

Should I file this as a bug? It has some bug-like qualities to it. In any case, what is going on is quite complicated so a resolution would be nice. If for nothing else to understand what is going on.

Well, I think Bulat correctly characterized the non-termination aspect. I didn't think the cooperative aspect of threading applied with the threaded RTS, so I'm not 100% sure I believe his characterization, but otherwise it seems like a reasonable explanation. The space leakiness is a different issue and likely worth a bug report in its own right. Do you think you could try checking for the speak leaking using the compacting garbage collector? I think that one is enabled with +RTS -c -RTS. Thanks for checking on all those different versions of GHC. Jason

Jesper Louis Andersen

17 Apr 17 Apr

10:43 a.m.

On Sat, Apr 17, 2010 at 12:00 AM, Jason Dagit wrote:

...

Myself and others posted "simpler" programs that had similar bad behavior, including the space leak (depending on optimizations flags). I realize it's tedious to retest all those versions, but do you think you could check with one of the other versions that doesn't need mtl?

You got me curious enough that I decided to attack it systematically. Here is a test-run script: \begin{code} #!/bin/bash GHC68=/usr/local/stow/ghc-6.8.3/bin/ghc GHC610=/usr/local/stow/ghc-6.10.4/bin/ghc GHC612=ghc GHC6HEAD=/usr/local/stow/ghc-6.13-20100416/bin/ghc run_round () { EXE=$(basename ${1} .hs) echo ---------------------------------------------- echo GHC68 ${GHC68} --make $2 $1 ./${EXE} +RTS -tstderr $3 echo ---------------------------------------------- echo GHC610 ${GHC610} --make $2 $1 ./${EXE} +RTS -tstderr $3 echo ---------------------------------------------- echo GHC612 ${GHC612} --make $2 $1 ./${EXE} +RTS -tstderr $3 echo ---------------------------------------------- echo GHC HEAD ${GHC6HEAD} --make -rtsopts $2 $1 ./${EXE} +RTS -tstderr $3 } run_round $1 $2 $3 \end{code} With this script down, we can run your "Good" version: jlouis@illithid:~$ sh runner.sh JD-Good.hs ---------------------------------------------- GHC68 [1 of 1] Compiling Main ( JD-Good.hs, JD-Good.o ) Linking JD-Good ... ./JD-Good +RTS -tstderr Main thread starting Delaying <> ---------------------------------------------- GHC610 [1 of 1] Compiling Main ( JD-Good.hs, JD-Good.o ) Linking JD-Good ... ./JD-Good +RTS -tstderr Main thread starting Delaying <> ---------------------------------------------- GHC612 [1 of 1] Compiling Main ( JD-Good.hs, JD-Good.o ) Linking JD-Good ... ./JD-Good +RTS -tstderr Main thread starting Delaying <> ---------------------------------------------- GHC HEAD [1 of 1] Compiling Main ( JD-Good.hs, JD-Good.o ) Linking JD-Good ... ./JD-Good +RTS -tstderr Main thread starting Delaying <> Your "Bad" one doesn't terminate consistently on all versions of GHC. Adding -threaded does not help. Neil Browns version is also consistent over all versions of GHC and doesn't terminate. It does not matter if I add -threaded.

...

Well, I think Bulat correctly characterized the non-termination aspect. I didn't think the cooperative aspect of threading applied with the threaded RTS, so I'm not 100% sure I believe his characterization, but otherwise it seems like a reasonable explanation.

It is certainly a valid explanation, and the most plausible at the moment I think.

...

The space leakiness is a different issue and likely worth a bug report in its own right. Do you think you could try checking for the speak leaking using the compacting garbage collector? I think that one is enabled with +RTS -c -RTS.

Oh, that gives some interesting progress: Here is the run without -c: jlouis@illithid:~$ ghc --version The Glorious Glasgow Haskell Compilation System, version 6.12.1 jlouis@illithid:~$ ghc --make -threaded Post.hs jlouis@illithid:~$ ./Post +RTS -s ./Post +RTS -s Main thread starting Delaying 840,429,800 bytes allocated in the heap 336,183,280 bytes copied during GC 86,294,808 bytes maximum residency (8 sample(s)) 2,648,600 bytes maximum slop 171 MB total memory in use (3 MB lost due to fragmentation) Generation 0: 1596 collections, 0 parallel, 0.35s, 0.33s elapsed Generation 1: 8 collections, 0 parallel, 0.27s, 0.35s elapsed Parallel GC work balance: nan (0 / 0, ideal 1) MUT time (elapsed) GC time (elapsed) Task 0 (worker) : 0.00s ( 0.32s) 0.00s ( 0.00s) Task 1 (worker) : 0.37s ( 0.32s) 0.62s ( 0.68s) Task 2 (worker) : 0.00s ( 0.32s) 0.00s ( 0.00s) Task 3 (worker) : 0.00s ( 0.32s) 0.00s ( 0.00s) SPARKS: 0 (0 converted, 0 pruned) INIT time 0.00s ( 0.00s elapsed) MUT time 0.28s ( 0.32s elapsed) GC time 0.62s ( 0.68s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 0.90s ( 1.00s elapsed) %GC time 68.9% (67.6% elapsed) Alloc rate 3,001,331,338 bytes per MUT second Productivity 31.1% of total user, 27.9% of total elapsed gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].steps[0].sync_large_objects: 0 gen[0].steps[1].sync_large_objects: 0 gen[1].steps[0].sync_large_objects: 0 And here with the compacting GC: jlouis@illithid:~$ ./Post +RTS -s -c ./Post +RTS -s -c Main thread starting Delaying 12,642,360 bytes allocated in the heap 2,522,160 bytes copied during GC 2,522,584 bytes maximum residency (3 sample(s)) 59,232 bytes maximum slop 4 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 22 collections, 0 parallel, 0.02s, 0.01s elapsed Generation 1: 3 collections, 0 parallel, 5.08s, 5.09s elapsed Parallel GC work balance: nan (0 / 0, ideal 1) MUT time (elapsed) GC time (elapsed) Task 0 (worker) : 0.00s ( 0.01s) 0.00s ( 0.00s) Task 1 (worker) : 0.00s ( 0.01s) 5.10s ( 5.10s) Task 2 (worker) : 0.00s ( 0.01s) 0.00s ( 0.00s) Task 3 (worker) : 0.00s ( 0.01s) 0.00s ( 0.00s) SPARKS: 0 (0 converted, 0 pruned) INIT time 0.00s ( 0.00s elapsed) MUT time 0.01s ( 0.01s elapsed) GC time 5.10s ( 5.10s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 5.10s ( 5.11s elapsed) %GC time 99.9% (99.8% elapsed) Alloc rate 3,159,800,049 bytes per MUT second Productivity 0.0% of total user, 0.0% of total elapsed gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].steps[0].sync_large_objects: 0 gen[0].steps[1].sync_large_objects: 0 gen[1].steps[0].sync_large_objects: 0 So it looks like it eliminates the space leak, but note how the mutator doesn't get to do any work since we are using up all the time in the GC. We only get to run 22 Gen0 collections and 3 Gen1 collections. In other words, I don't think it does anything to help with the leak. According to heap profiling, two things take memory: PAPs (which are partial applications to the RTS, that is functions which are not yet fully applied), and a closure. It would make sense that it is a PAP when one looks at the core. State monads are s -> (s, a) and StateT with IO as the underlying monad gets translated into s -> ioS -> (ioS, (s, a)), so I am not too confused about the PAP appearing. Thinking more about this might reveal why the PAP appears however. Also, if you need me to run any kind of test against other RTS options or programs, I'll be happy to do it. Just bump me :) -- J.

Bulat Ziganshin

11:35 a.m.

New subject: Re[2]: GHC, odd concurrency space leak

Hello Jason, Saturday, April 17, 2010, 2:00:04 AM, you wrote:

...

Well, I think Bulat correctly characterized the non-termination aspect. I didn't think the cooperative aspect of threading applied with the threaded RTS, so I'm not 100% sure I believe his characterization, but otherwise it seems like a reasonable explanation.

it's a well known side of ghc green threads implementation. read notes in sources of Control.Concurrent module: The concurrency extension for Haskell is described in the paper /Concurrent Haskell/ http://www.haskell.org/ghc/docs/papers/concurrent-haskell.ps.gz. Concurrency is "lightweight", which means that both thread creation and context switching overheads are extremely low. Scheduling of Haskell threads is done internally in the Haskell runtime system, and doesn't make use of any operating system-supplied thread packages. GHC implements pre-emptive multitasking: the execution of threads are interleaved in a random fashion. More specifically, a thread may be pre-empted whenever it allocates some memory, which unfortunately means that tight loops which do no allocation tend to lock out other threads (this only seems to happen with pathological benchmark-style code, however). -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Neil Brown

15 Apr 15 Apr

8:37 a.m.

Jason Dagit wrote:

...

On Wed, Apr 14, 2010 at 3:13 PM, Daniel Fischer mailto:daniel.is.fischer@web.de> wrote:

Am Mittwoch 14 April 2010 23:49:43 schrieb Jason Dagit: > > It will be interesting to hear what fixes this! > > > > > > forever' m = do _ <- m > > forever' m > > When I define that version of forever, the space leak goes away.

Not with optimisations.

Thanks for pointing that out. I forgot to say so in my email.

Here are two reduced versions of the original program:

<snip> I find non-termination with a much simpler program than yours (GHC 6.12.1): \begin{code}{-# OPTIONS -O1 #-} import Control.Concurrent import Control.Monad (forever) main = do putStrLn "Main thread starting" forkIO $ do putStrLn "Started thread" forever $ return () putStrLn "Delaying" threadDelay (1 * 1000000) putStrLn "Delayed" \end{code} If I compile that with "ghc --make -threaded" and run it, with -O1 or -O2, it burns CPU and never terminates. With -O0 it terminates. So looks like some optimisation is causing the problem. I might guess it's something to do with the RTS and threadDelay that's causing the problem. "Delayed" is never printed on my system, so it seems like (even when run with +RTS -N2) the original thread is not ever being rescheduled; perhaps the timeout queue isn't checked properly when a thread is burning up the CPU like that, and optimisations are on? Thanks, Neil.

Bulat Ziganshin

8:55 a.m.

New subject: Re[2]: GHC, odd concurrency space leak

Hello Neil, Thursday, April 15, 2010, 12:37:35 PM, you wrote:

...

I find non-termination with a much simpler program than yours (GHC 6.12.1): forkIO $ do putStrLn "Started thread" forever $ return ()

ghc multithreading is actually cooperative: it switches only on memory allocation. since almost any haskell code allocates, there is no problem - it works like preemptive one. but sometimes this assumption fails - with optimization enabled, your code doesn't allocate so there are no chances for thread switching. replacing return () with threadDelay call solves the problem -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Simon Peyton-Jones

17 Apr 17 Apr

12:41 p.m.

I have not been following the details of this, I'm afraid, but I notice this:

...

forever' m = do _ <- m forever' m

When I define that version of forever, the space leak goes away. What was the old version of forever that led to the leak? If you can boil down the leak to a simple test case, do submit a Trac ticket. Simon From: haskell-cafe-bounces@haskell.org [mailto:haskell-cafe-bounces@haskell.org] On Behalf Of Jason Dagit Sent: 14 April 2010 22:50 To: Gregory Collins Cc: Haskell Cafe Subject: Re: [Haskell-cafe] GHC, odd concurrency space leak On Wed, Apr 14, 2010 at 2:44 PM, Jason Dagit mailto:dagit@codersbase.com> wrote: On Wed, Apr 14, 2010 at 2:13 PM, Gregory Collins mailto:greg@gregorycollins.net> wrote: Jesper Louis Andersen mailto:jesper.louis.andersen@gmail.com> writes:

...

This post describes some odd behaviour I have seen in GHC 6.12.1 when writing Combinatorrent. The post is literate Haskell so you can run it. The executive summary: A space leak occurs when a new process is spawned from inside another process - and I can't figure out why. I am asking for help on haskell-cafe.

...[snip]...

...
import Control.Monad.State

Does the problem go away if you use "Control.Monad.State.Strict"? Nope :) That was the first thing I tried here. I tried playing with optimization level too. Next I tried making two versions that were as similar as possible and then comparing the core with ghc-core. I can't see a difference between a version that uses 1MB and a version that uses 160MB (on my system 160MB is the worst I can get it to blow up). The two versions I compared: Low memory: \begin{code}

...

startp4 :: IO ThreadId startp4 = spawn () () (return ())

...

startp3 :: IO ThreadId startp3 = spawn () () (forever $ do liftIO startp4 liftIO $ putStrLn "Delaying" liftIO $ threadDelay (3 * 1000000))

...

main1 = do putStrLn "Main thread starting" startp3 threadDelay (1 * 1000000)

main = main1 \end{code}

Too much memory: \begin{code}

...

startp4 :: IO ThreadId startp4 = spawn () () (forever $ return ())

...

startp3 :: IO ThreadId startp3 = spawn () () (forever $ do liftIO startp4 liftIO $ putStrLn "Delaying" liftIO $ threadDelay (3 * 1000000))

...

main1 = do putStrLn "Main thread starting" startp3 threadDelay (1 * 1000000)

main = main1 \end{code}

...

startp4 :: IO ThreadId startp4 = spawn () () (liftIO $ threadDelay (100 * 1000000))

...

startp3 :: IO ThreadId startp3 = spawn () () (forever $ do liftIO startp4 liftIO $ putStrLn "Delaying" liftIO $ threadDelay (3 * 1000000))

...

main1 = do putStrLn "Main thread starting" startp3 threadDelay (1 * 1000000)

main = main1 \end{code}

It will be interesting to hear what fixes this!

...

forever' m = do _ <- m forever' m

When I define that version of forever, the space leak goes away.

Daniel Fischer

2:02 p.m.

...

I have not been following the details of this, I'm afraid, but I notice

Am Samstag 17 April 2010 14:41:28 schrieb Simon Peyton-Jones: this:

...

...
forever' m = do _ <- m forever' m

When I define that version of forever, the space leak goes away.

What was the old version of forever that led to the leak?

Control.Monad.forever forever :: Monad m => m a -> m b forever m = m >> forever m However, that isn't the problem. In my tests, both variants of forever exhibit the same behaviour, what makes it leak or not is the optimisation level.

...

If you can boil down the leak to a simple test case, do submit a Trac ticket.

Simon

The code below behaves well if compiled without optimisations (~36K maximum residency). When compiled with optimisations (-O1 or -O2, no discernible difference), it gets stuck in the infinite loop "always (return ())" [no surprise], but it runs in small space (+RTS -M58K for me). With -O2 -fno-state-hack (or -O1 -fno-state-hack), it leaks memory: 469,292,260 bytes allocated in the heap 837,326,332 bytes copied during GC 233,727,956 bytes maximum residency (9 sample(s)) 3,740,036 bytes maximum slop 456 MB total memory in use (4 MB lost due to fragmentation) --------------------------------------------- module Main (main) where import Control.Concurrent {- always :: Monad m => m a -> m b always a = a >> always a -} always :: Monad m => m a -> m b always a = do _ <- a always a spawner :: IO () spawner = always $ do forkIO $ always (return ()) putStrLn "Delaying" threadDelay 1000000 main :: IO () main = do putStrLn "Spawning" forkIO spawner putStrLn "Delaying main" threadDelay 4000000 -------------------------------------------

Bertram Felgenhauer

8:11 p.m.

Daniel Fischer wrote:

...

...
I have not been following the details of this, I'm afraid, but I notice

Am Samstag 17 April 2010 14:41:28 schrieb Simon Peyton-Jones: this:

...
...
forever' m = do _ <- m forever' m

When I define that version of forever, the space leak goes away.

What was the old version of forever that led to the leak?

Control.Monad.forever

forever :: Monad m => m a -> m b forever m = m >> forever m

However, that isn't the problem. In my tests, both variants of forever exhibit the same behaviour, what makes it leak or not is the optimisation level.

This definition, plus sharing, is the source of the space leak. Consider this modification of your code: import Control.Concurrent always :: Monad m => m a -> m b always a = -- let act = a >> act in act do _ <- a always a noop :: IO () noop = return () body :: IO () body = always noop spawner :: IO () spawner = do forkIO $ body putStrLn "Delaying" threadDelay 1000000 body `seq` return () main :: IO () main = do putStrLn "Spawning" forkIO spawner putStrLn "Delaying main" threadDelay 4000000 Note that the 'always' in 'spawner' is gone, but it still exhibits the space leak. The leak goes away if the final line of 'spawner' is removed, hinting at the real problem: 'always' actually creates a long chain of actions instead of tying the knot. Indeed the following definition of 'always' (or 'forever') fares better in that regard, but is more susceptible to producing unproductive loops: always a = let act = a >> act in act (I used noop = yield for avoiding that problem in my tests) regards, Bertram

Bulat Ziganshin

8:36 p.m.

New subject: Re[2]: GHC, odd concurrency space leak

Hello Bertram, Sunday, April 18, 2010, 12:11:05 AM, you wrote:

...

always a = -- let act = a >> act in act do _ <- a always a

...

hinting at the real problem: 'always' actually creates a long chain of actions instead of tying the knot.

can you explain it deeper? it's what i see: always definition is equivalent to always a = do a always a what's the same as always a = a >> always a that looks exactly like your commented out definition, except that it doesn't create value act. but i don't see list of actions here -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Bertram Felgenhauer

11:36 p.m.

Bulat Ziganshin wrote:

...

Hello Bertram,

Sunday, April 18, 2010, 12:11:05 AM, you wrote:

...
always a = -- let act = a >> act in act do _ <- a always a

...
hinting at the real problem: 'always' actually creates a long chain of actions instead of tying the knot.

can you explain it deeper? it's what i see: always definition is equivalent to

always a = do a always a

what's the same as

always a = a >> always a

This expands as always a = a >> always a = a >> a >> always a = a >> a >> a >> always a ... where each >> application is represented by a newly allocated object (or several, I have not looked at it in detail) on the heap. With always a = let act = a >> act in act there's only one >> application being allocated. The principle is the same as with repeat x = x : repeat x versus repeat x = let xs = x : xs in xs HTH, Bertram

Bulat Ziganshin

18 Apr 18 Apr

7:19 a.m.

New subject: Re[2]: GHC, odd concurrency space leak

Hello Bertram, Sunday, April 18, 2010, 3:36:31 AM, you wrote:

...

This expands as

...

always a = a >> always a = a >> a >> always a = a >> a >> a >> always a ... where each >> application is represented by a newly allocated object (or several, I have not looked at it in detail) on the heap.

why you think so? i always thought that >> in ghc just sequentially executes statements, the RealWorld magic exists only at compile-time -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Bertram Felgenhauer

9:37 a.m.

Bulat Ziganshin wrote:

...

...
This expands as

...
always a = a >> always a = a >> a >> always a = a >> a >> a >> always a ... where each >> application is represented by a newly allocated object (or several, I have not looked at it in detail) on the heap.

why you think so?

At the time I wrote this, because it explains the space leak and because the space leak disappears if I address this precise issue. But I've since verified the theory by inspecting Core and Cmm code.

...

i always thought that >> in ghc just sequentially executes statements, the RealWorld magic exists only at compile-time

Yes, that's what happens once (>>) gets actually executed in IO. But this fact and the RealWorld token have nothing to do with the whole issue, which is about accumulating a chain of IO actions that have not yet been executed. I'll continue to write a >> b, which in IO, modulo newtypes, stands for \(s :: RealWorld#) -> case a s of (s', _) -> b s' The fact that the state token disappears at runtime does not change that this is a closure, represented by a (function) heap node. So we have some IO action let x = always a Now we run x, but also hold onto the corresponding thunk to reuse it later, say let x = always a in x >> x In order to execute that, x is forced, and evaluated to let x = let x' = always a in a >> x' in x >> x or, equivalently, let x' = always a x = a >> x' in x >> x Then the first step of the IO action is performed, resulting in let x' = always a x = a >> x' in x' >> x And now the same reduction happens again for x', let x2 = always a x' = a >> x2 x = a >> x' in x2 >> x and then again for x2, let x3 = always a x2 = a >> x3 x' = a >> x2 x = a >> x' in x2 >> x and so on, ad infinitum. This leaks memory because x, x', x2 etc. can't be garbage collected - there's still a reference to x. Note that this also explains why the space leak disappears if we remove the 'forever' in the spawner thread in the original example. This would not happen if the 'always a' was reused, i.e. if the code tied a knot as let act = a >> act in act does, but as you can see in the Core (and even Cmm if you look closely enough) that does not happen in those cases where the code leaks memory. HTH, Bertram

Daniel Fischer

17 Apr 17 Apr

11:05 p.m.

Am Samstag 17 April 2010 22:11:05 schrieb Bertram Felgenhauer:

...

Daniel Fischer wrote:

...
Am Samstag 17 April 2010 14:41:28 schrieb Simon Peyton-Jones:

...
I have not been following the details of this, I'm afraid, but I notice

this:

...
...
forever' m = do _ <- m forever' m

When I define that version of forever, the space leak goes away.

What was the old version of forever that led to the leak?

Control.Monad.forever

forever :: Monad m => m a -> m b forever m = m >> forever m

However, that isn't the problem. In my tests, both variants of forever exhibit the same behaviour, what makes it leak or not is the optimisation level.

This definition, plus sharing, is the source of the space leak. Consider this modification of your code:

import Control.Concurrent

always :: Monad m => m a -> m b always a = -- let act = a >> act in act do _ <- a always a

noop :: IO () noop = return ()

body :: IO () body = always noop

spawner :: IO () spawner = do forkIO $ body putStrLn "Delaying" threadDelay 1000000 body `seq` return ()

main :: IO () main = do putStrLn "Spawning" forkIO spawner putStrLn "Delaying main" threadDelay 4000000

Note that the 'always' in 'spawner' is gone, but it still exhibits the space leak. The leak goes away if the final line of 'spawner' is removed, hinting at the real problem: 'always' actually creates a long chain of actions instead of tying the knot.

Except that with optimisations turned on, GHC ties the knot for you (at least if always isn't exported). Without -fno-state-hack, the knot is tied so tightly that always (return ()) is never descheduled (and there's no leak). With -fno-state-hack, I get Rec { Main.main_always :: GHC.Types.IO () -> GHC.Types.IO () GblId [Arity 1 NoCafRefs Str: DmdType L] Main.main_always = \ (a_aeO :: GHC.Types.IO ()) -> let { k_sYz :: GHC.Types.IO () LclId [Str: DmdType] k_sYz = Main.main_always a_aeO } in (\ (eta_ann :: GHC.Prim.State# GHC.Prim.RealWorld) -> case (a_aeO `cast` (GHC.Types.NTCo:IO () :: GHC.Types.IO () ~ (GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)))) eta_ann of _ { (# new_s_anz, _ #) -> (k_sYz `cast` (GHC.Types.NTCo:IO () :: GHC.Types.IO () ~ (GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)))) new_s_anz }) `cast` (sym (GHC.Types.NTCo:IO ()) :: (GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)) ~ GHC.Types.IO ()) end Rec } which, despite tying the knot, leaks (so the program at least terminates).

...

Indeed the following definition of 'always' (or 'forever') fares better in that regard, but is more susceptible to producing unproductive loops:

Indeed, that doesn't terminate with -O2 -fno-state-hack

...

always a = let act = a >> act in act

(I used noop = yield for avoiding that problem in my tests)

regards,

Bertram

Bertram Felgenhauer

18 Apr 18 Apr

12:05 a.m.

Daniel Fischer wrote:

...

Except that with optimisations turned on, GHC ties the knot for you (at least if always isn't exported). Without -fno-state-hack, the knot is tied so tightly that always (return ()) is never descheduled (and there's no leak).

Yes, I was concentrating on -O2, without -fno-state-hack.

...

With -fno-state-hack, I get

Rec { Main.main_always :: GHC.Types.IO () -> GHC.Types.IO () GblId [Arity 1 NoCafRefs Str: DmdType L] Main.main_always = \ (a_aeO :: GHC.Types.IO ()) -> let { k_sYz :: GHC.Types.IO () LclId [Str: DmdType] k_sYz = Main.main_always a_aeO } in (\ (eta_ann :: GHC.Prim.State# GHC.Prim.RealWorld) -> case (a_aeO `cast` (GHC.Types.NTCo:IO () :: GHC.Types.IO () ~ (GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)))) eta_ann of _ { (# new_s_anz, _ #) -> (k_sYz `cast` (GHC.Types.NTCo:IO () :: GHC.Types.IO () ~ (GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)))) new_s_anz }) `cast` (sym (GHC.Types.NTCo:IO ()) :: (GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)) ~ GHC.Types.IO ()) end Rec }

Which is always = \a_aeO -> let k_sYz = always a_aeO in a_aeO >> k_sYz specialised to IO, and with (>>) inlined. Where is the knot? regards, Bertram

Daniel Fischer

12:16 a.m.

Am Sonntag 18 April 2010 02:05:30 schrieb Bertram Felgenhauer:

...

Which is

always = \a_aeO -> let k_sYz = always a_aeO in a_aeO >> k_sYz

specialised to IO, and with (>>) inlined.

Where is the knot?

Nowhere. Got confused by all the a_aAe and `cast` (GHC.Types...). Sorry.

Daniel Fischer

14 Apr 14 Apr

10:01 p.m.

Am Mittwoch 14 April 2010 23:13:13 schrieb Gregory Collins:

...

Jesper Louis Andersen writes:

...
This post describes some odd behaviour I have seen in GHC 6.12.1 when writing Combinatorrent. The post is literate Haskell so you can run it. The executive summary: A space leak occurs when a new process is spawned from inside another process - and I can't figure out why. I am asking for help on haskell-cafe.

...[snip]...

...
import Control.Monad.State

Does the problem go away if you use "Control.Monad.State.Strict"?

No. The problem goes away, however, if I replace p1 with p1 = forever $ liftIO (return () >> threadDelay 0) It is reduced, but still present, for p1 = forever $ liftIO (return () >> yield) It is also reduced by decreasing the delay in p2. I don't know what's going on, though.

Adam Vogt

17 Apr 17 Apr

1:49 p.m.

* On Wednesday, April 14 2010, Jesper Louis Andersen wrote:

...

...
newtype Process a b c = Process (ReaderT a (StateT b IO) c) deriving (Functor, Monad, MonadIO, MonadState b, MonadReader a)

Note that the automatic derivations of *MonadState b* and *MonadReader a* makes GHC spit our some mkUsageInfo warnings in its generation of the .hi-files. They don't seem to be dangerous. Glueing instructions for our model kit is given by

The relevant bug for that is: http://hackage.haskell.org/trac/ghc/ticket/3955 -- Adam

5556

Age (days ago)

5560

Last active (days ago)

List overview

Download

24 comments

9 participants

participants (9)

Adam Vogt
Bertram Felgenhauer
Bulat Ziganshin
Daniel Fischer
Gregory Collins
Jason Dagit
Jesper Louis Andersen
Neil Brown
Simon Peyton-Jones