[GHC] #9809: Overwhelming the TimerManager

#9809: Overwhelming the TimerManager -------------------------------------+------------------------------------- Reporter: fryguybob | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.9 Keywords: | Operating System: Linux Architecture: Unknown/Multiple | Type of failure: Runtime Difficulty: Unknown | performance bug Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- I was talking on IRC with davean about an issue that potentially could have been related to STM (I don't think it is at all after investigating) and I reduced the issue to the following program: {{{#!hs -- Main.hs module Main where import Control.Monad import Control.Concurrent import Control.Concurrent.STM import System.Environment main :: IO () main = do as <- getArgs case as of ["-f"] -> replicateM_ 100000 . void . forkIO . void $ registerDelay 10 _ -> replicateM_ 100000 . void $ registerDelay 10 threadDelay 1000 }}} This ends up registering a lot of events with the TimerManager. With the "-f" flag it does so from many threads and when run that way it appears to eventually overwhelm the TimerManager and causing over 350 MB total memory in use. {{{ $ ghc --version The Glorious Glasgow Haskell Compilation System, version 7.9.20141115 $ ghc -O2 -threaded -debug Main.hs -o Main-head ... $ ./Main-head -f +RTS -s 3,566,966,936 bytes allocated in the heap 4,200,021,784 bytes copied during GC 118,273,720 bytes maximum residency (96 sample(s)) 12,649,480 bytes maximum slop 354 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 6350 colls, 0 par 2.430s 2.434s 0.0004s 0.0073s Gen 1 96 colls, 0 par 7.438s 7.441s 0.0775s 0.2526s TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.002s ( 0.002s elapsed) MUT time 0.461s ( 0.460s elapsed) GC time 9.869s ( 9.875s elapsed) EXIT time 0.003s ( 0.003s elapsed) Total time 10.336s ( 10.340s elapsed) Alloc rate 7,741,472,461 bytes per MUT second Productivity 4.5% of total user, 4.5% of total elapsed gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].sync: 0 gen[1].sync: 0 }}} Running without forking many threads and the total memory in use stays low (3 MB). {{{ 154,305,648 bytes allocated in the heap 16,922,272 bytes copied during GC 1,378,608 bytes maximum residency (3 sample(s)) 28,520 bytes maximum slop 3 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 298 colls, 0 par 0.056s 0.056s 0.0002s 0.0015s Gen 1 3 colls, 0 par 0.005s 0.005s 0.0017s 0.0047s TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.001s ( 0.001s elapsed) MUT time 0.148s ( 0.148s elapsed) GC time 0.061s ( 0.061s elapsed) EXIT time 0.000s ( 0.000s elapsed) Total time 0.211s ( 0.210s elapsed) Alloc rate 1,042,557,595 bytes per MUT second Productivity 70.8% of total user, 71.2% of total elapsed gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].sync: 0 gen[1].sync: 0 }}} Using 7.6.3, things don't get out of hand, also with 3 MB total memory use. {{{ $ ./Main-7.6.3 -f +RTS -s 213,519,392 bytes allocated in the heap 116,111,712 bytes copied during GC 505,080 bytes maximum residency (11 sample(s)) 113,032 bytes maximum slop 3 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 403 colls, 0 par 0.33s 0.33s 0.0008s 0.0028s Gen 1 11 colls, 0 par 0.01s 0.01s 0.0011s 0.0019s TASKS: 3 (1 bound, 2 peak workers (2 total), using -N1) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.00s ( 0.00s elapsed) MUT time 1.99s ( 1.50s elapsed) GC time 0.34s ( 0.34s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 2.33s ( 1.84s elapsed) Alloc rate 107,426,859 bytes per MUT second Productivity 85.5% of total user, 108.1% of total elapsed gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].sync: 0 gen[1].sync: 0 }}} I looked for causes and eliminated any `STM` interactions causing problems (`STM` shows up in the `Unique` values and in creating a `TVar` for the registered delay) but did discover that the `emTimeouts` queue gets very large at some point when executing with "-f". If I print the size of `expired` on this line: https://github.com/ghc/ghc/blob/021b7978d14799bae779907faf7490cfd21b3f46/lib... I end up seeing somewhere from 4 to 20 events for a while then eventually it jumps up to 80000 or so. Perhaps davean can chime in about the particular use case that he has that I reduced to this simple program for a more real world use. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9809 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9809: Overwhelming the TimerManager -------------------------------------+------------------------------------- Reporter: fryguybob | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: Linux | Difficulty: Unknown Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by AndreasVoellmy): * cc: AndreasVoellmy (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9809#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9809: Overwhelming the TimerManager -------------------------------------+------------------------------------- Reporter: fryguybob | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: Linux | Difficulty: Unknown Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by davean): * cc: davean (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9809#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9809: Overwhelming the TimerManager -------------------------------------+------------------------------------- Reporter: fryguybob | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: Linux | Difficulty: Unknown Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by davean): My specific use case involved TCP connections which were waiting on data from a TChan. If the data did not arrive soon enough, it needed to stop waiting on the TChan and do some other work. After which it would either come back and listen again or not. Previously I was managing in the range of 40k per instance for months at a time without issue (excursions to 100k had occurred). Currently (with 7.8.3 and admittedly slightly different server code) I'm not safely able to handle more then around 6k per instance. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9809#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9809: Overwhelming the TimerManager -------------------------------------+------------------------------------- Reporter: fryguybob | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.9 Resolution: | Keywords: Operating System: Linux | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by davean): Confirming this still occurs on 7.10.1. {{{ $ ./Main -f +RTS -s 3,260,792,712 bytes allocated in the heap 3,797,542,040 bytes copied during GC 103,056,944 bytes maximum residency (88 sample(s)) 11,062,480 bytes maximum slop 309 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 5589 colls, 0 par 0.164s 0.160s 0.0000s 0.0004s Gen 1 88 colls, 0 par 1.844s 1.909s 0.0217s 0.0656s TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.000s ( 0.000s elapsed) MUT time 0.220s ( 0.266s elapsed) GC time 2.008s ( 2.069s elapsed) EXIT time 0.000s ( 0.000s elapsed) Total time 2.228s ( 2.335s elapsed) Alloc rate 14,821,785,054 bytes per MUT second Productivity 9.9% of total user, 9.4% of total elapsed gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].sync: 0 gen[1].sync: 0 }}} {{{ ./Main +RTS -s 182,472,160 bytes allocated in the heap 35,704,856 bytes copied during GC 3,270,336 bytes maximum residency (7 sample(s)) 89,368 bytes maximum slop 8 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 349 colls, 0 par 0.020s 0.016s 0.0000s 0.0002s Gen 1 7 colls, 0 par 0.000s 0.008s 0.0011s 0.0020s TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.000s ( 0.000s elapsed) MUT time 0.112s ( 0.152s elapsed) GC time 0.020s ( 0.024s elapsed) EXIT time 0.000s ( 0.000s elapsed) Total time 0.132s ( 0.177s elapsed) Alloc rate 1,629,215,714 bytes per MUT second Productivity 84.8% of total user, 63.4% of total elapsed gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].sync: 0 gen[1].sync: 0 }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9809#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9809: Overwhelming the TimerManager -------------------------------------+------------------------------------- Reporter: fryguybob | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.10.1 Resolution: | Keywords: Operating System: Linux | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by davean): * version: 7.9 => 7.10.1 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9809#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC