
#8224: Excessive system time -- new IO manager problem? -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: new Priority: high | Milestone: 7.8.1 Component: Runtime System | Version: 7.7 Resolution: | Keywords: IO Manager, Operating System: Linux | System Time Type of failure: Runtime | Architecture: x86_64 (amd64) performance bug | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------- Comment (by AndreasVoellmy): Replying to [comment:8 simonmar]:
200ms spent doing unspecified "initialization" should definitely be investigated, that's a lot. I can't see anything useful in that call stack, it looks like it might be related to signals or epoll() but it's hard to tell anything else.
What does the `+RTS -s` look like?
With my "one poller" hack: {{{ 48,624 bytes allocated in the heap 40 bytes copied during GC 38,592 bytes maximum residency (1 sample(s)) 268,608 bytes maximum slop 32 MB total memory in use (15 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 0 colls, 0 par 0.00s 0.00s 0.0000s 0.0000s Gen 1 1 colls, 0 par 0.00s 0.00s 0.0003s 0.0003s Parallel GC work balance: -nan (0 / 0, ideal 32) MUT time (elapsed) GC time (elapsed) Task 0 (worker) : 0.00s ( 0.00s) 0.00s ( 0.00s) Task 1 (worker) : 0.00s ( 0.00s) 0.00s ( 0.00s) Task 2 (bound) : 0.00s ( 0.00s) 0.00s ( 0.00s) Task 3 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 4 (worker) : 0.02s ( 0.01s) 0.00s ( 0.00s) Task 5 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 6 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 7 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 8 (worker) : 0.02s ( 0.01s) 0.00s ( 0.00s) Task 9 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 10 (worker) : 0.02s ( 0.01s) 0.00s ( 0.00s) Task 11 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 12 (worker) : 0.02s ( 0.01s) 0.00s ( 0.00s) Task 13 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 14 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 15 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 16 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 17 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 18 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 19 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 20 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 21 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 22 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 23 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 24 (worker) : 0.00s ( 0.01s) 0.00s ( 0.00s) Task 25 (worker) : 0.00s ( 0.01s) 0.00s ( 0.00s) Task 26 (worker) : 0.00s ( 0.01s) 0.00s ( 0.00s) Task 27 (worker) : 0.00s ( 0.01s) 0.00s ( 0.00s) Task 28 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 29 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 30 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 31 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 32 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) Task 33 (worker) : 0.01s ( 0.01s) 0.00s ( 0.00s) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.00s ( 0.01s elapsed) MUT time 0.00s ( 0.00s elapsed) GC time 0.00s ( 0.00s elapsed) EXIT time 0.01s ( 0.00s elapsed) Total time 0.02s ( 0.01s elapsed) Alloc rate 2,431,200 bytes per MUT second Productivity 80.0% of total user, 132.3% of total elapsed gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].sync: 0 gen[1].sync: 0 }}} With HEAD: {{{ 543,096 bytes allocated in the heap 64 bytes copied during GC 412,256 bytes maximum residency (1 sample(s)) 279,968 bytes maximum slop 18 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 0 colls, 0 par 0.00s 0.00s 0.0000s 0.0000s Gen 1 1 colls, 0 par 0.00s 0.00s 0.0009s 0.0009s Parallel GC work balance: -nan% (serial 0%, perfect 100%) TASKS: 65 (1 bound, 64 peak workers (64 total), using -N32) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.04s ( 0.03s elapsed) MUT time 0.00s ( -0.00s elapsed) GC time 0.00s ( 0.00s elapsed) EXIT time 0.01s ( 0.00s elapsed) Total time 0.05s ( 0.04s elapsed) Alloc rate 0 bytes per MUT second Productivity 17.4% of total user, 24.7% of total elapsed gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].sync: 0 gen[1].sync: 0 }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8224#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler