Re: [GHC] #7606: Stride scheduling for Haskell threads with priorities

23 Jan 2013

      #7606: Stride scheduling for Haskell threads with priorities
---------------------------------+------------------------------------------
    Reporter:  ezyang            |       Owner:  ezyang          
        Type:  feature request   |      Status:  new             
    Priority:  normal            |   Milestone:  7.8.1           
   Component:  Runtime System    |     Version:  7.7             
    Keywords:                    |          Os:  Unknown/Multiple
Architecture:  Unknown/Multiple  |     Failure:  None/Unknown    
  Difficulty:  Unknown           |    Testcase:                  
   Blockedby:                    |    Blocking:                  
     Related:                    |  
---------------------------------+------------------------------------------

Comment(by ezyang):

 OK, I have a much better sense for where the performance problems are
 coming from.

 ----

 Adding extra words to the TSO has no impact on most smp benchmarks, but
 really kills threads006 (up to 100% slowdown); this is not too surprising
 since this benchmark involves creating 200,000 threads. A representative
 stat without a small TSO is

 {{{
 224198416 bytes, 425 GCs (416 + 9), 0/0 avg/max bytes residency (0
 samples), 699092656 bytes GC work, 279M in use, 0.00 INIT (0.00 elapsed),
 0.02 MUT (0.02 elapsed), 0.33 GC (0.33 elapsed), 0.12 GC(0) (0.12
 elapsed), 0.21 GC(1) (0.21 elapsed)
 }}}

 A representative stat with a big TSO is

 {{{
 224198416 bytes, 425 GCs (415 + 10), 0/0 avg/max bytes residency (0
 samples), 853296496 bytes GC work, 426M in use, 0.00 INIT (0.00 elapsed),
 0.00 MUT (0.00 elapsed), 0.47 GC (0.47 elapsed), 0.12 GC(0) (0.12
 elapsed), 0.34 GC(1) (0.35 elapsed)
 }}}

 What I don't understand is why the memory in use blows up by 150MB; by my
 count, the extra words should only be adding something like 10MB of
 overhead; my best guess is that I am actually nudging TSO size over some
 invisible threshold (maybe the big object threshold).

 ----

 From there, the next two bugs have to do with subtle scheduler changes.
 The first bug I have a clean fix for (it was a plain old bug); 'sieve'
 slows down a lot if threads which HeapOverflow don't get put back in front
 of the run queue.  Fixing that dramatically improves runtime for all of
 the benchmarks except 'threads003': we can make that performance problem
 go away if we force threads to get appended to the end of the run queue
 (of course, stride scheduling won't work in that case!) I'm still
 investigating these.

-- 
Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7606#comment:15
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler