Re: [GHC] #7606: Stride scheduling for Haskell threads with priorities

22 Jan 2013

      #7606: Stride scheduling for Haskell threads with priorities
---------------------------------+------------------------------------------
    Reporter:  ezyang            |       Owner:  ezyang          
        Type:  feature request   |      Status:  new             
    Priority:  normal            |   Milestone:  7.8.1           
   Component:  Runtime System    |     Version:  7.7             
    Keywords:                    |          Os:  Unknown/Multiple
Architecture:  Unknown/Multiple  |     Failure:  None/Unknown    
  Difficulty:  Unknown           |    Testcase:                  
   Blockedby:                    |    Blocking:                  
     Related:                    |  
---------------------------------+------------------------------------------

Comment(by ezyang):

 You’re right, it’s not (or, at least, the situation is more complicated.)
 With the following gratuitous change:

 {{{
 diff --git a/rts/Capability.c b/rts/Capability.c
 index 811df58..b4a1ec0 100644
 --- a/rts/Capability.c
 +++ b/rts/Capability.c
 @@ -1007,6 +1007,10 @@ markCapability (evac_fn evac, void *user,
 Capability *cap,
      // or fewer Capabilities as GC threads, but just in case there
      // are more, we mark every Capability whose number is the GC
      // thread's index plus a multiple of the number of GC threads.
 +    StgTSO *tso;
 +    for (tso = cap->run_queue_hd; tso != END_TSO_QUEUE; tso = tso->_link)
 {
 +        evac(user, (StgClosure **)(void *)&tso);
 +    }
      evac(user, (StgClosure **)(void *)&cap->run_queue_hd);
      evac(user, (StgClosure **)(void *)&cap->run_queue_tl);
  #if defined(THREADED_RTS)
 }}}

 We only see a very minor bump in runtime:

 {{{
 --------------------------------------------------------------------------------
         Program           Size    Allocs   Runtime   Elapsed  TotalMem
 --------------------------------------------------------------------------------
           sieve          +0.0%     +0.0%     +0.7%     +0.7%     +0.0%
 }}}

 I've been focusing on sieve, since it had the most slowdown. What I do
 notice when I add some diagnostics is that the run queue starts with a 100
 threads in it, and that the program runs a lot more slowly when the run
 queue is large when it is small.  When I count actual copies, TSOs are
 copied a lot more frequently in the new code. If you have some ideas on
 what to look at, that would be great!

 I think what I am going to try to do is implement another version which
 uses sorted doubly linked lists instead, and see if the same problems crop
 up. (I also realized that we're leaking memory for the run queues, since I
 never free them. Oops!)

-- 
Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7606#comment:7
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler