
#9221: (super!) linear slowdown of parallel builds on 40 core machine -------------------------------------+------------------------------------- Reporter: carter | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #910, #8224 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2): Hey, a question about `sched_yield()`: I just read the [http://man7.org/linux/man-pages/man2/sched_yield.2.html man page] for `sched_yield()` again. It says:
`sched_yield()` is intended for use with read-time scheduling policies (i.e., `SCHED_FIFO` or `SCHED_RR`). Use of `sched_yield()` with nondeterministic scheduling policies such as `SCHED_OTHER` is unspecified and very likely means your application design is broken.
Linux provides the `sched_yield()` system call as a mechanism for a
This call used to simply move the process to the end of the run queue; now it moves the process to the "expired" queue, effectively cancelling
Does GHC set the `FIFO` or`RR` policy? If not, then according to that our "application design is broken". I also found some interesting info on http://www.informit.com/articles/article.aspx?p=101760&seqNum=5 (emphasis mine): process to explicitly yield the processor to other waiting processes. It works by removing the process from the active array (where it currently is, because it is running) and inserting it into the expired array. This has the effect of not only preempting the process and putting it at the end of its priority list, but putting it on the expired list — **guaranteeing it will not run for a while**. Because real-time tasks never expire, they are a special case. Therefore, they are merely moved to the end of their priority list (and not inserted into the expired array). **In earlier versions of Linux, the semantics of the `sched_yield()` call were quite different; at best, the task was only moved to the end of their priority list**. The yielding was often not for a very long time. Nowadays, applications and even kernel code should be certain they truly want to give up the processor before calling `sched_yield()`. A similar article on LWN: https://lwn.net/Articles/31462/ the rest of the process's time slice. So a process calling `sched_yield()` now must wait until all other runnable processes in the system have used up their time slices before it will get the processor again. The article goes on to explain that this resulted bad performance especially for
threaded applications [that] implement busy-wait loops with `sched_yield()`
Might this be relevant here? Also, can someone explain me why GHC is using `sched_yield()` at all? If the purpose is to wait until other GC threads are done, wouldn't `futex()` be enough? Or is that what's explained in https://ghcmutterings.wordpress.com/2010/01/25/yielding-more-improvements- in-parallel-performance/ ? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9221#comment:84 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler