
#13497: GHC does not use select()/poll() correctly on non-Linux platforms -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #8684, #12912 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2): Replying to [comment:11 nh2]:
The `select()` occurrence in `awaitEvent()` waits for `sleeping_queue->block_info.target - now` ([https://github.com/ghc/ghc/blob/380b25ea4754c2aea683538ffdb179f8946219a0/rts... code]) so that also needs to be vetted on whether it has the `&tv` updating problem on non-Linux.
OK, I've looked into that in detail now and added some printfs, and I think this `select()` needs some update as well if we want it to wake up precisely at `sleeping_queue->block_info.target`. The reasoning is the following: Inside the `while ((numFound = select(maxfd+1, &rfd, &wfd, NULL, ptv)) < 0) { ... }` there's this code: {{{ /* check for threads that need waking up */ wakeUpSleepingThreads(getLowResTimeOfDay()); /* If new runnable threads have arrived, stop waiting for * I/O and run them. */ if (!emptyRunQueue(&MainCapability)) { return; /* still hold the lock */ } }}} After an EINTR has interrupted `select()`, `wakeUpSleepingThreads(getLowResTimeOfDay())` checks whether there is a Haskell thread that wants to run (whether we are past its `sleeping_queue->block_info.target` time) -- and that includes the thread for which we're currently `select`ing -- and if so, we `return` out of the C code. So we're looking at the current time in each loop iteration. Consequently, we don't have the same problem as in `fdReady`, as this scheme does not rely on `select()` updating the passed in `struct timeval *ptv` pointer. However, there is still a problem: The `EINTR`s interrupting the `select()` come at fixed intervals (the timer signal). That can result in us waiting slighly too long. For example, assume `*ptv` is set to 15 ms (`sleeping_queue->block_info.target` is 15 ms from `now`), and assume the timer signal is every 10 ms. Then we would enter `select(15ms)`, get interrupted with EINTR after 10ms, and then call `select(15ms)` again in the same while loop. With the next timer EINTR 10ms later, we would `return` out of the `while` loop, so we're not at risk of running forever due to EINTR. But we have now waited 20ms in total instead of the desired 15ms. That's why I currently believe that the timeout argument to this `select()` should be recalculated based on the current time in every iteration, similar to how my fix for `fdReady()` does it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13497#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler