
#11075: Confusing parallel spark behaviour with safe FFI calls -------------------------------------+------------------------------------- Reporter: duncan | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.10.2 System | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- This is a tricky problem, and I don't see any easy solution, but here it is for the sake of documenting and tracking... When we create a bunch of 'par' sparks, e.g. via a strategy, the expected behaviour is that we get one thread per core that will churn through those sparks, one by one. The one by one is important, each core/cap/hec is only actively evaluating one of these sparks at a time. It's not like we have a thread per spark and they're all evaluating in parallel. The parallelism is limited to the core count (well, cap/hec). However, if each of these sparks makes a safe FFI call then the one-by-one behaviour is broken. Instead we get an OS thread per spark all running concurrently. If one looks at the mechanisms in detail one can see why this happens, but the overall effect can be disastrous for performance. If we have a small number of cores e.g. 8 and a large number of sparks e.g. 1024, then spawning 1024 OS threads each of which has plenty of work to do is not a recipe for good performance. The OS will share the time fairly which means each thread will get terrible CPU cache utilisation. Think of something like big numerical calculations where the cache utilisation is critical: running 100s of OS threads per core doing numerical calculations will be much worse than running one thread that churns through the calculations sequentially. So why does this happen? Well, when a cap is idle the cap's scheduler checks if it has any sparks in its spark pool, or in other cap's spark pools. If there are sparks available it grabs one and starts a Haskell thread to run it. Now the Haskell thread makes a safe FFI call. That means the Haskell thread is put to sleep while the foreign call runs in a new OS thread. But this now means that the scheduler thinks that this cap is idle again (there are no runnable Haskell threads), so it grabs another spark. And so another OS thread gets made to run the FFI call. And again and again, until we've converted all the sparks and forked off loads of OS threads to run these FFI calls. The problem is that the scheduler cannot tell the difference between a safe FFI call that is blocking and taking no CPU time (like waiting on network IO) vs a safe FFI call that is not blocking and is fully utilising the CPU (like big numerical calculations). In the latter case we would really like to say that the cap is not really idle, we don't need to create more work to do. In the former case it's important to create more work to do. But there's no easy way to tell the difference between these two cases, so this is a hard problem to fix. If a programmer knows that this is what is going on then they can work around the problem by being very careful with the number of sparks they make (kind of losing some of the benefits of sparks) or they can make all the expensive FFI calls unsafe (which can be tricky when using reusable FFI binding libs). But in practice most GHC users will never figure out that this is what is happening (plausibly it might be possible from looking at the threadscope view very careful). So that's the conundrum: a problem that users will find hard to identify, and no easy automatic solution. Ho hum. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11075 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler