[GHC] #15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the runtime

#15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the
runtime
-------------------------------------+-------------------------------------
Reporter: nh2 | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone: 8.6.1
Component: Runtime | Version: 8.2.2
System |
Keywords: | Operating System: Linux
Architecture: | Type of failure: Runtime
Unknown/Multiple | performance bug
Test Case: | Blocked By:
Blocking: | Related Tickets:
Differential Rev(s): | Wiki Page:
-------------------------------------+-------------------------------------
This is the outcome of https://mail.haskell.org/pipermail/ghc-
devs/2018-May/015749.html
Reading through the code of
[http://hackage.haskell.org/package/base-4.11.1.0/docs/src/GHC.IO.FD.html#rea...
readRawBufferPtr] the first line jumped to my eye:
{{{
| isNonBlocking fd = unsafe_read -- unsafe is ok, it can't block
}}}
This looks suspicious.
On Linux, if `fd` is a a descriptor to a regular file (on disk, a
networked filesystem, or a block device), then `O_NONBLOCK` will have no
effect, yet `unsafe_read` is used which will block the running OS thread.
You can read more about `O_NONBLOCK` not working on regular files on Linux
here:
* https://www.nginx.com/blog/thread-pools-boost-performance-9x/
* https://stackoverflow.com/questions/8057892/epoll-on-regular-files
* https://jvns.ca/blog/2017/06/03/async-io-on-linux--select--poll--and-
epoll/
*
https://groups.google.com/forum/#!topic/comp.os.linux.development.system/K
-fC-G6P4EA
And indeed, the following program does NOT keep printing things in the
printing thread, and instead blocks for 30 seconds:
{{{
module Main where
import Control.Concurrent
import Control.Monad
import qualified Data.ByteString as BS
import System.Environment
main :: IO ()
main = do
args <- getArgs
case args of
[file] -> do
forkIO $ forever $ do
putStrLn "still running"
threadDelay 100000 -- 0.1 s
bs <- BS.readFile file
putStrLn $ "Read " ++ show (BS.length bs) ++ " bytes"
_ -> error "Pass 1 argument (a file)"
}}}
when compiled with
{{{
~/.stack/programs/x86_64-linux/ghc-8.2.2/bin/ghc --make -O -threaded
blocking-regular-file-read-test.hs
}}}
on my Ubuntu 16.04 and on a 2GB file like
{{{
./blocking-regular-file-read-test /mnt/images/ubuntu-18.04-desktop-
amd64.iso
}}}
And `strace -f -e open,read` on it shows:
{{{
open("/mnt/images/ubuntu-18.04-desktop-amd64.iso",
O_RDONLY|O_NOCTTY|O_NONBLOCK) = 11
read(11,

#15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the runtime -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Runtime System | Version: 8.2.2 Resolution: | Keywords: Operating System: Linux | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2): This is mainly for Linux but may also apply to other OSs. I've read that FreeBSD has a better story for reading files asynchronously, but I don't know whether GHC has any special code to use those features. Input from FreeBSD users would be appreciated. Otherwise, likely the fix to this for Linux will also improve things over the status quo on FreeBSD. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15153#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the runtime -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Runtime System | Version: 8.2.2 Resolution: | Keywords: Operating System: Linux | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2): Also funny but perhaps not too surprising: If in my code, you replace `forkIO` by e.g. `forkOn 2`, then nondeterministically, sometimes the program hangs and sometimes it works with `+RTS -N2`. The higher you set `-N`, the more likely it is to work. If you put both the putStrLn loop and the readFile into `forkOn 0` and `forkOn 1` each, and run with `+RTS -N3`, then it always works as expected. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15153#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the runtime -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Runtime System | Version: 8.2.2 Resolution: | Keywords: Operating System: Linux | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Yeah, this is well known. O_NONBLOCK doesn't do anything for local disk files, but it also doesn't do any harm. It's not usually a big problem except when the filesystem actually has significant latency, like with NFS or an arbitrary fuse filesystem. I think I was worried about the overhead of using safe calls here, the problem is that if you have lots of threads all reading from Handles (e.g. sockets) then if those reads are safe calls we'll get a large number of real OS threads created. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15153#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the runtime -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Runtime System | Version: 8.2.2 Resolution: | Keywords: Operating System: Linux | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2): Replying to [comment:3 simonmar]:
It's not usually a big problem except when the filesystem actually has significant latency, like with NFS or an arbitrary fuse filesystem.
I think I was worried about the overhead of using safe calls here, the
What about my example though? I'm just reading a 3GB file using `BS.readFile` and it blocks other threads for 30 seconds. (My file system doens't have unexpected latency in terms of access time, it's a normal spinning disk, and the problem is the time it takes to read the data. Though I find it reasonablye to argue that the ~8ms it may take to seek this disk is ''also'' a lot of wasted CPU time, that's not the key problem in my example.) problem is that if you have lots of threads all reading from Handles (e.g. sockets) then if those reads are safe calls we'll get a large number of real OS threads created. The concern about overhead makes sense to me, but I find it easy to argue that there's more overheard in having all other resources of a machine (CPU, network, other disks) idle while a single disk arm is moving. And the time for a `pthread_create` is rather small compared to a read of any physical storage device. Another argument is that if examples like the one I gave don't work, it is very hard to build reasonable heartbeating of long-running IO tasks in Haskell.
O_NONBLOCK doesn't do anything for local disk files, but it also doesn't do any harm
What I meant is that from a quick look at the code, it looked like if O_NONBLOCK isn't set, a `safe` call would be made instead. In the spirit of https://ghc.haskell.org/trac/ghc/ticket/13296#comment:3 and the followup comment, would it be a good default to have reads to be `safe`, and offer `unsafe` equivalents when you know that the read will be very fast? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15153#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the runtime -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Runtime System | Version: 8.2.2 Resolution: | Keywords: Operating System: Linux | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by redneb): * cc: redneb (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15153#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC