Unexpected behaviour with send and send-buffer setting

I'm new to Haskell and have reached an impasse in understanding the behaviour of sockets. I see that under the hood Network.Socket sockets are set to non-blocking. Presumably, when a non-blocking socket's buffer is full it should immediately return 0 bytes. I've found that setting the send buffer size causes send to truncate the ByteString to the buffer size, but that successive sends continue to succeed when the buffer should be full. In the server code I set the send buffer to 1, then attempted to overflow it: handleConnection conn = do setSocketOption conn SendBuffer 1 s1 <- send conn "abc" putStrLn $ "Bytes sent: " ++ show s1 s2 <- send conn "def" putStrLn $ "Bytes sent: " ++ show s2 s3 <- send conn "ghi" putStrLn $ "Bytes sent: " ++ show s3 close conn And in the client I delay the recv by 1 second: setSocketOption sk RecvBuffer 1 threadDelay (1 * 10^6) b1 <- recv sk 1 B8.putStrLn b1 b2 <- recv sk 1 B8.putStrLn b2 b3 <- recv sk 1 B8.putStrLn b3 The server immediately outputs: Bytes sent: 1 Bytes sent: 1 Bytes sent: 1 The client waits a for second and then outputs: a d g What's going on? I expected the second and third send operation to return 0 bytes sent, because the send buffer can only hold 1 byte. The crux of my line of enquiry is this; how can my application know when to pause in generating its chunked output if send doesn't block and the current non-blocking send behaviour apparently succeeds when the send buffer should be full? More generally, isn't polling sockets using system calls to be avoided in favour of blocking and lightweight haskell threads? Hope someone can help clear up the confusion.

On Tue, Sep 3, 2013 at 6:56 PM, Simon Yarde
I've found that setting the send buffer size causes send to truncate the ByteString to the buffer size, but that successive sends continue to succeed when the buffer should be full.
I see no actual flow control here. That the receiver is blocked does not mean the receiver's *kernel* has not received the packets and buffered them. Also note that send is not synchronous; it cannot know that the receiver has hit its buffer limit --- and the kernel may well have already sent the previous packet, so the send buffer is in fact empty at that point, with the pending packet either in flight or in the receiving kernel's (interrupt time or normal; they are usually distinct. Or, with a sufficiently fancy network card, its own) network buffers. In short, you have not thought through all the possible ramifications, nor considered that the kernel handles packets and buffering independently of your program, nor considered the effects of the non-instantaneous network between sender and receiver. It may or not behave differently when sender and receiver are on the same machine. Do not assume that the kernel will short-circuit here and leave out all the intermediate buffering! The only part you're guaranteed to avoid is the interface with the network hardware. The crux of my line of enquiry is this; how can my application know when
to pause in generating its chunked output if send doesn't block and the current non-blocking send behaviour apparently succeeds when the send buffer should be full?
I would suggest reading a book on TCP/IP networking. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On Wed, Sep 4, 2013 at 12:56 AM, Simon Yarde
What's going on? I expected the second and third send operation to return 0 bytes sent, because the send buffer can only hold 1 byte.
If the underlying write operation returns EWOULDBLOCK then the "send"
function calls into the GHC IO manager with "threadWaitWrite", which
registers interest in the file descriptor using epoll() and blocks the
calling Haskell thread until the socket is writable.
G
--
Gregory Collins

On 4 Sep 2013, at 00:49, Gregory Collins
wrote: If the underlying write operation returns EWOULDBLOCK then the "send" function calls into the GHC IO manager with "threadWaitWrite", which registers interest in the file descriptor using epoll() and blocks the calling Haskell thread until the socket is writable.
If I'm following along correctly.. I was mistaken in thinking that send would never block because sockets are set to non-blocking, whereas the implementation of send re-introduces blocking behaviour through threadWaitWrite and efficient use of epoll (instead of continually polling with the attendant system call overhead).
On Tue, Sep 3, 2013 at 6:56 PM, Simon Yarde
wrote: The crux of my line of enquiry is this; how can my application know when to pause in generating its chunked output if send doesn't block and the current non-blocking send behaviour apparently succeeds when the send buffer should be full?
Now I've learned that send *can* block the current thread to avoid overwhelming the receiver (but uses different mechanisms than send()), I understand my app need simply wait for a send to proceed before generating the next chunk. Does that sound anywhere close?
On 4 Sep 2013, at 00:58, Joey Adams
wrote: 'send' will eventually block after enough 'send's without matching 'recv's. As Brandon explains, there is more buffering going on than the send buffer. In particular, the receiving host will accept segments until its buffer fills up. TCP implements flow control (i.e. keeps the sender from flooding the receiver) by having the receiver tell the sender how many more bytes it is currently willing to accept. This is done with the "window size" value in the TCP segment header [1]. [1]: http://en.wikipedia.org/wiki/Transmission_Control_Protocol#TCP_segment_struc...
If the receiver's buffer is full (reporting window size 0?), does send block until the receiver can accept more bytes, or return 0 bytes sent? Put another way; is there a scenario in which send could return 0 bytes sent?
On 4 Sep 2013, at 00:13, Brandon Allbery
wrote: I would suggest reading a book on TCP/IP networking.
I've studied Beej's Guide To Network Programming, Wikipedia entry, TCP spec, man pages.. I'll gladly take any recommendations that could help me understand system resource usage and configuration. Many thanks Brandon, Gregory, Joey.

On Tue, Sep 3, 2013 at 6:56 PM, Simon Yarde
I'm new to Haskell and have reached an impasse in understanding the behaviour of sockets.
The crux of my line of enquiry is this; how can my application know when to pause in generating its chunked output if send doesn't block and the current non-blocking send behaviour apparently succeeds when the send buffer should be full?
'send' will eventually block after enough 'send's without matching 'recv's. As Brandon explains, there is more buffering going on than the send buffer. In particular, the receiving host will accept segments until its buffer fills up. TCP implements flow control (i.e. keeps the sender from flooding the receiver) by having the receiver tell the sender how many more bytes it is currently willing to accept. This is done with the "window size" value in the TCP segment header [1]. [1]: http://en.wikipedia.org/wiki/Transmission_Control_Protocol#TCP_segment_struc...

On Tue, Sep 3, 2013 at 7:58 PM, Joey Adams
On Tue, Sep 3, 2013 at 6:56 PM, Simon Yarde
wrote: I'm new to Haskell and have reached an impasse in understanding the behaviour of sockets.
The crux of my line of enquiry is this; how can my application know when to pause in generating its chunked output if send doesn't block and the current non-blocking send behaviour apparently succeeds when the send buffer should be full?
'send' will eventually block after enough 'send's without matching 'recv's. As Brandon explains, there is more buffering going on than the send buffer. In particular, the receiving host will accept segments until its buffer fills up. TCP implements flow control (i.e. keeps the sender from flooding the receiver) by
Also note that, if you're using TCP, Nagle's algorithm will be turned on unless you specifically turn it off; this is explicitly designed to avoid sending very short packets, by buffering them into larger packets in the kernel network stack up until some timeout or a minimum threshold size is reached. (Protocols such as ssh and telnet turn it off for interactivity.) And if you're using UDP, there's no flow control at all; while packets won't be aggregated á la Nagle, the network stacks on the sending and receiving ends can do pretty much whatever they want with the individual packets. And in either case the socket buffer size is only the "last mile": there is no way to control what happens elsewhere, and in particular the interrupt-time received packet handling usually won't know even what socket is the target, much less what buffer size that socket has set. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On Tue, Sep 3, 2013 at 3:56 PM, Simon Yarde
I'm new to Haskell and have reached an impasse in understanding the behaviour of sockets.
Your question is actually not related to Haskell at all, but is a general "I don't understand socket programming" question. You're being misled by the non-blocking sockets observation - this makes no difference to the behaviour of your program. I recommend picking up copies of "Unix Network Programming" and "TCP/IP Illustrated", and reading them.

Well I'm always happy to take a public pasting in return for the chance to learn something :) It's fair to say I didn't fully understand the implications of TCPs flow-control mechanisms re buffering capacity, and I've been grateful to be guided here.
On 4 Sep 2013, at 16:52, Bryan O'Sullivan
wrote: Your question is actually not related to Haskell at all,
Well.. I'll try to protest my inquiry was specifically targeted at Haskell and that it's reasonable to ask: "what is the behaviour of send when there is no buffering capacity?"; and "as a newcomer to Haskell, broadly what are the mechanisms used to implement send on top of the C API?". Thanks to everyone who provided pointers. Network.Socket has this to say:
Essentially the entire C socket API is exposed through this module; in general the operations follow the behaviour of the C functions of the same name (consult your favourite Unix networking book).
I'm pretty sure you can't understand the behaviour of Network.Socket from consulting a Unix networking book, otherwise you'd be looking for -1 return values and error-codes, and pondering how you might go about efficiently managing read/write to a whole bunch non-blocking sockets. I don't know how one would go about providing a little background in the Network.Socket docs but it seems it would be very helpful; there's no clue provided as to the magic that it's taking care of. Anyway, that's enough hot air from me. I've put my findings below in case anyone else is ever interested. S For anyone interested, here's what I learned from asking questions here and digging around: - The underlying C sockets are in non-blocking mode; - send uses a threadWaitWrite to block the current thread until the socket is writeable; - threadWaitWrite relies on either epoll or kqueue to efficiently identify writeable file-descriptors, therefore select is not required and not implemented by Network.Socket, and there is no need to provide API access to setting O_NONBLOCK; - threadWaitWrite is run inside a throwSocketErrorWaitWrite (Network.Socket.Internal), which uses a throwErrnoIfRetry (Foreign.C.Error); the overall effect is that if the threadWaitWrite/send operation ever errors with EINTR then the whole threadWaitWrite/send is retried; otherwise subsequent sends are managed and called by threadWaitWrite when the send is likely to succeed (which is more efficient than polling using a retry loop alone); all other errors are converted to IOError. The end result on the behaviour of the Network.Socket.send is: - send can result in all sorts of socket-related IOErrors being thrown, except for EINT interrupts because the send is efficiently managed and retried; - send never has a return value of -1, because an exception is raised or the send is retried; - send blocks the current thread if a socket is not currently writeable. - It strikes me that a return value of 0 for Network.Socket.send is unlikely, because otherwise the socket file-descriptor would not have reported that it was writeable and the send would not have been called. But I'm unsure if a 0 value could never be returned. Simon Yarde simonyarde@me.com
participants (5)
-
Brandon Allbery
-
Bryan O'Sullivan
-
Gregory Collins
-
Joey Adams
-
Simon Yarde