Re: thread/socket behvior

21 Oct 2008


      I'll be interested to know if the fix helps your application.  The bug 
reported in #2703 results in the program just allocating memory endlessly 
until it dies, so it doesn't sound exactly like the symptoms you were 
originally describing.

Cheers,
	Simon

Jeff Polakow wrote:
...
Hello,
Just writing to let people know the resolution of this problem...
After much frustration and toil, we realized there was a bug in GHC's 
handle abstraction over sockets.
We resolved our immediate problem by having our code deal directly 
with the sockets, and we filed a bug report, #2703, which has just been 
(partially fixed) by Simon Marlow.
thanks,
  Jeff
Simon Marlow  wrote on 10/10/2008 09:23:31 AM:
...
Jeff Polakow wrote:
...
Don Stewart  wrote on 10/09/2008 02:56:02 PM:
...
...
We have a server that accepts messages over a socket, spawning
...
process them. Processing these messages may cause other, 
outgoing
   connections, to be spawned. Under sufficient load, the main
server loop
   (i.e. the call to accept, followed by a forkIO), becomes
nonresponsive.
A smaller distilled testcase reveals that when sufficient 
socket
activity
   is occurring, an incoming connection may not be responded to
until other
   connections have been cleared out of the way, despite the fact
...
other connections are being handled by separate threads. One
issue that
   we've been trying to figure out is where this behavior arises
from-- the
   GHC rts, the Network library, the underlying C libraries.
Have other GHC users doing applications with large amounts of
socket usage
   observed similar behavior and managed to trace back where it
originates
   from? Are there any particular architectural solutions that
jeff.polakow:
threads to
that these
people have
...
found to work well for these situations?
Hey Jeff,
Can you say which GHC you used, and whether you used the threaded
runtime or non-threaded runtime?
Oops, forgot about that...
We used both ghc-6.8.3 and ghc-6.10.rc1 and we used the threaded
runtime. We are running on a 64 bit linux machine using openSUSE 10.
The scheduler doesn't have a concept of priorities, so the accepting 
thread
will get the same share of the CPU as the other threads.  Another 
issue is
that the accepting thread has to be woken up by the IO manager thread 
when
a new connection is available, so we might have to wait for the IO 
manager
thread to run too.  But I wouldn't expect to see overly long delays. 
 Maybe
you could try network-alt which does its own IO multiplexing.
If you have multiple cores, you might want to try fixing the thread
affinity - e.g. put all the worker threads on one core, and the 
accepting
thread on the other core.  You can do this using GHC.Conc.forkOnIO, with
the +RTS -qm -qw options.
Other than that, I'm not sure what to try right now.  We're hoping to 
get
some better profiling for parallel/concurrent programs in the future, 
but
it's not ready yet.
Cheers,
   Simon
---
This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.