On Tue, Apr 1, 2014 at 11:21 AM, Pierre-Étienne Meunier <pierreetienne.meunier@gmail.com> wrote:
Hello cafe,

I’m trying to run a server to synchronize a bunch of machines, and one of the threads keeps crashing every ~50 hours.

You seem to be throwing away the exception messages so it will be difficult to diagnose why.

server::Config -> MVar State -> IO ()
server config state=withSocketsDo $ do
 installHandler sigPIPE Ignore Nothing
 threads<-Sem.new $ maxThreads config
 forever $ do
       E.catch (bracket (listenOn $ port config)) sClose $
              \sock->forever $ do
                    (s,a,_)<-accept sock

There is a race condition here, you can get an asynchronous exception any time after accepting the socket that would cause it to leak open. I'm guessing you might running out of file descriptors. This whole section should be run under "mask" and you should only unmask async exceptions when you know you've installed a signal handler to clean up afterwards.
 
                    wait threads
                    forkIO $ (reply state s a)`finally`(signal threads)`E.catch`(\e->let _=e::IOException in return ())

Try "forkIOWithUnmask", same advice applies in the child thread.

G
--
Gregory Collins <greg@gregorycollins.net>