
On Tue, Nov 27, 2012 at 11:02 AM, Jeff Shaw
Hello Timothy and others, One of my clients hosts their HTTP clients in an Amazon cloud, so even when they turn on persistent HTTP connections, they use many connections. Usually they only end up sending one HTTP request per TCP connection. My specific problem is that they want a response in 120 ms or so, and at times they are unable to complete a TCP connection in that amount of time. I'm looking at on the order of 100 TCP connections per second, and on the order of 1000 HTTP requests per second (other clients do benefit from persistent HTTP connections).
Once each minute, a thread of my program updates a global state, stored in an IORef, and updated with atomicModifyIORef', based on query results via HDBC-obdc. The query results are strict, and atomicModifyIORef' should receive the updated state already evaluated. I reduced the amount of time that query took from tens of seconds to just a couple, and for some reason that reduced the proportion of TCP timeouts drastically. The approximate before and after TCP timeout proportions are 15% and 5%. I'm not sure why this reduction in timeouts resulted from the query time improving, but this discovery has me on the task of removing all database code from the main program and into a cron job. My best guess is that HDBC-odbc somehow disrupts other communications while it waits for the DB server to respond.
Have you read section 8.4.2 of the ghc user guide? http://www.haskell.org/ghc/docs/7.4.1/html/users_guide/ffi-ghc.html Based on that I would check the FFI imports in your database library. In the best case (-threaded, 'safe', and thread-safe odbc), I think you'll find that N of these can run concurrently, but here your number of requests is likely to be much greater than N (where N is the number of threads the RTS created with +RTS -N). I'm not sure how to solve your problem, but perhaps this information can help you pinpoint the problem. Good luck, Jason