Can a GC delay TCP connection formation?

Hello, I've run into an issue that makes me think that when the GHC GC runs while a Snap or Warp HTTP server is serving connections, the GC prevents or delays TCP connections from forming. My application requires that TCP connections form within a few tens of milliseconds. I'm wondering if anyone else has run into this issue, and if there are some GC flags that could help. I've tried a few, such as -H and -c, and haven't found anything to help. I'm using GHC 7.4.1. Thanks, Jeff

Jeff Shaw
I've run into an issue that makes me think that when the GHC GC runs while a Snap or Warp HTTP server is serving connections, the GC prevents or delays TCP connections from forming. My application requires that TCP connections form within a few tens of milliseconds. I'm wondering if anyone else has run into this issue, and if there are some GC flags that could help. I've tried a few, such as -H and -c, and haven't found anything to help. I'm using GHC 7.4.1.
When you compile with -threaded and run on multiple threads, then the runtime uses parallel GC. Did you try that? Greets, Ertugrul -- Not to be or to be and (not to be or to be and (not to be or to be and (not to be or to be and ... that is the list monad.

GHC has a "stop the world" garbage collector, meaning that while major
GC is happening, the entire process must be halted. In my experience
GC pause times are typically low, but depending the heap residency
profile of your application (and the quantity of garbage being
produced by it), this may not be the case. If you have a hard
real-time requirement then a garbage-collected language may not be
appropriate for you.
On Tue, Nov 27, 2012 at 5:19 AM, Jeff Shaw
Hello, I've run into an issue that makes me think that when the GHC GC runs while a Snap or Warp HTTP server is serving connections, the GC prevents or delays TCP connections from forming. My application requires that TCP connections form within a few tens of milliseconds. I'm wondering if anyone else has run into this issue, and if there are some GC flags that could help. I've tried a few, such as -H and -c, and haven't found anything to help. I'm using GHC 7.4.1.
Thanks, Jeff
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
--
Gregory Collins

Kazu and Andreas, could this be IO manager related? On Monday, November 26, 2012, Jeff Shaw wrote:
Hello, I've run into an issue that makes me think that when the GHC GC runs while a Snap or Warp HTTP server is serving connections, the GC prevents or delays TCP connections from forming. My application requires that TCP connections form within a few tens of milliseconds. I'm wondering if anyone else has run into this issue, and if there are some GC flags that could help. I've tried a few, such as -H and -c, and haven't found anything to help. I'm using GHC 7.4.1.
Thanks, Jeff
______________________________**_________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe

Could you give us more info on what your constraints are? Is it necessary
that you have a certain number of connections per second, or is it necessary
that the connection results very quickly after some other message is
received?
---------- Původní zpráva ----------
Od: Johan Tibell

Hello Timothy and others, One of my clients hosts their HTTP clients in an Amazon cloud, so even when they turn on persistent HTTP connections, they use many connections. Usually they only end up sending one HTTP request per TCP connection. My specific problem is that they want a response in 120 ms or so, and at times they are unable to complete a TCP connection in that amount of time. I'm looking at on the order of 100 TCP connections per second, and on the order of 1000 HTTP requests per second (other clients do benefit from persistent HTTP connections). Once each minute, a thread of my program updates a global state, stored in an IORef, and updated with atomicModifyIORef', based on query results via HDBC-obdc. The query results are strict, and atomicModifyIORef' should receive the updated state already evaluated. I reduced the amount of time that query took from tens of seconds to just a couple, and for some reason that reduced the proportion of TCP timeouts drastically. The approximate before and after TCP timeout proportions are 15% and 5%. I'm not sure why this reduction in timeouts resulted from the query time improving, but this discovery has me on the task of removing all database code from the main program and into a cron job. My best guess is that HDBC-odbc somehow disrupts other communications while it waits for the DB server to respond. To respond to Ertugrul, I'm compiling with -threaded, and running with +RTS -N. I hope this helps describe my problem. I c an probably come up with some hard information if requested, E.G. threadscope. Jeff On 11/27/2012 10:55 AM, timothyhobbs@seznam.cz wrote:
Could you give us more info on what your constraints are? Is it necessary that you have a certain number of connections per second, or is it necessary that the connection results very quickly after some other message is received?

On Tue, Nov 27, 2012 at 11:02 AM, Jeff Shaw
Hello Timothy and others, One of my clients hosts their HTTP clients in an Amazon cloud, so even when they turn on persistent HTTP connections, they use many connections. Usually they only end up sending one HTTP request per TCP connection. My specific problem is that they want a response in 120 ms or so, and at times they are unable to complete a TCP connection in that amount of time. I'm looking at on the order of 100 TCP connections per second, and on the order of 1000 HTTP requests per second (other clients do benefit from persistent HTTP connections).
Once each minute, a thread of my program updates a global state, stored in an IORef, and updated with atomicModifyIORef', based on query results via HDBC-obdc. The query results are strict, and atomicModifyIORef' should receive the updated state already evaluated. I reduced the amount of time that query took from tens of seconds to just a couple, and for some reason that reduced the proportion of TCP timeouts drastically. The approximate before and after TCP timeout proportions are 15% and 5%. I'm not sure why this reduction in timeouts resulted from the query time improving, but this discovery has me on the task of removing all database code from the main program and into a cron job. My best guess is that HDBC-odbc somehow disrupts other communications while it waits for the DB server to respond.
Have you read section 8.4.2 of the ghc user guide? http://www.haskell.org/ghc/docs/7.4.1/html/users_guide/ffi-ghc.html Based on that I would check the FFI imports in your database library. In the best case (-threaded, 'safe', and thread-safe odbc), I think you'll find that N of these can run concurrently, but here your number of requests is likely to be much greater than N (where N is the number of threads the RTS created with +RTS -N). I'm not sure how to solve your problem, but perhaps this information can help you pinpoint the problem. Good luck, Jason

On Tue, Nov 27, 2012 at 11:17 AM, Jason Dagit
On Tue, Nov 27, 2012 at 11:02 AM, Jeff Shaw
wrote: Hello Timothy and others, One of my clients hosts their HTTP clients in an Amazon cloud, so even when they turn on persistent HTTP connections, they use many connections. Usually they only end up sending one HTTP request per TCP connection. My specific problem is that they want a response in 120 ms or so, and at times they are unable to complete a TCP connection in that amount of time. I'm looking at on the order of 100 TCP connections per second, and on the order of 1000 HTTP requests per second (other clients do benefit from persistent HTTP connections).
Once each minute, a thread of my program updates a global state, stored in an IORef, and updated with atomicModifyIORef', based on query results via HDBC-obdc. The query results are strict, and atomicModifyIORef' should receive the updated state already evaluated. I reduced the amount of time that query took from tens of seconds to just a couple, and for some reason that reduced the proportion of TCP timeouts drastically. The approximate before and after TCP timeout proportions are 15% and 5%. I'm not sure why this reduction in timeouts resulted from the query time improving, but this discovery has me on the task of removing all database code from the main program and into a cron job. My best guess is that HDBC-odbc somehow disrupts other communications while it waits for the DB server to respond.
Have you read section 8.4.2 of the ghc user guide? http://www.haskell.org/ghc/docs/7.4.1/html/users_guide/ffi-ghc.html
Ahem, I meant *8.2.4*.

On 11/27/12 2:17 PM, Jason Dagit wrote:
Based on that I would check the FFI imports in your database library. In the best case (-threaded, 'safe', and thread-safe odbc), I think you'll find that N of these can run concurrently, but here your number of requests is likely to be much greater than N (where N is the number of threads the RTS created with +RTS -N).
HDBC-odbc has long used the wrong type of FFI imports, resulting in long-running database queries potentially blocking all other IO. I just checked, and apparently a patch was made to the repo in September that finally fixes this [1], but apparently a new release has yet to be uploaded to hackage. In any case, if you try to install it from the repo, this may at least solve some of your problems. [1] https://github.com/hdbc/hdbc-odbc/commit/7299d3441ce2e1d5a485fe79b37540c0a44... --Gershom

On 11/27/2012 2:45 PM, Gershom Bazerman wrote:
HDBC-odbc has long used the wrong type of FFI imports, resulting in long-running database queries potentially blocking all other IO. I just checked, and apparently a patch was made to the repo in September that finally fixes this [1], but apparently a new release has yet to be uploaded to hackage. In any case, if you try to install it from the repo, this may at least solve some of your problems.
[1] https://github.com/hdbc/hdbc-odbc/commit/7299d3441ce2e1d5a485fe79b37540c0a44...
--Gershom Gershom, Thanks for pointing this out. I've checked out the latest hdbc-odbc code, and I'll see if there's an improvement.
Jeff

On Tue, Nov 27, 2012 at 9:45 PM, Jeff Shaw
On 11/27/2012 2:45 PM, Gershom Bazerman wrote:
HDBC-odbc has long used the wrong type of FFI imports, resulting in long-running database queries potentially blocking all other IO. I just checked, and apparently a patch was made to the repo in September that finally fixes this [1], but apparently a new release has yet to be uploaded to hackage. In any case, if you try to install it from the repo, this may at least solve some of your problems.
[1] https://github.com/hdbc/hdbc-odbc/commit/7299d3441ce2e1d5a485fe79b37540c0a44...
--Gershom
Gershom, Thanks for pointing this out. I've checked out the latest hdbc-odbc code, and I'll see if there's an improvement.
Hi, I'm the maintainer of HDBC. I haven't yet released this code since it hasn't yet been fully tested. However, if you're happy with it, I'll push the version with proper ffi bindings up to Hackage. Nick

Hi, I'm the maintainer of HDBC. I haven't yet released this code since it hasn't yet been fully tested. However, if you're happy with it, I'll push the version with proper ffi bindings up to Hackage. Nick Nick, I pulled the latest version of HDBC-odbc, and it appears to be working MUCH better than before. I now have 0% timeouts from httperf with 50 connections/second and timeout set to 0.1 seconds. It's looking like the safe imports vastly improved IO blocking. I haven't seen any new
On 11/27/2012 4:59 PM, Nicolas Wu wrote: problems since the new version went live. Jeff

On Wed, Nov 28, 2012 at 8:36 PM, Jeff Shaw
On 11/27/2012 4:59 PM, Nicolas Wu wrote: I pulled the latest version of HDBC-odbc, and it appears to be working MUCH better than before. I now have 0% timeouts from httperf with 50 connections/second and timeout set to 0.1 seconds. It's looking like the safe imports vastly improved IO blocking. I haven't seen any new problems since the new version went live.
That's great to hear! I'll aim to push this version to Hackage over the weekend. Nick

Jeff
Are you certain that all the delay can be laid at the GHC runtime?
How much of the end-to-end delay budget is being allocated to you? I recently moved a static website from a 10-year old server in telehouse into AWS in Ireland and watched the access time (HTTP GET to check time on top index page) increase by 150ms.
Neil
On 27 Nov 2012, at 19:02, Jeff Shaw
Hello Timothy and others, One of my clients hosts their HTTP clients in an Amazon cloud, so even when they turn on persistent HTTP connections, they use many connections. Usually they only end up sending one HTTP request per TCP connection. My specific problem is that they want a response in 120 ms or so, and at times they are unable to complete a TCP connection in that amount of time. I'm looking at on the order of 100 TCP connections per second, and on the order of 1000 HTTP requests per second (other clients do benefit from persistent HTTP connections).
Once each minute, a thread of my program updates a global state, stored in an IORef, and updated with atomicModifyIORef', based on query results via HDBC-obdc. The query results are strict, and atomicModifyIORef' should receive the updated state already evaluated. I reduced the amount of time that query took from tens of seconds to just a couple, and for some reason that reduced the proportion of TCP timeouts drastically. The approximate before and after TCP timeout proportions are 15% and 5%. I'm not sure why this reduction in timeouts resulted from the query time improving, but this discovery has me on the task of removing all database code from the main program and into a cron job. My best guess is that HDBC-odbc somehow disrupts other communications while it waits for the DB server to respond.
To respond to Ertugrul, I'm compiling with -threaded, and running with +RTS -N.
I hope this helps describe my problem. I c an probably come up with some hard information if requested, E.G. threadscope.
Jeff
On 11/27/2012 10:55 AM, timothyhobbs@seznam.cz wrote:
Could you give us more info on what your constraints are? Is it necessary that you have a certain number of connections per second, or is it necessary that the connection results very quickly after some other message is received?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Jeff, this is somewhat off topic, but interesting. Are "telehouse" and AWS
physically close? Was this latency increase not expected due to geography?
Alexander
On 28 November 2012 06:21, Neil Davies
Jeff
Are you certain that all the delay can be laid at the GHC runtime?
How much of the end-to-end delay budget is being allocated to you? I recently moved a static website from a 10-year old server in telehouse into AWS in Ireland and watched the access time (HTTP GET to check time on top index page) increase by 150ms.
Neil
On 27 Nov 2012, at 19:02, Jeff Shaw
wrote: Hello Timothy and others, One of my clients hosts their HTTP clients in an Amazon cloud, so even when they turn on persistent HTTP connections, they use many connections. Usually they only end up sending one HTTP request per TCP connection. My specific problem is that they want a response in 120 ms or so, and at times they are unable to complete a TCP connection in that amount of time. I'm looking at on the order of 100 TCP connections per second, and on the order of 1000 HTTP requests per second (other clients do benefit from persistent HTTP connections).
Once each minute, a thread of my program updates a global state, stored in an IORef, and updated with atomicModifyIORef', based on query results via HDBC-obdc. The query results are strict, and atomicModifyIORef' should receive the updated state already evaluated. I reduced the amount of time that query took from tens of seconds to just a couple, and for some reason that reduced the proportion of TCP timeouts drastically. The approximate before and after TCP timeout proportions are 15% and 5%. I'm not sure why this reduction in timeouts resulted from the query time improving, but this discovery has me on the task of removing all database code from the main program and into a cron job. My best guess is that HDBC-odbc somehow disrupts other communications while it waits for the DB server to respond.
To respond to Ertugrul, I'm compiling with -threaded, and running with +RTS -N.
I hope this helps describe my problem. I c an probably come up with some hard information if requested, E.G. threadscope.
Jeff
On 11/27/2012 10:55 AM, timothyhobbs@seznam.cz wrote:
Could you give us more info on what your constraints are? Is it necessary that you have a certain number of connections per second, or is it necessary that the connection results very quickly after some other message is received?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

No - the difference is 6.5ms each way
On 28 Nov 2012, at 14:44, Alexander Kjeldaas
Jeff, this is somewhat off topic, but interesting. Are "telehouse" and AWS physically close? Was this latency increase not expected due to geography?
Alexander
On 28 November 2012 06:21, Neil Davies
wrote: Jeff Are you certain that all the delay can be laid at the GHC runtime?
How much of the end-to-end delay budget is being allocated to you? I recently moved a static website from a 10-year old server in telehouse into AWS in Ireland and watched the access time (HTTP GET to check time on top index page) increase by 150ms.
Neil
On 27 Nov 2012, at 19:02, Jeff Shaw
wrote: Hello Timothy and others, One of my clients hosts their HTTP clients in an Amazon cloud, so even when they turn on persistent HTTP connections, they use many connections. Usually they only end up sending one HTTP request per TCP connection. My specific problem is that they want a response in 120 ms or so, and at times they are unable to complete a TCP connection in that amount of time. I'm looking at on the order of 100 TCP connections per second, and on the order of 1000 HTTP requests per second (other clients do benefit from persistent HTTP connections).
Once each minute, a thread of my program updates a global state, stored in an IORef, and updated with atomicModifyIORef', based on query results via HDBC-obdc. The query results are strict, and atomicModifyIORef' should receive the updated state already evaluated. I reduced the amount of time that query took from tens of seconds to just a couple, and for some reason that reduced the proportion of TCP timeouts drastically. The approximate before and after TCP timeout proportions are 15% and 5%. I'm not sure why this reduction in timeouts resulted from the query time improving, but this discovery has me on the task of removing all database code from the main program and into a cron job. My best guess is that HDBC-odbc somehow disrupts other communications while it waits for the DB server to respond.
To respond to Ertugrul, I'm compiling with -threaded, and running with +RTS -N.
I hope this helps describe my problem. I c an probably come up with some hard information if requested, E.G. threadscope.
Jeff
On 11/27/2012 10:55 AM, timothyhobbs@seznam.cz wrote:
Could you give us more info on what your constraints are? Is it necessary that you have a certain number of connections per second, or is it necessary that the connection results very quickly after some other message is received?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Tue, Nov 27, 2012 at 11:02 AM, Jeff Shaw
Once each minute, a thread of my program updates a global state, stored in an IORef, and updated with atomicModifyIORef', based on query results via HDBC-obdc.
Incidentally, what kind of database are you talking to? Issues of FFI correctness aside, HDBC is in general terribly slow compared to some of the more DB-specific bindings.

On 11/30/2012 1:29 PM, Bryan O'Sullivan wrote:
On Tue, Nov 27, 2012 at 11:02 AM, Jeff Shaw
mailto:shawjef3@gmail.com> wrote: Once each minute, a thread of my program updates a global state, stored in an IORef, and updated with atomicModifyIORef', based on query results via HDBC-obdc.
Incidentally, what kind of database are you talking to? Issues of FFI correctness aside, HDBC is in general terribly slow compared to some of the more DB-specific bindings. I'm connecting to a MS SQL Server 2008 R2 DBMS via Free TDS.
participants (11)
-
Alexander Kjeldaas
-
Bryan O'Sullivan
-
Ertugrul Söylemez
-
Gershom Bazerman
-
Gregory Collins
-
Jason Dagit
-
Jeff Shaw
-
Johan Tibell
-
Neil Davies
-
Nicolas Wu
-
timothyhobbs@seznam.cz