Re: HXT 7.4/7.5 leaking TCP connections

Hi, Bjorn and Uwe --
this is a known problem with HTTP package (version 3001.0.4). Paul Brown has described this somewere in his blog. (http://mult.ifario.us/t/haskell), but my firefox only shows an incomplete page of this blog, the solution is missing. Paul promissed in his blog to send a patch to Björn Bringert.
Here's the link to the article about SimpleHTTP; for some reason, it wasn't showing up. (Guess I have a bug to fix there...) http://mult.ifario.us/p/a-short-adventure-with-simplehttp
P.S. to Björn: It would be nice, if you could include this change into the HTTP module and upload a new version to hackage. Thank you Daniel and Uwe for reporting this (again). Strangely, I thought that I had already applied Paul's patch, but it seems I haven't. Paul: I seem to recall us talking about this, but can't remember the conclusion, nor find any e-mail about it. Do you remember anything about this?
I can't remember either; I'm sure that the darcs patch is lurking on one of my boxes. I'll track it down and send it your way so we can squash this one. Best. -- Paul -- paulrbrown@gmail.com http://mult.ifario.us/

Hello all. On Wed, 09 Apr 2008 05:36:40 Paul Brown wrote:
Hi, Bjorn and Uwe --
this is a known problem with HTTP package (version 3001.0.4). Paul Brown has described this somewere in his blog. (http://mult.ifario.us/t/haskell), but my firefox only shows an incomplete page of this blog, the solution is missing. Paul promissed in his blog to send a patch to Björn Bringert.
I'm not so sure it's due to the bug that Paul has patched. http was my first suspect and I managed to find a google cache of Paul's page, which wouldn't load for me. I've tried patching http with both patches on that page (one by Paul, one suggested in a comment). I then incremented http's version and rebuilt, changed the dependencies to require that version of http in HXT 7.5 and rebuilt, then required those versions in my test program. It still leaks TCP connections. I think that means that everything is using the patched http. Can I query what installed packages are linked against somehow? Paul's patch looks to ensure connections are closed when there is an exception, I'm leaking (I think) every TCP connection. Here's the test code again for the -cafe. Is there something obvious wrong with it? ==================== module Main where import Control.Exception (bracket) import Control.Monad (unless) import Text.XML.HXT.Arrow import System.IO (IOMode(..), hClose, hGetLine, hIsEOF, openFile) --/tmp/list.txt is a list of files on a local webserver. main = bracket (openFile "/tmp/list.txt" ReadMode) hClose processAll processAll hdl = do b <- hIsEOF hdl if b then return "done" else processNext hdl processNext hdl = hGetLine hdl >>= process >> processAll hdl process s = runX (getXmlPageBad (buildURL s)) --process s = runX (getXmlPageOK (buildURL s)) buildURL = ("http://localhost.localdomain/" ++) getXmlPageOK = readDocument [ (a_use_curl, "1"), (a_parse_html, "1"), (a_encoding, isoLatin1), (a_issue_warnings, "0") ] getXmlPageBad = readDocument [ (a_parse_html, "1"), (a_encoding, isoLatin1), (a_issue_warnings, "0") ] =================== A related question is: what does an open TCP connection that gets garbage collected turn into? I've been seeing lines like the following is lsof, and they still count toward the open file limit, I suspect they're garbage collected connections. test 28300 daniel 1023u sock 0,4 13900773 can't identify protocol
Here's the link to the article about SimpleHTTP; for some reason, it wasn't showing up. (Guess I have a bug to fix there...)
http://mult.ifario.us/p/a-short-adventure-with-simplehttp
P.S. to Björn: It would be nice, if you could include this change into the HTTP module and upload a new version to hackage.
Thank you Daniel and Uwe for reporting this (again). Strangely, I thought that I had already applied Paul's patch, but it seems I haven't. Paul: I seem to recall us talking about this, but can't remember the conclusion, nor find any e-mail about it. Do you remember anything about this?
I can't remember either; I'm sure that the darcs patch is lurking on one of my boxes. I'll track it down and send it your way so we can squash this one.
As I said, I've patched the latest darcs code with both the suggested patches on Pauls page, I can send either/both through to you if you would like Björn. Cheers Daniel

I've looked into this further and I believe the leaked connections are due to Network.Browser, this is a separate issue from that identified by Paul Brown. The BrowserState in Network.Browser has a connection pool of up to five connections. When a sixth is opened the oldest connection is closed. This looks to be the only time that a connection is closed. BrowserState's internals are not exported so there is no way for a user to close them. The net effect for HXT is that every time readDocument is called, using native http, a single TCP connection is leaked. I've attached a patch against the darcs version of http that cures my test programs leak. Cheers Daniel
participants (2)
-
Daniel McAllansmith
-
Paul Brown