Reverse DNS lookups on accept in network

I noticed that the accept function in the network library, unlike the underlying C function, does a reverse DNS lookup every time it accepts a connection. This seems to be the cause of an acute problem: Hackage is nearly unusable for people whose ISP has broken reverse DNS, since every request to the server delays for 30 seconds or more while waiting for the broken reverse DNS server to time out. I know, the ISP should fix it, or the user should switch to a different ISP, but that isn't always practical. In particular, Roman, our expert from Odessa, is experiencing this problem. And he is hosting a Haskell Hackathon, OdHack, in just a few weeks time. I am concerned that all participants in the Hackathon might also be susceptible, which would be a Very Bad Thing. I'll note that nowadays it seems to be widely accepted "best practice" to avoid per-connection RDNS lookup, e.g., by configuring web servers to log IP addresses instead of domain names. So there are two questions here: one is whether we need a change to the the network and/or cgi packages (and possibly others), and the other is how to solve the hackage problem promptly. My first thought on the first question is to add a new function acceptRaw or accept' to network that skips the lookup, and then change cgi to use it. But I would also support changing accept itself to skip the lookup always. Thanks, Yitz

Answering my own question:
I retract the proposal for changes here.
Although the function Network.accept
does an implied reverse DNS lookup,
it does so lazily. So the actual lookup
should not happen unless the library client
actually tries to use the host name.
As for the Hackage problem, this problem
is inherent to CGI, which is what Hackage
currently uses. The CGI protocol supplies
the resolved client host name to the web
application in an environment variable. So
the web server (Apache in this case) will
always have do a reverse DNS lookup by definition.
(Environment variables are strict. Too bad.)
So until we upgrade to a complete rewrite of
Hackage (any day now, right?), I guess the
only solution is to access Hackage via a
proxy on a host whose reverse DNS is
working.
Thanks,
Yitz
On Tue, Apr 9, 2013 at 3:54 PM, Yitzchak Gale
I noticed that the accept function in the network library, unlike the underlying C function, does a reverse DNS lookup every time it accepts a connection.
This seems to be the cause of an acute problem: Hackage is nearly unusable for people whose ISP has broken reverse DNS, since every request to the server delays for 30 seconds or more while waiting for the broken reverse DNS server to time out. I know, the ISP should fix it, or the user should switch to a different ISP, but that isn't always practical.
In particular, Roman, our expert from Odessa, is experiencing this problem. And he is hosting a Haskell Hackathon, OdHack, in just a few weeks time. I am concerned that all participants in the Hackathon might also be susceptible, which would be a Very Bad Thing.
I'll note that nowadays it seems to be widely accepted "best practice" to avoid per-connection RDNS lookup, e.g., by configuring web servers to log IP addresses instead of domain names.
So there are two questions here: one is whether we need a change to the the network and/or cgi packages (and possibly others), and the other is how to solve the hackage problem promptly.
My first thought on the first question is to add a new function acceptRaw or accept' to network that skips the lookup, and then change cgi to use it. But I would also support changing accept itself to skip the lookup always.
Thanks, Yitz

On 04/09/2013 10:24 AM, Yitzchak Gale wrote:
As for the Hackage problem, this problem is inherent to CGI, which is what Hackage currently uses. The CGI protocol supplies the resolved client host name to the web application in an environment variable. So the web server (Apache in this case) will always have do a reverse DNS lookup by definition. (Environment variables are strict. Too bad.)
This is not required by the CGI protocol. Apache only provides REMOTE_HOST if the HostnameLookups directive is set to On (the default is Off). So this should be easily fixable. Anders

I wrote:
As for the Hackage problem, this problem is inherent to CGI, which is what Hackage currently uses. The CGI protocol supplies the resolved client host name to the web application in an environment variable.
Anders Kaseorg wrote:
This is not required by the CGI protocol. Apache only provides REMOTE_HOST if the HostnameLookups directive is set to On (the default is Off). So this should be easily fixable.
Interesting, thanks. But it really does seem that Hackage is doing an RDNS lookup for every connection to a CGI app, both Haskell and non-Haskell. I don't have direct access to the server, but the sysadmin says that Apache is configured not to do lookups. And HostnameLookups did come up in the conversation. I'll ask again specifically about HostnameLookups. Can you think of any other reason that every CGI connection would trigger an RDNS lookup of the remote host? Thanks, Yitz

Yitzchak Gale
Although the function Network.accept does an implied reverse DNS lookup, it does so lazily. So the actual lookup should not happen unless the library client actually tries to use the host name.
I've looked at the source code but I don't recognize how the lazyness is achieved w.r.t. to the RDNS lookup, here's the relevant source fragment from [1]: accept sock@(MkSocket _ AF_INET _ _ _) = do ~(sock', (SockAddrInet port haddr)) <- Socket.accept sock peer <- catchIO (do (HostEntry peer _ _ _) <- getHostByAddr AF_INET haddr return peer ) (\_e -> inet_ntoa haddr) handle <- socketToHandle sock' ReadWriteMode return (handle, peer, port) the blocking operation would be 'getHostByAddr' but I don't see any measure to turn that into a lazy I/O operation. What am I overlooking? [1]: http://hackage.haskell.org/packages/archive/network/2.4.1.2/doc/html/src/Net...

On Apr 9, 2013, at 8:54 AM, Yitzchak Gale
This seems to be the cause of an acute problem: Hackage is nearly unusable for people whose ISP has broken reverse DNS, since every request to the server delays for 30 seconds or more while waiting for the broken reverse DNS server to time out.
So THAT's what was making Hackage totally unusable for me... So many hours (days?) blown on just waiting for it. -- Darius Jahandarie

Hi Darius,
Hackage is nearly unusable for people whose ISP has broken reverse DNS [...].
So THAT's what was making Hackage totally unusable for me... So many hours (days?) blown on just waiting for it.
you should complain to your ISP about this issue. Take care, Peter

Is it an option for you to change to OpenDNS or Google's DNS servers?
On Tue, Apr 9, 2013 at 10:30 AM, Darius Jahandarie
On Apr 9, 2013, at 8:54 AM, Yitzchak Gale
wrote: This seems to be the cause of an acute problem: Hackage is nearly unusable for people whose ISP has broken reverse DNS, since every request to the server delays for 30 seconds or more while waiting for the broken reverse DNS server to time out.
So THAT's what was making Hackage totally unusable for me... So many hours (days?) blown on just waiting for it.
-- Darius Jahandarie _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On Wed, Apr 10, 2013 at 2:57 PM, Jeffrey Shaw
Is it an option for you to change to OpenDNS or Google's DNS servers?
That unfortunately would be of no help for reverse DNS lookups. I've since switched ISPs though, so it's no longer been a problem for the past few months, thankfully. My old ISP was AT&T, which is rather large, so I imagine this issue could be affecting a lot of people. -- Darius Jahandarie

On Wed, Apr 10, 2013 at 11:50 AM, Gregory Collins
Network.Socket.accept doesn't do the reverse lookup.
That is what I use. The Network module itself doesn't get any love because I think converting Sockets to Handles is a bad idea (because I think the current Handle design is wrong). Hence all improvements to the API tends to go in Network.Socket.
participants (8)
-
Anders Kaseorg
-
Darius Jahandarie
-
Gregory Collins
-
Herbert Valerio Riedel
-
Jeffrey Shaw
-
Johan Tibell
-
Peter Simons
-
Yitzchak Gale