"too many open files" using snap

Hi haskellers, I wrote a web server using snap framework. Basically, it's an API server using hdbc-mysql to connect to a mysql server. But it also serves some static web pages and images, using the serveDirectory function in snap-core. The program eats more and more memory slowly, but I couldn't spot the leak. It uses less than 10MB memory when starting. Sometimes, it can grow to 60-70MB and stop working correctly. Then I find that the log file is full of "too many open files" error. As I don't have any code open any files, I suspect it's the serveDirectory leaking. But I don't have any expertise on the core of snap, so, could some one give me some advice on this problem? Thanks. Eric

On Sun, Jan 8, 2012 at 7:25 AM, Eric Wong
Hi haskellers,
I wrote a web server using snap framework. Basically, it's an API server using hdbc-mysql to connect to a mysql server. But it also serves some static web pages and images, using the serveDirectory function in snap-core. The program eats more and more memory slowly, but I couldn't spot the leak. It uses less than 10MB memory when starting. Sometimes, it can grow to 60-70MB and stop working correctly. Then I find that the log file is full of "too many open files" error. As I don't have any code open any files, I suspect it's the serveDirectory leaking. But I don't have any expertise on the core of snap, so, could some one give me some advice on this problem?
I'm fairly certain there's no leak in Snap in this code, although I will
gladly accept evidence to the contrary -- I've been load-testing
"serveDirectory" at 20,000 qps for a couple of minutes now, with no
increase in the memory resident size.
A "too many open files" error is usually due to running out of file
descriptors for network sockets. On my Linux machine the per-process limit
for file descriptors defaults to 1024, and on my Mac the default is 256.
You can see your current limit by running "ulimit -a" from the terminal.
Each incoming socket occupies a file descriptor, as does every open
connection to your database and every open file. When these run out, calls
to accept() or to open files start failing and you get this error message.
Memory profiling should help you find the leak.
G
--
Gregory Collins

Not knowing how exactly you set up your interop with mysql, is there any chance your application tries to keep too many mysql connections open? A simple "show processlist;" inside of a mysql console should rule out this possibility. Also, are you reading to or writing from any files at any point independently of the snap infrastructure? I have had issues in the past with intermittent background file reads not being strict enough and eventually starving the system of file descriptors. Cheers, Ozgun On Sunday, January 8, 2012 at 4:27 PM, Gregory Collins wrote:
On Sun, Jan 8, 2012 at 7:25 AM, Eric Wong
wrote: Hi haskellers,
I wrote a web server using snap framework. Basically, it's an API server using hdbc-mysql to connect to a mysql server. But it also serves some static web pages and images, using the serveDirectory function in snap-core. The program eats more and more memory slowly, but I couldn't spot the leak. It uses less than 10MB memory when starting. Sometimes, it can grow to 60-70MB and stop working correctly. Then I find that the log file is full of "too many open files" error. As I don't have any code open any files, I suspect it's the serveDirectory leaking. But I don't have any expertise on the core of snap, so, could some one give me some advice on this problem?
I'm fairly certain there's no leak in Snap in this code, although I will gladly accept evidence to the contrary -- I've been load-testing "serveDirectory" at 20,000 qps for a couple of minutes now, with no increase in the memory resident size.
A "too many open files" error is usually due to running out of file descriptors for network sockets. On my Linux machine the per-process limit for file descriptors defaults to 1024, and on my Mac the default is 256. You can see your current limit by running "ulimit -a" from the terminal.
Each incoming socket occupies a file descriptor, as does every open connection to your database and every open file. When these run out, calls to accept() or to open files start failing and you get this error message.
Memory profiling should help you find the leak.
G -- Gregory Collins
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org (mailto:Haskell-Cafe@haskell.org) http://www.haskell.org/mailman/listinfo/haskell-cafe

On Sun, Jan 8, 2012 at 12:27 PM, Gregory Collins
A "too many open files" error is usually due to running out of file descriptors for network sockets. On my Linux machine the per-process limit for file descriptors defaults to 1024, and on my Mac the default is 256. You can see your current limit by running "ulimit -a" from the terminal.
If you're on Linux, then you may $ ls -lah /proc/PID/fd to see where these open files are pointing to. HTH, -- Felipe.

On Sun, Jan 8, 2012 at 09:50, Felipe Almeida Lessa
If you're on Linux, then you may
$ ls -lah /proc/PID/fd
to see where these open files are pointing to.
And on other systems lsof can determine this information. $ lsof -p PID -- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms

I run both of these commands and found out that there're sockets leaking. Most of them are TCP connections in CLOSE_WAIT state. There're also some socket with "can't identify protocol". So, What's the problem? Is it related to the timeout in the server config? I'm using the default value. 在 2012-1-8,下午10:56, Brandon Allbery 写道:
On Sun, Jan 8, 2012 at 09:50, Felipe Almeida Lessa
wrote: If you're on Linux, then you may $ ls -lah /proc/PID/fd
to see where these open files are pointing to.
And on other systems lsof can determine this information.
$ lsof -p PID
-- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

2012/1/8 Eric Wong
I run both of these commands and found out that there're sockets leaking. Most of them are TCP connections in CLOSE_WAIT state. There're also some socket with "can't identify protocol".
So, What's the problem? Is it related to the timeout in the server config? I'm using the default value.
Since they're sockets, it could be either the database connections or the HTTP connections, right? Is there some way you can insert a fake database layer to see if that makes your problem go away, to try and isolate the problem? Antoine
在 2012-1-8,下午10:56, Brandon Allbery 写道:
On Sun, Jan 8, 2012 at 09:50, Felipe Almeida Lessa
wrote: If you're on Linux, then you may
$ ls -lah /proc/PID/fd
to see where these open files are pointing to.
And on other systems lsof can determine this information.
$ lsof -p PID
-- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

2012/1/9 Eric Wong
I run both of these commands and found out that there're sockets leaking. Most of them are TCP connections in CLOSE_WAIT state. There're also some socket with "can't identify protocol".
So, What's the problem? Is it related to the timeout in the server config? I'm using the default value.
CLOSE_WAIT indicates that the other side has closed its side of the
connection and the OS is waiting for you to close() the socket. At this
point I'd start looking at your database layer.
G
--
Gregory Collins

Yes, I know it's not related to the static files now. But do I have to close socket in snap monad? Shouldn't it be closed by snap-server? 在 2012-1-9,下午2:29, Gregory Collins 写道:
2012/1/9 Eric Wong
I run both of these commands and found out that there're sockets leaking. Most of them are TCP connections in CLOSE_WAIT state. There're also some socket with "can't identify protocol". So, What's the problem? Is it related to the timeout in the server config? I'm using the default value.
CLOSE_WAIT indicates that the other side has closed its side of the connection and the OS is waiting for you to close() the socket. At this point I'd start looking at your database layer.
G -- Gregory Collins

2012/1/9 Eric Wong
Yes, I know it's not related to the static files now. But do I have to close socket in snap monad? Shouldn't it be closed by snap-server?
Without knowing what your code looks like one way or the other, it's hard
to tell. Snap does close the sockets it opens, of course.
G
--
Gregory Collins

Forget to mention it, I am using HTTPS to serve all the APIs. And int the error.log file, I got lots of ConnectionAbruptlyTerminated exceptions, which I think is defined in OpenSSL. 在 2012-1-9,下午2:48, Gregory Collins 写道:
2012/1/9 Eric Wong
Yes, I know it's not related to the static files now. But do I have to close socket in snap monad? Shouldn't it be closed by snap-server? Without knowing what your code looks like one way or the other, it's hard to tell. Snap does close the sockets it opens, of course.
G -- Gregory Collins

This is very helpful, I found currently almost all of the open files are sockets. But it don't have that much traffic. So it seems it's leaking socket file descriptors. 在 2012-1-8,下午10:50, Felipe Almeida Lessa 写道:
On Sun, Jan 8, 2012 at 12:27 PM, Gregory Collins
wrote: A "too many open files" error is usually due to running out of file descriptors for network sockets. On my Linux machine the per-process limit for file descriptors defaults to 1024, and on my Mac the default is 256. You can see your current limit by running "ulimit -a" from the terminal.
If you're on Linux, then you may
$ ls -lah /proc/PID/fd
to see where these open files are pointing to.
HTH,
-- Felipe.
participants (6)
-
Antoine Latter
-
Brandon Allbery
-
Eric Wong
-
Felipe Almeida Lessa
-
Gregory Collins
-
Ozgun Ataman