questions about ResponseEnumerator

Hello, I have two questions about ResponseEnumerator. 1) If Application opens a file Handle and uses enumHandle to return ResponseEnumerator, how does the Application close the Handle? 2) It seems to me that Warp pauses the timer while running Application. If we want to set timer to Application for security reasons, what is the best way to do so? Thanks. P.S. Background: a process of mighttpd sometime dies. I'm not sure but I suspect that the CGI part is doing something wrong. --Kazu

On Wed, Oct 5, 2011 at 8:47 AM, Kazu Yamamoto
Hello,
I have two questions about ResponseEnumerator.
1) If Application opens a file Handle and uses enumHandle to return ResponseEnumerator, how does the Application close the Handle?
The ResponseEnumerator itself is in charge of the entire process, including running the Iteratee. So in pseudo-code, you'd do something like: ResponseEnumerator $ \mkiter -> do handle <- getHandle run_ $ enumHandler handle $ mkiter status headers hClose handle
2) It seems to me that Warp pauses the timer while running Application. If we want to set timer to Application for security reasons, what is the best way to do so?
I think we could make that functionality optional, based on an extra setting parameter. Would this just be boolean, or is more sophisticated control required? Michael
Thanks.
P.S.
Background: a process of mighttpd sometime dies. I'm not sure but I suspect that the CGI part is doing something wrong.
--Kazu
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel

Hello Michael,
The ResponseEnumerator itself is in charge of the entire process, including running the Iteratee. So in pseudo-code, you'd do something like:
ResponseEnumerator $ \mkiter -> do handle <- getHandle run_ $ enumHandler handle $ mkiter status headers hClose handle
Understood. Thanks.
I think we could make that functionality optional, based on an extra setting parameter. Would this just be boolean, or is more sophisticated control required?
What I want to do is to prevent a bad guy abusing CGI. So, I guess that boolean is enough. --Kazu

On Wed, Oct 5, 2011 at 9:08 AM, Kazu Yamamoto
Hello Michael,
The ResponseEnumerator itself is in charge of the entire process, including running the Iteratee. So in pseudo-code, you'd do something like:
ResponseEnumerator $ \mkiter -> do handle <- getHandle run_ $ enumHandler handle $ mkiter status headers hClose handle
Understood. Thanks.
I think we could make that functionality optional, based on an extra setting parameter. Would this just be boolean, or is more sophisticated control required?
What I want to do is to prevent a bad guy abusing CGI. So, I guess that boolean is enough.
Alright, here's a first stab[1]. What do you think? Michael [1] https://github.com/yesodweb/wai/commit/d2b6c66abef939bb1396d576e7541b711a6db...

Hello Michael,
I think we could make that functionality optional, based on an extra setting parameter. Would this just be boolean, or is more sophisticated control required?
What I want to do is to prevent a bad guy abusing CGI. So, I guess that boolean is enough.
Alright, here's a first stab[1]. What do you think?
Michael
[1] https://github.com/yesodweb/wai/commit/d2b6c66abef939bb1396d576e7541b711a6db...
Mighttpd executes a sub process and creates a pair of pipes for CGI. If timeout happens, it seems to me that there is no way to kill the sub process and close the pipes with this scheme. I would like to register a house-keeping action to Wrap's timer. --Kazu

On Thu, Oct 6, 2011 at 3:52 AM, Kazu Yamamoto
Hello Michael,
I think we could make that functionality optional, based on an extra setting parameter. Would this just be boolean, or is more sophisticated control required?
What I want to do is to prevent a bad guy abusing CGI. So, I guess that boolean is enough.
Alright, here's a first stab[1]. What do you think?
Michael
[1] https://github.com/yesodweb/wai/commit/d2b6c66abef939bb1396d576e7541b711a6db...
Mighttpd executes a sub process and creates a pair of pipes for CGI. If timeout happens, it seems to me that there is no way to kill the sub process and close the pipes with this scheme.
I would like to register a house-keeping action to Wrap's timer.
So it sounds like instead of the solution we just put in, we should just expose the ability to use Warp's timeout code directly. This shouldn't be a problem: * Expose the Timeout module (maybe in its own package, could be useful to others) * Add an extra settingsTimeoutManager :: IO Manager. That way you can create the manager in Mighttpd and then reuse it in Warp. Would this address the issue? Michael

Mighttpd executes a sub process and creates a pair of pipes for CGI. If timeout happens, it seems to me that there is no way to kill the sub process and close the pipes with this scheme.
I would like to register a house-keeping action to Wrap's timer.
So it sounds like instead of the solution we just put in, we should just expose the ability to use Warp's timeout code directly. This shouldn't be a problem:
* Expose the Timeout module (maybe in its own package, could be useful to others) * Add an extra settingsTimeoutManager :: IO Manager. That way you can create the manager in Mighttpd and then reuse it in Warp.
Would this address the issue?
I think so. --Kazu

On Thu, Oct 6, 2011 at 9:34 AM, Kazu Yamamoto
Mighttpd executes a sub process and creates a pair of pipes for CGI. If timeout happens, it seems to me that there is no way to kill the sub process and close the pipes with this scheme.
I would like to register a house-keeping action to Wrap's timer.
So it sounds like instead of the solution we just put in, we should just expose the ability to use Warp's timeout code directly. This shouldn't be a problem:
* Expose the Timeout module (maybe in its own package, could be useful to others) * Add an extra settingsTimeoutManager :: IO Manager. That way you can create the manager in Mighttpd and then reuse it in Warp.
Would this address the issue?
I think so.
--Kazu
OK, that one's even easier to implement. Please check out the most recent commit. I also realized that the Warp module already exports all the functions (I think) you need to use the timeout module; let me know if something's missing. Michael

OK, that one's even easier to implement. Please check out the most recent commit. I also realized that the Warp module already exports all the functions (I think) you need to use the timeout module; let me know if something's missing.
Ah. Thanks. I should have looked the recent version carefully. I think that "register", "cancel", "tickle", "initialize" should be also disclosed. (I'm not sure that "withManager" is good enough to solve my problem at this moment. Disclosing "initialize" would be appriciated.) --Kazu

On Thu, Oct 6, 2011 at 11:21 AM, Kazu Yamamoto
OK, that one's even easier to implement. Please check out the most recent commit. I also realized that the Warp module already exports all the functions (I think) you need to use the timeout module; let me know if something's missing.
Ah. Thanks. I should have looked the recent version carefully.
I think that "register", "cancel", "tickle", "initialize" should be also disclosed. (I'm not sure that "withManager" is good enough to solve my problem at this moment. Disclosing "initialize" would be appriciated.)
tickle and resume are actually the same. I'd rather not expose both, since I assume we'll be dropping one of them at some point. My latest commit exposed cancel, and I'll add register and initialize. But what's the use case where you need initialize? Michael

Hello Michael, Sorry for this late response. I noticed that if an exception handler is set in a Haskell thread, it works even if the thread is killed. So, the settingsPauseForApp approach is enough to me. Michael has alread reverted it but I want it back. Before that, I would like to make sure two things: 1) Would you take a look at "cgiApp'" defined in the following? https://github.com/kazu-yamamoto/wai-app-file-cgi/blob/master/Network/Wai/Ap... Since a sub-process is created, there are two iteratees: The original iteratee consumes HTTP request body and passes it to CGI. Another iteratee to consumes output from CGI is returned as ResponseEnumerator. Are the error handlings in "cgiApp'" reasonable from Michael's point of view? 2) I noticed that the commit of settingsPauseForApp[1] does not work. The timer is paused anyway in serveConnection. Suppose that a nasty client specifies Content-Length:, for exapmle, to 10 bytes and sends only 5 bytes only and stops. Since the timer is paused and there is no chance for an iteratee to resume the timer, the connection is not closed by time out. I'm sure that this happens in the case of ResponseEnumerator. I suspect this happens in the case of ResponseFile and ResponseBuilder, too. In other words, a bad guy can make massive connections to Warp, which will not be closed by time out. I guess pausing in serveConnection is not a good idea. [1] https://github.com/yesodweb/wai/commit/d2b6c66abef939bb1396d576e7541b711a6db... --Kazu
On Thu, Oct 6, 2011 at 9:34 AM, Kazu Yamamoto
wrote: Mighttpd executes a sub process and creates a pair of pipes for CGI. If timeout happens, it seems to me that there is no way to kill the sub process and close the pipes with this scheme.
I would like to register a house-keeping action to Wrap's timer.
So it sounds like instead of the solution we just put in, we should just expose the ability to use Warp's timeout code directly. This shouldn't be a problem:
* Expose the Timeout module (maybe in its own package, could be useful to others) * Add an extra settingsTimeoutManager :: IO Manager. That way you can create the manager in Mighttpd and then reuse it in Warp.
Would this address the issue?
I think so.
--Kazu
OK, that one's even easier to implement. Please check out the most recent commit. I also realized that the Warp module already exports all the functions (I think) you need to use the timeout module; let me know if something's missing.
Michael

Hello, I confirmed that the following patch fixes this vulnerability. Just FYI. https://github.com/kazu-yamamoto/wai/commit/96311d8040b6499922934d6eca68f76e... --Kazu
Hello Michael,
Sorry for this late response.
I noticed that if an exception handler is set in a Haskell thread, it works even if the thread is killed. So, the settingsPauseForApp approach is enough to me. Michael has alread reverted it but I want it back.
Before that, I would like to make sure two things:
1) Would you take a look at "cgiApp'" defined in the following? https://github.com/kazu-yamamoto/wai-app-file-cgi/blob/master/Network/Wai/Ap...
Since a sub-process is created, there are two iteratees: The original iteratee consumes HTTP request body and passes it to CGI. Another iteratee to consumes output from CGI is returned as ResponseEnumerator.
Are the error handlings in "cgiApp'" reasonable from Michael's point of view?
2) I noticed that the commit of settingsPauseForApp[1] does not work.
The timer is paused anyway in serveConnection. Suppose that a nasty client specifies Content-Length:, for exapmle, to 10 bytes and sends only 5 bytes only and stops. Since the timer is paused and there is no chance for an iteratee to resume the timer, the connection is not closed by time out.
I'm sure that this happens in the case of ResponseEnumerator. I suspect this happens in the case of ResponseFile and ResponseBuilder, too. In other words, a bad guy can make massive connections to Warp, which will not be closed by time out.
I guess pausing in serveConnection is not a good idea.
[1] https://github.com/yesodweb/wai/commit/d2b6c66abef939bb1396d576e7541b711a6db...
--Kazu
On Thu, Oct 6, 2011 at 9:34 AM, Kazu Yamamoto
wrote: Mighttpd executes a sub process and creates a pair of pipes for CGI. If timeout happens, it seems to me that there is no way to kill the sub process and close the pipes with this scheme.
I would like to register a house-keeping action to Wrap's timer.
So it sounds like instead of the solution we just put in, we should just expose the ability to use Warp's timeout code directly. This shouldn't be a problem:
* Expose the Timeout module (maybe in its own package, could be useful to others) * Add an extra settingsTimeoutManager :: IO Manager. That way you can create the manager in Mighttpd and then reuse it in Warp.
Would this address the issue?
I think so.
--Kazu
OK, that one's even easier to implement. Please check out the most recent commit. I also realized that the Warp module already exports all the functions (I think) you need to use the timeout module; let me know if something's missing.
Michael
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel

From Warp's perspective, there is no need to defend against the application: if an application wants to spend five hours responding to a request, that's the application's prerogative. I understand that in
I think we need to be clear about what "vulnerability" means here.
the CGI case, you want to prevent something like that from happening,
but that's beyond Warp's purview. The question is can we expose enough
primitives to make it possible for you to implement this at the
Mighttpd level.
So from that perspective, I'm not sure if 96311d would really be
considered a vulnerability in Warp. Isn't your patch making it
impossible for an application to run in unbounded time? It might make
more sense to add a specific timeout each time the CGI app is called
(possibly via timeout[1]) to ensure it responds appropriately. But if
I'm not mistaken, this isn't even necessary in the CGI case, as the
Response value will be returned by your code and will not be affected
by the response time of the CGI app itself.
Before we make more commits, let's make sure we're on the same page
about what needs to be done, and then make the fewest possible changes
to Warp in order to achieve our goals.
Michael
[1] http://hackage.haskell.org/packages/archive/base/4.4.0.0/doc/html/System-Tim...
On Wed, Oct 12, 2011 at 9:08 AM, Kazu Yamamoto
Hello,
I confirmed that the following patch fixes this vulnerability. Just FYI.
https://github.com/kazu-yamamoto/wai/commit/96311d8040b6499922934d6eca68f76e...
--Kazu
Hello Michael,
Sorry for this late response.
I noticed that if an exception handler is set in a Haskell thread, it works even if the thread is killed. So, the settingsPauseForApp approach is enough to me. Michael has alread reverted it but I want it back.
Before that, I would like to make sure two things:
1) Would you take a look at "cgiApp'" defined in the following? https://github.com/kazu-yamamoto/wai-app-file-cgi/blob/master/Network/Wai/Ap...
Since a sub-process is created, there are two iteratees: The original iteratee consumes HTTP request body and passes it to CGI. Another iteratee to consumes output from CGI is returned as ResponseEnumerator.
Are the error handlings in "cgiApp'" reasonable from Michael's point of view?
2) I noticed that the commit of settingsPauseForApp[1] does not work.
The timer is paused anyway in serveConnection. Suppose that a nasty client specifies Content-Length:, for exapmle, to 10 bytes and sends only 5 bytes only and stops. Since the timer is paused and there is no chance for an iteratee to resume the timer, the connection is not closed by time out.
I'm sure that this happens in the case of ResponseEnumerator. I suspect this happens in the case of ResponseFile and ResponseBuilder, too. In other words, a bad guy can make massive connections to Warp, which will not be closed by time out.
I guess pausing in serveConnection is not a good idea.
[1] https://github.com/yesodweb/wai/commit/d2b6c66abef939bb1396d576e7541b711a6db...
--Kazu
On Thu, Oct 6, 2011 at 9:34 AM, Kazu Yamamoto
wrote: Mighttpd executes a sub process and creates a pair of pipes for CGI. If timeout happens, it seems to me that there is no way to kill the sub process and close the pipes with this scheme.
I would like to register a house-keeping action to Wrap's timer.
So it sounds like instead of the solution we just put in, we should just expose the ability to use Warp's timeout code directly. This shouldn't be a problem:
* Expose the Timeout module (maybe in its own package, could be useful to others) * Add an extra settingsTimeoutManager :: IO Manager. That way you can create the manager in Mighttpd and then reuse it in Warp.
Would this address the issue?
I think so.
--Kazu
OK, that one's even easier to implement. Please check out the most recent commit. I also realized that the Warp module already exports all the functions (I think) you need to use the timeout module; let me know if something's missing.
Michael
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel

Hello,
I think we need to be clear about what "vulnerability" means here.
If you don't like the word vulnerability, sorry for that. But the following DOS is possible. A bad guy can open massive HTTP connections to Warp and send partial bodies and keep the connections. The connections will not time out. If the limit of open-file reaches, Warp cannot accept a new connections from a good guy. If my understanding is correct, this happens not only for CGI but also for other services.
From Warp's perspective, there is no need to defend against the application: if an application wants to spend five hours responding to a request, that's the application's prerogative. I understand that in the CGI case, you want to prevent something like that from happening, but that's beyond Warp's purview. The question is can we expose enough primitives to make it possible for you to implement this at the Mighttpd level.
Yes, applications are our friends. What I pointed out is that a bud guy can spend five hours. And during that, applicatons cannot spend even one second to do their service.
So from that perspective, I'm not sure if 96311d would really be considered a vulnerability in Warp. Isn't your patch making it impossible for an application to run in unbounded time? It might make more sense to add a specific timeout each time the CGI app is called (possibly via timeout[1]) to ensure it responds appropriately. But if I'm not mistaken, this isn't even necessary in the CGI case, as the Response value will be returned by your code and will not be affected by the response time of the CGI app itself.
The patch is just in a "proof of concept" level. We probably should prepare an option to let applications control time out by themselves. Warp should be secure by default. And an application should be able to open the door to the self-responsibility world if it wishes. --Kazu

On Mon, Oct 17, 2011 at 5:16 AM, Kazu Yamamoto
Hello,
I think we need to be clear about what "vulnerability" means here.
If you don't like the word vulnerability, sorry for that.
I wasn't playing with semantics; I have no problem calling it a vulnerability. I just want to make sure we're talking about the same thing.
But the following DOS is possible. A bad guy can open massive HTTP connections to Warp and send partial bodies and keep the connections. The connections will not time out. If the limit of open-file reaches, Warp cannot accept a new connections from a good guy.
I had not understood that this was the DOS attack you were trying to prevent, thank you for the clarification. I think you are correct that this is a problem, but perhaps we should solve it in the enumSocket function. If we tickle the timeout before calling Sock.recv and then pause it again afterwards, we will *only* be timing out on the part of the code that is receiving data from the client, as opposed to timing out on the application code itself.
If my understanding is correct, this happens not only for CGI but also for other services.
From Warp's perspective, there is no need to defend against the application: if an application wants to spend five hours responding to a request, that's the application's prerogative. I understand that in the CGI case, you want to prevent something like that from happening, but that's beyond Warp's purview. The question is can we expose enough primitives to make it possible for you to implement this at the Mighttpd level.
Yes, applications are our friends. What I pointed out is that a bud guy can spend five hours. And during that, applicatons cannot spend even one second to do their service.
So from that perspective, I'm not sure if 96311d would really be considered a vulnerability in Warp. Isn't your patch making it impossible for an application to run in unbounded time? It might make more sense to add a specific timeout each time the CGI app is called (possibly via timeout[1]) to ensure it responds appropriately. But if I'm not mistaken, this isn't even necessary in the CGI case, as the Response value will be returned by your code and will not be affected by the response time of the CGI app itself.
The patch is just in a "proof of concept" level. We probably should prepare an option to let applications control time out by themselves.
Warp should be secure by default. And an application should be able to open the door to the self-responsibility world if it wishes.
--Kazu
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel

Hello Michael,
I had not understood that this was the DOS attack you were trying to prevent, thank you for the clarification. I think you are correct that this is a problem, but perhaps we should solve it in the enumSocket function. If we tickle the timeout before calling Sock.recv and then pause it again afterwards, we will *only* be timing out on the part of the code that is receiving data from the client, as opposed to timing out on the application code itself.
I'm fine with any fixes which can solve this problem. Would you write the code so that I can test? --Kazu

On Mon, Oct 17, 2011 at 10:06 AM, Kazu Yamamoto
Hello Michael,
I had not understood that this was the DOS attack you were trying to prevent, thank you for the clarification. I think you are correct that this is a problem, but perhaps we should solve it in the enumSocket function. If we tickle the timeout before calling Sock.recv and then pause it again afterwards, we will *only* be timing out on the part of the code that is receiving data from the client, as opposed to timing out on the application code itself.
I'm fine with any fixes which can solve this problem. Would you write the code so that I can test?
I've started a new branch (slowloris); let's try to come up with a complete set of changes to address the issues and then merge it back. Here's the change I was describing: https://github.com/yesodweb/wai/commit/58119eb0b762fde98567ba181ada61b14dfed... Michael

Hello Michael,
I've started a new branch (slowloris); let's try to come up with a complete set of changes to address the issues and then merge it back. Here's the change I was describing:
https://github.com/yesodweb/wai/commit/58119eb0b762fde98567ba181ada61b14dfed...
I confirmed that my problem is gone. I hope that this will be merged and the next Warp will be released. I also confirmed Greg's slowloris attach is possible. The following code demonstrates it. I think we should introduce rate limiting in the future. --Kazu module Main where import Control.Concurrent import Data.ByteString.Char8 import Network.Socket hiding (send, recv) import Network.Socket.ByteString import System.IO header :: String header = "GET / HTTP/1.1\r\nHost: localhost\r\n" main :: IO () main = do let hint = defaultHints { addrFlags = [AI_NUMERICHOST, AI_NUMERICSERV] , addrSocketType = Stream } a:_ <- getAddrInfo (Just hint) (Just "127.0.0.1") (Just "8080") s <- socket (addrFamily a) (addrSocketType a) (addrProtocol a) connect s (addrAddress a) slowloris s header slowloris :: Socket -> String -> IO () slowloris _ [] = return () slowloris s (c:cs) = do send s (pack [c]) putChar c hFlush stdout threadDelay (30 * 1000000) slowloris s cs

On Thu, Oct 20, 2011 at 6:28 AM, Kazu Yamamoto
Hello Michael,
I've started a new branch (slowloris); let's try to come up with a complete set of changes to address the issues and then merge it back. Here's the change I was describing:
https://github.com/yesodweb/wai/commit/58119eb0b762fde98567ba181ada61b14dfed...
I confirmed that my problem is gone. I hope that this will be merged and the next Warp will be released.
I also confirmed Greg's slowloris attach is possible. The following code demonstrates it. I think we should introduce rate limiting in the future.
--Kazu
module Main where
import Control.Concurrent import Data.ByteString.Char8 import Network.Socket hiding (send, recv) import Network.Socket.ByteString import System.IO
header :: String header = "GET / HTTP/1.1\r\nHost: localhost\r\n"
main :: IO () main = do let hint = defaultHints { addrFlags = [AI_NUMERICHOST, AI_NUMERICSERV] , addrSocketType = Stream } a:_ <- getAddrInfo (Just hint) (Just "127.0.0.1") (Just "8080") s <- socket (addrFamily a) (addrSocketType a) (addrProtocol a) connect s (addrAddress a) slowloris s header
slowloris :: Socket -> String -> IO () slowloris _ [] = return () slowloris s (c:cs) = do send s (pack [c]) putChar c hFlush stdout threadDelay (30 * 1000000) slowloris s cs
I think Greg's/Snap's approach of a separate timeout for the status and headers is right on the money. It should never take more than one timeout cycle to receive a full set of headers, regardless of how slow the user's connection, and given a reasonable timeout setting from the user (anything over 2 seconds should be fine I'd guess, and our default is 30 seconds). The bigger question is what we do about the request body. A simple approach might just be that if we receive a packet from the client which is less than a certain size (user defined, maybe 2048 bytes is a good default) it does not tickle the timeout at all. Obviously this means a malicious program could be devised to send precisely 2048 bytes per timeout cycle... but I don't think there's any way to do better than this. We *have* to err on the side of allowing attacks, otherwise we'll end up with disconnecting valid requests. In other words, here's how I'd see the timeout code working: 1. A timeout is created at the beginning of a connection, and not tickled at all until all the request headers are read in. 2. Every time X (default: 2048) bytes of the request body are read, the timeout is tickled. Actually, to elaborate on (2) a bit: we want to make sure we're not applying the timeout to the application code at all, so what we'd really do is: a. Try to read a piece of the request body. b. If the piece is greater than X bytes, or the entire request body is now read, pause the timeout. c. Pass that chunk to the application. d. If the request body has not been entirely read, resume the timeout and return to (a). The response body timeout code should be relatively safe already I believe: it only tickles the timeout once all the data is sent. Michael

On Sat, Oct 22, 2011 at 10:20 PM, Michael Snoyman
I think Greg's/Snap's approach of a separate timeout for the status and headers is right on the money. It should never take more than one timeout cycle to receive a full set of headers, regardless of how slow the user's connection, and given a reasonable timeout setting from the user (anything over 2 seconds should be fine I'd guess, and our default is 30 seconds).
That's fairly uncontroversial.
The bigger question is what we do about the request body. A simple approach might just be that if we receive a packet from the client which is less than a certain size (user defined, maybe 2048 bytes is a good default) it does not tickle the timeout at all. Obviously this means a malicious program could be devised to send precisely 2048 bytes per timeout cycle... but I don't think there's any way to do better than this.
This doesn't really work either. I've already posted code in this thread for what I think is the only reasonable option, which is rate limiting. The way we've implemented rate limiting is: 1) any individual data packet must arrive within N seconds (the usual timeout) 2) when you receive a packet, you compute the data rate in bytes per second -- if it's lower than X bytes/sec (where X is a policy decision left up to the user), the connection is killed 3) the logic from 2) only kicks in after Y seconds, to cover cases where the client needs to do some expensive initial setup. Y is also a policy decision.
We *have* to err on the side of allowing attacks, otherwise we'll end up with disconnecting valid requests.
I don't agree with this. Some kinds of "valid" requests are indistinguishable from attacks. You need to decide what's more important: letting some guy on a 30-kilobit packet radio connection upload a big file, or letting someone DoS your server.
In other words, here's how I'd see the timeout code working:
1. A timeout is created at the beginning of a connection, and not tickled at all until all the request headers are read in. 2. Every time X (default: 2048) bytes of the request body are read, the timeout is tickled.
Note that this is basically a crude form of rate-limiting (at X/T
bytes per second). Why not do it "properly"?
G
--
Gregory Collins

I don't know if Snap is doing this yet, but it is possible to just deny
partial GET/HEAD requests.
Apache is considered vulnerable to slowloris because it has a limited thread
pool. Nginx is not considered vulnerable to slowloris because it uses an
evented architecture and by default drops connections after 60 seconds that
have not been completed. Technically we say our Haskell web servers are
using threads, but they are managed by a very fast evented system. So we can
hold many unused connections open like Nginx and should not be vulnerable if
we have a timeout that cannot be tickled. This could make for an interesting
benchmark - how many slowloris connections can we take on? The code from
Kazu makes just one connection - it does not demonstrate a successful
slowloris attack, just one successful slowloris connection.
If we limit the number of connections per ip address, that means a slowloris
attack will require the coordination of thousands of nodes and make it
highly impractical. Although there may be a potential issue with proxies
(AOL at least used to do this, but I think just for GET) wanting to make
lots of connections.
---------- Forwarded message ----------
From: Gregory Collins
I think Greg's/Snap's approach of a separate timeout for the status and headers is right on the money. It should never take more than one timeout cycle to receive a full set of headers, regardless of how slow the user's connection, and given a reasonable timeout setting from the user (anything over 2 seconds should be fine I'd guess, and our default is 30 seconds).
That's fairly uncontroversial.
The bigger question is what we do about the request body. A simple approach might just be that if we receive a packet from the client which is less than a certain size (user defined, maybe 2048 bytes is a good default) it does not tickle the timeout at all. Obviously this means a malicious program could be devised to send precisely 2048 bytes per timeout cycle... but I don't think there's any way to do better than this.
This doesn't really work either. I've already posted code in this thread for what I think is the only reasonable option, which is rate limiting. The way we've implemented rate limiting is: 1) any individual data packet must arrive within N seconds (the usual timeout) 2) when you receive a packet, you compute the data rate in bytes per second -- if it's lower than X bytes/sec (where X is a policy decision left up to the user), the connection is killed 3) the logic from 2) only kicks in after Y seconds, to cover cases where the client needs to do some expensive initial setup. Y is also a policy decision.
We *have* to err on the side of allowing attacks, otherwise we'll end up with disconnecting valid requests.
I don't agree with this. Some kinds of "valid" requests are indistinguishable from attacks. You need to decide what's more important: letting some guy on a 30-kilobit packet radio connection upload a big file, or letting someone DoS your server.
In other words, here's how I'd see the timeout code working:
1. A timeout is created at the beginning of a connection, and not tickled at all until all the request headers are read in. 2. Every time X (default: 2048) bytes of the request body are read, the timeout is tickled.
Note that this is basically a crude form of rate-limiting (at X/T
bytes per second). Why not do it "properly"?
G
--
Gregory Collins

On Sun, Oct 23, 2011 at 6:55 PM, Greg Weber
Apache is considered vulnerable to slowloris because it has a limited thread pool. Nginx is not considered vulnerable to slowloris because it uses an evented architecture and by default drops connections after 60 seconds that have not been completed. Technically we say our Haskell web servers are using threads, but they are managed by a very fast evented system. So we can hold many unused connections open like Nginx and should not be vulnerable if we have a timeout that cannot be tickled. This could make for an interesting benchmark - how many slowloris connections can we take on? The code from Kazu makes just one connection - it does not demonstrate a successful slowloris attack, just one successful slowloris connection.
Slowloris causes problems with any scarce resource -- threads in a pool, as you mentioned, but a bigger problem for us is running out of file descriptors. If the client is allowed to hold connections open for long enough, an attacker should be able to run the server out of file descriptors using only a handful of machines.
If we limit the number of connections per ip address, that means a slowloris attack will require the coordination of thousands of nodes and make it highly impractical. Although there may be a potential issue with proxies (AOL at least used to do this, but I think just for GET) wanting to make lots of connections.
Yep -- this is one possible solution, although you're right about the
proxy/NAT gateway issue potentially being a problem for some
applications. Ultimately I think to handle this "well enough", we just
need to be able to handle timeouts properly to deter low-grade script
kiddies with a couple of machines at their disposal. Attackers with
real botnets are going to be hard to stop no matter what.
G
--
Gregory Collins

On Sun, Oct 23, 2011 at 11:37 AM, Gregory Collins
Apache is considered vulnerable to slowloris because it has a limited
pool. Nginx is not considered vulnerable to slowloris because it uses an evented architecture and by default drops connections after 60 seconds
On Sun, Oct 23, 2011 at 6:55 PM, Greg Weber
wrote: thread that have not been completed. Technically we say our Haskell web servers are using threads, but they are managed by a very fast evented system. So we can hold many unused connections open like Nginx and should not be vulnerable if we have a timeout that cannot be tickled. This could make for an interesting benchmark - how many slowloris connections can we take on? The code from Kazu makes just one connection - it does not demonstrate a successful slowloris attack, just one successful slowloris connection.
Slowloris causes problems with any scarce resource -- threads in a pool, as you mentioned, but a bigger problem for us is running out of file descriptors. If the client is allowed to hold connections open for long enough, an attacker should be able to run the server out of file descriptors using only a handful of machines.
Good point. The equation for number of nodes needed to pull off a slowloris attack becomes: process file descriptor limit / connections per ip address With a 60 second request timeout (like nginx), each slowloris request must be finished in 60 seconds. This also means the log will be writing information about the attack as it is in progress. Stopping Slowloris * 60 second hard timeout * reject partial GET/HEAD requests * require all the headers for any request be sent at once * increase ulimit for the web server. * limit requests per ip address limiting per ip can be done with iptables, but i don't think iptables would know about the http request method. Limiting the number of post requests per ip address to something small seems like an easy way to stop a simple Slowloris attack without having to worry much about the proxy/gateway issue. We will need to add a deploy instruction about increasing the ulimit. If we wanted to, as a final precaution, we could try to detect when we are at the ulimit and start dropping connections.
If we limit the number of connections per ip address, that means a slowloris attack will require the coordination of thousands of nodes and make it highly impractical. Although there may be a potential issue with proxies (AOL at least used to do this, but I think just for GET) wanting to make lots of connections.
Yep -- this is one possible solution, although you're right about the proxy/NAT gateway issue potentially being a problem for some applications. Ultimately I think to handle this "well enough", we just need to be able to handle timeouts properly to deter low-grade script kiddies with a couple of machines at their disposal. Attackers with real botnets are going to be hard to stop no matter what.
G -- Gregory Collins

Slowloris causes problems with any scarce resource -- threads in a pool, as you mentioned, but a bigger problem for us is running out of file descriptors. If the client is allowed to hold connections open for long enough, an attacker should be able to run the server out of file descriptors using only a handful of machines.
Yes. We need to keep this in our mind: "A chain is no stronger than its weakest link." --Kazu

Hello,
Apache is considered vulnerable to slowloris because it has a limited thread pool. Nginx is not considered vulnerable to slowloris because it uses an evented architecture and by default drops connections after 60 seconds that have not been completed. Technically we say our Haskell web servers are using threads, but they are managed by a very fast evented system. So we can hold many unused connections open like Nginx and should not be vulnerable if we have a timeout that cannot be tickled. This could make for an interesting benchmark - how many slowloris connections can we take on? The code from Kazu makes just one connection - it does not demonstrate a successful slowloris attack, just one successful slowloris connection.
I you want, I create a code to do real slowloris attack to consume the file descriptors of a server. It's quite easy.
If we limit the number of connections per ip address, that means a slowloris attack will require the coordination of thousands of nodes and make it highly impractical.
If we pay money, we can use a *botnet* to do this. This actually happens in the real world. But I don't think a bad guy targets your web server. --Kazu

On Sun, Oct 23, 2011 at 7:50 PM, Kazu Yamamoto
Hello,
Apache is considered vulnerable to slowloris because it has a limited thread pool. Nginx is not considered vulnerable to slowloris because it uses an evented architecture and by default drops connections after 60 seconds that have not been completed. Technically we say our Haskell web servers are using threads, but they are managed by a very fast evented system. So we can hold many unused connections open like Nginx and should not be vulnerable if we have a timeout that cannot be tickled. This could make for an interesting benchmark - how many slowloris connections can we take on? The code from Kazu makes just one connection - it does not demonstrate a successful slowloris attack, just one successful slowloris connection.
I you want, I create a code to do real slowloris attack to consume the file descriptors of a server. It's quite easy.
That depends on the file descriptor limits. If the server can take on more connections that the attacking script can produce, you will need a second machine, which I think means it is no longer quite easy. Slowloris was designed to show how easy it is to take down servers with Apache style architecture by just gradually opening 50 slow connections. We are actually in the realm of DOS attacks now. Intentionally slowing down the connection of the attack is just a technique to make the DOS more effective, but I don't even know why it matters anymore to our situation - the attacker can just send more requests to the server - I suppose the slow connection is more effective at blocking out legitimate users competing for connections.
If we limit the number of connections per ip address, that means a slowloris attack will require the coordination of thousands of nodes and make it highly impractical.
If we pay money, we can use a *botnet* to do this. This actually happens in the real world. But I don't think a bad guy targets your web server.
--Kazu
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel

On Sun, Oct 23, 2011 at 5:09 PM, Gregory Collins
On Sat, Oct 22, 2011 at 10:20 PM, Michael Snoyman
wrote: I think Greg's/Snap's approach of a separate timeout for the status and headers is right on the money. It should never take more than one timeout cycle to receive a full set of headers, regardless of how slow the user's connection, and given a reasonable timeout setting from the user (anything over 2 seconds should be fine I'd guess, and our default is 30 seconds).
That's fairly uncontroversial.
The bigger question is what we do about the request body. A simple approach might just be that if we receive a packet from the client which is less than a certain size (user defined, maybe 2048 bytes is a good default) it does not tickle the timeout at all. Obviously this means a malicious program could be devised to send precisely 2048 bytes per timeout cycle... but I don't think there's any way to do better than this.
This doesn't really work either. I've already posted code in this thread for what I think is the only reasonable option, which is rate limiting. The way we've implemented rate limiting is:
1) any individual data packet must arrive within N seconds (the usual timeout)
2) when you receive a packet, you compute the data rate in bytes per second -- if it's lower than X bytes/sec (where X is a policy decision left up to the user), the connection is killed
3) the logic from 2) only kicks in after Y seconds, to cover cases where the client needs to do some expensive initial setup. Y is also a policy decision.
As you mention below, the approach I outline is fairly close to rate limiting. I would argue it's *more* correct. In the rate limiting case, a client could send a burst of a huge amount of data at the beginning of the connection, and then send single bytes every 20 seconds for the next few minutes, and rate limiting would allow it to happen. The approach below (call it minimum packet size?) wouldn't. Additionally, I think minimum packet size can be implemented much more efficiently, as there is significantly less data to track.
We *have* to err on the side of allowing attacks, otherwise we'll end up with disconnecting valid requests.
I don't agree with this. Some kinds of "valid" requests are indistinguishable from attacks. You need to decide what's more important: letting some guy on a 30-kilobit packet radio connection upload a big file, or letting someone DoS your server.
Fair enough, let me rephrase: there's a gray area between attack and valid request, and by default I'd like to draw the line so as to incorporate most of the gray zone.
In other words, here's how I'd see the timeout code working:
1. A timeout is created at the beginning of a connection, and not tickled at all until all the request headers are read in. 2. Every time X (default: 2048) bytes of the request body are read, the timeout is tickled.
Note that this is basically a crude form of rate-limiting (at X/T bytes per second). Why not do it "properly"?
G -- Gregory Collins

On Mon, Oct 17, 2011 at 6:19 AM, Michael Snoyman
On Mon, Oct 17, 2011 at 5:16 AM, Kazu Yamamoto
wrote: But the following DOS is possible. A bad guy can open massive HTTP connections to Warp and send partial bodies and keep the connections. The connections will not time out. If the limit of open-file reaches, Warp cannot accept a new connections from a good guy.
I had not understood that this was the DOS attack you were trying to prevent, thank you for the clarification. I think you are correct that this is a problem, but perhaps we should solve it in the enumSocket function. If we tickle the timeout before calling Sock.recv and then pause it again afterwards, we will *only* be timing out on the part of the code that is receiving data from the client, as opposed to timing out on the application code itself.
This is called the "slow loris" attack. We spent a lot of effort in
Snap trying to guard against it. It isn't enough to simply tickle the
timeout when you get bytes from Sock.recv: the attacker can simply
send you one byte at a time, slowly, forever. There are different ways
to deal with this properly; in Snap we do a couple of things. For HTTP
file uploads, we use rate limiting to kill uploads that are running
slower than some user-defined threshold. (Our iteratee rate limiting
code is here: https://github.com/snapframework/snap-core/blob/master/src/Snap/Iteratee.hs#...)
For HTTP requests, we use an absolute timeout which is never tickled
-- i.e., you have X seconds to get the HTTP status line and headers to
us -- so that the attacker can not simply trickle headers to us
forever. The output side, of course, the application has control over,
so we don't do anything special there.
Back to the issue of "no timeout" -- again, in general I think this is
a really bad idea, even for web services that are run in a trusted
environment. If no data is flowing across a connection, it's
impossible to distinguish "the other end died" from "it's just sitting
waiting patiently". Robust services that make long-lived connections
use ping packets to test connection liveness, period. Infinite network
timeouts, to me, are a synonym for "I didn't completely think through
the failure modes of this application".
G
--
Gregory Collins

On Thu, Oct 6, 2011 at 9:16 AM, Michael Snoyman
On Thu, Oct 6, 2011 at 3:52 AM, Kazu Yamamoto
wrote: Hello Michael,
I think we could make that functionality optional, based on an extra setting parameter. Would this just be boolean, or is more sophisticated control required?
What I want to do is to prevent a bad guy abusing CGI. So, I guess that boolean is enough.
Alright, here's a first stab[1]. What do you think?
Michael
[1] https://github.com/yesodweb/wai/commit/d2b6c66abef939bb1396d576e7541b711a6db...
Mighttpd executes a sub process and creates a pair of pipes for CGI. If timeout happens, it seems to me that there is no way to kill the sub process and close the pipes with this scheme.
I would like to register a house-keeping action to Wrap's timer.
So it sounds like instead of the solution we just put in, we should just expose the ability to use Warp's timeout code directly. This shouldn't be a problem:
* Expose the Timeout module (maybe in its own package, could be useful to others) * Add an extra settingsTimeoutManager :: IO Manager. That way you can create the manager in Mighttpd and then reuse it in Warp.
Would this address the issue?
It would, but it'd be a Law of Demeter violation. In Snap we give you
an IO action "_snapSetTimeout :: Int -> IO ()", that gets handed to
you as hidden state in the Snap monad. {*} I don't think it's
necessary to expose the TimeoutManager to applications at all, and it
might get you into trouble later if you decide you need to change the
interface.
BTW: is timeout handling specified in WAI, or not? IMO you almost
can't write a stable web application without thinking about the
timeout issue, and I think that interface probably belongs in WAI if
it isn't there already.
G
{*} Note: you could make the parameter "Maybe Int" to allow the user
to disable the timer, but I chose not to -- disabling the timer is a
bad idea. You can still set an effectively infinite timeout using
"maxBound", but at least then you feel guilty for doing something
stupid.
--
Gregory Collins

On Thu, Oct 6, 2011 at 10:39 AM, Gregory Collins
On Thu, Oct 6, 2011 at 9:16 AM, Michael Snoyman
wrote: On Thu, Oct 6, 2011 at 3:52 AM, Kazu Yamamoto
wrote: Hello Michael,
I think we could make that functionality optional, based on an extra setting parameter. Would this just be boolean, or is more sophisticated control required?
What I want to do is to prevent a bad guy abusing CGI. So, I guess that boolean is enough.
Alright, here's a first stab[1]. What do you think?
Michael
[1] https://github.com/yesodweb/wai/commit/d2b6c66abef939bb1396d576e7541b711a6db...
Mighttpd executes a sub process and creates a pair of pipes for CGI. If timeout happens, it seems to me that there is no way to kill the sub process and close the pipes with this scheme.
I would like to register a house-keeping action to Wrap's timer.
So it sounds like instead of the solution we just put in, we should just expose the ability to use Warp's timeout code directly. This shouldn't be a problem:
* Expose the Timeout module (maybe in its own package, could be useful to others) * Add an extra settingsTimeoutManager :: IO Manager. That way you can create the manager in Mighttpd and then reuse it in Warp.
Would this address the issue?
It would, but it'd be a Law of Demeter violation. In Snap we give you an IO action "_snapSetTimeout :: Int -> IO ()", that gets handed to you as hidden state in the Snap monad. {*} I don't think it's necessary to expose the TimeoutManager to applications at all, and it might get you into trouble later if you decide you need to change the interface.
Except this was a case where exposing the timeout manager *was* necessary[1]. We already provide users the ability to modify the timeout duration via settingsTimeout, this is for an extra feature: letting the user supply their own manager. It's true that it will cause us trouble if we decide to change things later, but this was already part of the exposed interface, since we allow some fairly low-level tinkering with Warp. (This is necessary for things like wai-handler-devel.)
BTW: is timeout handling specified in WAI, or not? IMO you almost can't write a stable web application without thinking about the timeout issue, and I think that interface probably belongs in WAI if it isn't there already.
Timeout handling would be outside the scope of WAI. How would it work there? What does it mean to have timeouts set for FastCGI, or CGI, or testing? I think having a timeout handler in Warp is the correct solution, I don't know what we'd gain by moving it to WAI, much less what that would even mean. Michael [1] Well, Kazu could just implement his own timeout manager and have that running for CGI processes, but that would be a waste.

On Thu, Oct 6, 2011 at 3:52 AM, Kazu Yamamoto
Mighttpd executes a sub process and creates a pair of pipes for CGI. If timeout happens, it seems to me that there is no way to kill the sub process and close the pipes with this scheme.
I would like to register a house-keeping action to Wrap's timer.
Can you catch the exception that the timeout handler presumably throws
to you and do your cleanup there?
G
--
Gregory Collins
participants (4)
-
Greg Weber
-
Gregory Collins
-
Kazu Yamamoto
-
Michael Snoyman