
On Mon, Jan 24, 2011 at 10:11 PM, Neil Mitchell
Hi,
I tried switch Hoogle to Wai and Warp today. The results were good, and once a few little things are cleared up I'm fairly certain that the the next Hoogle release will be based on them. As I wrote the code, I made some notes:
1) Why not add statusOK as an alias for status200? As someone who doesn't spend all day dealing with HTTP response codes, statusOK is far better documenting - similarly for all the other status values. I added statusOK as a local definition in Hoogle.
The main reason I stuck with numerical values was the redirects: there is a bit of ambiguity about the right name for 302, 303 and 307. (Yes, I know that the specs give a name for these, but in practice they are treated slightly different.) Frankly, I had always intended WAI to be something low-level that a framework would wrap around. It's a new phenomenon that there is interest in writing directly against WAI. I think adding some of the status codes as aliases makes sense in that context.
2) Changing from HTTP to Wai, it was a shame that I had to define all the header keys myself - could they be included - it would have better performance, and give static guarantees. I added locally:
hdrContentType = mkCIByteString "Content-Type" hdrCacheControl = mkCIByteString "Cache-Control"
Firstly, CIByteString is an instance of IsString, so you can just write "Content-Type". While this doesn't address the performance concerns, it does address the convenience. As far as *why* I didn't include them, it just becomes an issue of how much to include/exclude. I included the statuses because it's an annoyance to write Status 200 "OK", but I thought "Content-Type" was acceptable. But I really have no strong feelings on the matter.
3) The documentation for ResponseFile doesn't say what happens if the file is not found. I am sure it must throw an exception, but for a very brief second I wondered if it might change my status code to 404 on my behalf. Could the documentation be clarified.
It depends on what the sendfile package does actually. I haven't tried this personally, so in fact I don't know what it does. I'll make a note to test it and update the documentation accordingly.
4) I defined a Wai handler that threw an error, and Warp did nothing. Should it? Or should I just be careful and include my own exception handling? I'm not really sure what the right thing to do is - silently eating the error isn't great, but bring down my server would be even less appreciated.
There's already a request in place to allow you to provide a callback function for exceptions. However, I personally prefer having the application catch all exceptions (which is what Yesod does in practice).
5) Hoogle is currently written as lazy IO + String. Converting this format to a response was easy with liftIO, LBS.pack and responseLBS. Converting back was rather tricky. I eventually came up with:
responseFlatten :: Response -> IO (Status, ResponseHeaders, LBS.ByteString) responseFlatten r = responseEnumerator r $ \status headers -> let f res E.EOF = return (status, headers, Blaze.toLazyByteString res) f res (E.Chunks xs) = E.continue (f $ mconcat $ res:xs) in E.continue (f mempty)
It's not too long, but it has a relatively high level of complexity. Is there something simpler I could have done?
You could probably write something simpler using the consume iteratee from the enumerator package. It's late at night here, so the following may or may not fire missiles: responseEnumerator r $ \s hs -> do builders <- consume return (s, hs, Blaze.toLazyByteString $ mconcat builders)
6) I want responseFlatten for two purposes, the first of which is to output a CGI representation of the Response, which is easy with the above function. However, outputting a response in CGI format is trivial, and is probably the easiest way to debug/view the contents of a Response. Would it be worth adding showResponse :: Response -> IO LBS.ByteString to the Wai library, which produced the result representing how it would be sent in CGI format, and was also useful for debugging?
I think such a function would make more sense for the wai-extra package. I already have a debug middleware over there, that kind of function could be a good addition.
7) The second reason for using responseFlatten is that I want to write a function Response -> IO Response which modifies a response to rewrite "href='file://" to "href='/file/". I currently do this with responseFlatten, converting to a String, and then stepping through it Char by Char with isPrefixOf. Given the performance focus of this email list, I can imagine several of you are now crying. What would have been the Iterator/Builder approved way to do it?
Well, obviously converting to Char makes us cry, so sticking to ByteString operations would be a little easier. But frankly, to my knowledge, there isn't an efficient way to perform transformations on Builders. You'll probably be best off mconcat'ing your Builders into a single Builder, converting that to a lazy ByteString and performing the conversion there. Assuming that you are creating the Builder using fromLazyByteString in the first place, the performance hit here won't really be as bad as it seems: blaze-builder will actually use the original buffer when a source bytestring is large enough, so there should not be a significant overhead versus performing the transformation on the original lazy ByteString.
8) The last point is the only point that is a blocker for Hoogle, and regards the Warp server, Windows and ghci. In ghci on Windows, running the network function accept doesn't terminate if you hit Ctrl+C. If you run Warp in the main thread ghci will never respond to you again. If you run Warp on a second thread then hitting Ctrl+C twice returns to the prompt, but the thread serving the web page is still running (and still serving web pages), and if you run the server again the old server continues to serve the pages instead. I develop Hoogle on Windows through ghci in server mode, so without a workaround it's fairly fatal.
When writing Hoogle, I found that even though accept doesn't terminate on exceptions, if you do sClose on the socket it raises an exception in accept and does terminate. I experimented in a local copy of Warp and found the following worked very well:
import Control.Concurrent import Control.Monad
run :: Port -> Application -> IO () run port app = withSocketsDo $ do var <- newMVar Nothing let clean = modifyMVar_ var $ \s -> maybe (return ()) sClose s >> return Nothing forkIO $ bracket (listenOn $ PortNumber $ fromIntegral port) (const clean) (\s -> do modifyMVar_ var (\_ -> return $ Just s); serveConnections port app s) forever (threadDelay maxBound) `finally` clean
If you hit Ctrl+C in ghci on Windows you get the message "Network.Socket.accept: failed (No error)" - but it terminates perfectly. That error could even be dropped silently if you reach the `finally` clause. Can this code be included in Warp?
I've also been using Warp for development purposes at work on my Windows machine, and have run into the Ctrl-C issue. I'll review this patch after some sleep. My initial reaction is to include it conditionally for Windows. My guess is that very few people (if any) are planning on running production websites using Windows/Warp, so adding something which enhanced usability at the expense of a few cycles would be a general win. However, for the speed-crazed masses serving from Linux (eg, you and me apparently), I'd like to avoid any extra code. Thanks for taking the time to write up such a thorough email, it is much appreciated. Michael