
At Fri, 6 May 2011 10:15:50 +0200, Gregory Collins wrote:
Hi David,
Re: this comment from catchI:
It is not possible to catch asynchronous exceptions, such as lazily evaluated divide-by-zero errors, the throw function, or exceptions raised by other threads using throwTo if those exceptions might arrive anywhere outside of a liftIO call.
It might be worth investigating providing a version which can catch asynchronous exceptions if the underlying monad supports it (via MonadCatchIO or something similar). One of the most interesting advantages I can see for IterIO over the other iteratee implementations is that you actually have some control over resource usage -- not being able to catch asynchronous exceptions nullifies much of that advantage. A clear use case for this is timeouts on server threads, where you typically throw a TimeoutException exception to the handling thread using "throwTo" if the timeout is exceeded.
Excellent point. There's actually a chance that iterIO already catches those kinds of exceptions, but I wasn't sure enough about how the Haskell runtime works to make that claim. I've noticed in practice that asynchronous exceptions tend to come exactly when I execute the IO >>= operation. If that's true, then since each IO >>= is wrapped in a try block, the exceptions will all be caught (well, not divide by zero, but things like throwTo, which I think are more important). One way I was thinking of implementing this was wrapping the whole execution in block, and then calling unblock (unless iterIO's own hypothetical block function is called) for every invocation of liftIO. Unfortunately, the block and unblock functions now seem to be deprecated, and the replacement mask/unmask ones would not be as amenable to this technique. However, if there's some simpler way to guarantee that >>= is the point where exceptions are thrown (and might be the case for GHC in practice), then I basically only need to update the docs. If someone with more GHC understanding could explain how asynchronous exceptions work, I'd love to hear it...
Another question re: resource cleanup: in the docs I see:
Now suppose inumHttpBody fails (most likely because it receives an EOF before reading the number of bytes specified in the Content-Length header). Because inumHttpBody is fused to handler, the failure will cause handler to receive an EOF, which will cause foldForm to fail, which will cause handleI to receive an EOF and return, which will ensure hClose runs and the file handle h is not leaked.
Once the EOFs have been processed, the exception will propagate upwards making inumHttpServer fail, which in turn will send an EOF to iter. Then the exception will cause enum to fail, after which sock will be closed. In summary, despite the complex structure of the web server, because all the components are fused together with pipe operators, corner cases like this just work with no need to worry about leaked file descriptors.
Could you go into a little bit of detail about the mechanism behind this?
Yes, absolutely. This relies on the fact that an Inum must always return its target Iter, even when the Inum fails. This invariant is ensured by the two Inum construction functions, mkInumC and mkInumM, which catch exceptions thrown by the "codec" iteratee and add in the state of the target iteratee. Now when you execute code like "inum .| iter", the immediate result of running inum is "IterR tIn m (IterR tOut m a)"--i.e., the result of an iteratee returning the result an iteratee (because Inums are iteratees, too). If the Inum failed, then the outer IterR will use the Fail constructor: Fail !IterFail !(Maybe a) !(Maybe (Chunk t)) Where the "Maybe a" will be a "Maybe (IterR tOut m b)", and, because of the Inum invariant, will be Just an actual result. .| then must translate the inner iteratee result to the appropriate return type for the Inum (since the Inum's type (IterR tIn m ...) is different from the Iter's (Iter tOut m ...)). This happens through the internal function joinR, which says: joinR (Fail e (Just i) c) = flip onDoneR (runR i) $ \r -> case r of Done a _ -> Fail e (Just a) c Fail e' a _ -> Fail e' a c _ -> error "joinR" Where the 'runR' function basically keeps feeding EOF to an Iter (and executing it's monadic actions and rejecting its control requests) until it returns a result, at which point the result's residual input can be discarded and replaced with the residual input of the Inum. David