Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

6 May 2011

      At Fri, 6 May 2011 10:15:50 +0200,
Gregory Collins wrote:
...
Hi David,
Re: this comment from catchI:
...
It is not possible to catch asynchronous exceptions, such as
lazily evaluated divide-by-zero errors, the throw function, or
exceptions raised by other threads using throwTo if those
exceptions might arrive anywhere outside of a liftIO call.
It might be worth investigating providing a version which can catch
asynchronous exceptions if the underlying monad supports it (via
MonadCatchIO or something similar). One of the most interesting
advantages I can see for IterIO over the other iteratee
implementations is that you actually have some control over resource
usage -- not being able to catch asynchronous exceptions nullifies
much of that advantage. A clear use case for this is timeouts on
server threads, where you typically throw a TimeoutException exception
to the handling thread using "throwTo" if the timeout is exceeded.
Excellent point.  There's actually a chance that iterIO already
catches those kinds of exceptions, but I wasn't sure enough about how
the Haskell runtime works to make that claim.  I've noticed in
practice that asynchronous exceptions tend to come exactly when I
execute the IO >>= operation.  If that's true, then since each IO >>=
is wrapped in a try block, the exceptions will all be caught (well,
not divide by zero, but things like throwTo, which I think are more
important).

One way I was thinking of implementing this was wrapping the whole
execution in block, and then calling unblock (unless iterIO's own
hypothetical block function is called) for every invocation of liftIO.
Unfortunately, the block and unblock functions now seem to be
deprecated, and the replacement mask/unmask ones would not be as
amenable to this technique.

However, if there's some simpler way to guarantee that >>= is the
point where exceptions are thrown (and might be the case for GHC in
practice), then I basically only need to update the docs.  If someone
with more GHC understanding could explain how asynchronous exceptions
work, I'd love to hear it...
...
Another question re: resource cleanup: in the docs I see:
...
Now suppose inumHttpBody fails (most likely because it receives an
EOF before reading the number of bytes specified in the
Content-Length header). Because inumHttpBody is fused to handler,
the failure will cause handler to receive an EOF, which will cause
foldForm to fail, which will cause handleI to receive an EOF and
return, which will ensure hClose runs and the file handle h is not
leaked.
...
Once the EOFs have been processed, the exception will propagate
upwards making inumHttpServer fail, which in turn will send an EOF
to iter. Then the exception will cause enum to fail, after which
sock will be closed. In summary, despite the complex structure of
the web server, because all the components are fused together with
pipe operators, corner cases like this just work with no need to
worry about leaked file descriptors.
Could you go into a little bit of detail about the mechanism behind this?
Yes, absolutely.  This relies on the fact that an Inum must always
return its target Iter, even when the Inum fails.  This invariant is
ensured by the two Inum construction functions, mkInumC and mkInumM,
which catch exceptions thrown by the "codec" iteratee and add in the
state of the target iteratee.

Now when you execute code like "inum .| iter", the immediate result of
running inum is "IterR tIn m (IterR tOut m a)"--i.e., the result of an
iteratee returning the result an iteratee (because Inums are
iteratees, too).  If the Inum failed, then the outer IterR will use
the Fail constructor:

        Fail !IterFail !(Maybe a) !(Maybe (Chunk t))

Where the "Maybe a" will be a "Maybe (IterR tOut m b)", and, because
of the Inum invariant, will be Just an actual result.  .| then must
translate the inner iteratee result to the appropriate return type for
the Inum (since the Inum's type (IterR tIn m ...) is different from
the Iter's (Iter tOut m ...)).  This happens through the internal
function joinR, which says:

joinR (Fail e (Just i) c) = flip onDoneR (runR i) $ \r ->
                            case r of
                              Done a _    -> Fail e (Just a) c
                              Fail e' a _ -> Fail e' a c
                              _ -> error "joinR"

Where the 'runR' function basically keeps feeding EOF to an Iter (and
executing it's monadic actions and rejecting its control requests)
until it returns a result, at which point the result's residual input
can be discarded and replaced with the residual input of the Inum.

David

Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

dm-list-haskell-cafe＠scs.stanford.edu