
On Jun 14, 2008, at 6:49 PM, Isaac Dupree wrote:
Sebastiaan Visser wrote:
Hi, I've got a question about lazy IO in Haskell. The most well known function to do lazy IO is the `hGetContents', which lazily reads all the contents from a handle and returns this as a regular [Char]. The thing with hGetContents is that is puts the Handle in a semi- closed state, no one can use the handle anymore. This behaviour is understandable from the point of safety; it is not yet determined when the result of hGetContents will actually be computed, using the handle in the meantime is undesirable. The point is, I think I really have a situation in which I want to use the handle again `after' a call to hGetContents. I think I can best explain this using a code example. readHttpMessage :: IO (Headers, Data.ByteString.Lazy.ByteString) readHttpMessage = do myStream <- <accept http connection from client> request <- hGetContents myStream header <- parseHttpHeader request bs <- Data.ByteString.Lazy.hGetContents myStream return (header, body)
that's impure because parseHttpHeader doesn't return anything telling you how much of the stream it's looked at. Maybe it looked ahead more than it needed to, thus deleting part of the body. I was going to suggest, if you can't change parseHttpHeader to use ByteStrings,
bs <- Data.ByteString.Lazy.hGetContents myStream header <- parseHttpHeader (Data.ByteString.Lazy.unpack bs)
but you still have to get parseHttpHeader (or perhaps if it has similar friends) to tell you how much of the string it consumed! I don't know what parsing functions you have available to work with, so I can't tell you whether it's possible.
It is a regular Parsec parser and I am pretty sure it does not consume anything other than the header itself. Maybe I could rewrite my parser to work on Word8's instead of Char's, I don't think HTTP even allows Unicode characters within HTTP headers. Thanks. I think I'll try this. But I'm still curious about how to lazily parse messages with arbitrary size Unicode headers and plain (possibly) binary bodies.
-Isaac
-- Sebastiaan.