
I have been looking around at possibly making a Browser module for
Network.HTTP.Conduit on top of Control.Monad.State. I came across this
roadbump:
In order to implement redirection following, client code must call 'http'
with a redirection count of 0. If there is a redirect, 'http' will throw an
exception. In order to catch this exception and continue on with the
redirection chain, the 'catch' function must be called. The problem is that
the 'catch' function has a type of (catch :: Exception e => IO a -> (e ->
IO a) -> IO a) which means that it can only be used in the IO monad. A call
to 'http' inside the first argument of 'catch' must be wrapped in a
'runResourceT'
This has a couple implications:
- The catch function cannot yield a Source, because any connections
which 'http' opened will have to be immediately closed as soon as control
reaches the 'catch' function (because of the runResourceT). The only way
around this is to put the bind inside the catch block with some default
sink, but this loses all of the constant-memory benefits of using a conduit
in the first place.
- The function that will be calling 'catch' operates as a State monad.
Cookie state must be updated both before and after making the request (both
inside the 'catch' block and outside). I could pass around an updated
cookie jar out of the IO monad as an extra entry in a tuple, but this
defeats the purpose of using a State monad in the first place. This kind of
programming is very ugly.
I see two solutions to these problems:
- Make 'http' not throw a StatusCodeException if the _original_
redirection count = 0
- Make my internal module call the un-exported 'httpRaw' function. This
has the problem that it doen't solve the problem for anyone else trying to
use http-conduit; it only solves the problem for me. Perhaps this could be
alleviated by exporting httpRaw.
What do you think?
--Myles
On Tue, Feb 7, 2012 at 12:35 AM, Myles C. Maxfield wrote: Alright. I'll issue another pull request to you when it's done (expect in
a couple weeks). Thanks for your input so far, Aristid and Michael. @Chris Wong: Do you want to talk about the Suffix List stuff some time? --Myles On Mon, Feb 6, 2012 at 10:14 PM, Michael Snoyman +1 I would say: if it adds no package dependencies, put it right in. Aristid Am 06.02.2012 22:09 schrieb "Myles C. Maxfield" <
myles.maxfield@gmail.com>: After all these commits have been flying around, I have yet another
question: the 'HTTP' package defines Network.Browser which is a State monad which
keeps state about a "browser" (i.e. a cookie jar, a proxy, redirection
parameters, etc.) It would be pretty straightforward to implement this kind of functionality on top of http-conduit. I was originally going to do it and release it as its own package, but
it
may be beneficial to add such a module to the existing http-conduit Should I add it in to the existing package, or release it as its own
package? --Myles On Mon, Feb 6, 2012 at 12:15 AM, Michael Snoyman Just an FYI for everyone: Myles sent an (incredibly thorough) pull
request to handle cookies: https://github.com/snoyberg/http-conduit/pull/13 Thanks! On Sun, Feb 5, 2012 at 8:20 AM, Myles C. Maxfield
1. The spec defines a grammar for the attributes. They're in uppercase. 2. Yes - 1.3 is the first version that lists DiffTime as an
instance of
RealFrac (so I can use the 'floor' function to pull out the number
of
seconds to render it)
3. I'll see what I can do. --Myles On Sat, Feb 4, 2012 at 9:06 PM, Michael Snoyman <
michael@snoyman.com>
wrote:
>
> Looks good, a few questions/requests:
>
> 1. Is there a reason to upper-case all the attributes?
> 2. Is the time >= 1.3 a requirements? Because that can cause a lot
of
> trouble for people.
> 3. Can you send the patch as a Github pull request? It's easier to
> track that way.
>
> Michael
>
> On Sat, Feb 4, 2012 at 1:21 AM, Myles C. Maxfield
> > > works,
> > though.
> >
> > Thanks,
> > Myles
> >
> >
> > On Thu, Feb 2, 2012 at 11:26 PM, Myles C. Maxfield
> > > >> second
> >> file, which is a very large source file that contains each
element
> >> in
> >> the
> >> list. The list should export `elem`-type queries. I'm not quite
> >> sure
> >> how to
> >> handle wildcards that appear in the list - that part is up to
you.
> >> Thanks
> >> for helping out with this :]
> >>
> >> --Myles
> >>
> >>
> >> On Thu, Feb 2, 2012 at 10:53 PM, Michael Snoyman
> >> > >>> > of
> >>> > its
> >>> > domain verification. The cookie code only needs to be able to
> >>> > tell
> >>> > if a
> >>> > specific string is in the list (W.Ascii -> Bool)
> >>> >
> >>> > I propose making an entirely unrelated package,
> >>> > public-suffix-list,
> >>> > with a
> >>> > module Network.PublicSuffixList, which will expose this
> >>> > function, as
> >>> > well as
> >>> > functions about parsing the list itself. Thoughts?
> >>> >
> >>> > Web.Cookie doesn't have a "secure-only" attribute. Adding
one in
> >>> > is
> >>> > straightforward enough.
> >>> > The spec describes cookies as a property of HTTP, not of the
> >>> > World
> >>> > Wide
> >>> > Web.
> >>> > Perhaps "Web.Cookie" should be renamed? Just a thought; it
> >>> > doesn't
> >>> > really
> >>> > matter to me.
> >>> >
> >>> > As for Network.HTTP.Conduit.Cookie, the spec describes in
> >>> > section
> >>> > 5.3
> >>> > "Storage Model" what fields a Cookie has. Here is my proposal
> >>> > for
> >>> > the
> >>> > functions it will expose:
> >>> >
> >>> > receiveSetCookie :: SetCookie -> Req.Request m -> UTCTime ->
> >>> > Bool ->
> >>> > CookieJar -> CookieJar
> >>> >
> >>> > Runs the algorithm described in section 5.3 "Storage Model"
> >>> > The UTCTime is the current-time, the Bool is whether or not > >>> > caller
> >>> > is an
> >>> > HTTP-based API (as opposed to JavaScript or anything else)
> >>> >
> >>> > updateCookieJar :: Res.Response a -> Req.Request m ->
UTCTime ->
> >>> > CookieJar
> >>> > -> (CookieJar, Res.Response a)
> >>> >
> >>> > Applies "receiveSetCookie" to a Response. The output
CookieJar
> >>> > is
> >>> > stripped
> >>> > of any Set-Cookie headers.
> >>> > Specifies "True" for the Bool in receiveSetCookie
> >>> >
> >>> > computeCookieString :: Req.Request m -> CookieJar -> UTCTime
->
> >>> > Bool
> >>> > ->
> >>> > (W.Ascii, CookieJar)
> >>> >
> >>> > Runs the algorithm described in section 5.4 "The Cookie
Header"
> >>> > The UTCTime and Bool are the same as in receiveSetCookie
> >>> >
> >>> > insertCookiesIntoRequest :: Req.Request m -> CookieJar ->
> >>> > UTCTime ->
> >>> > (Req.Request m, CookieJar)
> >>> >
> >>> > Applies "computeCookieString" to a Request. The output cookie
> >>> > jar
> >>> > has
> >>> > updated last-accessed-times.
> >>> > Specifies "True" for the Bool in computeCookieString
> >>> >
> >>> > evictExpiredCookies :: CookieJar -> UTCTime -> CookieJar
> >>> >
> >>> > Runs the algorithm described in the last part of section 5.3
> >>> > "Storage
> >>> > Model"
> >>> >
> >>> > This will make the relevant part of 'http' look like:
> >>> >
> >>> > go count req'' cookie_jar'' = do
> >>> > now <- liftIO $ getCurrentTime
> >>> > let (req', cookie_jar') = insertCookiesIntoRequest
req''
> >>> > (evictExpiredCookies cookie_jar'' now) now
> >>> > res' <- httpRaw req' manager
> >>> > let (cookie_jar, res) = updateCookieJar res' req' now
> >>> > cookie_jar'
> >>> > case getRedirectedRequest req' (responseHeaders res)
> >>> > (W.statusCode
> >>> > (statusCode res)) of
> >>> > Just req -> go (count - 1) req cookie_jar
> >>> > Nothing -> return res
> >>> >
> >>> > I plan to not allow for a user-supplied cookieFilter
function.
> >>> > If
> >>> > they
> >>> > want
> >>> > that functionality, they can re-implement the
> >>> > redirection-following
> >>> > logic.
> >>> >
> >>> > Any thoughts on any of this?
> >>> >
> >>> > Thanks,
> >>> > Myles
> >>> >
> >>> > On Wed, Feb 1, 2012 at 5:19 PM, Myles C. Maxfield
> >>> > > >>> >>> >> as a
> >>> >>> >> reference.
> >>> >>> >>
> >>> >>> >> I will implement the functions according to RFC 6265
> >>> >>> >> I will also need to write the following functions.
Should
> >>> >>> >> they
> >>> >>> >> also be
> >>> >>> >> exported?
> >>> >>> >>
> >>> >>> >> canonicalizeDomain :: W.Ascii -> W.Ascii
> >>> >>> >>
> >>> >>> >> turns "..a.b.c..d.com..." to "a.b.c.d.com"
> >>> >>> >> Technically necessary for domain matching (Chrome does
it)
> >>> >>> >> Perhaps unnecessary for a first pass? Perhaps we can > >>> >>> >> users
> >>> >>> >> for
> >>> >>> >> now?
> >>> >>> >>
> >>> >>> >> domainMatches :: W.Ascii -> W.Ascii -> Maybe W.Ascii
> >>> >>> >>
> >>> >>> >> Does the first domain match against the second domain?
> >>> >>> >> If so, return the prefix of the first that isn't in the
> >>> >>> >> second
> >>> >>> >>
> >>> >>> >> pathMatches :: W.Ascii -> W.Ascii -> Bool
> >>> >>> >>
> >>> >>> >> Do the paths match?
> >>> >>> >>
> >>> >>> >> In order to implement domain matching, I have to have
> >>> >>> >> knowledge
> >>> >>> >> of
> >>> >>> >> the Public Suffix List so I know that
> >>> >>> >> sub1.sub2.pvt.k12.wy.us
> >>> >>> >> can
> >>> >>> >> set
> >>> >>> >> a
> >>> >>> >> cookie for sub2.pvt.k12.wy.us but not for k12.wy.us
> >>> >>> >> (because
> >>> >>> >> pvt.k12.wy.us
> >>> >>> >> is a "suffix"). There are a variety of ways to implement
> >>> >>> >> this.
> >>> >>> >>
> >>> >>> >> As far as I can tell, Chrome does it by using a script
> >>> >>> >> (which a
> >>> >>> >> human
> >>> >>> >> periodically runs) which parses the list at creates a
.cc
> >>> >>> >> file
> >>> >>> >> that is
> >>> >>> >> included in the build.
> >>> >>> >>
> >>> >>> >> I might be wrong about the execution of the script; it
> >>> >>> >> might be
> >>> >>> >> a
> >>> >>> >> build
> >>> >>> >> step. If it is a build step, however, it is suspicious > >>> >>> >> a
> >>> >>> >> build
> >>> >>> >> target
> >>> >>> >> would try to download a file...
> >>> >>> >>
> >>> >>> >> Any more elegant ideas?
> >>> >>> >>
> >>> >>> >> Feedback on any/all of the above would be very helpful
> >>> >>> >> before I
> >>> >>> >> go
> >>> >>> >> off
> >>> >>> >> into the weeds on this project.
> >>> >>> >>
> >>> >>> >> Thanks,
> >>> >>> >> Myles C. Maxfield
> >>> >>> >>
> >>> >>> >> On Sat, Jan 28, 2012 at 8:17 PM, Michael Snoyman
> >>> >>> >> > >>> >>> >>>
> >>> >>> >>> On Sat, Jan 28, 2012 at 8:36 PM, Myles C. Maxfield
> >>> >>> >>> > >>> >>> >>> >> to
> >>> >>> >>> >> myself
> >>> >>> >>> >> that
> >>> >>> >>> >> you're right. That's the kind of code that could
easily
> >>> >>> >>> >> be
> >>> >>> >>> >> broken
> >>> >>> >>> >> in
> >>> >>> >>> >> future updates by an unwitting maintainer (e.g.,
me).
> >>> >>> >>> >> To
> >>> >>> >>> >> protect
> >>> >>> >>> >> the
> >>> >>> >>> >> world from me, I'd prefer if the code didn't have On Mon, Feb 6, 2012 at 11:16 PM, Aristid Breitkreuz
> >>> >>> >>> >> fromJust.
> >>> >>> >>> >> This
> >>> >>> >>> >> might be a good place to leverage the Monad
instance of
> >>> >>> >>> >> Maybe.
> >>> >>> >>> >>
> >>> >>> >>> >> Michael
> >>> >>> >>> >
> >>> >>> >>> >
> >>> >>> >>
> >>> >>> >>
> >>> >>> >
> >>> >
> >>> >
> >>
> >>
> > _______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe