
On Sun, Aug 19, 2012 at 12:45:47AM -0400, Michael Orlitzky wrote:
On 08/18/2012 08:52 PM, Michael Orlitzky wrote:
I'm one bug away from a working program and need some help. I wrote a little utility that logs into LWN.net, retrieves an article, and creates an epub out of it.
I've created two pages where anyone can test this. The first just takes any username and password via post and sets a session variable. The second prints "Success." if the session variable is set, and "Failure." if it isn't. The bash script,
[…]
The attached haskell program using Network.Curl, doesn't:
$ runghc haskell-test.hs Logged in... Failure.
Any help is appreciated =)
So, take this with a grain of salt: I've been bitten by curl (the haskell bindings, I mean) before, and I don't hold the quality of the library in great regard. The libcurl documentation says: "When you set a file name with CURLOPT_COOKIEJAR, that file name will be created and all received cookies will be stored in it when curl_easy_cleanup(3) is called" (i.e. at the end of a curl handle session). But even though the curl bindings seem to run easy_cleanup on handles (initialize → mkCurl → mkCurlWithCleanup), they don't do this correctly: DEBUG: ALLOC: CURL DEBUG: ALLOC: /tmp/network-curl-test-haskell20417.txt DEBUG: ALLOC: username=foo&password=bar DEBUG: ALLOC: http://michael.orlitzky.com/tmp/network-curl-test1.php DEBUG: ALLOC: WRITER DEBUG: ALLOC: WRITER Note there's no "DEBUG: FREE: CURL" as the code seems to imply there should be. Hence, the handle is never cleaned up (do the curl bindings leak handles?), so the cookie file is never written. Side note: by running the same program multiple times, sometimes you see DEBUG: FREE: CURL, sometimes no FREE actions. I believe there's something very wrong in the curl bindings with regard to cleanups. If I modify curl to export a "force cleanup" function, I can make the program work (but not always; my patch is a hack). Alternatively, as the curl library doesn't need a cookie jar to use cookies in the same handle, by modifying your code to reuse the same curl handle (returning it from log_in and reusing the same in get_page) gives me a success code. But the cookie file is still not filled, since the curl handle is never properly terminated. Since the curl bindings also have problems in multi-threaded programs when SSL is enabled (as it doesn't actually setup the curl library correctly with regards to multi-threaded memory allocation), I would suggest you try to use the http conduit library, since that's a pure haskell library that should work as well, if not better. Happy to be proved wrong, if I'm just biased against curl :) regards, iustin