
Hello, I have a problem with Network.HTTP module (http://www.haskell.org/http/) version 3001.0.0 . I have already mailed Bjorn Bringert about it but I didn't get answer yet so maybe someone here can help me. GHC v. 6.6.1 Ubuntu 7.10 x86_64 . I have turned on debug flag. Using get example (http://darcs.haskell.org/http/test/get.hs) I can download pages like this: $ ./get http://www.haskell.org/http/ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Haskell HTTP package</title> <link href="style.css" rel="stylesheet" type="text/css" /> </head> <body> .... SNIP rest of the content SNIP .... Also the log contain content of this file. However, some links misbehaves like: $ ./get http://www.podshow.com/feeds/gbtv.xml ... no-output ... however I see content of this xml in debug file and wget downloads almost 250 kB of data. Also: $ ./get http://digg.com/rss/indexvideos_animation.xml ... hangs ... and debug file has size 0, but wget downloads the file I could suspect this is xml problem but: $ ./get http://planet.haskell.org/rss20.xml <?xml version="1.0"?> <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"> <channel> <title>Planet Haskell</title> <link>http://planet.haskell.org/</link> <language>en</language> <description>Planet Haskell - http://planet.haskell.org/</description> .... SNIP rest of the content SNIP .... so it works. Do you have any idea what is going on here? What goes wrong? What other (high level) modules could I use to download files through http? Cheers, Radek. -- Codeside: http://codeside.org/ Przedszkole Miejskie nr 86 w Lodzi: http://www.pm86.pl/

On Nov 17, 2007, at 17:07 , Radosław Grzanka wrote:
Hello, I have a problem with Network.HTTP module (http://www.haskell.org/http/) version 3001.0.0 . I have already mailed Bjorn Bringert about it but I didn't get answer yet so maybe someone here can help me. GHC v. 6.6.1 Ubuntu 7.10 x86_64 .
I have turned on debug flag.
Using get example (http://darcs.haskell.org/http/test/get.hs) I can download pages like this:
$ ./get http://www.haskell.org/http/
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Haskell HTTP package</title> <link href="style.css" rel="stylesheet" type="text/css" /> </head> <body> .... SNIP rest of the content SNIP ....
Also the log contain content of this file.
However, some links misbehaves like:
$ ./get http://www.podshow.com/feeds/gbtv.xml
... no-output ...
however I see content of this xml in debug file and wget downloads almost 250 kB of data.
Also: $ ./get http://digg.com/rss/indexvideos_animation.xml
... hangs ...
and debug file has size 0, but wget downloads the file
I could suspect this is xml problem but: $ ./get http://planet.haskell.org/rss20.xml
<?xml version="1.0"?> <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel> <title>Planet Haskell</title> <link>http://planet.haskell.org/</link> <language>en</language> <description>Planet Haskell - http://planet.haskell.org/</description>
.... SNIP rest of the content SNIP ....
so it works. Do you have any idea what is going on here? What goes wrong? What other (high level) modules could I use to download files through http?
Cheers, Radek.
Hi Radek, thanks for the report. This turned out to be a bug in how Network.HTTP handled Chunked Transfer Encoding. The web server sent the chunk size as "00004000" (according to RFC 2616 this can be non-empty sequence of hex digits). However, Network.HTTP treated any chunk size starting with '0' as a chunk size of 0, which indicates the end of the chunked encoding. This is now fixed and a new release with the fix is available from http://hackage.haskell.org/cgi-bin/hackage-scripts/package/HTTP-3001.0.1 /Björn

This is now fixed and a new release with the fix is available from http://hackage.haskell.org/cgi-bin/hackage-scripts/package/HTTP-3001.0.1
/Björn
Thank you very much! That was fast. I switched for the moment to curl bindings but I will gladly turn back. :) Thanks again, Radek.

Hi Bjorn, I have tested the new version:
$ ./get http://www.podshow.com/feeds/gbtv.xml
... no-output ...
This case is indeed fixed. Thanks!
Also: $ ./get http://digg.com/rss/indexvideos_animation.xml
However this one still seems to hang and eventually ends with : get: recv: resource vanished (Connection reset by peer) Cheers, Radek.

On Nov 17, 2007 4:52 PM, Radosław Grzanka
Also: $ ./get http://digg.com/rss/indexvideos_animation.xml
However this one still seems to hang and eventually ends with : get: recv: resource vanished (Connection reset by peer)
It's not a Haskell problem. It looks like Digg expects a User-Agent request header. Modify get.hs like this: request uri = Request{ rqURI = uri, rqMethod = GET, rqHeaders = [Header HdrUserAgent "haskell-get-example"], rqBody = "" } and see what happens. G

Hi Graham,
2007/11/17, Graham Fawcett
On Nov 17, 2007 4:52 PM, Radosław Grzanka
wrote: Also: $ ./get http://digg.com/rss/indexvideos_animation.xml
However this one still seems to hang and eventually ends with : get: recv: resource vanished (Connection reset by peer)
It's not a Haskell problem. It looks like Digg expects a User-Agent request header. Modify get.hs like this:
request uri = Request{ rqURI = uri, rqMethod = GET, rqHeaders = [Header HdrUserAgent "haskell-get-example"], rqBody = "" }
Yes, that works. It's not only digg but other services as well.. Thank you for your help. Cheers, Radek. -- Codeside: http://codeside.org/ Przedszkole Miejskie nr 86 w Lodzi: http://www.pm86.pl/

Hello again Bjorn,
This is now fixed and a new release with the fix is available from http://hackage.haskell.org/cgi-bin/hackage-scripts/package/HTTP-3001.0.1
You have left debug flag on in the library code. Thanks, Radek. -- Codeside: http://codeside.org/ Przedszkole Miejskie nr 86 w Lodzi: http://www.pm86.pl/

On Nov 18, 2007, at 22:08 , Radosław Grzanka wrote:
Hello again Bjorn,
This is now fixed and a new release with the fix is available from http://hackage.haskell.org/cgi-bin/hackage-scripts/package/ HTTP-3001.0.1
You have left debug flag on in the library code.
Thanks, Radek.
Dammit. I forgot that cabal sdist of course uses the code in the current directory, not what's recorded in darcs. Silly me. 3001.0.2 fixes this, http://hackage.haskell.org/cgi-bin/hackage-scripts/ package/HTTP-3001.0.2 Thanks! /Björn
participants (3)
-
Bjorn Bringert
-
Graham Fawcett
-
Radosław Grzanka