
On Sat, Aug 10, 2013 at 05:16:58PM -0700, Dan Krol wrote:
Hi,
I'm working on an rss file getter. I was wondering if I could get some help getting files to download and save without holding the entire file in memory in between. I chose Conduit's version of SimpleHttp only because it was recommended, and it was the quickest thing I could get to work correctly because I was eager to get started on this project, so I'd be happy to switch.
Here's where I define the download and save functions:
https://github.com/orblivion/feedGetter/blob/master/rss.hs#L107
And here's where I use them, getting multiple at a time with async:
https://github.com/orblivion/feedGetter/blob/master/rss.hs#L208
What happens when I run this is that it outputs that it's "Getting" the file, waits a while (presumably to download the whole thing), then says it's "Saving". And I checked the file system, it's not there during the pause. I'm not entirely sure why. Is it my choice of libraries, or the way I'm using them? Perhaps something to do with async? I just tried content <- simpleHttp "http://google.com" in ghci, and it does pause for a second, so I'm guessing this is strict from the getgo. But I've done almost no I/O before.
Is there a straightforward, canonical option? It seems like there perhaps should be. But if it comes down to using pipes or conduit, what the heck I'll try it out, I'd like to learn pipes eventually.
Michael is very good with documenting his packages, this is what I found in the docs for http-conduit (http://is.gd/WkDb7G): Note: Even though this function returns a lazy bytestring, it does not utilize lazy I/O, and therefore the entire response body will live in memory. If you want constant memory usage, you'll need to use the conduit package and http directly. /M -- Magnus Therning OpenPGP: 0xAB4DFBA4 email: magnus@therning.org jabber: magnus@therning.org twitter: magthe http://therning.org/magnus I invented the term Object-Oriented, and I can tell you I did not have C++ in mind. -- Alan Kay