
Quoth Pete Kazmier, nevermore,
the same error regarding max open files. Incidentally, the lazy bytestring version of my program was by far the fastest and used the least amount of memory, but it still crapped out regarding max open files.
I've tried the approach you appear to be using and it can be tricky to predict how the laziness will interact with the list of actions. For example, I tried to download a temporary file, read a bit of data out of it and then download another one. I thought I would save thinking and use the same file name for each download: /tmp/feed.xml. What happened was that it downloaded them all in rapid succession, over-writing each one with the next and not actually reading the data until the end. So I ended up parsing N identical copies of the final file, instead of one of each. You need to refactor how you map the functions so that fewer whole lists are passed around. I'd guess that (1) is being executed in its entirety before being passed to (2), but it's not until (2) that the file data is actually used.
main = getArgs >>= mapM fileContentsOfDirectory >>= -- (1) mapM_ print . threadEmails . map parseEmail . concat -- (2)
This means there are a lot of files sitting open doing nothing. I've had a lot of success by recreating this as:
main = getArgs >>= mapM_ readAndPrint where readAndPrint = fileContentsOfDirectory >>= print -- etc.
It may seem semantically identical but it sometimes makes a difference when things actually happen. -- Dougal Stanton