
"Matthew Brecknell"
So here's a test. I don't have any big maildirs handy, so this is based on the simple exercise of printing the first line of each of a large number of files. First, the preamble.
import Control.Exception (bracket) import System.Environment import System.IO
main = do t:n:fs <- getArgs ([test0,test1,test2,test3] !! read t) (take (read n) $ cycle fs)
[snip]
Thank you for summarizing the approaches presented by others. As a Haskell newbie, there seems to be quite a few esoteric concepts to conquer. Your concrete examples were helpful in my understanding of the ramifications associated with the various approaches. After reading the various threads you cited, I decided to avoid lazy IO altogether. By using 'readFile' without forcing the strict evaluation of my parser, I inadvertently relinquished control of the resource management--closing of the file handles was left to the GC. And although I could have used 'seq' to address the issue, why bother fixing a problem that could have been avoided altogther by using strict IO. With that said, I added the following function to my program and then replaced the invocation of 'readFile' with it: readEmailHeaders :: FilePath -> IO String readEmailHeaders file = bracket (openFile file ReadMode) (hClose) (headers []) where headers acc h = do line <- hGetLine h case line of -- Stop reading file once we hit the empty separator -- line, no need to read the rest of the file (body). "" -> return . concat . reverse $ acc _ -> headers ("\n":line:acc) h I'm not sure if this is the best implementation, but the speed is comparable to the lazy IO version without the annoying defect of running out of file handles. I also tried an implementation using 'hGetChar' but that was much slower. I attempted to read Oleg's fold-stream implementation [1] as this sounds quite appealing to me, but I was completely overwhelmed, especially with all of the various type signatures used. It would be great if one of the regular Haskell bloggers (Tom Moertel are you reading this?) might write a blog entry or two interpreting his implementation for those of us starting out in Haskell perhaps by starting out with a non-polymorphic version so as to emphasize the approach. Thanks, Pete [1] http://okmij.org/ftp/Haskell/fold-stream.lhs