
Tom, The bad news is that 1. Haskell makes no guarantee about when the files are closed, 2. file handles are a limited resource, and 3. lazy I/O doesn't handle errors in a recoverable fashion. Unfortunately this means that lazy I/O is fundamentally unsound. The only safe way to do it is to read the file strictly in blocks using Data.ByteString.hGet. Steve Tom Tobin wrote:
Instead of a question, I thought I'd share a moment of lazy-evaluation enlightenment I had last night.
I have some code that recursively descends a directory, gets the SHA1 hashes for all the files, and builds a map of which file paths share the same SHA1 hash. The code that actually generates the hash looked like this:
sha1file :: FilePath -> IO String sha1file fn = do bs <- expandPath fn >>= BSL.readFile return $ PureSHA.showDigest $ PureSHA.sha1 bs
Everything worked fine on paths without many files in them, but choked on paths with many files:
"Exception: getCurrentDirectory: resource exhausted (Too many open files)"
This was driving me crazy; ByteString.Lazy.readFile is supposed to close the file once it's done. I kept going over my code, wondering what was at fault, until it finally clicked: *the hashes weren't being generated until I actually tried to view them*, and thus all the files were being held open until that point! I made a single change to my "sha1file" function:
sha1file :: FilePath -> IO String sha1file fn = do bs <- expandPath fn >>= BSL.readFile return $ PureSHA.showDigest $! PureSHA.sha1 bs
(the "$!") ... and everything worked perfectly. The code now finished processing each file before opening the next one, and I was happy. :-) _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners