
Pete Kazmier:
When using readFile to process a large number of files, I am exceeding the resource limits for the maximum number of open file descriptors on my system. How can I enhance my program to deal with this situation without making significant changes?
AFAIU, file handles opened by readFile are closed in the following circumstances: 1) When lazy evaluation of the returned contents reaches the end of the file. 2) When the garbage collector runs the finaliser for the file structure. Obviously, for this to happen, the file structure must be unreachable. Unfortunately, the unreachability of the file structure doesn't guarantee anything about the timeliness of the garbage collection. While the garbage collector does respond to memory utilisation pressure, it doesn't respond to file handle utilisation pressure. Consequently, any program which uses readFile to read small portions of many files is likely to exhibit the problem you are experiencing. I'm not aware of an easy fix. You could use openFile, hGetContents and hClose, but then you have to be careful to avoid another problem, as described in [1]. In [2], Oleg describes the deeper problems with getContents and friends (including readFile), and advocates explicitly sequenced I/O. I have a feeling there have been even more discussions around this topic recently, but they elude me at the moment. Of course, we'll be most curious to hear which solution you choose. [1]http://www.haskell.org/pipermail/haskell-cafe/2007-March/023189.html [2]http://www.haskell.org/pipermail/haskell-cafe/2007-March/023073.html