
oleg:
Given the stance against top-level mutable variables, I have not expected to see this Lazy IO code. After all, what could be more against the spirit of Haskell than a `pure' function with observable side effects. With Lazy IO, one indeed has to choose between correctness and performance. The appearance of such code is especially strange after the evidence of deadlocks with Lazy IO, presented on this list less than a month ago. Let alone unpredictable resource usage and reliance on finalizers to close files (forgetting that GHC does not guarantee that finalizers will be run at all).
Is there an alternative?
Hi Oleg! I'm glad you joined the thread at this point. Some background: our best solutions for this problem using lazy IO, are based on chunk-wise lazy data structures, typically lazy bytestrings. Often we'll write programs like: import qualified Data.ByteString.Lazy.Char8 as B import System.Environment main = do [f] <- getArgs s <- B.readFile f print (B.count '\n' s) Which are nicely efficient $ ghc -O2 A.hs --make $ du -hs data 100M data $ time ./A data 11078540 ./A data 0.17s user 0.04s system 100% cpu 0.210 total And we know from elsewhere the performance is highlycompetitive: http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&lang=all Now, enumerators are very promising, and there's a lot of interest at the moment, (e.g. just this week, Johan Tibell gave an inspiring talk at Galois about this approach to IO, http://www.galois.com/blog/2008/09/12/left-fold-enumerators-a-safe-expressiv... and we spent the day sketching out an enumerator bytestring design, But there are some open questions. Perhaps you have some answers? * Can we write a Data.ByteString.Enumerator that has matching or better performance than its "dual", the existing chunk-wise lazy stream type? * Is there a translation from data ByteString = Empty | Chunk {-# UNPACK #-} !S.ByteString ByteString and functions on this type, foldlChunks :: (a -> S.ByteString -> a) -> a -> ByteString -> a foldlChunks f z = go z where go !a Empty = a go !a (Chunk c cs) = go (f a c) cs to an enumerator implementation? * Can we compose enumerators as we can stream functions? * Can we do fusion on enumerators? Does that make composition easier? (Indeed, is there an encoding of enumerators analogous to stream fusion control?) Any thoughts? -- Don