Buggy behavior of "withFile"

I've encountered unexplainable of "withFile" function. Consider "example.hs" program: import System.IO main = do cnts <- withFile "example.hs" ReadMode $ (\h -> do res <- hGetContents h --putStr res return res) putStr cnts When commented-out line is like that, program doesn't write out anything to the STDOUT. If "--" (commend characters) are removed, program writes out contents of "example.hs" two times. Is this expected behavior? I asked on #haskell (freenode) and one fellow there found it equally confusing... ------------------------------------------- GHC version: The Glorious Glasgow Haskell Compilation System, version 7.6.3 OS: Linux Thanks, Zoran Plesivčak

On Tue, Dec 09 2014, Zoran Plesivčak
I've encountered unexplainable of "withFile" function. Consider "example.hs" program:
import System.IO
main = do cnts <- withFile "example.hs" ReadMode $ (\h -> do res <- hGetContents h --putStr res return res) putStr cnts
When commented-out line is like that, program doesn't write out anything to the STDOUT. If "--" (commend characters) are removed, program writes out contents of "example.hs" two times.
Is this expected behavior? I asked on #haskell (freenode) and one fellow there found it equally confusing...
This is an example of where Lazy IO can get tricky. withFile is closing the handle before you are able to evaluate the contents. When you use putStr in the lambda function, `res` is evaluated (and thus evaluated when it's returned). When it's commented, res is not evaluated before you close the file. See the documentation: http://hackage.haskell.org/package/base-4.7.0.1/docs/System-IO.html#v%3awith... Specifically note: "The handle will be closed on exit from withFile" Google around for "haskell lazy io withFile" and you might find more detailed explanations. -- Christopher Reichert irc: creichert gpg: C81D 18C8 862A 3618 1376 FFA5 6BFC A992 9955 929B

On Wed, Dec 10, 2014 at 3:34 AM, Christopher Reichert wrote: Google around for "haskell lazy io withFile" and you might find more
detailed explanations. Yes indeed. The explanations are pretty good too.
This is a gotcha that deserves to be more widely known. A recent patch to
ghc gives a better error message explaining what's going on.
How does one go about using withFile correctly then?
If you see the code "withFile filename ReadMode" chances are you should
replace it with "readFile filename".
Hence, instead of
main = do
cnts <- withFile "example.hs" ReadMode $ (\h -> do
res <- hGetContents h
--putStr res
return res)
putStr cnts
write
main = do
cnts <- readFile "example.hs"
putStr cnts
or more idiomatically
main = readFile "example.hs" >>= putStr
-- Kim-Ee

On Tue, Dec 9, 2014 at 2:21 PM, Zoran Plesivčak
I've encountered unexplainable of "withFile" function. Consider "example.hs" program:
import System.IO
main = do cnts <- withFile "example.hs" ReadMode $ (\h -> do res <- hGetContents h --putStr res return res) putStr cnts
When commented-out line is like that, program doesn't write out anything to the STDOUT. If "--" (commend characters) are removed, program writes out contents of "example.hs" two times.
I'm not an expert, but I believe you're problem is lazy IO. hGetContents doesn't actually read the file until something consumes it. In the version with the first putStr commented out, that doesn't happen until you do the putStr outside the withFile - by which time the file handle has been closed, so nothing ever gets read. With the putStr inside the withFile, that consumes the contents before the file is closed, so it gets read and then returned, where the second putStr outputs it a second time. If I'm right, you need to force the evaluation of res before the return, with something like "res seq return res" instead of just "return res". Or use the BangPatterns extension and then write "!res <- hGetContents h"

On Wed, Dec 10, 2014 at 3:36 AM, Mike Meyer
If I'm right, you need to force the evaluation of res before the return, with something like "res seq return res" instead of just "return res". Or use the BangPatterns extension and then write "!res <- hGetContents h"
This will read one byte (or none at all if the file is empty). That one byte will then be displayed by the putStr outside the withFile expression. It's a small but noticeable improvement to nothing being displayed at all. Why one byte? Because seq evaluates only to weak head normal form (WHNF), which in the case of a list, amounts to determining whether it's [] or (x:xs). You could write res <- hGetContents h evaluate $ length res -- Control.Exception.evaluate return res but if you're going to do that, you're better off using Strict I/O. And in fact, strict _bytestring_ I/O because the spatial footprint of lists is 10x bytestrings. But really, the default built-in lazy I/O on standard lists are fine for general usage. Avoid premature optimization. -- Kim-Ee

I ran into this issue recently for some code that needed to both read and then later write to the same file. readFile didn't work as expected because it kept the file handle open. I could restructure the code so the entire logic was in inside a `withFile` block, but I preferred not to. I wrote something similar to Zoran's original code and was stumped until I googled around. I want to share this stackoverflow answer regarding deepseq to force evaluation http://stackoverflow.com/a/9423349 (the key takeaway being `return $!! res` for Zoran's code) But really, the default built-in lazy I/O on standard lists are fine for
general usage. Avoid premature optimization.
I agree with Kim here, so I'm not contradicting that sentiment. It's just
nice to know the next step to take if the original lazy I/O code isn't
working out.
On Fri, Dec 12, 2014 at 3:36 AM, Kim-Ee Yeoh
On Wed, Dec 10, 2014 at 3:36 AM, Mike Meyer
wrote: If I'm right, you need to force the evaluation of res before the return, with something like "res seq return res" instead of just "return res". Or use the BangPatterns extension and then write "!res <- hGetContents h"
This will read one byte (or none at all if the file is empty).
That one byte will then be displayed by the putStr outside the withFile expression. It's a small but noticeable improvement to nothing being displayed at all.
Why one byte? Because seq evaluates only to weak head normal form (WHNF), which in the case of a list, amounts to determining whether it's [] or (x:xs).
You could write
res <- hGetContents h evaluate $ length res -- Control.Exception.evaluate return res
but if you're going to do that, you're better off using Strict I/O. And in fact, strict _bytestring_ I/O because the spatial footprint of lists is 10x bytestrings.
But really, the default built-in lazy I/O on standard lists are fine for general usage. Avoid premature optimization.
-- Kim-Ee
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
participants (5)
-
Christopher Reichert
-
Kim-Ee Yeoh
-
Mike Meyer
-
Zach Moazeni
-
Zoran Plesivčak