letting go of file handles and Data.Binary

hi all -- using binary 0.4.1 on ghc 6.8.2, vista 64 sp1. consider the following program: import System.Directory import Data.Binary main = do let dat = [1..10]::[Int] fname = "foo.dat" encodeFile fname dat dat2 <- decodeFile fname print (dat == dat2) removeFile fname this throws a permission denied exception, presumably because the file is still open when the removeFile is called. i've grovelled the source of Data.Bytestring.Lazy and all but i can't seem to understand the "right" way to make this work. note that i've forced the evaluation of dat2 and presumably therefore the filehandle is at least half-closed. any suggestions? take care, Ben

Ben wrote:
this throws a permission denied exception, presumably because the file is still open when the removeFile is called.
Yes. The file handle opened by encodeFile won't be closed until its finalizer is run. There is no guarantee that the finalizer will be run immediately. In fact, you can usually rely on the finalizer *not* running immediately, because the GC has to kick in, notice that the handle is dead, and hand it to the finalizer thread. It's actually possible that your finalizer could not run at all.
i've grovelled the source of Data.Bytestring.Lazy and all but i can't seem to understand the "right" way to make this work.
Don't use encodeFile at all. Instead, write your own: import Data.Binary (Binary, encode) import Control.Exception (bracket) import qualified Data.ByteString.Lazy as L import System.IO (IOMode(..), hClose, openFile) encodeFile :: Binary a => FilePath -> a -> IO () encodeFile path = bracket (openFile path WriteMode) hClose . flip L.hPut . encode

Doh! For all that I wrote about encodeFile, substitute decodeFile. You'll need to write something to force the value that you're decoding. Something like this ought to do the trick. import Data.Binary (Binary, decode) import Control.Exception (bracket) import qualified Data.ByteString.Lazy as L import System.IO (IOMode(..), hClose, openFile) strictDecodeFile :: Binary a => FilePath -> (a -> b) -> IO () strictDecodeFile path force = bracket (openFile path ReadMode) hClose $ \h -> do c <- L.hGetContents h force (decode c) `seq` return ()

On Sun, 2008-04-20 at 14:24 -0700, Bryan O'Sullivan wrote:
Doh! For all that I wrote about encodeFile, substitute decodeFile.
Indeed the version of encodeFile you wrote should be exactly identical to the original because the lazy bytestring writeFile already uses bracket like that: writeFile :: FilePath -> ByteString -> IO () writeFile f txt = bracket (openBinaryFile f WriteMode) hClose (\hdl -> hPut hdl txt)
strictDecodeFile :: Binary a => FilePath -> (a -> b) -> IO () strictDecodeFile path force = bracket (openFile path ReadMode) hClose $ \h -> do c <- L.hGetContents h force (decode c) `seq` return ()
Yes, the problem with Ben's program was that decodeFile is lazily reading the file and lazily decoding. If the decoding consumes all the input then it should be possible to avoid rewriting decodeFile and use: dat2 <- decodeFile fname evaluate dat2 removeFile fname It's not immediately clear to me if we can make the decodeFile behave like your version. I'd have to go think about whether running the Get monad can lazily return values or if it always consumes all the input it'll ever need. Duncan

FWIW, installed bytestring-0.9.1.0, ran ghc-pkg hide
bytestring-0.9.0.1, recompiled and reinstalled binary-0.4.1. then i
played around with all that you suggested, and came to the conclusion
that i don't understand seq!
import Control.Exception (bracket)
import System.Directory
import System.IO
import Data.Binary
import Data.ByteString.Lazy as B
strictDecodeFile :: Binary a => FilePath -> (a -> b) -> IO ()
strictDecodeFile path force =
bracket (openFile path ReadMode) hClose $ \h -> do
c <- B.hGetContents h
force (decode c) `seq` return ()
strictDecodeFile' :: Binary a => FilePath -> (a -> IO b) -> IO ()
strictDecodeFile' path force =
bracket (openFile path ReadMode) hClose $ \h -> do
c <- B.hGetContents h
force (decode c)
return ()
main = do
let dat = [1..10]::[Int]
fname = "foo.dat"
encodeFile fname dat
h <- openFile fname ReadMode
c <- B.hGetContents h
let dat2 = decode c
print (dat == dat2)
hClose h
removeFile fname
encodeFile fname dat
strictDecodeFile fname (\x -> do print "strict 1"
print (x == dat))
removeFile fname
encodeFile fname dat
strictDecodeFile' fname (\x -> do print "strict 2"
print (x == dat))
removeFile fname
encodeFile fname dat
dat4 <- decodeFile fname
print (dat == dat4)
removeFile fname
running main outputs
True
"strict 2"
True
True
*** Exception: foo.dat: removeFile: permission denied (Permission denied)
e.g. the handle version works, Bryan's original strictDecodeFile
appears to not run "force", the modified strictDecodeFile' does run
"force" (i didn't use seq, just an additional line in the monad), and
the encodeFile / decodeFile / removeFile appears to still not work
with the latest bytestring. what's the difference between the seq and
non-seq versions?
for now i can use strictDecodeFile' but at least something should be
said in the docs about decodeFile et al holding handles. (i
understand this is not the fault of binary per se as much as haskell's
non-strict semantics, but a reminder for noobs like me would be
helpful.) and finally something like strictDecodeFile' might be
useful in the library?
thanks for the help, ben
On Sun, Apr 20, 2008 at 2:34 PM, Duncan Coutts
On Sun, 2008-04-20 at 14:24 -0700, Bryan O'Sullivan wrote:
Doh! For all that I wrote about encodeFile, substitute decodeFile.
Indeed the version of encodeFile you wrote should be exactly identical to the original because the lazy bytestring writeFile already uses bracket like that:
writeFile :: FilePath -> ByteString -> IO () writeFile f txt = bracket (openBinaryFile f WriteMode) hClose (\hdl -> hPut hdl txt)
strictDecodeFile :: Binary a => FilePath -> (a -> b) -> IO () strictDecodeFile path force = bracket (openFile path ReadMode) hClose $ \h -> do c <- L.hGetContents h force (decode c) `seq` return ()
Yes, the problem with Ben's program was that decodeFile is lazily reading the file and lazily decoding. If the decoding consumes all the input then it should be possible to avoid rewriting decodeFile and use:
dat2 <- decodeFile fname evaluate dat2 removeFile fname
It's not immediately clear to me if we can make the decodeFile behave like your version. I'd have to go think about whether running the Get monad can lazily return values or if it always consumes all the input it'll ever need.
Duncan

Am Montag, 21. April 2008 01:35 schrieb Ben:
FWIW, installed bytestring-0.9.1.0, ran ghc-pkg hide bytestring-0.9.0.1, recompiled and reinstalled binary-0.4.1. then i played around with all that you suggested, and came to the conclusion that i don't understand seq!
import Control.Exception (bracket) import System.Directory import System.IO import Data.Binary import Data.ByteString.Lazy as B
strictDecodeFile :: Binary a => FilePath -> (a -> b) -> IO () strictDecodeFile path force = bracket (openFile path ReadMode) hClose $ \h -> do c <- B.hGetContents h force (decode c) `seq` return ()
This means that force (decode c) is reduced to head normal form, not fully evaluated. So if the result type of force were e.g. Either String Int, it would be evaluated just far enough to determine if force (decode c) is bottom, Left something or Right somethingelse; Left undefined and Right undefined won't cause an exception. In the case below, force (decode c) is an IO action, it will be evaluated as far as necessary to see it's (IO whatever) and not bottom, for that it need not be run, therefore you don't see "strict 1" printed out.
strictDecodeFile' :: Binary a => FilePath -> (a -> IO b) -> IO () strictDecodeFile' path force = bracket (openFile path ReadMode) hClose $ \h -> do c <- B.hGetContents h force (decode c) return ()
Here you say you want the IO action force (decode c) to be run, so it is run and "strict 2" is printed out. Hope that helps.

one more piece of email pollution:
import Control.Exception (bracket)
import System.IO
import Data.Binary
import qualified Data.ByteString.Lazy as B
strictDecodeFile :: Binary a => FilePath -> IO a
strictDecodeFile path =
bracket (openBinaryFile path ReadMode) hClose $ \h -> do
c <- B.hGetContents h
return $! decode c
seems to work like Data.Binary.decodeFile but explicitly closes the handle.
take care, ben
On Sun, Apr 20, 2008 at 4:35 PM, Ben
FWIW, installed bytestring-0.9.1.0, ran ghc-pkg hide bytestring-0.9.0.1, recompiled and reinstalled binary-0.4.1. then i played around with all that you suggested, and came to the conclusion that i don't understand seq!
import Control.Exception (bracket) import System.Directory import System.IO import Data.Binary import Data.ByteString.Lazy as B
strictDecodeFile :: Binary a => FilePath -> (a -> b) -> IO () strictDecodeFile path force = bracket (openFile path ReadMode) hClose $ \h -> do c <- B.hGetContents h force (decode c) `seq` return ()
strictDecodeFile' :: Binary a => FilePath -> (a -> IO b) -> IO ()
strictDecodeFile' path force = bracket (openFile path ReadMode) hClose $ \h -> do c <- B.hGetContents h force (decode c) return ()
main = do let dat = [1..10]::[Int] fname = "foo.dat" encodeFile fname dat h <- openFile fname ReadMode c <- B.hGetContents h let dat2 = decode c print (dat == dat2) hClose h removeFile fname
encodeFile fname dat strictDecodeFile fname (\x -> do print "strict 1" print (x == dat)) removeFile fname
encodeFile fname dat strictDecodeFile' fname (\x -> do print "strict 2" print (x == dat)) removeFile fname
encodeFile fname dat dat4 <- decodeFile fname print (dat == dat4) removeFile fname
running main outputs
True "strict 2" True True *** Exception: foo.dat: removeFile: permission denied (Permission denied)
e.g. the handle version works, Bryan's original strictDecodeFile appears to not run "force", the modified strictDecodeFile' does run "force" (i didn't use seq, just an additional line in the monad), and the encodeFile / decodeFile / removeFile appears to still not work with the latest bytestring. what's the difference between the seq and non-seq versions?
for now i can use strictDecodeFile' but at least something should be said in the docs about decodeFile et al holding handles. (i understand this is not the fault of binary per se as much as haskell's non-strict semantics, but a reminder for noobs like me would be helpful.) and finally something like strictDecodeFile' might be useful in the library?
thanks for the help, ben
On Sun, Apr 20, 2008 at 2:34 PM, Duncan Coutts
wrote: On Sun, 2008-04-20 at 14:24 -0700, Bryan O'Sullivan wrote:
Doh! For all that I wrote about encodeFile, substitute decodeFile.
Indeed the version of encodeFile you wrote should be exactly identical to the original because the lazy bytestring writeFile already uses bracket like that:
writeFile :: FilePath -> ByteString -> IO () writeFile f txt = bracket (openBinaryFile f WriteMode) hClose (\hdl -> hPut hdl txt)
strictDecodeFile :: Binary a => FilePath -> (a -> b) -> IO () strictDecodeFile path force = bracket (openFile path ReadMode) hClose $ \h -> do c <- L.hGetContents h force (decode c) `seq` return ()
Yes, the problem with Ben's program was that decodeFile is lazily reading the file and lazily decoding. If the decoding consumes all the input then it should be possible to avoid rewriting decodeFile and use:
dat2 <- decodeFile fname evaluate dat2 removeFile fname
It's not immediately clear to me if we can make the decodeFile behave like your version. I'd have to go think about whether running the Get monad can lazily return values or if it always consumes all the input it'll ever need.
Duncan

Ben wrote:
i played around with all that you suggested, and came to the conclusion that i don't understand seq!
That's certainly possible, but you also got the type of your first forcing function wrong :-)
strictDecodeFile :: Binary a => FilePath -> (a -> b) -> IO ()
encodeFile fname dat strictDecodeFile fname (\x -> do print "strict 1" print (x == dat)) removeFile fname
You provided a function that returns an IO action. Note that strictDecodeFile ensures that the result of force is evaluated to WHNF, but it doesn't know what the *type* of the result is. If you return an IO action, there's no way it can be run, because the caller can't even tell that you returned an IO action: it just knows that you returned something of an unknown type "b". If the IO action can't be run, the comparison inside can't be performed, and so nothing useful actually happens. Instead, just provide a pure function to force the decoding: something as simple as (==dat) will do. Evaluating this to WHNF will reduce it to the constructor True or False. In order to produce a constructor, enough of the decoded data should be demanded to suit your needs. Developing a good enough mental model of how laziness works is a very useful way to spend some time.

On Sun, 2008-04-20 at 14:02 -0700, Ben wrote:
hi all --
using binary 0.4.1 on ghc 6.8.2, vista 64 sp1. consider the following program:
import System.Directory import Data.Binary
main = do let dat = [1..10]::[Int] fname = "foo.dat" encodeFile fname dat dat2 <- decodeFile fname print (dat == dat2) removeFile fname
this throws a permission denied exception, presumably because the file is still open when the removeFile is called. i've grovelled the source of Data.Bytestring.Lazy and all but i can't seem to understand the "right" way to make this work. note that i've forced the evaluation of dat2 and presumably therefore the filehandle is at least half-closed.
any suggestions?
There was a bug in bytestring-0.9.0.1 where the file handle was not closed when Data.Bytestring.Lazy.hGetContents reached the end of the file. The fix is certainly in version 0.9.0.4. So the above program should work there. If it still does not then the culprit is probably that your binary deserialiser is not consuming the whole input. Remember if you have access to a Handle you can always hClose it explicitly. Duncan
participants (4)
-
Ben
-
Bryan O'Sullivan
-
Daniel Fischer
-
Duncan Coutts