So this thread got me thinking about what's wrong with file IO in Haskell. My conclusion is that file handles are problematic because they're essentially pointers and as such subject to the same sorts of problems such as use after freeing (as in this example) which is further complicated by defaulting to a lazy implementation. The same sorts of solutions to dealing with the Ptr type can be applied to dealing with file handles (and in fact have been) such as the way alloca works by wrapping the operations on a pointer such that it's allocated, passed into a function, and then freed on exit from the function. In a similar fashion you could have a withFile function that takes a file name, a R/W mode, and a function to perform some work on the file. This is in fact the exact pattern implemented by the ResourceT library (which in turn started as part of the previously mentioned conduit package).

In general I'd recommend avoiding all the standard file functions as they're very un-Haskellish and inherently unsafe (in the type safety sense, not the security sense). I personally think those functions should be phased out in future Haskell releases in favor of better functional abstractions, but my opinion carries very little weight being essentially nobody in the Haskell world, so take with whatever sized grain of salt you feel is appropriate.

On Feb 4, 2013 8:05 AM, "Patrick Mylund Nielsen" <haskell@patrickmylund.com> wrote:
conduit and pipes are two examples:

http://hackage.haskell.org/package/conduit
http://hackage.haskell.org/package/pipes


On Mon, Feb 4, 2013 at 2:00 PM, Kyle Murphy <orclev@gmail.com> wrote:

I can't say 100% for sure, but I'd guess it's because parsec is pure, and the file operations are using lazy bytestrings. Since no IO operations are applied to cont until after you close the handle, nothing can be read (since at that time the handle is closed). If you want to keep the program structured the same I believe there are functions that can convert a lazy bytestring into a strict one, and then you can perform the parsing on that. Alternatively you could rewrite things to close the file handle after you write it's contents to the output file.

The default file operations in Haskell are known to be a source of difficulty in terms of laziness, and there has been some debate as to whether they're poorly designed or not. I might suggest you look into some of the alternatives, particular those based on stream fusion principles, that allow you to kill two birds with one stone by iteratively dealing with input thereby forcing evaluation and also improving memory usage and making it harder to trigger space leaks. I don't have the names available at the moment or I'd provide them, but I'm pretty sure at least one of them is named something like enumeratee, although I believe there's at least one other that might debatably be considered better.

On Feb 4, 2013 5:51 AM, "Kees Bleijenberg" <k.bleijenberg@lijbrandt.nl> wrote:

module Main where

 

import Text.ParserCombinators.Parsec (many,many1,string, Parser, parse)

import System.IO (IOMode(..),hClose,openFile,hGetContents,hPutStrLn)

                                                                              

parseFile hOut fn = do

                        handle <- openFile fn ReadMode

                        cont <- hGetContents handle                                      

                        print cont

                        let res = parse (many (string "blah")) "" cont

                        hClose handle                   

                        case res of

                            (Left err) -> hPutStrLn hOut $ "Error: " ++ (show err)

                            (Right goodRes) -> mapM_ (hPutStrLn hOut) goodRes                        

                 

main = do  

            hOut <- openFile "outp.txt" WriteMode

            mapM (parseFile hOut) ["inp.txt"]

            hClose hOut

 

I’am writing a program that parses a lot of files. Above is the simplest program I can think of that demonstrates my problem.

The program above parses inp.txt.  Inp.txt has only the word blah in it.  The output is saved in outp.txt. This file contains the word blah after running the program. if I comment out the line ‘print cont’ nothing is saved in outp.txt. 

If I comment out ‘print cont’ and replace many with many1 in the following line, it works again?

Can someone explain to me what is going  on?

 

Kees


_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://www.haskell.org/mailman/listinfo/beginners


_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://www.haskell.org/mailman/listinfo/beginners



_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://www.haskell.org/mailman/listinfo/beginners