
Hello, I'm still learning Haskell and I am stuck on a probably simple problem. Assume I have a file where lines are of the form "key=value" I want to search a value in that file and came up with the following code.
rechf :: String -> IO (Maybe String) rechf r = bracket (openFile "liste" ReadMode) (hClose) (rechf2 r)
rechf2 :: String -> Handle -> IO (Maybe String) rechf2 r h= do f <- hGetContents h --print f return $ rech r $ lines f
rech :: String -> [ String ] -> Maybe String rech r l = lookup r $ map span2 l
span2 :: String->(String,String) span2 c = (a,b) where a=takeWhile (/='=') c b=drop 1 $ dropWhile (/='=') c
Now the problem is this : 1) if I try rechf, it returns nothing even for a key that exists in the file. 2) if I uncomment the line where there is "print f", the key is found and the value returned. I'm guessing print forces f to be evaluated, so the file is actually read, but I was wondering why it doesn't work without it and how to correct that. David.

Hello, There are a few solutions, depending on what behaviour you want. If you plan to read the file all the way to the end, then you do not need to explicitly close the handle, hGetContents will do it for you. http://www.haskell.org/ghc/docs/latest/html/libraries/base/System-IO.html#v%... If you plan to read only a portion of the file, (ie, only read enough to find the first occurance of a key), then you have to make sure you have really forced the 'rechf2' to do its work before hClose is called. 'print' works, because it forces the value to be printed. You can also force the value by using seq or evaluate. Instead of 'print f' you could do something like: evaluate $ length f This forces all of 'f' to be read in. (This would also trigger hGetContents to close the file, so you still would not need hClose). However, if f is very large, then you may not want it to be read into memory all at once. So, instead you might change the last line of rechf2 to: return $! rech r $ lines f This should make the return value strict. Since you would not have to read the whole file in this case, you would need to keep your explicit hClose. j. ps. I didn't actually try any of these suggestions, so I may be wrong. pps. depending on your needs, you might also just opt to use readFile instead of hGetContents At Tue, 6 Mar 2007 19:37:38 +0100, D.V. wrote:
Hello,
I'm still learning Haskell and I am stuck on a probably simple problem.
Assume I have a file where lines are of the form "key=value"
I want to search a value in that file and came up with the following code.
rechf :: String -> IO (Maybe String) rechf r = bracket (openFile "liste" ReadMode) (hClose) (rechf2 r)
rechf2 :: String -> Handle -> IO (Maybe String) rechf2 r h= do f <- hGetContents h --print f return $ rech r $ lines f
rech :: String -> [ String ] -> Maybe String rech r l = lookup r $ map span2 l
span2 :: String->(String,String) span2 c = (a,b) where a=takeWhile (/='=') c b=drop 1 $ dropWhile (/='=') c
Now the problem is this : 1) if I try rechf, it returns nothing even for a key that exists in the file. 2) if I uncomment the line where there is "print f", the key is found and the value returned.
I'm guessing print forces f to be evaluated, so the file is actually read, but I was wondering why it doesn't work without it and how to correct that.
David. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Tue, Mar 06, 2007 at 07:37:38PM +0100, D.V. wrote:
Hello,
I'm still learning Haskell and I am stuck on a probably simple problem.
Assume I have a file where lines are of the form "key=value"
I want to search a value in that file and came up with the following code.
rechf :: String -> IO (Maybe String) rechf r = bracket (openFile "liste" ReadMode) (hClose) (rechf2 r)
rechf2 :: String -> Handle -> IO (Maybe String) rechf2 r h= do f <- hGetContents h --print f return $ rech r $ lines f
rech :: String -> [ String ] -> Maybe String rech r l = lookup r $ map span2 l
span2 :: String->(String,String) span2 c = (a,b) where a=takeWhile (/='=') c b=drop 1 $ dropWhile (/='=') c
Now the problem is this : 1) if I try rechf, it returns nothing even for a key that exists in the file. 2) if I uncomment the line where there is "print f", the key is found and the value returned.
I cannot help you with your question more than pointing you to http://bugs.darcs.net/issue391 where Simon Marlow explains how to avoid "IO.bracket". But heres a hint: you can use ParserCombinators instead of rolling your own parser, especially if the complexity of your input formats will increase. mm
I'm guessing print forces f to be evaluated, so the file is actually read, but I was wondering why it doesn't work without it and how to correct that.
David. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On 3/6/07, mm
I cannot help you with your question more than pointing you to
http://bugs.darcs.net/issue391
where Simon Marlow explains how to avoid "IO.bracket".
I'm using Control.Exception's bracket. Also, following Jeremy's advice, I replaced the last line of rechf2 with return $! rech r $ lines f and indeed it works. I don't understand why it doesn't without the ! The documentation for hGetContents says the items are read on demand. the function rech needs the strings so they should be read on demand. This confuses me :(

On Mar 6, 2007, at 16:22 , D.V. wrote:
On 3/6/07, mm
wrote: I cannot help you with your question more than pointing you to
http://bugs.darcs.net/issue391
where Simon Marlow explains how to avoid "IO.bracket".
I'm using Control.Exception's bracket.
Also, following Jeremy's advice, I replaced the last line of rechf2 with return $! rech r $ lines f and indeed it works.
I don't understand why it doesn't without the !
The documentation for hGetContents says the items are read on demand. the function rech needs the strings so they should be read on demand. This confuses me :(
Because rech is lazy so it only reads them when something demands it do so (e.g. IO, or $! which forces strict evaluation). -- brandon s. allbery [linux,solaris,freebsd,perl] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Wednesday 07 March 2007 10:22, D.V. wrote:
On 3/6/07, mm
wrote: I cannot help you with your question more than pointing you to
http://bugs.darcs.net/issue391
where Simon Marlow explains how to avoid "IO.bracket".
I'm using Control.Exception's bracket.
Also, following Jeremy's advice, I replaced the last line of rechf2 with return $! rech r $ lines f and indeed it works.
I don't understand why it doesn't without the !
The documentation for hGetContents says the items are read on demand. the function rech needs the strings so they should be read on demand. This confuses me :(
The problem is that hGetContents only reads the contents of the file on demand and, without the 'return $!' you don't demand the value until somewhere outside of rechf. By this point the hClose has happened and hGetContents has no access to the file => no lines => no result. Using 'return $!' is demanding the value of rech r $ lines f immediately so hGetContents accesses the file before the hClose. hGetContents is implemented using unsafePerformIO so, unless you're very careful, you could get other weird behaviours. Daniel

The problem is that hGetContents only reads the contents of the file on demand and, without the 'return $!' you don't demand the value until somewhere outside of rechf. By this point the hClose has happened and hGetContents has no access to the file => no lines => no result.
I must be really dumb but I don't get why 'at this point the hClose has happened' It seemed to me that when I typed at ghci's prompt rechf "xxxx" it tries to evaluate it. that makes it perform the IO action of opening the file, then performing the IO action of evaluating ( since I need the result ) rechf2 "xxxx" and *lastly* performing the IO action of closing the file. So to me either I'm just too confused to understand or something doesn't work as it should be. I also tried using readFile ( again following Jeremy's suggestion ) and it works :
rechf r = do f <- readFile "liste" return $ rech r $ lines f

On Wednesday 07 March 2007 10:56, D.V. wrote:
The problem is that hGetContents only reads the contents of the file on demand and, without the 'return $!' you don't demand the value until somewhere outside of rechf. By this point the hClose has happened and hGetContents has no access to the file => no lines => no result.
I must be really dumb but I don't get why 'at this point the hClose has happened'
Not at all. Mixing laziness and unsafe IO can be tricksie little devils. A bit like vinegar and baking soda, or, perhaps more accurately, nitric acid and a bale of cotton. My general advice would be to stay away from hGetContents (which is unsafe) and just use explicit mechanisms, at least until your more familiar with haskell.
It seemed to me that when I typed at ghci's prompt rechf "xxxx" it tries to evaluate it.
As an aside ghci is actually executing something like putStrLn . show $ rechf "xxxx" as a convenience for you. It's demanding the value of rechf "xxxx" so that it can print it out.
that makes it perform the IO action of opening the file, then performing the IO action of evaluating ( since I need the result ) rechf2 "xxxx" and *lastly* performing the IO action of closing the file.
Your program actually says: 1) open file handle 2) create a String that will be read _on demand_ from file handle 3) close file handle 4) print the value which is computed from the String Step 2 doesn't read anything from the file because it is lazy. Step 4 the putStrLn that ghci added for you needs the String, which needs to read from the file, but it's too late. When you insert the print that you originally had commented out you're adding a step 2.5 which needs value of the String which causes the file to be read. As Matthew Brecknell shows it can be difficult to write a 'Step 2.5' that fixes the problem.
So to me either I'm just too confused to understand or something doesn't work as it should be.
It's working as expected, the expectation being that the mixing of laziness and unsafe IO is likely to give you a headache unless you really know what you're doing. The documentation of hGetContents could make this clearer, I think.
I also tried using readFile ( again following Jeremy's suggestion )
and it works :
rechf r = do f <- readFile "liste" return $ rech r $ lines f
Yeah, my advice is to use the explicit mechanisms, which aren't difficult, at least until you know enough not to blow your fingers off. Daniel

On 3/7/07, Daniel McAllansmith
Your program actually says: 1) open file handle 2) create a String that will be read _on demand_ from file handle 3) close file handle 4) print the value which is computed from the String
Thanks for a clear explanation and that's also where I've got lost. Claus' explanation was helpfull too. I thought that the order things were done is 1) print the value which is computed from rechf to do so, 2) open file handle 3) create a String .... 4) close file handle hence my confusion. I think I kinda understand now what's going on, but the docs are definitely lacking here :( by the way, what would be a safe version of hGetContents ? and why hGetContents is unsafe ?

At Tue, 6 Mar 2007 22:56:52 +0100, D.V. wrote:
The problem is that hGetContents only reads the contents of the file on demand and, without the 'return $!' you don't demand the value until somewhere outside of rechf. By this point the hClose has happened and hGetContents has no access to the file => no lines => no result.
I must be really dumb but I don't get why 'at this point the hClose has happened'
It seemed to me that when I typed at ghci's prompt rechf "xxxx" it tries to evaluate it. that makes it perform the IO action of opening the file, then performing the IO action of evaluating ( since I need the result ) rechf2 "xxxx" and *lastly* performing the IO action of closing the file.
That is close, but not quite right: 1. in ghci, you type, rechf "xxxxxx" and it tries to print the result 2. that makes it perform the IO action of openning the file (and it happens immediately) 3. 'f <- hGetContents h' lazily returns the contents of the file handle as a string. This means it won't actually read anything from the file until it needs to. 4. 'return $ rech r $ lines f' also lazily returns its result. This means that it does not perform the computation right away, instead it waits for some one to actually 'use' the result. In general to actually 'use' a value you have to do something that interacts with the 'real world', such as print the value out. 5. 'hClose h' closes the file handle, and does so immediately. 6. ghci tries to print the result that rechf returned. So, this means we are finally doing something that will force the value. So, all those suspended computations try to do their thing -- except, we have explicitly closed the file handle already. So, when the suspended computations try to read from the file handle, they don't get anything. When you uncomment the 'print f' line, you force lazy value f to be evaluated right then -- which is before hClose has been called. But, when you do not call 'print f', then nothing really needs the value of f until *after* hClose has been called, so you get no output. Of course, you do not want to print the entire contents of the file, so that is where the other solutions come into play. The function ($!) has the type: ($!) :: (a -> b) -> a -> b When you run: f $! x It forces 'x' and then runs 'f x'. In your code this means it will force the expression: (rech r $ lines f) before the function returns. Then, when hClose runs -- everything is ok, because we already got everything out of the file that we needed. As someone else pointed out in a different message, sometimes (f $! x) might not be sufficient because it might not force x 'all the way'. As this blog entry shows: http://blogs.nubgames.com/code/?p=22 hGetContents can make for very elegant code. But, at the same time, under the hood, hGetContents uses 'unsafeInterleaveIO'. The 'unsafe' part is basically the exact bug you are now seeing. For this reason, some people have suggested that hGetContents should be: a) removed or b) renamed to unsafehGetContents In general laziness is very nice, but there are two common problem cases that you will run into when you are first starting: 1) Mixing IO and laziness 2) Space leaks (aka, using lots of RAM) and stack overflows caused by code being *too* lazy. Every Haskell programmer runs into these at some point in time, and they are very confusing at first. Unfortunately, Haskell programmers tend to uncover these two issues at the beginning of their journey, when they are least equiped to make sense of them :( Hope this helps, j.

Daniel McAllansmith:
The problem is that hGetContents only reads the contents of the file on demand and, without the 'return $!' you don't demand the value until somewhere outside of rechf. By this point the hClose has happened and hGetContents has no access to the file => no lines => no result.
Using 'return $!' is demanding the value of rech r $ lines f immediately so hGetContents accesses the file before the hClose.
hGetContents is implemented using unsafePerformIO so, unless you're very careful, you could get other weird behaviours.
For example, if the value for the key being searched is very long (>8k on my system), it is likely to be truncated. This is because "return $!" only reduces its argument to WHNF. In other words, it only demands enough to work out whether the key is present, which means that hGetContents only gets a chance to return the first block of data retrieved after the key is found, before the file is closed. So, instead of "return $! rech r $ lines f", you really need something like: case (rech r $ lines f) of Just s -> return $! Just $! foldr seq s s Nothing -> return Nothing

Also, following Jeremy's advice, I replaced the last line of rechf2 with return $! rech r $ lines f and indeed it works.
I don't understand why it doesn't without the !
The documentation for hGetContents says the items are read on demand. the function rech needs the strings so they should be read on demand. This confuses me :(
that kind of issue comes up frequently, so here is my go at a description. if you find this helpful, it would be nice if you could put up your questions and answers on the wiki; if you don't find it helpful, or if the wiki already explains the issues, please ignore the following. [I've hereby set up an implicit source of input, which you can read at your own convenience, as far as you need:-] hGetContents sets up an implicit computation that will read input on demand for its output. you're not supposed to do anything else with either the input or the computation, other than consuming its output, until the demand for the output has been satisfied completely (and hence the input has been read as far as needed). openFile/<readsomething>/hClose give you explicit control over the sequence of file operations. explicit control means you're responsible for organizing things. either style of i/o works on its own, but don't try to mix them, certainly not on the same file/handle. concretely, you're explicitly requesting the file handle to be closed, after the implicit computation has been set up, but *before* any of its results have been demanded. that cuts off the input source for the implicit computation. so when later parts of the program demand its output, there is nothing left to be delivered. [rechf only returns an access point to the results of the implicit computation] your options: - move the explicit hClose backwards, until you're sure that you've all the output you want, and hence are done with the input (sometimes tricky) - avoid the explict hClose entirely, use only implicit i/o on that file (okay if you don't need many handles) - move the demand for the output of the implicit computation forward, to ensure that its input will be consumed *before* the explicit hClose cuts it off (involves seq or $! and some forced traversal of the complete output, eg, via length) - avoid the implicit hGetContents entirely, use only explicit i/o on that file (sometimes necessary, but usually more complicated) hth, claus [i've hereby explicitly closed the source of input, suggesting there is nothing more to come] ps. there should be library functions for getContentsNow/readFileNow, giving the convenience of implicit i/o without the advantages and pitfalls of lazy i/o.as long as there aren't, you might find it both instructive and helpful in practice to define them yourself? [but then i've added more input here, after the closing; will it be read implicitly, or ignored explicitly?-]
participants (8)
-
Brandon S. Allbery KF8NH
-
Claus Reinke
-
D.V.
-
Daniel McAllansmith
-
Jeremy Shaw
-
Jeremy Shaw
-
Matthew Brecknell
-
mm