
Hello Libraries, In base, the functions which read all contents from a handle or file into one String currently all do lazy IO: readFile, getContents, hGetContents. https://hackage.haskell.org/package/base-4.12.0.0/docs/System-IO.html#v:hGet... The easiest way to get a strict alternative seems to be to explicitly force the list, for example using ```length contents `seq` pure ()```, but that's far from an obvious solution. Is there a better way? If not, I propose to add readFile', getContents', hGetContents', which don't do lazy IO. It regularly creates confusion among beginners, and it's easy to assume that lazy IO is benign if that's the only way to do certain operations, when it's arguably the wrong way to read files to begin with. Cheers, Li-yao

I believe such a function exists in the strict package. I agree that it would be good to add such functions to base.
On Sep 11, 2019, at 1:01 PM, Li-yao Xia
wrote: Hello Libraries,
In base, the functions which read all contents from a handle or file into one String currently all do lazy IO: readFile, getContents, hGetContents.
https://hackage.haskell.org/package/base-4.12.0.0/docs/System-IO.html#v:hGet...
The easiest way to get a strict alternative seems to be to explicitly force the list, for example using ```length contents `seq` pure ()```, but that's far from an obvious solution.
Is there a better way?
If not, I propose to add readFile', getContents', hGetContents', which don't do lazy IO.
It regularly creates confusion among beginners, and it's easy to assume that lazy IO is benign if that's the only way to do certain operations, when it's arguably the wrong way to read files to begin with.
Cheers, Li-yao _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On Wed, 11 Sep 2019, Li-yao Xia wrote:
The easiest way to get a strict alternative seems to be to explicitly force the list, for example using ```length contents `seq` pure ()```, but that's far from an obvious solution.
I am not sure, whether this works reliably. Evaluating the length of 'contents' only generates the skeleton of the list but not immediately the element values. A cleaner way would be to use 'deepseq'.

Hi Henning, On 9/11/19 2:52 PM, Henning Thielemann wrote:
On Wed, 11 Sep 2019, Li-yao Xia wrote:
The easiest way to get a strict alternative seems to be to explicitly force the list, for example using ```length contents `seq` pure ()```, but that's far from an obvious solution.
I am not sure, whether this works reliably. Evaluating the length of 'contents' only generates the skeleton of the list but not immediately the element values. A cleaner way would be to use 'deepseq'.
That's an interesting question, because I'm pretty confident this is a reliable way to force getContents, but I'm less sure I can convince you of it easily. Thinking of how that could break, I believe that one would have to get out of their way in order to implement getContents such that forcing the list does not also make its characters available even after the file is closed, at which point the author of that function should stop and wonder whether it is worth the trouble, and I trust that the author, if they even considered the possibility, would reach the reasonable conclusion of "don't do that". Of course, that argument can go wrong in many ways, especially because it is full of subjective judgements. So to get some closure, let's look at the source code. Skipping over the intermediate steps that one would have to check for themselves anyway, it boils down to this unpack function: https://hackage.haskell.org/package/base-4.12.0.0/docs/src/GHC.IO.Handle.Tex... Near the end of the function is the line that adds a character c as part of the string that will be returned at the end, we can see that the cons comes with the character fully read by peekElemOf: unpackRB (c : acc) (i-1) Cheers, Li-yao

Wouldn't it be more sensible to not interleave IO in the first place? Cheers, Vanessa On 9/11/19 8:13 PM, Li-yao Xia wrote:
Hi Henning,
On 9/11/19 2:52 PM, Henning Thielemann wrote:
On Wed, 11 Sep 2019, Li-yao Xia wrote:
The easiest way to get a strict alternative seems to be to explicitly force the list, for example using ```length contents `seq` pure ()```, but that's far from an obvious solution.
I am not sure, whether this works reliably. Evaluating the length of 'contents' only generates the skeleton of the list but not immediately the element values. A cleaner way would be to use 'deepseq'.
That's an interesting question, because I'm pretty confident this is a reliable way to force getContents, but I'm less sure I can convince you of it easily.
Thinking of how that could break, I believe that one would have to get out of their way in order to implement getContents such that forcing the list does not also make its characters available even after the file is closed, at which point the author of that function should stop and wonder whether it is worth the trouble, and I trust that the author, if they even considered the possibility, would reach the reasonable conclusion of "don't do that".
Of course, that argument can go wrong in many ways, especially because it is full of subjective judgements. So to get some closure, let's look at the source code. Skipping over the intermediate steps that one would have to check for themselves anyway, it boils down to this unpack function:
https://hackage.haskell.org/package/base-4.12.0.0/docs/src/GHC.IO.Handle.Tex...
Near the end of the function is the line that adds a character c as part of the string that will be returned at the end, we can see that the cons comes with the character fully read by peekElemOf:
unpackRB (c : acc) (i-1)
Cheers, Li-yao _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

+1 to adding those non-lazy versions. Such functions could work
without having to half-close the handle, thus making it easier to
continue reading from a file after EOF (à la `tail -f`). (I've asked
about how to do this exact thing before at
https://stackoverflow.com/q/56221606/7509065 and this would give it a
trivial answer.)
Joseph C. Sible
On Wed, Sep 11, 2019 at 2:01 PM Li-yao Xia
Hello Libraries,
In base, the functions which read all contents from a handle or file into one String currently all do lazy IO: readFile, getContents, hGetContents.
https://hackage.haskell.org/package/base-4.12.0.0/docs/System-IO.html#v:hGet...
The easiest way to get a strict alternative seems to be to explicitly force the list, for example using ```length contents `seq` pure ()```, but that's far from an obvious solution.
Is there a better way?
If not, I propose to add readFile', getContents', hGetContents', which don't do lazy IO.
It regularly creates confusion among beginners, and it's easy to assume that lazy IO is benign if that's the only way to do certain operations, when it's arguably the wrong way to read files to begin with.
Cheers, Li-yao _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

I like the idea of having strict versions of these functions. I also prefer
to recommend people use the strict versions of these functions from text
package's Data.Text.IO to encouraging using String to load files. I agree
that people should be able to read file contents strictly without having to
use tricks like forcing the length.
On Wed, Sep 11, 2019 at 3:57 PM Joseph C. Sible
+1 to adding those non-lazy versions. Such functions could work without having to half-close the handle, thus making it easier to continue reading from a file after EOF (à la `tail -f`). (I've asked about how to do this exact thing before at https://stackoverflow.com/q/56221606/7509065 and this would give it a trivial answer.)
Joseph C. Sible
On Wed, Sep 11, 2019 at 2:01 PM Li-yao Xia
wrote: Hello Libraries,
In base, the functions which read all contents from a handle or file into one String currently all do lazy IO: readFile, getContents, hGetContents.
https://hackage.haskell.org/package/base-4.12.0.0/docs/System-IO.html#v:hGet...
The easiest way to get a strict alternative seems to be to explicitly force the list, for example using ```length contents `seq` pure ()```, but that's far from an obvious solution.
Is there a better way?
If not, I propose to add readFile', getContents', hGetContents', which don't do lazy IO.
It regularly creates confusion among beginners, and it's easy to assume that lazy IO is benign if that's the only way to do certain operations, when it's arguably the wrong way to read files to begin with.
Cheers, Li-yao _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
-- Eric Mertens

Certainly reading directly into a String is inefficient for a strict read.
I imagine the best way is to read (eagerly) into a lazy ByteString and
decode that lazily into a String. Even for Text, it may well be better to
read into a lazy ByteString and decode into lazy Text, since the latter
tends to take considerably more memory.
On Wed, Sep 11, 2019, 2:01 PM Li-yao Xia
Hello Libraries,
In base, the functions which read all contents from a handle or file into one String currently all do lazy IO: readFile, getContents, hGetContents.
https://hackage.haskell.org/package/base-4.12.0.0/docs/System-IO.html#v:hGet...
The easiest way to get a strict alternative seems to be to explicitly force the list, for example using ```length contents `seq` pure ()```, but that's far from an obvious solution.
Is there a better way?
If not, I propose to add readFile', getContents', hGetContents', which don't do lazy IO.
It regularly creates confusion among beginners, and it's easy to assume that lazy IO is benign if that's the only way to do certain operations, when it's arguably the wrong way to read files to begin with.
Cheers, Li-yao _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
participants (6)
-
David Feuer
-
Eric Mertens
-
Henning Thielemann
-
Joseph C. Sible
-
Li-yao Xia
-
Vanessa McHale