Re: [Haskell-cafe] Proposal: Non-recursive let

ivan.chollet wrote:
let's consider the following:
let fd = Unix.open ... let fd = Unix.open ...
At this point one file descriptor cannot be closed. Static analysis will have trouble catching these bugs, so do humans.
Both sentences express false propositions.
The given code, if Haskell, does not open any file descriptors, so
there is nothing to close. In the following OCaml code
let fd = open_in "/tmp/a" in
let fd = open_in "/tmp/v" in
...
the first open channel becomes unreachable. When GC collects it (which
will happen fairly soon, on a minor collection, because the channel
died young), GC will finalize the channel and close its file
descriptor.
The corresponding Haskell code
do
h <- openFile ...
h <- openFile ...
works similarly to OCaml. Closing file handles upon GC is codified in
the Haskell report because Lazy IO crucially depends on such behavior.
If one is interested in statically tracking open file descriptors and
making sure they are closed promptly, one could read large literature
on this topic. Google search for monadic regions should be a good
start. Some of the approaches are implemented and used in Haskell.
Now about static analysis. Liveness analysis has no problem whatsoever
determining that a variable fd in our examples has been shadowed and
the corresponding value is dead. We are all familiar with liveness
analysis -- it's the one responsible for `unused variable'
warnings. The analysis is useful for many other things (e.g., if it
determines that a created value dies within the function activation,
the value could be allocated on stack rather than on heap.). Here is
example from C:
#include
Disallowing variable shadowing prevents this. The two "fd" occur in different contexts and should have different names. Usage of shadowing is generally bad practice. It is error-prone. Hides obnoxious bugs like file descriptors leaks. The correct way is to give different variables that appear in different contexts a different name, although this is arguably less convenient and more verbose.
CS would be better as science if we refrain from passing our personal opinions and prejudices as ``the correct way''. I can't say better than the user Kranar in a recent discussion on a similar `hot topic': The issue is that much of what we do as developers is simply based on anecdotal evidence, or statements made by so called "evangelicals" who blog about best practices and people believe them because of how articulate they are or the cache and prestige that the person carries. ... It's unfortunate that computer science is still advancing the same way medicine advanced with witch doctors, by simply trusting the wisest and oldest of the witch doctors without any actual empirical data, without any evidence, just based on the reputation and overall charisma or influence of certain bloggers or "big names" in the field. http://www.reddit.com/r/programming/comments/1iyp6v/is_there_a_really_an_emp...

Although your post is a bit trollish, I answer below to clear any confusion.
On Thu, Jul 25, 2013 at 7:18 AM,
ivan.chollet wrote:
let's consider the following:
let fd = Unix.open ... let fd = Unix.open ...
At this point one file descriptor cannot be closed. Static analysis will have trouble catching these bugs, so do humans.
Both sentences express false propositions.
Nope, they both express true propositions, as shown below.
The given code, if Haskell, does not open any file descriptors, so there is nothing to close.
I gave the code in caml syntax since your original post was about caml.
In the following OCaml code
let fd = open_in "/tmp/a" in let fd = open_in "/tmp/v" in ...
the first open channel becomes unreachable. When GC collects it (which will happen fairly soon, on a minor collection, because the channel died young), GC will finalize the channel and close its file descriptor.
This is not the code I posted. I explicitly used "Unix.open", not "open_in". This dishonest rewrite of my trivial code snippet is trollish and ridiculous. In your code, your "fd"s are not file descriptors, they are channels. It's funny that you use a variable name "fd" for a channel, shows confusion at the very least, since they are completely different objects and concepts. I assume that for a file descriptor, you invariably use a variable name "ch"? In my code, the fd are not garbage collected, they are not subject to garbage collection since they don't live in the runtime, they are purely OS objects that need to be closed with an explicit Unix.close system call. As a result, your comment is completely wrong, the first file descriptor in my code snippet gets invariably leaked as I said. This proves the first sentence.
The corresponding Haskell code do h <- openFile ... h <- openFile ...
works similarly to OCaml. Closing file handles upon GC is codified in the Haskell report because Lazy IO crucially depends on such behavior.
If one is interested in statically tracking open file descriptors and making sure they are closed promptly, one could read large literature on this topic. Google search for monadic regions should be a good start. Some of the approaches are implemented and used in Haskell.
Now about static analysis. Liveness analysis has no problem whatsoever determining that a variable fd in our examples has been shadowed and the corresponding value is dead. We are all familiar with liveness analysis -- it's the one responsible for `unused variable' warnings. The analysis is useful for many other things (e.g., if it determines that a created value dies within the function activation, the value could be allocated on stack rather than on heap.). Here is example from C:
#include
void foo(void) { char x[4] = "abc"; /* Intentional copying! */ { char x[4] = "cde"; /* Intentional copying and shadowing */ x[0] = 'x'; printf("result %s\n",x); } }
Pretty old GCC (4.2.1) had no trouble detecting the shadowing. With the optimization flag -O4, GCC acted on this knowledge. The generated assembly code reveals no traces of the string "abc", not even in the .rodata section of the code. The compiler determined the string is really unused and did not bother even compiling it in.
Detecting shadowing is trivial as I said in a previous post and is not the problem here. The compiler can always throw a warning about the shadowing, but not much else. There is no algorithm that can tell you (by static analysis or not by the way) whether a program is leaking file descriptors or not. This is a corollary of Rice's theorem. For example if you put your file descriptors in a list, then shadow some of them, static analysis will be of no help to tell you if your program is leaking file descriptors or not. This proves the second sentence.
Disallowing variable shadowing prevents this. The two "fd" occur in different contexts and should have different names. Usage of shadowing is generally bad practice. It is error-prone. Hides obnoxious bugs like file descriptors leaks. The correct way is to give different variables that appear in different contexts a different name, although this is arguably less convenient and more verbose.
CS would be better as science if we refrain from passing our personal opinions and prejudices as ``the correct way''.
CS would be better as a science if we refrained from using flawed logic and trollish behaviors, as you did in your quoted post.
I can't say better than the user Kranar in a recent discussion on a similar `hot topic':
The issue is that much of what we do as developers is simply based on anecdotal evidence, or statements made by so called "evangelicals" who blog about best practices and people believe them because of how articulate they are or the cache and prestige that the person carries. ... It's unfortunate that computer science is still advancing the same way medicine advanced with witch doctors, by simply trusting the wisest and oldest of the witch doctors without any actual empirical data, without any evidence, just based on the reputation and overall charisma or influence of certain bloggers or "big names" in the field.
http://www.reddit.com/r/programming/comments/1iyp6v/is_there_a_really_an_emp...
At this point and given the context of this post: don't feed the troll
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

I'm just going to say that I'd rather we didn't resort to calling each
others trolls.
I happen to disagree with Oleg on this particular issue and find that it is
better resolved by just using -Wall or a 2-line combinator, but I find that
across the breadth and depth of issues in the Haskell ecosystem I find
myself agreeing with him more often than not.
-Edward
On Thu, Jul 25, 2013 at 1:41 PM, i c
Although your post is a bit trollish, I answer below to clear any confusion.
On Thu, Jul 25, 2013 at 7:18 AM,
wrote: ivan.chollet wrote:
let's consider the following:
let fd = Unix.open ... let fd = Unix.open ...
At this point one file descriptor cannot be closed. Static analysis will have trouble catching these bugs, so do humans.
Both sentences express false propositions.
Nope, they both express true propositions, as shown below.
The given code, if Haskell, does not open any file descriptors, so there is nothing to close.
I gave the code in caml syntax since your original post was about caml.
In the following OCaml code
let fd = open_in "/tmp/a" in let fd = open_in "/tmp/v" in ...
the first open channel becomes unreachable. When GC collects it (which will happen fairly soon, on a minor collection, because the channel died young), GC will finalize the channel and close its file descriptor.
This is not the code I posted. I explicitly used "Unix.open", not "open_in". This dishonest rewrite of my trivial code snippet is trollish and ridiculous. In your code, your "fd"s are not file descriptors, they are channels. It's funny that you use a variable name "fd" for a channel, shows confusion at the very least, since they are completely different objects and concepts. I assume that for a file descriptor, you invariably use a variable name "ch"? In my code, the fd are not garbage collected, they are not subject to garbage collection since they don't live in the runtime, they are purely OS objects that need to be closed with an explicit Unix.close system call.
As a result, your comment is completely wrong, the first file descriptor in my code snippet gets invariably leaked as I said.
This proves the first sentence.
The corresponding Haskell code do h <- openFile ... h <- openFile ...
works similarly to OCaml. Closing file handles upon GC is codified in the Haskell report because Lazy IO crucially depends on such behavior.
If one is interested in statically tracking open file descriptors and making sure they are closed promptly, one could read large literature on this topic. Google search for monadic regions should be a good start. Some of the approaches are implemented and used in Haskell.
Now about static analysis. Liveness analysis has no problem whatsoever determining that a variable fd in our examples has been shadowed and the corresponding value is dead. We are all familiar with liveness analysis -- it's the one responsible for `unused variable' warnings. The analysis is useful for many other things (e.g., if it determines that a created value dies within the function activation, the value could be allocated on stack rather than on heap.). Here is example from C:
#include
void foo(void) { char x[4] = "abc"; /* Intentional copying! */ { char x[4] = "cde"; /* Intentional copying and shadowing */ x[0] = 'x'; printf("result %s\n",x); } }
Pretty old GCC (4.2.1) had no trouble detecting the shadowing. With the optimization flag -O4, GCC acted on this knowledge. The generated assembly code reveals no traces of the string "abc", not even in the .rodata section of the code. The compiler determined the string is really unused and did not bother even compiling it in.
Detecting shadowing is trivial as I said in a previous post and is not the problem here. The compiler can always throw a warning about the shadowing, but not much else.
There is no algorithm that can tell you (by static analysis or not by the way) whether a program is leaking file descriptors or not. This is a corollary of Rice's theorem. For example if you put your file descriptors in a list, then shadow some of them, static analysis will be of no help to tell you if your program is leaking file descriptors or not.
This proves the second sentence.
Disallowing variable shadowing prevents this. The two "fd" occur in different contexts and should have different
names.
Usage of shadowing is generally bad practice. It is error-prone. Hides obnoxious bugs like file descriptors leaks. The correct way is to give different variables that appear in different contexts a different name, although this is arguably less convenient and more verbose.
CS would be better as science if we refrain from passing our personal opinions and prejudices as ``the correct way''.
CS would be better as a science if we refrained from using flawed logic and trollish behaviors, as you did in your quoted post.
I can't say better than the user Kranar in a recent discussion on a similar `hot topic':
The issue is that much of what we do as developers is simply based on anecdotal evidence, or statements made by so called "evangelicals" who blog about best practices and people believe them because of how articulate they are or the cache and prestige that the person carries. ... It's unfortunate that computer science is still advancing the same way medicine advanced with witch doctors, by simply trusting the wisest and oldest of the witch doctors without any actual empirical data, without any evidence, just based on the reputation and overall charisma or influence of certain bloggers or "big names" in the field.
http://www.reddit.com/r/programming/comments/1iyp6v/is_there_a_really_an_emp...
At this point and given the context of this post: don't feed the troll
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (3)
-
Edward Kmett
-
i c
-
oleg@okmij.org