RE: [Haskell-cafe] Are handles garbage-collected?

On 24 October 2004 20:51, Sven Panne wrote:
IMHO it would be best to use explicit bracketing where possible, and hope for the RTS/GC to try its best when one runs out of a given resource. Admittedly the current Haskell implementations could be improved a little bit in the last respect.
Indeed, GHC could/should try to free up file descriptors when they run out. It's a bit tricky though. At the moment performGC doesn't actually run any finalizers. It schedules a thread to run the finalizers, and you hope the thread runs soon. So if you're running performGC for the purposes of finalization, then almost certainly (performGC >> yield) is better. I've been wondering whether having a more synchronous kind of finalizer would be a good thing. Nevertheless, I agree with the general sentiment on this thread that file descriptors shouldn't be treated as a resource in the same way as memory. Cheers, Simon

"Simon Marlow"
At the moment performGC doesn't actually run any finalizers. It schedules a thread to run the finalizers, and you hope the thread runs soon. So if you're running performGC for the purposes of finalization, then almost certainly (performGC >> yield) is better. I've been wondering whether having a more synchronous kind of finalizer would be a good thing.
In my language Kogut I have two kinds of finalizers: - Some objects are implemented in C with C finalizers attached, e.g. raw files. These finalizers should not access other objects and they can't call back Kogut; they are used to free external resources or explicitly managed memory (e.g. bignums, or the payload of arrays which is malloced). These finalizers are run during GC. - You can also attach GHC-style weak references with finalizers to arbitrary objects. These finalizers are Kogut functions. They may resurrect the object being finalized, and as far as I know they can't be used to crash the runtime. They are executed in a separate thread spawned by the garbage collector. They may use synchronization primitives to wait until global variables are in a consistent state. If a finalizer throws an exception, the exception is ignored and the finalizing thread proceeds with other objects. There is a small problem with their implementation: if a finalizer blocks forever, other finalizers from the same GC round will never be executed. This implies that if you forget about a buffered file used for writing, then the file destriptor will be closed, but buffers will not be flushed. Opening a file tries to GC once if it gets an error about too many open files. It works because files use the first kind of finalizers. Finalizers of the first kind are run at program end, because it was easy to do. It doesn't matter anyway for things they are currently used for, as all of them would be freed by the OS anyway. Finalizers of the second kind are not run at program end, and I don't know whether there is a sane way to do that. You may register actions to be run at program end. I think it's possible to use them to implement temporary files which are guaranteed to be deleted at some time: either when they are no longer referenced, or at program end. Implementing this would require a global set of weak references to these files, and a single action will delete them all. It's not possible to unregister these actions, and I'm not convinced that it's needed in general. * * * BTW, I managed to get opening of named fifos working in the presence of green threads. Opening them in non-blocking mode doesn't work; GHC documentation describes a workaround for opening them for reading. It's yet worse for writing (the system just returns ENXIO). So I open all files in blocking mode, and switch the descriptor to non-blocking mode after it has been obtained. The timer signal is ITIMER_REAL (SIGALRM) rather than ITIMER_VIRTUAL (SIGVTALRM), which causes the timer to tick during waiting for opening a file too. This causes open() to fail with EINTR, in which case I process signals or reschedule as appropriate and go back to trying open(). This also allows waiting for chold processes concurrently with waiting for I/O or timeout. The thread waiting for a process or trying to open a named fifo wastes its time slice though. I don't think it's possible to implement this without this problem using Unix API, without native threads. While it would be possible to avoid waking it up N times per second if it's the only thread which can do some work, I'm not sure it would be worth the effort. The CPU usage is minimal. Another downside of ITIMER_REAL is that third-party libraries might not be prepared for EINTR from various system calls. For example my Kogut<->Python bridge has to disable the timer when entering Python, and enable it when entering Kogut back - otherwise Python's blocking I/O will just fail. It still fails when a signal arrives, even if it's then ignored (because ignoring a signal is done by having a signal handler which effectively does nothing, rather than raw SIG_IGN). * * * These libraries might not be prepared for EAGAIN resulting from non-blocking mode, in case the same descriptor is used by several runtimes or across an exec, in particular stdin/stdout/stderr. This requires explicit programmer intervention - I have no idea how to do this fully automatically, so I require to do this almost fully manually to avoid confusion when it's being done by the runtime. A program tries to restore the original mode when it finishes, instead of blindly setting them to the blocking mode, so a fork & exec of a Kogut program from another doesn't disturb the blocking state. OTOH the programmer must set them to blocking mode manually before exec'ing a program which is not prepared for non-blocking mode. Instead of resetting them to non-blocking mode on SIGCONT, as GHC does, my default handler for SIGTSTP looks like this: restore blocking mode on std handles set SIGTSTP signal handler to SIG_DFL raise(SIGTSTP) // This causes the process to stop until SIGCONT set SIGTSTP signal handler back to my runtime's handler set non-blocking mode on std handles This means that it works on ^Z, and whatever parent process notices the stop, it will not be confused by non-blocking mode of descriptors. This is about the only case, apart from program start & exit and opening a new file, when the blocking mode is changed by the runtime. It does not work well though when somebody sends a SIGSTOP signal instead of hitting ^Z. Then the shell starts using the terminal, possibly sets blocking mode, and the program is continued without restoring the mode. But since it will not work anyway in other cases where the descriptor shared with other processes changes its mode, I'm leaving it this way. -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/

On Mon, Oct 25, 2004 at 02:14:28PM +0100, Simon Marlow wrote:
On 24 October 2004 20:51, Sven Panne wrote:
IMHO it would be best to use explicit bracketing where possible, and hope for the RTS/GC to try its best when one runs out of a given resource. Admittedly the current Haskell implementations could be improved a little bit in the last respect.
Indeed, GHC could/should try to free up file descriptors when they run out. It's a bit tricky though. Hm, I'm not sure about the "should". Garbage collection is meant for memory, and anything making that less clear makes people more likely to depend on incorrect assumptions. And redefining GC to be a collection of _all_ garbage, instead of just memory doesn't sound so fantastic either.
At the moment performGC doesn't actually run any finalizers. It schedules a thread to run the finalizers, and you hope the thread runs soon. So if you're running performGC for the purposes of finalization, then almost certainly (performGC >> yield) is better. I've been wondering whether having a more synchronous kind of finalizer would be a good thing. Synchronous finalizers seem to be difficult to implement in e.g. the JVM, so may make a JVM-backend more difficult. (I'm thinking about how CPython vs Jython go about closing file-handles... CPython uses (primarily) reference-counting, so files get closed as soon as they aren't referenced anymore, which lots of people now depend on. Jython uses Java-GC, so some CPython programs may suddenly fail...)
Nevertheless, I agree with the general sentiment on this thread that file descriptors shouldn't be treated as a resource in the same way as memory.
Cheers, Simon
Groeten, Remi P.S. Why do so many people (including me) seem to come to Haskell from Python? It can't be just the indentation, can it? ;) -- Nobody can be exactly like me. Even I have trouble doing it.

On Mon, Oct 25, 2004 at 08:55:46PM +0200, Remi Turk wrote:
P.S. Why do so many people (including me) seem to come to Haskell from Python? It can't be just the indentation, can it? ;)
How many? I don't. Best regards, Tom -- .signature: Too many levels of symbolic links

On Mon, Oct 25, 2004 at 09:28:23PM +0200, Tomasz Zielonka wrote:
On Mon, Oct 25, 2004 at 08:55:46PM +0200, Remi Turk wrote:
P.S. Why do so many people (including me) seem to come to Haskell from Python? It can't be just the indentation, can it? ;)
How many? I don't.
Best regards, Tom
At least one. (Me) And, judging from the amount of references to Python in these mailing-lists, I really doubt I'm the only one. I actually met Haskell mostly by reading about it in the python mailinglist/newsgroup. (in e.g. Alex Martelli's posts) Someone else (who shall remain anonymous. SCNR) wrote:
Speculation I'm afraid to post: People are drawn to Python because they hear it is a clean language, but slowly find that it's really pretty messy internally. Haskell is beautiful on the outside and the inside. :-)
The last sentence I can definitely agree with, but I'm not so sure Python really is that messy. (Messier-than-Haskell sure, but messy? ;o) Groetjes, Remi P.S. Hm, it _is_ haskell-cafe, but maybe it's about time for a [Off-topic] note? ;) -- Nobody can be exactly like me. Even I have trouble doing it.

Remi Turk
At least one. (Me) And, judging from the amount of references to Python in these mailing-lists, I really doubt I'm the only one.
At least two. I also came to Haskell from Python.
I actually met Haskell mostly by reading about it in the python mailinglist/newsgroup. (in e.g. Alex Martelli's posts)
I met Haskell because I started writing all of my Python code with a single return point, was overusing reduce and map, building function pipelines, and finally someone asked me if I'd used Haskell before.
Haskell is beautiful on the outside and the inside. :-)
Yes, Haskell is beautiful inside and outside. That's something that has kept me interested in it for years. 'Conceptually pure' is what I call it. -- Shae Matijs Erisson - Programmer - http://www.ScannedInAvian.org/ "I will, as we say in rock 'n' roll, run until the wheels come off, because I love what I do." -- David Crosby

Shae Matijs Erisson wrote:
Remi Turk
writes: At least one. (Me) And, judging from the amount of references to Python in these mailing-lists, I really doubt I'm the only one.
At least two. I also came to Haskell from Python.
Me three.
I actually met Haskell mostly by reading about it in the python mailinglist/newsgroup. (in e.g. Alex Martelli's posts)
I met Haskell because I started writing all of my Python code with a single return point, was overusing reduce and map, building function pipelines, and finally someone asked me if I'd used Haskell before.
I met Haskell by first meeting Vyper, which was a more functionally-slanted variant of Python, which was written in Ocaml. This led me to Haskell, which I liked better.
Haskell is beautiful on the outside and the inside. :-)
Yes, Haskell is beautiful inside and outside. That's something that has kept me interested in it for years. 'Conceptually pure' is what I call it.
Yep. Bryn

Remi Turk
At least one. (Me) And, judging from the amount of references to Python in these mailing-lists, I really doubt I'm the only one.
On Tue, Oct 26, 2004 at 09:21:10PM +0200, Shae Matijs Erisson wrote:
At least two. I also came to Haskell from Python.
I made a conscious effort at some point in college to learn a variety of programming languages. Haskell arose as a particularly "nice" ML variant in this exploration. I still find it useful for expressing mathematical things for various computations to help with things. -- wli

Tomasz Zielonka
P.S. Why do so many people (including me) seem to come to Haskell from Python? It can't be just the indentation, can it? ;)
How many? I don't.
And I don't either. But indeed I've seen more references to Haskell on Python lists than on other lists: http://groups.google.com/groups?&q=group%3Acomp.lang.python+haskell - 1590 posts http://groups.google.com/groups?&q=group%3Acomp.lang.python - 219 000 threads http://groups.google.com/groups?&q=group%3Acomp.lang.java+haskell - 73 posts http://groups.google.com/groups?&q=group%3Acomp.lang.java - 85 100 threads http://groups.google.com/groups?&q=group%3Acomp.lang.perl+haskell - 35 posts http://groups.google.com/groups?&q=group%3Acomp.lang.perl - 52 700 threads http://groups.google.com/groups?&q=group%3Acomp.lang.c+haskell - 197 posts http://groups.google.com/groups?&q=group%3Acomp.lang.c - 435 000 threads http://groups.google.com/groups?&q=group%3Acomp.lang.scheme+haskell - 1330 posts http://groups.google.com/groups?&q=group%3Acomp.lang.scheme - 44 800 threads I don't know how to get the total number of posts from a group archived by Google. Assuming that average threads on these groups have the same size, Haskell is - 8 times more popular on c.l.python than c.l.java, - 11 times more popular on c.l.python than c.l.perl, - 16 times more popular on c.l.python than c.l.c, but finally - 4 times *less* popular on c.l.python than c.l.scheme, i.e. the order of Haskell-awareness is Scheme > Python > Java > Perl > C. -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/

Marcin 'Qrczak' Kowalczyk
- 8 times more popular on c.l.python than c.l.java, - 11 times more popular on c.l.python than c.l.perl, - 16 times more popular on c.l.python than c.l.c, but finally - 4 times *less* popular on c.l.python than c.l.scheme, i.e. the order of Haskell-awareness is Scheme > Python > Java > Perl > C.
I'm not sure whether this is accounted for in the tally, but I would guess more threads are crossposted between c.l.scheme and c.l.functional than between c.l.f and the other groups, and thus more likely to be read by Haskellites. -kzm -- If I haven't seen further, it is by standing in the footprints of giants

On 2004-10-25, Tomasz Zielonka
On Mon, Oct 25, 2004 at 08:55:46PM +0200, Remi Turk wrote:
P.S. Why do so many people (including me) seem to come to Haskell from Python? It can't be just the indentation, can it? ;)
How many? I don't.
I'm one. I look at Haskell at what Python should be in a lot of ways. Haskell's type system is static and helpful, yet inobtrusive like I like my Python. Python has added a lot of FP-like features lately, and they're great. Things like generators, etc. But why go only half-way? Haskell doesn't need a special generator because a list is lazy. I like that. Haskell's greatest deficiency compared to Python, IMHO, is the breadth of good libraries available for it. I'm working to fix that. One down (ftplib), many more (ConfigParser, etc) to go :-) -- John Goerzen Author, Foundations of Python Network Programming http://www.amazon.com/exec/obidos/tg/detail/-/1590593715

On Mon, Oct 25, 2004 at 09:28:23PM +0200, Tomasz Zielonka wrote:
On Mon, Oct 25, 2004 at 08:55:46PM +0200, Remi Turk wrote:
P.S. Why do so many people (including me) seem to come to Haskell from Python? It can't be just the indentation, can it? ;)
How many? I don't.
My road to Haskell went from C, C++ and Perl through Ocaml, Clean and Erlang. I was mainly motivated by dissatisfaction with languages I used at the moment. I am very happy with Haskell and I think I'll be using it for some time. I still use C++ at work, mostly because of performance requirements and participation in multi-developer projects (multi stands for 2-3 here). To be honest, I am very happy with what we managed to achieve in C++, but I still get angry when I code some more complicated but less-performance-critical parts and I think how easy it would be in Haskell :) Best regards, Tom -- .signature: Too many levels of symbolic links

On Tue, Oct 26, 2004 at 10:22:23PM +0200, Tomasz Zielonka wrote:
My road to Haskell went from C, C++ and Perl through Ocaml, Clean and Erlang. I was mainly motivated by dissatisfaction with languages I used at the moment. I am very happy with Haskell and I think I'll be using it for some time. I still use C++ at work, mostly because of performance requirements and participation in multi-developer projects (multi stands for 2-3 here). To be honest, I am very happy with what we managed to achieve in C++, but I still get angry when I code some more complicated but less-performance-critical parts and I think how easy it would be in Haskell :)
All C and assembly programming for kernels and firmware at work here. -- wli

Remi Turk
Hm, I'm not sure about the "should". Garbage collection is meant for memory, and anything making that less clear makes people more likely to depend on incorrect assumptions. And redefining GC to be a collection of _all_ garbage, instead of just memory doesn't sound so fantastic either.
I would not be surprised if relying on GC to close open files would be generally considered kosher in a few years - in cases when it has little visible effects outside, i.e. excluding network connections, but including reading configuration files. All which is needed is that the OS doesn't run out of system-wide resources when only a given process opens many files; and that the language implementation can actually force a garbage collection and run finalizers of file objects immediately in the rare case when the limit of per-process descriptors is reached. Well, this has some impact on mixing languages, because when a module implemented in one language runs out of file descriptors, it should cause a program-wide GC of all runtimes. Maybe this style would apply only to environments like .NET. -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/

Marcin 'Qrczak' Kowalczyk writes:
I would not be surprised if relying on GC to close open files would be generally considered kosher in a few years - in cases when it has little visible effects outside, i.e. excluding network connections, but including reading configuration files.
One of the goals of ReiserFS, if I recall correctly, is to create a
filesystem API that doesn't involve file handles. I wonder how far one
could get without explicitly dealing with file handles outside of the IO
library implementation.
--
David Menendez

From: David Menendez
Marcin 'Qrczak' Kowalczyk writes:
I would not be surprised if relying on GC to close open files would be generally considered kosher in a few years - in cases when it has little visible effects outside, i.e. excluding network connections, but including reading configuration files.
One of the goals of ReiserFS, if I recall correctly, is to create a filesystem API that doesn't involve file handles. I wonder how far one could get without explicitly dealing with file handles outside of the IO library implementation. -- David Menendez
http://www.eyrie.org/~zednenem/ _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

David Menendez wrote:
One of the goals of ReiserFS, if I recall correctly, is to create a filesystem API that doesn't involve file handles. I wonder how far one could get without explicitly dealing with file handles outside of the IO library implementation.
I tried to do this last year with my File and Directory abstractions [1], but the consensus was that it would be impossible to implement on top of Posix or Win32. It might be possible on top of the native NT API, though. For this to really work properly the whole OS needs to be designed around it. Such OSes exist -- they're called "capability-based" -- but like so many other good ideas this one hasn't generated much interest in industry. I think we're stuck with explicit file handle management for the next couple of decades. -- Ben [1] http://www.mail-archive.com/haskell@haskell.org/msg13071.html

Ben Rudiak-Gould
For this to really work properly the whole OS needs to be designed around it. Such OSes exist -- they're called "capability-based" -- but like so many other good ideas this one hasn't generated much interest in industry. I think we're stuck with explicit file handle management for the next couple of decades.
hOp/House doesn't have a filesystem yet. See http://www.cse.ogi.edu/~hallgren/House/ (source recently released) Maybe we won't always require explicit file handles. -- Shae Matijs Erisson - Programmer - http://www.ScannedInAvian.org/ "I will, as we say in rock 'n' roll, run until the wheels come off, because I love what I do." -- David Crosby

Ben Rudiak-Gould
For this to really work properly the whole OS needs to be designed around it. Such OSes exist -- they're called "capability-based" -- but like so many other good ideas this one hasn't generated much interest in industry. I think we're stuck with explicit file handle management for the next couple of decades.
On Tue, Oct 26, 2004 at 09:22:58PM +0200, Shae Matijs Erisson wrote:
hOp/House doesn't have a filesystem yet. See http://www.cse.ogi.edu/~hallgren/House/ (source recently released) Maybe we won't always require explicit file handles.
It doesn't seem to have demand paging either. Hmm. Maybe I should look at this sometime. -- wli

Marcin 'Qrczak' Kowalczyk wrote:
I would not be surprised if relying on GC to close open files would be generally considered kosher in a few years - in cases when it has little visible effects outside, i.e. excluding network connections, but including reading configuration files.
The best idea I've seen is another one from Microsoft Research. It's an extension to C that allows the the programmer to use the type system to specify the lifetime of things. Seems like a pretty good idea: http://research.microsoft.com/vault/ Not sure how you'd do something like that in Haskell though! Cheers, Sam

Sam Mason
The best idea I've seen is another one from Microsoft Research. It's an extension to C that allows the the programmer to use the type system to specify the lifetime of things.
I'm worried about putting too many thing in types. For example many people believe exception specifications in Java are a mistake. Such problems often don't show up until someone tries to write big programs. It's not sufficient that it appears to work for toy programs. Putting many things in types is generally problematic for all kinds of polymorphism. -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/

At 20:55 25/10/04 +0200, Remi Turk wrote:
P.S. Why do so many people (including me) seem to come to Haskell from Python? It can't be just the indentation, can it? ;)
I did. (Or: from Java via Python.) I don't think it's the indentation. At least, not *just* that. FWIW, I speculate: 1. Python users have already chosen expressivity over efficiency 2. For all that it's a "rapid" development language, and being dynamically typed, Python doesn't completely abandon discipline in program construction. #g ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

On Tue, 26 Oct 2004, Graham Klyne wrote:
At 20:55 25/10/04 +0200, Remi Turk wrote:
P.S. Why do so many people (including me) seem to come to Haskell from Python? It can't be just the indentation, can it? ;)
I did. (Or: from Java via Python.)
I don't think it's the indentation. At least, not *just* that.
Maybe it is the fact that Python adapted some of the techniques that make Haskell convenient, like map, filter, list construction, lambda and so on. Perhaps people like to know the origin of these extensions?
participants (15)
-
Ben Rudiak-Gould
-
Bryn Keller
-
David Menendez
-
Graham Klyne
-
Henning Thielemann
-
John Goerzen
-
Ketil Malde
-
Marcin 'Qrczak' Kowalczyk
-
Remi Turk
-
Sam Mason
-
Shae Matijs Erisson
-
Simon Marlow
-
Sun Yi Ming
-
Tomasz Zielonka
-
William Lee Irwin III