Re: [Haskell-cafe] what is f=f (not) doing ?

Miguel Mitrofanov wrote:
A deadlock happens whenever two (or more) threads are blocked on each other. Deadlocks can be extremely hard to detect, especially if they occur intermittently.
Isn't that so much different from garbage collection? Replace "thread" with "chunk of data", and "waits for" with "has a pointer to" and these two problems look very similar. And we all know there ARE efficient garbage collectors.
Actually, I think the two problems look /extremely/ similar. When doing garbage collection, one BIG problem is determining what's really a pointer and what's just data. There are conservative garbage collectors that assume just about anything can be a pointer, and then there are precise garbage collectors that use knowledge of the layout of "chunks of data" to identify the pointers to follow. All garbage collectors have one thing in common, though. They all depend on being able to identify, for each chunk of data, what else it has pointers to. If all else fails, a conservative GC can simply assume that everything is potentially a pointer and try to follow it. Now, a deadlock detector would need to be able to identify, for each blocked thread, what that thread is waiting for. The OS already tracks this information and uses it to determine when a blocked thread becomes un-blocked and ready to run. Now, if we could make sure that /all/ locks go through the OS, then the OS would be able to build a complete dependency table. When a deadlock is about to occur, the OS would be able to detect it immediately. Back to reality: Not all locks go through the OS. Suppose I'm using fibers. That means I have one OS thread running many routines at the same time, and I'm doing my own cooperative multitasking. GHC does this. The OS has no idea what's going on. Even if I had an OS that could track all locks, I can still get into deadlock situations. Suppose I have a programming error that causes two of my threads to deadlock. Furthermore, suppose that when the deadlock is detected, I make both threads roll back and start again with the same pre- conditions. Now they will go back into the same deadlock. This cycle will repeat indefinitely, and it's beyond the OS's ability to detect. Regarding deadlock detection, I think the real questions to ask are (1) Do deadlocks occur often enough for us to care? (2) When they do occur, are they severe enough for us to care? If the answers are both "no", then deadlock detection is not worth the trouble of implementation. My opinion is that for typical commercial end-user software, the answers are both "no". For "critical" computing systems, like flight computers and life support systems, the answers are obviously "yes"; a deadlock that occurs once per million years is still a problem if it can bring down a multi-million dollar aircraft with hundreds of passengers on board. For these types of systems, the developers tend to spend extra development resources to try to design fool-proof systems. In some cases, they may even go to heroic lengths to /prove/ that their software is correct. It /is/ possible to design a system that cannot deadlock. It turns out that one of the necessary conditions for deadlock is the hold and wait condition. If a process that's already holding resources is /not/ allowed to request additional resources, then deadlocks are impossible. If we really, absolutely don't want any deadlocks, then we can design a system in which a process that needs a resource is required to release all its resources and then re-allocate its updated list of resources. -- Ron
participants (1)
-
Ronald Guida