
On Wed, 2008-09-17 at 21:20 +0000, Aaron Denney wrote:
On 2008-09-17, Jonathan Cast
wrote: In my mind pooling vs new-creation is only relevant to process vs thread in the performance aspects.
Say what? This discussion is entirely about performance --- does CPython actually have the ability to scale concurrent programs to multiple processors? The only reason you would ever want to do that is for performance.
I entered the discussion as which model is a workaround for the other --
Well, I thought the discussion was about implementations, not models. I also assumed remarks would be made in the context of the entire thread. I shall have to remember that in the future.
someone said processes were a workaround for the lack of good threading in e.g. standard CPython.
I replied that most languages thread support
Using a definition of `thread' which, apparantly, excludes Concurrent Haskell.
can be seen as a workaround for the poor performance of communicating processes.
Meaning kernel-switched processes.
(creation in particular is usually cited, but that cost can often be reduced by process pools, context switching costs, alas, is harder.)
Kernel threads /are/ expensive. Which is why all the cool kids use user-space threads.
Often muxed on top of kernel threads, because user-threads can't use multiple CPUs at once.
Well, a single kernel thread can't use multiple CPUs at once. (So you need more than one).
The central aspect in my mind is a default share-everything, or default share-nothing.
I really don't think you understand Concurrent Haskell, then. (Or Concurrent ML, or stackless Python, or libthread, or any other CSP-based set-up).
Or Erlang, Occam, or heck, even jcsp. Because I'm coming at this from a slightly different perspective
Different enough we're talking past each other. The idea that the thing you make with forkIO doesn't count as a thread never crossed my mind, sorry.
and place a different emphasis on things
and use completely different definitions for key terms and make statements which, substituting in the definitions I was using, are (as I hope you grant) non-sensical
you think I don't understand?
Not any more. I just think your definition of `thread' is unexpected in this context (without rather more elaboration).
No, trust me, I do understand them[1], and think CSP and actor models (the differences in nondeterminism is a minor detail that doesn't much matter here) are extremely nice ways of implementing parallel systems.
I'm glad to hear that...
These are, in fact, process models.
OK. I think that perspective is rather unique, but OK.
They are implemented on top of thread models, but that's a performance hack.
Maybe. It's done for performance, but I don't see why you call it a hack. Does it sacrifice some important advantage I'm missing? (Vs. kernel-scheduled threads).
And while putting this model on top restores much of the programming sanity, in languages with mutable variables and references that can be passed, you still need a fair bit of discipline to keep that sanity. There, the implementation detail of thread, rather than process allows and even encourages shortcuts that violate the process model. In languages that are immutable, taking advantage of the shared memory space really can gain efficiency without any noticeably downside.
Nice clarification.[1] Thanks. jcc [1] I am, btw., painfully aware that Haskell has mutable references that can be passed between threads. Just as I am painfully aware of Unix's, um, interesting ideas on maintaining file system consistency in the presence of concurrent access to *that* shared resource...