
On Mar 10, 2009, at 10:38 PM, Mark Spezzano wrote:
Hi,
I’m an experienced software developer, but a bit of a newbie when it comes to parallel processing in any language.
Question 1:
Is there any programmatic change in dealing with multiple threads as opposed to multiple cores in most languages (specifically Haskell)?
That is, to write multiple threads you normally “spin-off” a new thread which runs in parallel along with other threads. Is Haskell smart enough to do this by magic by itself, or would I need to tell it explicitly : Run thread A whilst running thread B
Also, what about multicore architectures? Do I have to tell the language to spin off two separate programs to run on each core and then somehow use some kind of communications to exchange data?
GHC's parallel runtime, which is what we're really talking about when we're talking about Haskell concurrency, operates roughly like this: Threads at the language level are "green" -- i.e. they are conceptually threads, and the runtime schedules them as it sees fit, but they aren't tied to any given core for execution. You can create (using the forkIO primitive) as many new threads as you like. If you then execute your program with runtime options (i.e. +RTS -N 2 for two cores) that specify how many actually operating system threads you want, the runtime will then map your set of green threads onto OS threads (and presumably cores) in what it thinks is an efficient manner. Communication between threads is via MVars, which are like one-item mailboxes, and are one concurrency primitive, or TVars, which are for Software Transactional Memory.
I also assume that, in theory it would be possible to have multiple threads running on each core. Say, 3 threads spawned from program 1 running on core 1 and 5 threads on program 2 running on core 2
Generally, the runtime system moves threads between cores as it sees fit. However, you can tie threads to particular cores using other concurrency primitives (i.e. runInBoundThread). See the documentation for Control.Concurrent (http://www.haskell.org/ ghc/docs/latest/html/libraries/base/Control-Concurrent.html) and Control.Concurrent.STM (http://www.haskell.org/ghc/docs/latest/html/ libraries/stm/Control-Concurrent-STM.html) for more details.
Likewise I would suppose that it would be possible to have multiprocessors, each potentially multicore, each core running multiple threads.
Yep, see above, although I don't know of any way to distinguish between cores and processors as you're really just mapping to OS threads (i.e. capabilities) and relying on the OS to distribute these among cores and processors reasonably.
Question 2:
In light of the above statement, is the programmatic change DIFFERENT for dealing with each of
a) Multithreading in Haskell versus
b) Multicores in Haskell verus
c) Multiprocessors in Haskell
Generally, you'll want to write concurrent Haskell code as an abstraction to think about things that are "naturally" concurrent -- i.e. which are best thought of as happening at the same time. At runtime, you specify how many OS capabilities you want to map onto. For algorithms where you want computations to happen in parallel, there's an entirely different set of operations, based around `par` (see Control.Parallel [http://www.haskell.org/ghc/docs/latest/html/ libraries/parallel/Control-Parallel.html] and Control.Parallel.Strategies). In neither case should you need, generally, to concern yourself with the details of threads vs. cores vs. processors. Cheers, Sterl.