[Haskell-cafe] Re: Can you do everything without shared-memory concurrency?

12 Sep 2008


      On 2008-09-10, David Roundy  wrote:
...
On Wed, Sep 10, 2008 at 03:30:50PM +0200, Jed Brown wrote:
...
On Wed 2008-09-10 09:05, David Roundy wrote:
...
I should point out, however, that in my experience MPI programming
involves deadlocks and synchronization handling that are at least as
nasty as any I've run into doing shared-memory threading.
Absolutely, avoiding deadlock is the first priority (before error
handling).  If you use the non-blocking interface, you have to be very
conscious of whether a buffer is being used or the call has completed.
Regardless, the API requires the programmer to maintain a very clear
distinction between locally owned and remote memory.
Even with the blocking interface, you had subtle bugs that I found
pretty tricky to deal with.  e.g. the reduce functions in lam3 (or was
it lam4) at one point didn't actually manage to result in the same
values on all nodes (with differences caused by roundoff error), which
led to rare deadlocks, when it so happened that two nodes disagreed as
to when a loop was completed.  Perhaps someone made the mistake of
assuming that addition was associative, or maybe it was something
triggered by the non-IEEE floating point we were using.  But in any
case, it was pretty nasty.  And it was precisely the kind of bug that
won't show up except when you're doing something like MPI where you
are pretty much forced to assume that the same (pure!) computation has
the same effect on each node.
Ah, okay.  I think that's a real edge case, and probably not how most
use MPI.  I've used both threads and MPI; MPI, while cumbersome, never
gave me any hard-to-debug deadlock problems.

-- 
Aaron Denney
-><-