[Haskell-cafe] Re: Re[6]: Parallel combinator, performance advice

8 Apr 2009

      Hi
...
...
...
...
parallel_ (x1:xs) = do
    sem <- newQSem $ 1 - length xs
    forM_ xs $ \x ->
        writeChan queue (x >> signalQSem sem, False)
    x1
    addWorker
    waitQSem sem
    writeChan queue (signalQSem sem, True)
    waitQSem sem
...
...
Neil, executing x1 directly in parallel_ is incorrect idea.
...
forget this. but it still a bit suboptimal...
i think i realized why you use this schema. my solution may lead to
N-1 worker threads in the system if last job is too small - after its
execution we finish one thread and have just N-1 working threads until
parallel_ will be finished
but problem i mentioned in previous letter may also take place
although it looks like less important. we may solve both problems by
allowing worker thread to actively select its death time: it should
die only at the moment when *last* job in bucket was finished - this
guarantees us exactly N worker threads at any time. so:
parallel_ (x1:xs) = do
   sem <- newQSem $ - length xs
   jobsLast <- newMVar (length xs)
   addWorker
   forM_ (x1:xs) $ \x -> do
       writeChan queue $ do
          x
          signalQSem sem
          modifyMVar jobsLast $ \jobs -> do
              return (jobs-1, jobs==0)
Yes, this saves us adding a kill job to the queue, but requires an
extra MVar. I guess which one of these is to be preferred depends on
performance measures.

I've just found that QSem _ISN'T_ valid below 0, so the above code
won't actually work with QSem as it stands. I've reported this on
ghc-users@. I've also spotted that QSem creates plenty of MVar's
itself, so the logic of moving to QSem instead of MVar isn't really
valid (although the new approach is nicer, so that's good).

Thanks

Neil

[Haskell-cafe] Re: Re[6]: Parallel combinator, performance advice

Neil Mitchell