
Hi Sebastian:
How about using unsafeInterleaveIO to get a lazy suspension of the result of each action, and then using par to spark off each of them? If that works you can reuse the existing task-parallel system of GHC to do the heavily lifting for you, instead of having to write your own.
par is likely to spark all the computations, and then switch between them - which will mean I've got more than N things running in parallel.
i think the only way to solve this problem is to create one more thread each time. let's see: on every call to para you need to alloc one thread to wait for jobs completion. so on each nested call to para you have minus one worker thread. finally you will eat them all!
so you need to make fork: one thread should serve jobs and another one wait for completion of this jobs bucket. and with killItself flag you will finish superfluous thread JIT
You are right, your previous solution was running at N-1 threads if the order was a little unlucky. I've attached a new version which I think gives you N threads always executing at full potential. It's basically your idea from the last post, with the main logic being: parallel_ (x1:xs) = do sem <- newQSem $ 1 - length xs forM_ xs $ \x -> writeChan queue (x >> signalQSem sem, False) x1 addWorker waitQSem sem writeChan queue (signalQSem sem, True) waitQSem sem Where the second flag being True = kill, as you suggested. I think I've got the semaphore logic right - anyone want to see if I missed something? With this new version running 1000000 items takes ~1 second, instead of ~10 seconds before, so an order of magnitude improvement, and greater fairness. Very nice, thanks for all the help! Thanks Neil