
Hi
The problem I'm trying to solve is running system commands in parallel.
"system commands" means execution of external commands or just system calls inside Haskell?
Calls to System.Cmd.system, i.e. running external console processes. It's a make system I'm writing, so virtually all the time is spent in calls to ghc etc. To Bulat: I should have been clearer with the spec. The idea is that multiple calls to paralell_ can execute, and a function executing inside parallel_ can itself call parallel_. For this reason I need one top-level thread pool, which requires unsafePerformIO. If I create a thread pool every time, I end up with more threads than I want.
Your parallel_ does not return until all operations are finished.
parallel_ (x:xs) = do ys <- mapM idempotent xs mapM_ addParallel ys sequence_ $ x : reverse ys
By the way, there is no obvious reason to insert "reverse" there.
There is a reason :-) Imagine I do parallel_ [a,b,c] That's roughly doing (if b' is idempotent b): enqueue b' enqueue c' a b' c' If while executing a the thread pool starts on b', then after I've finished a, I end up with both threads waiting for b', and nothing doing c'. If I do a reverse, then the thread pool and me are starting at different ends, so if we lock then I know it's something important to me that the thread pool started first. It's still not idea, but it happens less often.
What I meant was something like:
para [] = return () para [x] = x para xs = do q <- newQSemN 0 let wrap x = finally x (signalQSemN q 1) go [y] n = wrap x >> waitQSemN q (succ n) go (y:ys) n = addParallel (wrap y) >> go ys $! succ n go xs 0
This is nearly identical to your code, and avoid creating the MVar for each operation. I use "finally" to ensure the count is correct, but if a worker threads dies then bas things will happen. You can replace finally with (>>) if speed is important.
Consider a thread pool with 2 threads and the call parallel_ [parallel_ [b,c],a] You get the sequence: enqueue (parallel_ [b,c]) a wait on parallel_ [b,c] While you are executing a, a thread pool starts: enqueue b c wait for b Now you have all the threads waiting, and no one dealing with the thread pool. This results in deadlock. I guess the "nested calls to parallel_" bit is the part of the spec that makes everything much harder! Thanks Neil