You are in a twisty maze of concurrency libraries, all different ...

I'm looking for the "right" concurrency library/semantics for what should be a reasonably simple problem. I have a little simulator: runWorldSim :: MTGen -> SimState -> IO SimState it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. I run 5-10 of these jobs, and then use: mergeWorld :: [SimState] -> SimState to pick the best features of the runs and build another possible world (state). Then I use this new world to run another 5-10 jobs and so on. I run this through ~20000 iterations. It's an obvious place for parallelism. I'm looking for a concurrency library with something like: forkSequence :: Int -> [IO a] -> IO [a] which I could call with something like this: forkSequence 4 (take 10 (repeat (runWorldSim g ss))) this would construct 4 threads, then dispatch the 10 jobs onto the threads, and pack up the results into a list I could run through my merger. It strikes me as something someone would already have done, but I can't find anything in hackage. Probably I've missed something obvious? Any pointers? If not, what would be the best/easiest existing package to write an extension to? Thanks, Patrick.

Patrick Caldon
it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. I run 5-10 of these jobs, and then use:
Which RNG are you using that it needs so much IO?
mergeWorld :: [SimState] -> SimState
to pick the best features of the runs and build another possible world (state). Then I use this new world to run another 5-10 jobs and so on. I run this through ~20000 iterations.
It's an obvious place for parallelism.
I'm looking for a concurrency library with something like:
forkSequence :: Int -> [IO a] -> IO [a]
which I could call with something like this:
forkSequence 4 (take 10 (repeat (runWorldSim g ss)))
this would construct 4 threads, then dispatch the 10 jobs onto the threads, and pack up the results into a list I could run through my merger.
It strikes me as something someone would already have done, but I can't find anything in hackage. Probably I've missed something obvious? Any pointers?
If not, what would be the best/easiest existing package to write an extension to?
Thanks, Patrick.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Patrick Caldon
writes: it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. I run 5-10 of these jobs, and then use:
Which RNG are you using that it needs so much IO? Mersenne Twister, System.Random.Mersenne. The ordinary rng kills
Ivan Lazar Miljenovic wrote: performance. Patrick.

Patrick Caldon wrote:
I'm looking for the "right" concurrency library/semantics for what should be a reasonably simple problem.
I have a little simulator:
runWorldSim :: MTGen -> SimState -> IO SimState
it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. I run 5-10 of these jobs, and then use:
mergeWorld :: [SimState] -> SimState
to pick the best features of the runs and build another possible world (state). Then I use this new world to run another 5-10 jobs and so on. I run this through ~20000 iterations.
It's an obvious place for parallelism.
I'm looking for a concurrency library with something like:
forkSequence :: Int -> [IO a] -> IO [a]
which I could call with something like this:
forkSequence 4 (take 10 (repeat (runWorldSim g ss)))
this would construct 4 threads, then dispatch the 10 jobs onto the threads, and pack up the results into a list I could run through my merger.
Why particularly do you want to run the 10 jobs on 4 threads? Haskell's run-time is quite good at spreading out the lightweight threads onto all your cores, so the easiest thing to do is run the 10 jobs on 10 (light-weight) threads and let the run-time sort out the rest. So if what you want is a function: runPar :: [IO a] -> IO [a] you can easily construct this. Shameless plug: my CHP library effectively has this function already, runParallel :: [CHP a] -> CHP [a] (CHP being a slight layer on top of IO). But you can do it just as easily with, say, STM. Here is a version where order doesn't matter (apologies for the point-free style): import Control.Concurrent import Control.Concurrent.STM import Control.Monad modifyTVar :: TVar a -> (a -> a) -> STM () modifyTVar tv f = readTVar tv >>= writeTVar tv . f runPar :: [IO a] -> IO [a] runPar ps = do resVar <- newTVarIO [] mapM_ (forkIO . (>>= atomically . modifyTVar resVar . (:))) ps atomically $ do res <- readTVar resVar when (length res < length ps) retry return res If order does matter, you can zip the results with an index, and sort by the index afterwards. If efficiency matters, you can perform other tweaks. But the principle is quite straightforward. Or you can refactor your code to take the IO dependency out of your random number generation, and run the sets of pure code in parallel using the parallel library. If all you are using IO for is random numbers, that's probably the nicest approach. Thanks, Neil. P.S. take 10 . repeat is the same as replicate 10

Neil Brown wrote:
Patrick Caldon wrote:
I'm looking for the "right" concurrency library/semantics for what should be a reasonably simple problem.
I have a little simulator:
runWorldSim :: MTGen -> SimState -> IO SimState
it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. I run 5-10 of these jobs, and then use:
mergeWorld :: [SimState] -> SimState
to pick the best features of the runs and build another possible world (state). Then I use this new world to run another 5-10 jobs and so on. I run this through ~20000 iterations.
It's an obvious place for parallelism.
I'm looking for a concurrency library with something like:
forkSequence :: Int -> [IO a] -> IO [a]
which I could call with something like this:
forkSequence 4 (take 10 (repeat (runWorldSim g ss)))
this would construct 4 threads, then dispatch the 10 jobs onto the threads, and pack up the results into a list I could run through my merger.
Why particularly do you want to run the 10 jobs on 4 threads? Haskell's run-time is quite good at spreading out the lightweight threads onto all your cores, so the easiest thing to do is run the 10 jobs on 10 (light-weight) threads and let the run-time sort out the rest.
Thanks so much for that! I'll give it a go. Different threads is just because some of the jobs are memory hogs, and I want to minimize the number running simultaneously. I'll see what happens with a runPar-like approach, and use a queue-based approach if it becomes a problem.
So if what you want is a function:
runPar :: [IO a] -> IO [a]
you can easily construct this. Shameless plug: my CHP library effectively has this function already, runParallel :: [CHP a] -> CHP [a] (CHP being a slight layer on top of IO). But you can do it just as easily with, say, STM. Here is a version where order doesn't matter (apologies for the point-free style):
import Control.Concurrent import Control.Concurrent.STM import Control.Monad
modifyTVar :: TVar a -> (a -> a) -> STM () modifyTVar tv f = readTVar tv >>= writeTVar tv . f
runPar :: [IO a] -> IO [a] runPar ps = do resVar <- newTVarIO [] mapM_ (forkIO . (>>= atomically . modifyTVar resVar . (:))) ps atomically $ do res <- readTVar resVar when (length res < length ps) retry return res
If order does matter, you can zip the results with an index, and sort by the index afterwards. If efficiency matters, you can perform other tweaks. But the principle is quite straightforward. Or you can refactor your code to take the IO dependency out of your random number generation, and run the sets of pure code in parallel using the parallel library. If all you are using IO for is random numbers, that's probably the nicest approach.
Good, fast random numbers are unfortunately necessary - I had a nice implementation using System.Random, but had to rewrite it because performance was poor :( .
P.S. take 10 . repeat is the same as replicate 10
Thanks again! Patrick.

On Fri, Dec 4, 2009 at 12:28 PM, Patrick Caldon
Neil Brown wrote:
Patrick Caldon wrote:
I'm looking for the "right" concurrency library/semantics for what should be a reasonably simple problem.
I have a little simulator:
runWorldSim :: MTGen -> SimState -> IO SimState
it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. I run 5-10 of these jobs, and then use:
mergeWorld :: [SimState] -> SimState
to pick the best features of the runs and build another possible world (state). Then I use this new world to run another 5-10 jobs and so on. I run this through ~20000 iterations.
It's an obvious place for parallelism.
I'm looking for a concurrency library with something like:
forkSequence :: Int -> [IO a] -> IO [a]
which I could call with something like this:
forkSequence 4 (take 10 (repeat (runWorldSim g ss)))
this would construct 4 threads, then dispatch the 10 jobs onto the threads, and pack up the results into a list I could run through my merger.
Why particularly do you want to run the 10 jobs on 4 threads? Haskell's run-time is quite good at spreading out the lightweight threads onto all your cores, so the easiest thing to do is run the 10 jobs on 10 (light-weight) threads and let the run-time sort out the rest.
Thanks so much for that! I'll give it a go.
Different threads is just because some of the jobs are memory hogs, and I want to minimize the number running simultaneously. I'll see what happens with a runPar-like approach, and use a queue-based approach if it becomes a problem.
So if what you want is a function:
runPar :: [IO a] -> IO [a]
you can easily construct this. Shameless plug: my CHP library effectively has this function already, runParallel :: [CHP a] -> CHP [a] (CHP being a slight layer on top of IO). But you can do it just as easily with, say, STM. Here is a version where order doesn't matter (apologies for the point-free style):
import Control.Concurrent import Control.Concurrent.STM import Control.Monad
modifyTVar :: TVar a -> (a -> a) -> STM () modifyTVar tv f = readTVar tv >>= writeTVar tv . f
runPar :: [IO a] -> IO [a] runPar ps = do resVar <- newTVarIO [] mapM_ (forkIO . (>>= atomically . modifyTVar resVar . (:))) ps atomically $ do res <- readTVar resVar when (length res < length ps) retry return res
If order does matter, you can zip the results with an index, and sort by the index afterwards. If efficiency matters, you can perform other tweaks. But the principle is quite straightforward. Or you can refactor your code to take the IO dependency out of your random number generation, and run the sets of pure code in parallel using the parallel library. If all you are using IO for is random numbers, that's probably the nicest approach.
Good, fast random numbers are unfortunately necessary - I had a nice
implementation using System.Random, but had to rewrite it because performance was poor :( .
Have you tried this, pure, library? http://hackage.haskell.org/package/mersenne-random-pure64 http://hackage.haskell.org/package/mersenne-random-pure64 -- Sebastian Sylvan

On Fri, 2009-12-04 at 22:51 +1100, Patrick Caldon wrote:
I'm looking for the "right" concurrency library/semantics for what should be a reasonably simple problem.
I have a little simulator:
runWorldSim :: MTGen -> SimState -> IO SimState
it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO.
Wait! This is not going to work! You cannot use the MTGen from the mersenne-random in a concurrent IO program because the C code uses a single global mutable RNG state. Your "independent" simulations would not be independent and you would not get reproducible results. Indeed you could get incorrect results or segfaults because the C code does not expect to be called from multiple threads simultaneously (there is no locking). Personally I would attack this by eliminating the IO. There's no justification for a random number generator being in IO. And look at the problems it causes! There are other MT implementations that do not use C code which assumes it's ok to use one single global mutable RNG state for an entire process. There are pure-Haskell MT impls that use mutable variables in ST but give an overall pure lazy list of random numbers. If you don't need MT specifically then there are other fast RNGs too. Duncan

On Fri, Dec 4, 2009 at 7:38 AM, Duncan Coutts
Wait! This is not going to work!
You cannot use the MTGen from the mersenne-random in a concurrent IO program because the C code uses a single global mutable RNG state.
So use the PRNG in the statistics package instead. It's got some nice features that make it a better choice than mersenne-random for essentially all uses: - Faster than mersenne-random - State is encapsulated, so you can have independent PRNGs in different threads or different library modules - You can easily seed independent generators from your system's high-quality PRNG It can also generate normally distributed numbers as well as uniformly distributed numbers (which is all that mersenne-random gives you), and it uses a high-quality fast algorithm for the normal distribution, rather than the usual ziggurat which is somewhat broken.

On 04/12/09 11:51, Patrick Caldon wrote:
I'm looking for the "right" concurrency library/semantics for what should be a reasonably simple problem.
I have a little simulator:
runWorldSim :: MTGen -> SimState -> IO SimState
it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. I run 5-10 of these jobs, and then use:
mergeWorld :: [SimState] -> SimState
to pick the best features of the runs and build another possible world (state). Then I use this new world to run another 5-10 jobs and so on. I run this through ~20000 iterations.
It's an obvious place for parallelism.
If you can get rid of the need for IO then you can use Control.Parallel to evaluate pure functions instead. If you only use IO for the random numbers then you can either keep a StdGen in your SimState or else use a "State StdGen" monad. Since your random number use is presumably already in monadic IO you could probably switch to a state monad fairly trivially. Paul.

It appears there are several implementations existing on Hackage of the following function, in various disguises: runPar :: [IO a] -> IO [a] the idea being that the IO computations are run in parallel, rather than sequentially. My own Streaming Component Combinators package contains a similar function, but somewhat generalized: class Monad m => ParallelizableMonad m where parallelize :: m a -> m b -> m (a, b) instance ParallelizableMonad IO -- implemented using forkIO instance ParallelizableMonad Identity -- implemented using par instance ParallelizableMonad Maybe -- implemented using par Would there be any interest in having this class packaged in a separate library? If so, can you sugest a better name or some additional functionality?

On Wed, Dec 9, 2009 at 2:17 PM, Mario Blazevic
It appears there are several implementations existing on Hackage of the following function, in various disguises:
runPar :: [IO a] -> IO [a]
the idea being that the IO computations are run in parallel, rather than sequentially. My own Streaming Component Combinators package contains a similar function, but somewhat generalized:
class Monad m => ParallelizableMonad m where parallelize :: m a -> m b -> m (a, b)
instance ParallelizableMonad IO -- implemented using forkIO instance ParallelizableMonad Identity -- implemented using par instance ParallelizableMonad Maybe -- implemented using par
Would there be any interest in having this class packaged in a separate library? If so, can you sugest a better name or some additional functionality?
A similar function that I'm fond of: forkExec :: IO a -> IO (IO a) forkExec k = do result <- newEmptyMVar _ <- forkIO $ k >>= putMVar result return (takeMVar result) Although I don't think it can be generalized to non-IO monads. Antoine

Antoine Latter wrote:
A similar function that I'm fond of:
forkExec :: IO a -> IO (IO a)
It's cute that forkExec already has a dual operation with just the right name (specialised to IO): join :: IO (IO a) -> IO a

It's a good thing then that forkExec and return are denotationally equal (though not operationally). Otherwise, I'd be worried. Matthew Brecknell wrote:
Antoine Latter wrote:
A similar function that I'm fond of:
forkExec :: IO a -> IO (IO a)
It's cute that forkExec already has a dual operation with just the right name (specialised to IO):
join :: IO (IO a) -> IO a
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

A similar function that I'm fond of:
forkExec :: IO a -> IO (IO a) forkExec k = do result <- newEmptyMVar _ <- forkIO $ k >>= putMVar result return (takeMVar result)
Although I don't think it can be generalized to non-IO monads.
Antoine
I can't test it right now, but wouldn't the following do the job in the Identity monad? forkExec :: Identity a -> Identity (Identity a) forkExec k = let result = runIdentity k in result `par` return (Identity result)

On Wed, Dec 9, 2009 at 3:44 PM, Mario Blazevic
I can't test it right now, but wouldn't the following do the job in the Identity monad?
forkExec :: Identity a -> Identity (Identity a) forkExec k = let result = runIdentity k in result `par` return (Identity result)
Since Identity is a newtype, would that be equivalent to "result `par` result"? The forkExec in the IO monad let's other computations keep going until I need the result from the forked computation. In a pure computation, I can already get the same result with `par` and laziness, right? Antoine
participants (11)
-
Antoine Latter
-
Bryan O'Sullivan
-
Dan Weston
-
Duncan Coutts
-
Ivan Lazar Miljenovic
-
Mario Blazevic
-
Matthew Brecknell
-
Neil Brown
-
Patrick Caldon
-
Paul Johnson
-
Sebastian Sylvan