Performance problems with parallelizing QuickCheck using channels (Control.Concurrent.Chan)

16 Nov 2007

      So I've been trying to get my QuickCheck tests to run in parallel. I did take a look at Don's Parallel QuickCheck library http://www.cse.unsw.edu.au/~dons/pqc.html, but I didn't like how much code it had in it and I figured it'd be a good exercise to try to do myself.

After quite a lot of help from the good folk of #haskell, I eventually came up with this:
  module Pcheck (parTest, parCheck) where

  import Control.Monad (replicateM_, liftM)
  import Control.Concurrent.Chan (newChan, writeChan, getChanContents)
  import Control.Concurrent (forkIO)
  import Test.QuickCheck (quickCheck', Testable())

  -- | Takes a list of functions using parCheck, and returns True iff all return
  -- True. Evaluates them in parallel.
  parTest :: [IO Bool] -> IO Bool
  parTest = andTest . parList
      where andTest :: IO [Bool] -> IO Bool
            andTest = liftM and

  {- | Test in parallel. Forks off a QuickCheck 'n' times; QuickCheck tests using
   the proposition 't'. Returns True if all tests were passed, else
   False. Should be run with parallelizing options like with +RTS -N4 -RTS &etc. -}
  parCheck :: (Testable prop) => prop -> Int -> IO Bool
  parCheck t n = do chan <- newChan
                    replicateM_ n $ forkIO $ (writeChan chan) =<< (quickCheck' t)
                    liftM (and . take n) $ getChanContents chan

  -- | Takes a list of functions (presumably using parCheck) and evaluates all in parallel.
  parList :: [IO a] -> IO [a]
  parList fs = do chan <- newChan
                  mapM_ (\m -> forkIO $ m >>= writeChan chan) fs
                  liftM (take n) $ getChanContents chan
                      where n = length fs

I liked how simple the Channels library http://haskell.org/ghc/docs/latest/html/libraries/base-3.0.0.0/Control-Concu... seemed to be; I could just pass the channel as an argument and have every forkIO'd test simply chuck its Boolean result into it when done - which seem'd much simpler than using MVars and the other techniques for returning stuff from forkIO threads.

And so it compiles, it runs tests correctly, and so on. But the problem is that it does so slowly. I have another module of equations about nuclear bombs called nuke.hs, which has a number of QuickCheck properties defined. Here's what happens when main is defined as 'parTest [the various tests..]':

 ./nuke +RTS -N7 -sstderr -RTS 40.57s user 46.55s system 116% cpu 1:14.61 total
 ./nuke +RTS -N6 -sstderr -RTS 40.72s user 47.66s system 117% cpu 1:15.50 total
 ./nuke +RTS -N5 -sstderr -RTS 42.33s user 49.08s system 116% cpu 1:18.67 total
 ./nuke +RTS -N4 -sstderr -RTS 43.71s user 48.41s system 117% cpu 1:18.48 total
 ./nuke +RTS -N3 -sstderr -RTS 41.51s user 48.25s system 114% cpu 1:18.10 total
 ./nuke +RTS -N2 -sstderr -RTS 42.28s user 47.18s system 115% cpu 1:17.39 total
 ./nuke +RTS -N1 -sstderr -RTS 27.87s user 18.40s system 99% cpu 46.498 total

(From http://hpaste.org/3886#a6; compiled as "=ghc -v --make -threaded -O2 ./nuke.hs".)

For some reason, running the parallel tests with a single thread is faster than running with 4 threads (I have a quad-core Intel processor)? I find this counter-intuitive to say the least. the par* functions are indeed operating in parallel, as evidenced by it using more than 100% CPU time, or, running on multiple cores, and all the tests are passed as True in both -N1 and -N[2-7] versions, so -N1 can't be bailing out early due to "and"'s laziness, and in general everything seems to be written correctly.

I am perplexed by this. Is Chan simply a very inefficient way of parallelizing things? Is it not as parallel as I think? Or am I missing something else entirely?

(Attached is the source of nuke.hs and pcheck.hs, as well as some data from -sstderr.)

--
gwern
.45 GIGN jya. wire ISI SADCC JPL embassy Recon World

Gwern Branwen

tags

participants (1)