
Dear list, I create four haskell threads, each performs disjoint STM transactions. The more system threads I add, the slower the program runs. With four system threads the performance is 14% of the nonparallel run. What is going on? $ ghc -v Glasgow Haskell Compiler, Version 7.10.3, stage 2 booted by GHC version 7.8.4 $ ghc -O2 -threaded test-stm.hs $ time ./test-stm +RTS -N1 real 1.843s $ time ./test-stm +RTS -N2 real 7.469s $ time ./test-stm +RTS -N3 real 9.640s $ time ./test-stm +RTS -N4 real 13.144s $ cat test-stm.hs {-# LANGUAGE ScopedTypeVariables #-} import Control.Monad import Control.Concurrent import Control.Concurrent.STM import Control.Concurrent.STM.Stats import System.IO.Unsafe -- Copied from GHC docs children :: MVar [MVar ()] children = unsafePerformIO (newMVar []) waitForChildren :: IO () waitForChildren = do cs <- takeMVar children case cs of [] -> return () m:ms -> do putMVar children ms takeMVar m waitForChildren forkChild :: IO () -> IO ThreadId forkChild io = do mvar <- newEmptyMVar childs <- takeMVar children putMVar children (mvar:childs) forkFinally io (\_ -> putMVar mvar ()) -- Test case main = do forkChild $ newTVarIO 0 >>= incrManyTimes "thread1" forkChild $ newTVarIO 0 >>= incrManyTimes "thread2" forkChild $ newTVarIO 0 >>= incrManyTimes "thread3" forkChild $ newTVarIO 0 >>= incrManyTimes "thread4" waitForChildren dumpSTMStats -- Confirms no conflicts incrManyTimes :: String -> TVar Int -> IO () incrManyTimes l = incrRec (1000000 :: Int) where incrRec n v | n == 0 = pure () | otherwise = trackNamedSTM l (modifyTVar v (+1)) >> incrRec (n-1) v -- Wojtek