Re: [Haskell-beginners] missing parallelism

As far as I can tell there's nothing wrong with your code. My hypothesis is that Haskell optimizes call to sumEuler 5000 by calling it only once in one thread. Here's why I think so: The program I used for debugging this is: import Control.Concurrent.Async (async, wait) import System.IO.Unsafe (unsafePerformIO) sumEuler :: Int -> Int sumEuler = sum . map euler . mkList where mkList n = seq (unsafePerformIO (putStrLn "Calculating")) [1..n-1] euler n = length (filter (relprime n) [1..n-1]) where relprime x y = gcd x y == 1 p :: Int -> IO Int p b = return $! sumEuler b p5000 :: IO Int p5000 = return $ sumEuler 5000 main :: IO () main = do a <- async $ p5000 b <- async $ p5000 av <- wait a bv <- wait b print (av,bv) The two main modification are adding a debugging message "Calculating" when a list [1..(n-1)] is evaluated. The second one is making p into a function. Notice that it uses strict application ($! - check how it works with simple $). I will use that function in further examples. Running this program with "time ./Test 2 +RTS -ls -N2" gives me: Calculating (7598457,7598457) real 0m3.752s user 0m3.833s sys 0m0.211s Just to be sure I have almost the same time when doing only one computation with: main :: IO () main = do a <- async $ p5000 av <- wait a print av So it seems like the value returned by p5000 is computed only once. GHC might be noticing that p5000 will always return the same value and might try cache it or memoize it. If this hypothesis is right then calling sumEuler with two different values should run in two different threads. And indeed it is so: main :: IO () main = do a <- async $ p 5000 b <- async $ p 4999 av <- wait a bv <- wait b print (av,bv) Gives: Calculating Calculating (7598457,7593459) real 0m3.758s user 0m7.414s sys 0m0.064s So it runs in two threads just as expected. The strict application ($!) here is important. Otherwise it seems that the async thread returns a thunk and the evaluation happens in print (av, bv) which is evaluated in a single thread. Also the fact that p5000 is a top level binding is important. When I do: main :: IO () main = do a <- async $ p 5000 b <- async $ p 5000 av <- wait a bv <- wait b print (av,bv) I get no optimization (GHC 7.6.3). Best, Greg
participants (1)
-
Grzegorz Milka