Re: [Haskell-beginners] parMap on multicore computers

23 May 2009

      Hi Thomas,

Thanks for your answer.
I know that there is needed some tweak in the process of  
parallelization to get a performance gain. I agree with your reasoning  
regarding the 100 string input, but in the other example I gave, I use  
the 8-core machine to process a 10 strings (and each one takes half a  
second to be processed) why is this case failing too?

M;

El 23/05/2009, a las 0:02, Thomas Friedrich escribió:
...
Hi Miguel,
I don't think that you can expect a program to process each line of  
a text file in parallel to very efficient.  Opening a new thread is  
usually fairly cheap, however, there is some bookkeeping involved  
that shouldn't be underestimated.  You only have 2 or 4 cores, so  
opening 100 threads or more, depending on the size of your file,  
will do you no good.  You should rather split up your file in 4  
chunks and and then process these *4* threads in parallel.
That should make it more efficient!  Parallel /= faster.  At least  
not automatically.
Happy Hacking,
Thomas
Miguel Pignatelli wrote:
...
Hi all,
I'm experimenting a bit with the parallelization capabilities of  
Haskell.
What I am trying to do is to process in parallel all the lines of a  
text file, calculating the edit distance of each of these lines  
with a given string.
This is my testing code:
import System.IO
import Control.Monad
import Control.Parallel
import Control.Parallel.Strategies
edist :: String -> String -> Int
-- edist calculates the edit distance of 2 strings
-- see for example http://www.csse.monash.edu.au/~lloyd/tildeFP/Haskell/1998/Edit01/ 
 <http://www.csse.monash.edu.au/%7Elloyd/tildeFP/Haskell/1998/ 
Edit01/>
getLines :: FilePath -> IO [Int]
getLines = liftM ((parMap rnf (edist longString)) . lines) . readFile
main :: IO ()
main = do
list <- getLines "input.txt"
mapM_ ( putStrLn . show ) list
I am testing this code in a 2xQuadCore linux (Ubuntu 8.10) machine  
(8 cores in total).
The code has been compiled with
ghc --make -threaded mytest.hs
I've been trying input files of different lengths, but the more  
cores I try to use, the worst performance I am getting.
Here are some examples:
# input.txt -> 10 lines (strings) of ~1200 letters each
$ time ./mytest +RTS -N1 > /dev/null
real 0m4.775s
user 0m4.700s
sys 0m0.080s
$ time ./mytest +RTS -N4 > /dev/null
real 0m6.272s
user 0m8.220s
sys 0m0.290s
$ time ./mytest +RTS -N8 > /dev/null
real 0m7.090s
user 0m10.960s
sys 0m0.400s
# input.txt -> 100 lines (strings) of ~1200 letters each
$ time ./mytest +RTS -N1 > /dev/null
real 0m49.854s
user 0m49.730s
sys 0m0.120s
$ time ./mytest +RTS -N4 > /dev/null
real 1m11.303s
user 1m36.210s
sys 0m1.070s
$ time ./mytest +RTS -N8 > /dev/null
real 1m19.488s
user 2m6.250s
sys 0m1.270s
What is going wrong in this code? Is this a problem of the "grain  
size" of the parallelization?
Any help / advice would be very welcome,
M;
------------------------------------------------------------------------
_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://www.haskell.org/mailman/listinfo/beginners