
Hi, I'm trying to use parallelism in Haskell, parMap in particular. My code is essentially this: import Control.Parallel.Strategies slow :: String -> (String, [Int]) fast x = (x, [0..255]) main :: IO () main = do content <- readFile "4.txt" --a 327 lines long file print $ parMap rpar slow $ lines content For the full/compiling code, see: https://github.com/csmarosi/cryptopals/blob/bf1ca794f897858f9008fb8740a1c3ef... What do I miss here? The code should be trivial to run on multiple cores, but it seem to use only a single CPU core, i.e. $ ghc -O2 -v0 -threaded --make s01p04.hs slow gives: $ time ./s01p04 +RTS -N2 | wc 1 1 64028 real 0m8.903s user 0m7.888s sys 0m1.792s whereas fast is: $ time ./s01p04 +RTS -N2 | wc 1 1 320787 real 0m0.016s user 0m0.012s sys 0m0.012s Bonus question: is there a parallel filter? I use Debian sid packages, ghc version 7.8.4 Some multi-threading seems to go on, since without -N2, the sys time is below 0.1s. BTW, the slow code use ~300K heap according to valgrind. Thanks, Csabi

My first guesses would be that: a) Your program is slowed down because lines from the file is read one line by one, i.e. there's lazy IO at work. Have you tried your code on a list that's loaded in the memory as a whole at once? b) The cost of dereferencing the next link in a list overwhelms the cost of computing one answer. Have you tried using a tree instead? Best regards, Marcin Mrotek

Thanks for the tips, some comments:
a) I tried to print the list before my expensive map (prints fast) and
even copy-paste the content into the source code, but it did not help.
b) What do you mean by this?
One more question: when I run my program with -s, the stats:
4,961,037,352 bytes allocated in the heap
19,431,024 bytes copied during GC
[...]
3 MB total memory in use (0 MB lost due to fragmentation)
What does the first number (~ 5G memory) mean? When I wrote my
functions, I just assumed that even the bad code will be fast enough.
(Even this question is more about learning that a practical issue.)
Can you recommend any docs about this kind of Haskell internals?
PS: splitting the input file to two, and use bash to wait both half
reduces the runtime by ~40%. What I want here is to annotate my
program with this knowledge to help ghc to beat bash :)
On 6/5/15, Marcin Mrotek
My first guesses would be that: a) Your program is slowed down because lines from the file is read one line by one, i.e. there's lazy IO at work. Have you tried your code on a list that's loaded in the memory as a whole at once? b) The cost of dereferencing the next link in a list overwhelms the cost of computing one answer. Have you tried using a tree instead?
Best regards, Marcin Mrotek
participants (2)
-
Csaba Marosi
-
Marcin Mrotek