Parallel Karatsuba - A Weird speed up value greater than 4 on an Intel Quadcore CPU! - Haskell-Cafe

Parallel Karatsuba - A Weird speed up value greater than 4 on an Intel Quadcore CPU!

Burak Ekici

24 Dec 2011 24 Dec '11

3:49 p.m.

Dear List, I am trying to parallelize Karatsuba multiplication with Haskell's second generation strategies. Although, I am running the code on an Intel quad-core CPU, I abnormally have a speedup much greater than 4, around 10, which means a weird parallelization or something occurs. I would be appreciated, if anyone make some comments on the issue explaining the possible reasons why this weird incident occurs? Here is the basic parallel portion of the code: karatsuba :: Int -> [Bool] -> [Bool] -> [Bool] karatsuba _ [] _ = [] karatsuba _ _ [] = [] karatsuba currentDepth xs ys | (l < 32 || currentDepth >= limit) = mul xs ys | otherwise = (x `add` (replicate l False ++ (z `add` (replicate l False ++ y)))) `Main.using` strategy where l = (min (length xs) (length ys)) `div` 2 (xs0, xs1) = splitAt l xs (ys0, ys1) = splitAt l ys x = (normalize (karatsuba (currentDepth+1) xs0 ys0)) y = (normalize (karatsuba (currentDepth+1) xs1 ys1)) z = ((karatsuba (currentDepth+1) (add xs0 xs1) (add ys0 ys1)) `sub` (normalize (karatsuba (currentDepth+1) xs0 ys0)) `sub` (normalize (karatsuba (currentDepth+1) xs1 ys1))) strategy res = do (Main.rpar) (x) (Main.rpar) (y) (Main.rpar) (z) Main.rdeepseq res Many thanks in advance and kind regards. Saluti, Burak.

Attachments:

attachment.html (text/html — 3.3 KB)

Show replies by date

Eugene Kirpichov

24 Dec 24 Dec

3:53 p.m.

New subject: Parallel Karatsuba - A Weird speed up value greater than 4 on an Intel Quadcore CPU!

Superlinear speedup can occur due to the increased cache size. 24.12.2011, в 19:49, Burak Ekici написал(а):

...

Dear List,

I am trying to parallelize Karatsuba multiplication with Haskell's second generation strategies. Although, I am running the code on an Intel quad-core CPU, I abnormally have a speedup much greater than 4, around 10, which means a weird parallelization or something occurs.

I would be appreciated, if anyone make some comments on the issue explaining the possible reasons why this weird incident occurs?

Here is the basic parallel portion of the code:

karatsuba :: Int -> [Bool] -> [Bool] -> [Bool] karatsuba _ [] _ = [] karatsuba _ _ [] = [] karatsuba currentDepth xs ys | (l < 32 || currentDepth >= limit) = mul xs ys | otherwise = (x `add` (replicate l False ++ (z `add` (replicate l False ++ y)))) `Main.using` strategy where l = (min (length xs) (length ys)) `div` 2 (xs0, xs1) = splitAt l xs (ys0, ys1) = splitAt l ys x = (normalize (karatsuba (currentDepth+1) xs0 ys0)) y = (normalize (karatsuba (currentDepth+1) xs1 ys1)) z = ((karatsuba (currentDepth+1) (add xs0 xs1) (add ys0 ys1)) `sub` (normalize (karatsuba (currentDepth+1) xs0 ys0)) `sub` (normalize (karatsuba (currentDepth+1) xs1 ys1))) strategy res = do (Main.rpar) (x) (Main.rpar) (y) (Main.rpar) (z) Main.rdeepseq res

Many thanks in advance and kind regards.

Saluti, Burak.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Burak Ekici

3:58 p.m.

First of all, thanks a lot for your quick answer! However, the question is what are the approximate limits of this super-linear speedup? I mean, is it acceptable, if parallelization happens even 100 time faster? How can I calculate the limits of this speedup via the cache size of my processor? Cheers, Burak. CC: haskell-cafe@haskell.org From: ekirpichov@gmail.com Subject: Re: [Haskell-cafe] Parallel Karatsuba - A Weird speed up value greater than 4 on an Intel Quadcore CPU! Date: Sat, 24 Dec 2011 19:53:26 +0400 To: ekcburak@hotmail.com Superlinear speedup can occur due to the increased cache size. 24.12.2011, в 19:49, Burak Ekici написал(а): Dear List, I am trying to parallelize Karatsuba multiplication with Haskell's second generation strategies. Although, I am running the code on an Intel quad-core CPU, I abnormally have a speedup much greater than 4, around 10, which means a weird parallelization or something occurs. I would be appreciated, if anyone make some comments on the issue explaining the possible reasons why this weird incident occurs? Here is the basic parallel portion of the code: karatsuba :: Int -> [Bool] -> [Bool] -> [Bool] karatsuba _ [] _ = [] karatsuba _ _ [] = [] karatsuba currentDepth xs ys | (l < 32 || currentDepth >= limit) = mul xs ys | otherwise = (x `add` (replicate l False ++ (z `add` (replicate l False ++ y)))) `Main.using` strategy where l = (min (length xs) (length ys)) `div` 2 (xs0, xs1) = splitAt l xs (ys0, ys1) = splitAt l ys x = (normalize (karatsuba (currentDepth+1) xs0 ys0)) y = (normalize (karatsuba (currentDepth+1) xs1 ys1)) z = ((karatsuba (currentDepth+1) (add xs0 xs1) (add ys0 ys1)) `sub` (normalize (karatsuba (currentDepth+1) xs0 ys0)) `sub` (normalize (karatsuba (currentDepth+1) xs1 ys1))) strategy res = do (Main.rpar) (x) (Main.rpar) (y) (Main.rpar) (z) Main.rdeepseq res Many thanks in advance and kind regards. Saluti, Burak. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Eugene Kirpichov

4:04 p.m.

New subject: Parallel Karatsuba - A Weird speed up value greater than 4 on an Intel Quadcore CPU!

Well, assume that cache is x times faster than main memory and that the hot working set size is y, and cache size of one core is z, and that the algorithm is really bound by memory access. Then some simple math should give the answer :) I can't do it myself now as I don't have a pen and paper at the moment. 24.12.2011, в 19:58, Burak Ekici написал(а):

...

First of all, thanks a lot for your quick answer! However, the question is what are the approximate limits of this super-linear speedup? I mean, is it acceptable, if parallelization happens even 100 time faster?

How can I calculate the limits of this speedup via the cache size of my processor?

Cheers, Burak.

CC: haskell-cafe@haskell.org From: ekirpichov@gmail.com Subject: Re: [Haskell-cafe] Parallel Karatsuba - A Weird speed up value greater than 4 on an Intel Quadcore CPU! Date: Sat, 24 Dec 2011 19:53:26 +0400 To: ekcburak@hotmail.com

Superlinear speedup can occur due to the increased cache size.

24.12.2011, в 19:49, Burak Ekici написал(а):

Dear List,

I am trying to parallelize Karatsuba multiplication with Haskell's second generation strategies. Although, I am running the code on an Intel quad-core CPU, I abnormally have a speedup much greater than 4, around 10, which means a weird parallelization or something occurs.

I would be appreciated, if anyone make some comments on the issue explaining the possible reasons why this weird incident occurs?

Here is the basic parallel portion of the code:

karatsuba :: Int -> [Bool] -> [Bool] -> [Bool] karatsuba _ [] _ = [] karatsuba _ _ [] = [] karatsuba currentDepth xs ys | (l < 32 || currentDepth >= limit) = mul xs ys | otherwise = (x `add` (replicate l False ++ (z `add` (replicate l False ++ y)))) `Main.using` strategy where l = (min (length xs) (length ys)) `div` 2 (xs0, xs1) = splitAt l xs (ys0, ys1) = splitAt l ys x = (normalize (karatsuba (currentDepth+1) xs0 ys0)) y = (normalize (karatsuba (currentDepth+1) xs1 ys1)) z = ((karatsuba (currentDepth+1) (add xs0 xs1) (add ys0 ys1)) `sub` (normalize (karatsuba (currentDepth+1) xs0 ys0)) `sub` (normalize (karatsuba (currentDepth+1) xs1 ys1))) strategy res = do (Main.rpar) (x) (Main.rpar) (y) (Main.rpar) (z) Main.rdeepseq res

Many thanks in advance and kind regards.

Saluti, Burak.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Eugene Kirpichov

4:06 p.m.

New subject: Parallel Karatsuba - A Weird speed up value greater than 4 on an Intel Quadcore CPU!

If the cache was infinitely faster, then doubling it would give an infinite speedup for an algorithm whose working set was exactly one core's cache size. 24.12.2011, в 19:58, Burak Ekici написал(а):

...

First of all, thanks a lot for your quick answer! However, the question is what are the approximate limits of this super-linear speedup? I mean, is it acceptable, if parallelization happens even 100 time faster?

How can I calculate the limits of this speedup via the cache size of my processor?

Cheers, Burak.

CC: haskell-cafe@haskell.org From: ekirpichov@gmail.com Subject: Re: [Haskell-cafe] Parallel Karatsuba - A Weird speed up value greater than 4 on an Intel Quadcore CPU! Date: Sat, 24 Dec 2011 19:53:26 +0400 To: ekcburak@hotmail.com

Superlinear speedup can occur due to the increased cache size.

24.12.2011, в 19:49, Burak Ekici написал(а):

Dear List,

I am trying to parallelize Karatsuba multiplication with Haskell's second generation strategies. Although, I am running the code on an Intel quad-core CPU, I abnormally have a speedup much greater than 4, around 10, which means a weird parallelization or something occurs.

I would be appreciated, if anyone make some comments on the issue explaining the possible reasons why this weird incident occurs?

Here is the basic parallel portion of the code:

karatsuba :: Int -> [Bool] -> [Bool] -> [Bool] karatsuba _ [] _ = [] karatsuba _ _ [] = [] karatsuba currentDepth xs ys | (l < 32 || currentDepth >= limit) = mul xs ys | otherwise = (x `add` (replicate l False ++ (z `add` (replicate l False ++ y)))) `Main.using` strategy where l = (min (length xs) (length ys)) `div` 2 (xs0, xs1) = splitAt l xs (ys0, ys1) = splitAt l ys x = (normalize (karatsuba (currentDepth+1) xs0 ys0)) y = (normalize (karatsuba (currentDepth+1) xs1 ys1)) z = ((karatsuba (currentDepth+1) (add xs0 xs1) (add ys0 ys1)) `sub` (normalize (karatsuba (currentDepth+1) xs0 ys0)) `sub` (normalize (karatsuba (currentDepth+1) xs1 ys1))) strategy res = do (Main.rpar) (x) (Main.rpar) (y) (Main.rpar) (z) Main.rdeepseq res

Many thanks in advance and kind regards.

Saluti, Burak.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Eugene Kirpichov

4:09 p.m.

New subject: Parallel Karatsuba - A Weird speed up value greater than 4 on an Intel Quadcore CPU!

I mean exactly 2x one cores cache size of course. 24.12.2011, в 20:06, Eugene Kirpichov написал(а):

...

If the cache was infinitely faster, then doubling it would give an infinite speedup for an algorithm whose working set was exactly one core's cache size.

24.12.2011, в 19:58, Burak Ekici написал(а):

...
First of all, thanks a lot for your quick answer! However, the question is what are the approximate limits of this super-linear speedup? I mean, is it acceptable, if parallelization happens even 100 time faster?

How can I calculate the limits of this speedup via the cache size of my processor?

Cheers, Burak.

CC: haskell-cafe@haskell.org From: ekirpichov@gmail.com Subject: Re: [Haskell-cafe] Parallel Karatsuba - A Weird speed up value greater than 4 on an Intel Quadcore CPU! Date: Sat, 24 Dec 2011 19:53:26 +0400 To: ekcburak@hotmail.com

Superlinear speedup can occur due to the increased cache size.

24.12.2011, в 19:49, Burak Ekici написал(а):

Dear List,

I am trying to parallelize Karatsuba multiplication with Haskell's second generation strategies. Although, I am running the code on an Intel quad-core CPU, I abnormally have a speedup much greater than 4, around 10, which means a weird parallelization or something occurs.

I would be appreciated, if anyone make some comments on the issue explaining the possible reasons why this weird incident occurs?

Here is the basic parallel portion of the code:

karatsuba :: Int -> [Bool] -> [Bool] -> [Bool] karatsuba _ [] _ = [] karatsuba _ _ [] = [] karatsuba currentDepth xs ys | (l < 32 || currentDepth >= limit) = mul xs ys | otherwise = (x `add` (replicate l False ++ (z `add` (replicate l False ++ y)))) `Main.using` strategy where l = (min (length xs) (length ys)) `div` 2 (xs0, xs1) = splitAt l xs (ys0, ys1) = splitAt l ys x = (normalize (karatsuba (currentDepth+1) xs0 ys0)) y = (normalize (karatsuba (currentDepth+1) xs1 ys1)) z = ((karatsuba (currentDepth+1) (add xs0 xs1) (add ys0 ys1)) `sub` (normalize (karatsuba (currentDepth+1) xs0 ys0)) `sub` (normalize (karatsuba (currentDepth+1) xs1 ys1))) strategy res = do (Main.rpar) (x) (Main.rpar) (y) (Main.rpar) (z) Main.rdeepseq res

Many thanks in advance and kind regards.

Saluti, Burak.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Arseniy Alekseyev

25 Dec 25 Dec

3:31 a.m.

Hi. You may need to make sure that the CPU frequency scaling does not do anything funny. (like only boosting the frequency to the maximum when the parallel program is running) Arseniy.

...

24.12.2011, в 19:49, Burak Ekici написал(а):

Dear List,

I am trying to parallelize Karatsuba multiplication with Haskell's second generation strategies. Although, I am running the code on an Intel quad-core CPU, I abnormally have a speedup much greater than 4, around 10, which means a weird parallelization or something occurs.

I would be appreciated, if anyone make some comments on the issue explaining the possible reasons why this weird incident occurs?

Here is the basic parallel portion of the code:

karatsuba :: Int -> [Bool] -> [Bool] -> [Bool] karatsuba _ [] _ = [] karatsuba _ _ [] = [] karatsuba currentDepth xs ys | (l < 32 || currentDepth >= limit) = mul xs ys | otherwise = (x `add` (replicate l False ++ (z `add` (replicate l False ++ y)))) `Main.using` strategy where l = (min (length xs) (length ys)) `div` 2 (xs0, xs1) = splitAt l xs (ys0, ys1) = splitAt l ys x = (normalize (karatsuba (currentDepth+1) xs0 ys0)) y = (normalize (karatsuba (currentDepth+1) xs1 ys1)) z = ((karatsuba (currentDepth+1) (add xs0 xs1) (add ys0 ys1)) `sub` (normalize (karatsuba (currentDepth+1) xs0 ys0)) `sub` (normalize (karatsuba (currentDepth+1) xs1 ys1))) strategy res = do (Main.rpar) (x) (Main.rpar) (y) (Main.rpar) (z) Main.rdeepseq res

Many thanks in advance and kind regards.

Saluti, Burak.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

4941

Age (days ago)

4942

Last active (days ago)

List overview

Download

6 comments

3 participants

participants (3)

Arseniy Alekseyev
Burak Ekici
Eugene Kirpichov