Unbelievable parallel speedup

3 Jun 2011

      I've enjoyed reading Simon Marlow's new tutorial on parallel and
concurrent programming, and learned some surprisingly basic tricks.  I
didn't know about the '-s' runtime option for printing statistics.  I
decided to compute speedups for a program I wrote just as Simon did,
after running the program on an unloaded machine with four processors.
 When I did, I found the speedup on two processors was 2.4, on three
it was 3.2, and on four it was 4.4!  Am I living in a dream world?

I ran the test nine more times, and here is a table of the speedups.

2.35975	3.42595	4.39351
1.57458	2.18623	2.94045
1.83232	2.77858	3.41629
1.58011	2.37084	2.94913
2.36678	3.63694	4.42066
1.58199	2.29053	2.95165
1.57656	2.34844	2.94683
1.58143	2.3242	2.95098
2.36703	3.36802	4.41918
1.58341	2.30123	2.93933

That last line looks pretty reasonable to me, and is what I expected.
Let's look at a table of the elapse times.

415.67	176.15	121.33	94.61
277.52	176.25	126.94	94.38
321.37	175.39	115.66	94.07
277.72	175.76	117.14	94.17
415.63	175.61	114.28	94.02
277.75	175.57	121.26	94.10
277.68	176.13	118.24	94.23
277.51	175.48	119.40	94.04
415.58	175.57	123.39	94.04
277.62	175.33	120.64	94.45

Notice that the elapse times for two and four processors is pretty
consistent, and the one for three processors is a little inconsistent,
but the times for the single processor case are all over the map.  Can
anyone explain all this variance?

I have enclosed the raw output from the runs and the script that was
run ten times to produce the output.

John

John D. Ramsdell

Thomas Schilling

Yves Parès

Erlend Hamberg

Simon Marlow

tags

participants (5)