
dons:
(Where I note GHC is currently in second place, though we've not submitted any parallel programs yet).
We might call that the thread-ring effect :-)
Also CC'd Isaac, Mr. Shootout. Isaac, is the quad core shootout open for business? Should we rally the troops?
iirc there was some discussion after the last GHC release about cleaning up the programs to make them less low-level given the improved capabilities of the compiler - I don't think that ever happened, and "low level" seems to be a common complaint against the Haskell programs shown in the benchmarks game. As Simon Peyton-Jones suggested we're certainly open for suggestions: http://groups.google.com/group/fa.haskell/browse_thread/thread/7eb82c689de8688/4f3c47b976394666?lnk=st&q=alioth+shootout#4f3c47b976394666 However, we're operating new measurement scripts on both u64q (published) and gp4 (unpublished), and my focus is still on catching up to where we were with measurements from the old scripts, and installing third-party libraries on u64q.

igouy2:
dons:
(Where I note GHC is currently in second place, though we've not submitted any parallel programs yet).
We might call that the thread-ring effect :-)
Also CC'd Isaac, Mr. Shootout. Isaac, is the quad core shootout open for business? Should we rally the troops?
iirc there was some discussion after the last GHC release about cleaning up the programs to make them less low-level given the improved capabilities of the compiler - I don't think that ever happened, and "low level" seems to be a common complaint against the Haskell programs shown in the benchmarks game.
As Simon Peyton-Jones suggested we're certainly open for suggestions:
However, we're operating new measurement scripts on both u64q (published) and gp4 (unpublished), and my focus is still on catching up to where we were with measurements from the old scripts, and installing third-party libraries on u64q.
So still consolidating the system. Do I understand though, that if we submit, say, a quad-core version of binary-trees, for example, using `par` and -N4, it'll go live on the benchmark page? -- Don

--- Don Stewart
So still consolidating the system.
Pretty much.
Do I understand though, that if we submit, say, a quad-core version of binary-trees, for example, using `par` and -N4, it'll go live on the benchmark page?
That's an open question - should it? How should the benchmarks game approach multicore?

igouy2:
--- Don Stewart
wrote: -snip-
So still consolidating the system.
Pretty much.
Do I understand though, that if we submit, say, a quad-core version of binary-trees, for example, using `par` and -N4, it'll go live on the benchmark page?
That's an open question - should it?
How should the benchmarks game approach multicore?
Well, there's a famous paper, Algorithm + Strategy = Parallelism I'd imagine we use the benchmark game's algorithms, but let submitters determine the strategy. Then the results would show a) how well you utilize the cores, and b) overall wall clock results. I'm keen to get going on this, if only because I think we can turn out parallelised versions of many of the existing programs, fairly cheaply. -- Don

--- Don Stewart
How should the benchmarks game approach multicore?
Well, there's a famous paper,
Algorithm + Strategy = Parallelism
I'd imagine we use the benchmark game's algorithms, but let submitters determine the strategy. Then the results would show
a) how well you utilize the cores, and b) overall wall clock results.
otoh I see the attraction of showing parallelised versions alongside existing programs; otoh that adds yet another layer of confusion about why the measurements differ (and another level of quarreling about whether even vaguely the same thing is being measured); otoh some existing programs already use more cores when they can ... The Scala threadring program shows 524s cpu but 157s elapsed: http://shootout.alioth.debian.org/u64q/benchmark.php?test=threadring&lang=all
I'm keen to get going on this, if only because I think we can turn out parallelised versions of many of the existing programs, fairly cheaply.
I'm always delighted that you're keen to get going on things like this! The benchmarks game always seems to demand somewhat unnatural acts and here's another - is there an effective way to /prevent/ ghc using multiple cores when multiple cores are available? Can we force ghc to only use one core on the quadcore machine? (Moreover can we do the same trick for other languages?)

On Aug 29, 2008, at 9:11 AM, Isaac Gouy wrote:
The benchmarks game always seems to demand somewhat unnatural acts and here's another - is there an effective way to /prevent/ ghc using multiple cores when multiple cores are available? Can we force ghc to only use one core on the quadcore machine? (Moreover can we do the same trick for other languages?)
There is indeed. In fact, GHC-compiled programs will only use multiple processors if you explicitly tell them to. To make use of two processors, for instance, you have to pass the flags '+RTS -N2' to the compiled executable when you run it. This tells it to schedule its internal threads on two OS threads, which the OS will presumably run on two processors if possible. Aaron

igouy2:
otoh I see the attraction of showing parallelised versions alongside existing programs; otoh that adds yet another layer of confusion about why the measurements differ (and another level of quarreling about whether even vaguely the same thing is being measured); otoh some existing programs already use more cores when they can ...
The Scala threadring program shows 524s cpu but 157s elapsed:
http://shootout.alioth.debian.org/u64q/benchmark.php?test=threadring&lang=all
Very cool!
I'm keen to get going on this, if only because I think we can turn out parallelised versions of many of the existing programs, fairly cheaply.
I'm always delighted that you're keen to get going on things like this!
The benchmarks game always seems to demand somewhat unnatural acts and here's another - is there an effective way to /prevent/ ghc using multiple cores when multiple cores are available? Can we force ghc to only use one core on the quadcore machine? (Moreover can we do the same trick for other languages?)
Certainly, for the quad core, to get all 4 cores in play: * compile with -threaded * run with +RTS -N4 to force single core, we'll: * compile normally * run normally -- Don
participants (3)
-
Aaron Tomb
-
Don Stewart
-
Isaac Gouy