Problems with threading?

While working on the Shootout, I noticed the following benchmarks: http://shootout.alioth.debian.org/u64/program.php?test=chameneosredux&lang=ghc&id=3 http://shootout.alioth.debian.org/u64/program.php?test=chameneosredux&lang=ghc&id=3 http://shootout.alioth.debian.org/u64q/program.php?test=chameneosredux&lang=ghc&id=3 The same program becomes almost 4 times slower when compiled with --threaded and run with +RTS -N5 -- even though the multi-core benchmark really only ever uses one processor. Other languages seem to have found a way of arranging these threads in a way such that parallelism actually happens, but as it stands, compiling this benchmark without --threaded actually makes Haskell competitive against the genuinely parallel alternatives in other languages...which is unusual by itself. I wanted to throw this out for people to discuss, because I'd like to see it improved. As it stands, I'm going to submit a version which asks not to be compiled with --threaded (and has a few other improvements). http://shootout.alioth.debian.org/u64q/program.php?test=chameneosredux&lang=ghc&id=3 Louis Wasserman wasserman.louis@gmail.com http://profiles.google.com/wasserman.louis

wasserman.louis:
While working on the Shootout, I noticed the following benchmarks:
http://shootout.alioth.debian.org/u64/program.php?test=chameneosredux&lang=ghc& id=3 http://shootout.alioth.debian.org/u64q/program.php?test=chameneosredux&lang= ghc&id=3
The same program becomes almost 4 times slower when compiled with --threaded and run with +RTS -N5 -- even though the multi-core benchmark really only ever uses one processor.
Using -N5 sounds suspicious. There are only 4 cores on the machine.
Other languages seem to have found a way of arranging these threads in a way such that parallelism actually happens, but as it stands, compiling this benchmark without --threaded actually makes Haskell competitive against the genuinely parallel alternatives in other languages...which is unusual by itself.
I wanted to throw this out for people to discuss, because I'd like to see it improved. As it stands, I'm going to submit a version which asks not to be compiled with --threaded (and has a few other improvements).
What parallelization did you try? Is it a good algorithm? -- Don

--- On Mon, 6/7/10, Don Stewart
From: Don Stewart
Subject: Re: [Haskell-cafe] Problems with threading? To: "Louis Wasserman" Cc: "Haskell Café List" Date: Monday, June 7, 2010, 2:50 PM wasserman.louis: While working on the Shootout, I noticed the following benchmarks:
http://shootout.alioth.debian.org/u64/program.php?test=chameneosredux〈=ghc& id=3 http://shootout.alioth.debian.org/u64q/program.php?test=chameneosredux〈= ghc&id=3
The same program becomes almost 4 times slower when compiled with --threaded and run with +RTS -N5 -- even though the multi-core benchmark really only ever uses one processor.
Using -N5 sounds suspicious. There are only 4 cores on the machine.
-N5 is likely to have been your orsuggestion for getting the most out of ghc 6.10.* :-) -snip-
I wanted to throw this out for people to discuss, because I'd like to see it improved.
As Louis has already mentioned this to me, I'll take the opportunity to sketch out a simple approach - 1) GHC programs compiled without -threaded and run without +RTS -N are already shown for x86 and x64 http://shootout.alioth.debian.org/u32/compare.php?lang=ghc http://shootout.alioth.debian.org/u64/compare.php?lang=ghc 2) For quad-core, the GHC programs will all be compiled with -threaded and all run with +RTS -N4 3) That seems to match the approach taken with Erlang, where all the programs on quad-core run with smp built into the vm, and all the programs on one core run without smp built into the vm.

igouy2:
As Louis has already mentioned this to me, I'll take the opportunity to sketch out a simple approach -
1) GHC programs compiled without -threaded and run without +RTS -N are already shown for x86 and x64
http://shootout.alioth.debian.org/u32/compare.php?lang=ghc
http://shootout.alioth.debian.org/u64/compare.php?lang=ghc
2) For quad-core, the GHC programs will all be compiled with -threaded and all run with +RTS -N4
3) That seems to match the approach taken with Erlang, where all the programs on quad-core run with smp built into the vm, and all the programs on one core run without smp built into the vm.
Yep, that's fine.

Louis Wasserman
While working on the Shootout, I noticed the following benchmarks:
http://shootout.alioth.debian.org/u64/program.php?test=chameneosredux&lang=ghc&id=3
http://shootout.alioth.debian.org/u64q/program.php?test=chameneosredux&lang=ghc&id=3 [...]
I'd like to see it improved. [...]
One difficulty is that the single meeting place is a bottleneck. The only parallel activity I can see for the chameneos creatures is standing around in queues.
participants (4)
-
Don Stewart
-
Isaac Gouy
-
Louis Wasserman
-
Tom Pledger