Re: [Haskell-cafe] Haskell Speed Myth

26 Aug 2008

      thomas.dubuisson:
...
dons:
...
Simon Marlow sez:
The thread-ring benchmark needs careful scheduling to get a speedup
    on multiple CPUs. I was only able to get a speedup by explicitly
    locking half of the ring onto each CPU. You can do this using
    GHC.Conc.forkOnIO in GHC 6.8.x, and you'll also need +RTS -qm -qw.
Also make sure that you're not using the main thread for any part of
    the main computation, because the main thread is a bound thread and
    runs in its own OS thread, so communication between the main thread
    and any other thread is slow.
I had to see the results for myself :-)
old RTS: 0m54.296s
threaded RTS (-N1):    0m56.839s
threaded RTS (-N2):    0m52.623s
Wow!  3x the performance for a simple change.  Frustrating that there
isn't a protable/standard way to express this.  Also frustrating that
the threaded version doesn't improve on the situation (utilization is
back at 50%).
Anyway, that was a fun miro-benchmark to play with.
Did we gain any insights for submitting to the multicore shootout,

    http://shootout.alioth.debian.org/u64q/benchmark.php?test=all&lang=all

(Where I note GHC is currently in second place, though we've not
submitted any parallel programs yet).

Also CC'd Isaac, Mr. Shootout. Isaac, is the quad core shootout
open for business? Should we rally the troops?

-- Don

Re: [Haskell-cafe] Haskell Speed Myth

Don Stewart