Re: [Haskell-cafe] Dynamic thread management?

23 Aug 2007

      On 8/22/07, Brandon Michael Moore  wrote:
...
Automatic threading is inherently limited by data dependencies.
You can't run a function that branches on an argument in parallel
with the computation producing that argument. Even with arbitrarily
many processors and no communication overhead there is a limit to
how much parallelism you can squeeze from any given program.
Yes.  Its worse than that in fact, because many real-world problems
will involve functions that have side-effects, but we know the
side-effects dont matter.  The parallelisation algo of course doesnt
know they dont matter (unless we tell it).

Example: imagine we want to copy files from one machine to five
others.  Copying a file has a clear side-effect, but since we're
copying to 5 independent machines, we can copy to each machine in
parallel.  The algo doesnt know that this is ok.
...
You should read
"Feedback Directed Implicit Parallelism"
http://research.microsoft.com/~tharris/papers/2007-fdip.pdf
and perhaps
"Limits to Implicit Parallelism in Functional Applications"
http://www.detreville.org/papers/Limits.pdf
Ok
...
In short, with zero overhead and an oracle for scheduling you can
get a factor of at most 2x to 32x by implicitly parallelizing
existing Haskell code. In practice, with execution overhead it's a
gain of perhaps 10% to 80% on a 4-core system.
This is a good argument that it's not enough to prototype on a 4 core
system, but we really need some way to simulate a 1024 core system to
carry out meaningful benchmarks.
...
You can do a lot better if you expect people to rewrite code,
but "automatic threading" suggests something completely effortless.
Yes, I tend to overuse the word "automatic" ;-)
...
I think you can get much better results if you work on the programming
style in connection with a better runtime.  You can think of data parallel
Haskell as a new programming style with more implicit parallelims,
and the runtime support to exploit it.
Yes, you're right.
...
If you have cores to waste, you might try rewrites like
f x
=>
case x of
 C1 a1 a2 -> f (C1 a1 a2)
 C2 b -> f (C2 b)
 C3 -> f C3
and then speculatively execute several of the case branches.
If you don't throw away too much type information you should
even be able to do it at runtime.
That's a good observation.  Sortof anti-laziness :-D

Re: [Haskell-cafe] Dynamic thread management?

Hugh Perkins