
On Thu, Aug 23, 2007 at 06:27:43AM +0100, Hugh Perkins wrote:
On 8/22/07, Brandon Michael Moore
wrote: Automatic threading is inherently limited by data dependencies. You can't run a function that branches on an argument in parallel with the computation producing that argument. Even with arbitrarily many processors and no communication overhead there is a limit to how much parallelism you can squeeze from any given program.
Yes. Its worse than that in fact, because many real-world problems will involve functions that have side-effects, but we know the side-effects dont matter. The parallelisation algo of course doesnt know they dont matter (unless we tell it).
Example: imagine we want to copy files from one machine to five others. Copying a file has a clear side-effect, but since we're copying to 5 independent machines, we can copy to each machine in parallel. The algo doesnt know that this is ok.
Actually, the algorithm was right, and the intuitive argument is wrong. Suppose all five computers are actually re-exporting network filesystem hosted on a sixth server, and the file names alias. By overruling the algorithm, you've introduced a corner case bug that will go unnoticed for years.
If you have cores to waste, you might try rewrites like
f x => case x of C1 a1 a2 -> f (C1 a1 a2) C2 b -> f (C2 b) C3 -> f C3
and then speculatively execute several of the case branches. If you don't throw away too much type information you should even be able to do it at runtime.
That's a good observation. Sortof anti-laziness :-D
It's called speculative execution, and microprocessors have been doing it for many years as part of their built-in automatic parallelization circuitry. (If you think emulator-based autoparallelization is going to be easy, consider that it takes a couple million transistors to do a reasonable job - and a chunk of 90's silicon can already do a million things at once, so you're at a huge starting disadvantage. Just my two cents.) Stefan