
it seems that now we move right into this direction with GPUs
I was just thinking that GPUs might make a good target for a reduction language like Haskell. They are hugely parallel, and they have the commercial momentum to keep them current. It also occurred to me that the cell processor (in the Playstation 3) might also be a good target considering its explicitly parallel architecture.
They are no good. GPU's have no synchronisation between them which is needed for graph reduction. A delayed computation undergo several steps when created and actually computed/forced and we'd like to save effort. Also, they are slow when dealing with memory accesses. Some are slow on conditional execution. Take a look at BrookGPU: http://graphics.stanford.edu/projects/brookgpu/ They have raytracer on GPU and it is SLOW because of high cost of tree traversal.