Re: Parallel Haskell: 2-year project to push real world use

Hi,
on that topic, consider this (rather trivial) array:
a = array (1,10) [ (i,f i) | i <-[1..10]] where
f 1 = 1
f 2 = 1
f i = a!(i-1) + a!(i-2)
(aah, school ;)
Right now, I am abusing vector in ST by doing this:
a <- new
a' <- freeze a
forM_ [3..10] $ \i -> do
write a (a'!(i-1) + a!(i-2))
Let's say I wanted to do something like this in dph (or repa), does that
work? We are actually using this for RNA folding algorithms that are at
least O(n^3) time. For some of the more advanced stuff, it would be
really nice if we could "just" parallelize.
To summarise: I need arrays that allow in-place updates.
Otherwise, most libraries that do heavy stuff (O(n^3) or worse) are
using vector right now. On a single core, it performs really great --
even compared to C-code that has been optimized a lot.
Thanks and "Viele Gruesse",
Christian
* Duncan Coutts
On Fri, 2010-04-30 at 10:25 -0400, Tyson Whitehead wrote:
On April 30, 2010 06:32:55 Duncan Coutts wrote:
In the last few years GHC has gained impressive support for parallel programming on commodity multi-core systems. In addition to traditional threads and shared variables, it supports pure parallelism, software transactional memory (STM), and data parallelism. With much of this research and development complete, and more on the way, the next stage is to get the technology into more widespread use.
Does this mean DPH is ready for abuse?
This project is about pushing the practical use of the parallel techniques that are already mature, rather than about pushing research projects along further.
So this project is not really about DPH. On the other hand it's possible someone might be able to make more immediate use of the dense, regular parallel arrays which has been a recent spinoff of the DPH project. They have the advantage of being considerably easier to implement, but much less expressive than the full sparse, nested parallel arrays.
Duncan
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

You can certainly create an array with these values, but in the provided code it looks like each successive array element has a serial dependency on the previous two elements. How were you expecting it to parallelise? Repa arrays don't support visible destructive update. For many algorithms you should't need it, and it causes problems for parallelisation. I'm actively writing more Repa examples now. Can you sent me some links explaining the algorithm that you're using, and some example data + output? Thanks, Ben. On 04/05/2010, at 9:21 AM, Christian Höner zu Siederdissen wrote:
a = array (1,10) [ (i,f i) | i <-[1..10]] where f 1 = 1 f 2 = 1 f i = a!(i-1) + a!(i-2)
(aah, school ;)
Right now, I am abusing vector in ST by doing this:
a <- new a' <- freeze a forM_ [3..10] $ \i -> do write a (a'!(i-1) + a!(i-2))
Let's say I wanted to do something like this in dph (or repa), does that work? We are actually using this for RNA folding algorithms that are at least O(n^3) time. For some of the more advanced stuff, it would be really nice if we could "just" parallelize.
To summarise: I need arrays that allow in-place updates.
Otherwise, most libraries that do heavy stuff (O(n^3) or worse) are using vector right now. On a single core, it performs really great -- even compared to C-code that has been optimized a lot.
Thanks and "Viele Gruesse", Christian

On 04/05/2010, at 09:21, Christian Höner zu Siederdissen wrote:
Hi,
on that topic, consider this (rather trivial) array:
a = array (1,10) [ (i,f i) | i <-[1..10]] where f 1 = 1 f 2 = 1 f i = a!(i-1) + a!(i-2)
(aah, school ;)
Right now, I am abusing vector in ST by doing this:
a <- new a' <- freeze a forM_ [3..10] $ \i -> do write a (a'!(i-1) + a!(i-2))
Let's say I wanted to do something like this in dph (or repa), does that work? We are actually using this for RNA folding algorithms that are at least O(n^3) time. For some of the more advanced stuff, it would be really nice if we could "just" parallelize.
Do you really just need a prefix sum? These are easily parallelisable if the operator is associative. For instance, you could implement the Fibonacci sequence as: mapP fst $ scanP (\(a,b) _ -> (a+b,a)) (1,0) $ replicateP n (0,0) and DPH would parallelise it. That's how I would write the above with vector as well.
To summarise: I need arrays that allow in-place updates.
In-place updates + parallelism = bad! That's oversimplifying, of course. But the transformations underlying DPH, for instance, simply don't work in the presence of side effects.
Otherwise, most libraries that do heavy stuff (O(n^3) or worse) are using vector right now. On a single core, it performs really great -- even compared to C-code that has been optimized a lot.
That's great to know! Do you (or anyone else) by any chance have any benchmarks you could share? At the moment, I'm only benchmarking vector with a couple of rather simplistic algorithms which is a bit of a problem. Roman
participants (4)
-
Ben Lippmeier
-
Christian Höner zu Siederdissen
-
Don Stewart
-
Roman Leshchinskiy