Re: Parallel Haskell: 2-year project to push real world use

4 May 2010

      * Ben Lippmeier  [04.05.2010 02:21]:
...
You can certainly create an array with these values, but in the provided code it looks like each successive array element has a serial dependency on the previous two elements. How were you expecting it to parallelise?
actually, in reality it is rather more complex, in a 2d-array, each cell
(i,j) requires a linear number of accesses to previously calculated
cells that all have indices bounded by the current (i,j).

One of the simplest codes is like this:

forall i in [1..n]
forall j in [i..n]
set (i,j) to: minimum of (i,k)+(k,j) (forall k in [i+1..j-1])

So, either I use destructive updates or need a tricky way to extend the
already computed part of the array with the new part (i,j). The above
code shows only why I need destructive updates. RNA folding is one of
those where it is needed.

I will try to distill the code down to an example that shows a
possibility for parallelization. I would want to use this for future
algorithms where it makes much more sense (O(n^4) or more), but that
still first update an array element, and then access it later.

Here http://www.tbi.univie.ac.at/newpapers/Abstracts/98-06-009.ps.gz is
a description of a parallel version of RNAfold.
...
Repa arrays don't support visible destructive update. For many algorithms you should't need it, and it causes problems for parallelisation.
I'm actively writing more Repa examples now.  Can you sent me some links explaining the algorithm that you're using, and some example data + output?
Thanks,
Ben.
On 04/05/2010, at 9:21 AM, Christian Höner zu Siederdissen wrote:
...
a = array (1,10) [ (i,f i) | i <-[1..10]] where
 f 1 = 1
 f 2 = 1
 f i = a!(i-1) + a!(i-2)
(aah, school ;)
Right now, I am abusing vector in ST by doing this:
a <- new
a' <- freeze a
forM_ [3..10] $ \i -> do
 write a (a'!(i-1) + a!(i-2))
Let's say I wanted to do something like this in dph (or repa), does that
work? We are actually using this for RNA folding algorithms that are at
least O(n^3) time. For some of the more advanced stuff, it would be
really nice if we could "just" parallelize.
To summarise: I need arrays that allow in-place updates.
Otherwise, most libraries that do heavy stuff (O(n^3) or worse) are
using vector right now. On a single core, it performs really great --
even compared to C-code that has been optimized a lot.
Thanks and "Viele Gruesse",
Christian