Memoizing in parallel

newer
Re: [Haskell-cafe] Haskell image...

Pierre-Etienne Meunier

8 Nov 2009 8 Nov '09

9:51 a.m.

Hi, I'm designing an algorithm that uses dynamic programming. I've written it with an array, and it works, but it is still very slow and needs way too much memory. Then I realized that the array was very sparse (at most a O(\sqrt(n)) of its size is actually used). Now I want to rewrite it with a Data.Map, but since I do not know a priori what the keys are, I need a mutable ref somewhere. Which is fine, but... the array solution allowed me to explore the subsolutions in parallel with Control.Parallel, and use items already evaluated by other threads in the array, while an STRef forbids it. Can someone know how to do it (outside IO, and without using unsafePerformIO, of course) ? Thanks Pierre-Etienne Meunier

Show replies by date

Luke Palmer

8 Nov 8 Nov

1:08 p.m.

On Sun, Nov 8, 2009 at 2:51 AM, Pierre-Etienne Meunier wrote:

...

Hi,

I'm designing an algorithm that uses dynamic programming. I've written it with an array, and it works, but it is still very slow and needs way too much memory.

Then I realized that the array was very sparse (at most a O(\sqrt(n)) of its size is actually used). Now I want to rewrite it with a Data.Map, but since I do not know a priori what the keys are, I need a mutable ref somewhere.

I don't know how you drew that conclusion. In fact, no mutable ref is necessary. Your keys are (or can be mapped to) integers, since you used an array. A solution is to use a trie of integers. You could, for example, store the values at the nodes of an infinite tree that looks like: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ... There are various implementations of this around. For a quick solution, though, you can try the data-memocombinators package: import qualified Data.MemoCombinators as Memo let f = Memo.integral go where go = ... f ... See how that performs. It has asymptotically better space performance for sparse usage, but the devil can be in the constant factors. Luke

Pierre-Etienne Meunier

2:20 p.m.

In fact, I could simply write a fibonacci function like this : fibonacci n= fst $ fib n Map.empty where fib 0 m=(0,m) fib 1 m=(1,m) fib n m= case Map.lookup n m of Just x->(x,m) Nothing-> let (n',m')=memoMap (n-2) m (n'',m'')=memoMap (n-1) m' in (n'+n'',Map.insert n (n'+n'') m'') And I could do it even with the implementation of maps in Data.IntMap. I've tried your solution, and looked at your source code; There is a small performance difference between your solution and this one : since you rely on laziness, and my structure is really sparse (a O(\sqrt(n)) is used in the best cases), it performs slightly better than an array, but this is still too slow : the tree does not rebalance itself, contrarily to what the trees in Data.IntMap do. But my question is mainly a question about Haskell's parallelism : imagine I'm solving the problem of attributing m pairs of skis to n skiers, while trying to minimize the sum of the differences between the size of the skiers and the size of their skis. With an array, I'd write : import GHC.Arr import Control.Parallel n=1000 --skiers m=1500 --skis skis=listArray (1,m) [150..] skiers=listArray (1,n) [139..] costs=listArray ((1,1), (n,m)) $ map cost $ range ((1,1),(n,m)) cost (i,j) | j

...

On Sun, Nov 8, 2009 at 2:51 AM, Pierre-Etienne Meunier wrote:

...
Hi,

I'm designing an algorithm that uses dynamic programming. I've written it with an array, and it works, but it is still very slow and needs way too much memory.

Then I realized that the array was very sparse (at most a O(\sqrt(n)) of its size is actually used). Now I want to rewrite it with a Data.Map, but since I do not know a priori what the keys are, I need a mutable ref somewhere.

I don't know how you drew that conclusion.

In fact, no mutable ref is necessary. Your keys are (or can be mapped to) integers, since you used an array. A solution is to use a trie of integers. You could, for example, store the values at the nodes of an infinite tree that looks like:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ...

There are various implementations of this around. For a quick solution, though, you can try the data-memocombinators package:

import qualified Data.MemoCombinators as Memo

let f = Memo.integral go where go = ... f ...

See how that performs. It has asymptotically better space performance for sparse usage, but the devil can be in the constant factors.

Luke

0 0

Reply
Sign in to reply online Use email software

5717
Age (days ago)

5717
Last active (days ago)

List overview
Download

2 comments

2 participants

Add to favorites Remove from favorites

tags

participants (2)

Luke Palmer

Pierre-Etienne Meunier