Un-memoization

newer
ANNOUNCE: options, an easy-to-use...

older
Open-source projects for beginning...

Victor Miller

22 Mar 2012 22 Mar '12

1:55 a.m.

I was writing a Haskell program which builds a large labeled binary tree and then does some processing of it, which is fold-like. In the actual application that I have in mind the tree will be *huge*. If the whole tree is kept in memory it would probably take up 100's of gigabytes. Because of the pattern of processing the tree, it occurred to me that it might be better (cause much less paging) if some large subtrees could be replaced by thunks which can either recalculate the subtree as needed, or write out the subtree, get rid of the references to it (so it can be garbage collected) and then read back in (perhaps in pieces) as needed. This could be fairly cleanly expressed monadically. So does anyone know if someone has created something like this? Victor

Attachments:

attachment.html (text/html — 788 bytes)

Show replies by date

L Corbijn

22 Mar 22 Mar

8:18 a.m.

On Mar 22, 2012 2:56 AM, "Victor Miller" wrote:

...

I was writing a Haskell program which builds a large labeled binary tree

and then does some processing of it, which is fold-like. In the actual application that I have in mind the tree will be *huge*. If the whole tree is kept in memory it would probably take up 100's of gigabytes. Because of the pattern of processing the tree, it occurred to me that it might be better (cause much less paging) if some large subtrees could be replaced by thunks which can either recalculate the subtree as needed, This sounds like weak references (warning this is tricky stuff), http://www.haskell.org/ghc/docs/7.0.2/html/libraries/base-4.3.1.0/System-Mem... or write out the subtree, get rid of the references to it (so it can be garbage collected) and then read back in (perhaps in pieces) as needed. This could be fairly cleanly expressed monadically. So does anyone know if someone has created something like this?

...

Victor

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Stephen Tetley

6:20 p.m.

Hi Victor There was a paper at one of the early PADL conferences describing out-out-core data structures in Ocaml. I've never seen anyone following up this work, possibly because RAM has got so cheap in the last decade. If you have such large trees you may find the paper interesting. Although the authors used Caml they did use monads. Tyng-Ruey Chuang and Shin-Cheng Mu, "Out-of-core functional programming with type-based primitives," The paper seems to be on Citeseer.

Edward Kmett

23 Mar 23 Mar

5:17 p.m.

It depends on how you are building the tree. If you are building up the tree from repeated substitution at the leaves and never reference its body before you do the final fold, you may be able to exploit the fact that trees form a free monad, and that there is a nice construction for increasing the efficiency of substitution into free monads that has the side-effect of making them 'memoryless'. I blogged about it here: http://comonad.com/reader/2011/free-monads-for-less/ http://comonad.com/reader/2011/free-monads-for-less-2/ and used it here: http://comonad.com/reader/2011/free-monads-for-less-3/ Janis Voightländer has a paper on the simpler version from the first post above as well http://www.iai.uni-bonn.de/~jv/mpc08.pdf However, to page out or recalculate, you are probably looking at a more complicated construction. -Edward On Wed, Mar 21, 2012 at 9:55 PM, Victor Miller wrote:

...

I was writing a Haskell program which builds a large labeled binary tree and then does some processing of it, which is fold-like. In the actual application that I have in mind the tree will be *huge*. If the whole tree is kept in memory it would probably take up 100's of gigabytes. Because of the pattern of processing the tree, it occurred to me that it might be better (cause much less paging) if some large subtrees could be replaced by thunks which can either recalculate the subtree as needed, or write out the subtree, get rid of the references to it (so it can be garbage collected) and then read back in (perhaps in pieces) as needed. This could be fairly cleanly expressed monadically. So does anyone know if someone has created something like this?

Victor

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

4853

Age (days ago)

4854

Last active (days ago)

List overview

Download

3 comments

4 participants

participants (4)

Edward Kmett
L Corbijn
Stephen Tetley
Victor Miller