Re: [Haskell-cafe] HANSEI in Haskell?

The topic of HANSEI in Haskell does come up from time to time. In fact, there was a Haskell edition of the first version of Hansei: http://okmij.org/ftp/kakuritu/probM.hs It was written to see how the code would look in Haskell, and how memoization (inherent in lazy evaluation of GHC) may help. The code contains some of the tests from the HANSEI distribution. The experience showed that lazy evaluation hurts -- a lot. The problem is not in lack of seq -- adding seq would make things even worse. There is a bit of explanation at the end of probM.hs. The gist of the problem is that approximate inference (sampling, to be precise) trades space for time. The search tree is way too large to fit into memory. Therefore, it is better to recompute the values (the branches of the tree) than to keep storing them. That is precisely the opposite to the trade-off of lazy evaluation. The end result is attempting to allocate gigabytes (really!), even for toy problems. We can get around the problem by using thunks explicitly. But then we lose the remaining benefits of Haskell syntax (the ones that are left after we have to switch to the monadic notation). OCaml becomes quite compelling then. It must be stressed that laziness in general is _crucial_ for solving even moderately complex problems. However, the needed laziness is NOT of the kind provided by GHC. We need memoization specific to a possible world, rather than the one that affects all possible worlds. GHC gives only the latter. The sort of laziness needed for non-deterministic programming calls for first-class store. It can be emulated, albeit imperfectly, for example, as was described in the ICFP09 paper. Hansei uses a quite similar idea. Ideally first-class memory is provided as built-in, since it requires interaction with garbage collection. Greg Morrisett has written extensively on this topic and done a few implementations; see for example, http://www.eecs.harvard.edu/~greg/papers/jgmorris-callcs.ps Sadly, the topic has been forgotten.

On 24/02/2011 09:30, oleg@okmij.org wrote:
The sort of laziness needed for non-deterministic programming calls for first-class store. It can be emulated, albeit *imperfectly*, for example, as was described in the ICFP09 paper.
What do you mean by "imperfectly"? Do you think implementing 'probM' with 'share' would lead to the same performance problems you experienced in probM.hs?
Ideally first-class memory is provided as built-in, since it requires interaction with garbage collection. Greg Morrisett has written extensively on this topic and done a few implementations; see for example, http://www.eecs.harvard.edu/~greg/papers/jgmorris-callcs.ps

Arnaud Clère
On 24/02/2011 09:30, oleg@okmij.org wrote:
The sort of laziness needed for non-deterministic programming calls for first-class store. It can be emulated, albeit *imperfectly*, for example, as was described in the ICFP09 paper. What do you mean by "imperfectly"?
I think Oleg meant that, because the garbage collector only understands the native store and not our emulation of first-class stores, cells of a first-class store that are no longer referenced will nevertheless not be garbage-collected until the store itself is garbage-collected. What we need is a way to tell the garbage collector that the store reference and the cell reference are both needed to access the data so the data can be garbage-collected as soon as *either* the store reference *or the cell reference* is. (Analogy from the capabilities literature: the key and the lock are both needed to open the door so the door can be garbage-collected as soon as either the key or the lock is.) Any thoughts on how to achieve that?
Do you think implementing 'probM' with 'share' would lead to the same performance problems you experienced in probM.hs?
No, implementing probM with share would be like what OCaml HANSEI currently does, except in monadic notation rather than direct style. -- Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig 1st graffitiist: QUESTION AUTHORITY! 2nd graffitiist: Why?

On Thu, Feb 24, 2011 at 12:57 PM, Chung-chieh Shan
Arnaud Clère
wrote in article in gmane.comp.lang.haskell.cafe: On 24/02/2011 09:30, oleg@okmij.org wrote:
The sort of laziness needed for non-deterministic programming calls for first-class store. It can be emulated, albeit *imperfectly*, > for example, as was described in the ICFP09 paper. What do you mean by "imperfectly"?
I think Oleg meant that, because the garbage collector only understands the native store and not our emulation of first-class stores, cells of a first-class store that are no longer referenced will nevertheless not be garbage-collected until the store itself is garbage-collected. What we need is a way to tell the garbage collector that the store reference and the cell reference are both needed to access the data so the data can be garbage-collected as soon as *either* the store reference *or the cell reference* is. (Analogy from the capabilities literature: the key and the lock are both needed to open the door so the door can be garbage-collected as soon as either the key or the lock is.) Any thoughts on how to achieve that?
Here's a weak cell which is a candidate for GC as soon as either the cell or the key is lost: http://hpaste.org/44280/weak_ref_needing_both_halves The implementation is GHC specific, as far as I know. I don't know if it's exactly what you're looking for, but the idea could be adapted towards some other purpose. Antoine

On 2011-02-24T16:20:46-0600, Antoine Latter wrote:
On Thu, Feb 24, 2011 at 12:57 PM, Chung-chieh Shan wrote:
What we need is a way to tell the garbage collector that the store reference and the cell reference are both needed to access the data so the data can be garbage-collected as soon as *either* the store reference *or the cell reference* is. (Analogy from the capabilities literature: the key and the lock are both needed to open the door so the door can be garbage-collected as soon as either the key or the lock is.) Any thoughts on how to achieve that?
Here's a weak cell which is a candidate for GC as soon as either the cell or the key is lost: http://hpaste.org/44280/weak_ref_needing_both_halves
That's promising! The lock I have in mind would probably be a map from Int to weak pointers. I'm unfamiliar with System.Mem.Weak so I'll have to study this code further. What is "addFinalizer lock (finalize wk)" for? It seems wk doesn't have any finalizers to run. Thanks! -- Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig 1st graffitiist: QUESTION AUTHORITY! 2nd graffitiist: Why?

On Thu, Feb 24, 2011 at 10:15 PM, Chung-chieh Shan
On 2011-02-24T16:20:46-0600, Antoine Latter wrote:
On Thu, Feb 24, 2011 at 12:57 PM, Chung-chieh Shan wrote:
What we need is a way to tell the garbage collector that the store reference and the cell reference are both needed to access the data so the data can be garbage-collected as soon as *either* the store reference *or the cell reference* is. (Analogy from the capabilities literature: the key and the lock are both needed to open the door so the door can be garbage-collected as soon as either the key or the lock is.) Any thoughts on how to achieve that?
Here's a weak cell which is a candidate for GC as soon as either the cell or the key is lost: http://hpaste.org/44280/weak_ref_needing_both_halves
That's promising! The lock I have in mind would probably be a map from Int to weak pointers. I'm unfamiliar with System.Mem.Weak so I'll have to study this code further. What is "addFinalizer lock (finalize wk)" for? It seems wk doesn't have any finalizers to run.
I was confused by GHC weak references for some time. GHC Weak references have the following parts: * a key * a value * finalizers attached to the key Setting up a weak reference to a value ties the value's liveness to the key's liveness (even if the weak reference itself is dead). Calling 'finalize' on the weak reference breaks the link between the weak reference, and runs the finalizers attached to the key. So here, we attach a finalizer to the "Lock" structure which then breaks the link between the key and the value (this doesn't break the link from the weak ref to the value, it simply stops the key from keeping the value alive). This might not be entirely safe in the face of some optimizations - finalizers on ForeignPtrs and MVars are odd, to work around other oddities that I don't yet grasp. Antoine
Thanks!
-- Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig 1st graffitiist: QUESTION AUTHORITY! 2nd graffitiist: Why?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (4)
-
Antoine Latter
-
Arnaud Clère
-
Chung-chieh Shan
-
oleg@okmij.org