Re: Suggested algorithm to control upper bound of space "leaks"

http://www.haskell.org/pipermail/cvs-ghc/2009-October/050928.html Shelby Moore wrote:
One possible runtime optimization to provide an upper bound for cache control, might be to not cache thunks when the runtime computation time times number of cache hits, is less than the round-trip paging time multiplied by recent system paging load.
Is this novel? Is it useful?
Clarify and improve: One possible idea for runtime optimization to provide a more deterministic upper bound on paging load for cache control, might be to not cache thunk-wise when for each thunk, the total tree of computation time under the thunk multiplied by the number of cache hits on the thunk, is less than the round-trip paging time of thunks under the tree multiplied by a chosen factor of recent system paging load. Or toggle the test on and off on a chosen threshold of recent system paging load. Obviously "heap size" could be substituted for "paging load", but my goal is to make the optimization resolution independent, i.e. orthogonal to system configuration such as RAM size, speed, virtual media size, algorithm, etc.. The "catch-22" of the idea if any, probably derives from the permutations of computation time from the lazy computation of thunks under the tree? Note I realize this list is probably not the correct place to discuss research. I do not have time to do this research. Any suggestion where I could post or send this idea so people interested in this type of research could see it? Add a feature request to GHC bug tracker? To the general Haskell mailing list?

This is an opportunity cost minimization problem: http://www.haskell.org/pipermail/haskell-cafe/2009-November/068435.html One of the worst (most unoptimized and conflated) solutions is to force some determinism at the low-level language architecture specifically targetted to achieve determinism in some domain at the higher level (which actually doesn't achieve it as the aliasing error gets pushed around any way). It analogous to pulling all your teeth out so you won't get cavities, considering the power that lazy evaluation and pure referential transparency adds to algorithm expression, composability, OOP, optimization opportunities (in many domains, e.g. speed, algorithmic resonance, concurrency, etc): http://www.haskell.org/pipermail/haskell-cafe/2009-November/068432.html Thus my analysis so far is Haskell has it correct, and I am suggesting the missing optimization is to let us automatically put an upper bound on the space non-determinism (the topic of this thread), then the programmer can optimize beyond that with profiling and strategy placement of seq and type constraints[1]. [1]Hudak, Hughes, Peyton Jones, Wadler (2007). "A History of Haskell: being lazy with class" (¶32 §10.3, "Controlling evaluation order" and ¶32 §10.2, "Space profiling")

Thus my analysis so far is Haskell has it correct, and I am suggesting the missing optimization is to let us automatically put an upper bound on the space non-determinism (the topic of this thread), then the programmer can optimize beyond that with...
Arghh, the mailing list web archive did not find the head thread from the previous month, so those readers here it is, so you know what I am refering to: http://www.haskell.org/pipermail/haskell-cafe/2009-October/068382.html Unfortunately anyone reading the link above will not see a link to the thread continuation in this month. Bug.

Simon Marlow replied: http://www.haskell.org/pipermail/cvs-ghc/2009-November/050946.html ============= Also I received privately a very much appreciated and stimulating reply about the convolution of the competition for the shared resources in the external state machine. My quick (not deeply thought out) response is that the algorithm's target is not fixed (not a hard-coded upper bound), but rather the algorithm should be a gradient search for the local minima (which will always be a moving target, but that is okay). The problem is when it gets stuck in some local minima that is far from the global minima (e.g. some cycling resonance with the external competition), so to the degree that is a common, we occasionally we need to test some random position in the solution space (simulated annealing). That is analogous to how autonomous actors optimize problems in real-world, e.g. humans in their interaction with their shared reality with other actors, they bite off some goal and go for it, then they scrap the goal if it is useless local minima. In short, we trade speed of convergence for determinism of convergence, i.e. in annealing (very slow cooling), then ice has fewer cracks. (I learned this stuff 20+ years ago from reading about one model of how neural networks can learn and converge). So yes I agree that the nondeterminism can still creep back in as aliasing error in the algorithm's sampling for a global solution. But according to the fundamental theorems of the universe ( http://www.haskell.org/pipermail/haskell-cafe/2009-November/068432.html ), nothing is every finally optimized (except that "faith exists") to a globally closed solution (even if you have closed solution mathematical model which says it is, i.e. space-time did not finalize the later inconsistency of quantum theory), i.e. there is no global solution in life only a relative one (there is always some more complex external state, i.e. the universe trends to maximum disorder). But the practical motivation is that to the degree the gradient search is reasonably more deterministic in common usage (see my "KEY POINT" in response to Simon Marlow as for why I think that is likely achievable), then the space nondeterminism should in theory scale more continuously more often. Regarding the private point about overbooking, since the CPU is much faster than the hard disk, it should be true that even if the paging load is 100% of CPU allocation, then the CPU is not overbooked. And if the paging load is so overbooked due to competiting actors, you might as well just turn off your machine :). And I agree with the private point that the gradient search algorithm should incorporate a gradient reduction in it's velocity towards the target as it approaches the target minima, i.e. that systems do not perform reliabily if resource allocation strategies are edge discontinuous. I disagree that the programmer/user needs to be burdened with this, as is the case now. I am arguing it should be automatic, unless the programmer wants to do space profile optimization work. Maybe I am very wrong. It is research after all.

stimulating reply about the convolution of the competition for the shared resources in the external state machine.
Apparently the concurrency problem is potentially convolved with external state in much more convoluted and intertwinded spaghetti: http://lambda-the-ultimate.org/node/2048#comment-51673 http://lambda-the-ultimate.org/node/3637#comment-51649 Multi-core is coming! Wouldn't it be better if the mainstream scripting (i.e. composability) language was also pure FP (and the composable modules were pure FP). If it costs 4X slower (lose 3/4 of) performance (and I doubt it would be any where near that bad) on keeping some of the thunks closures around so that scripting doesn't require masters degree in profiling, but we can remain in pure FP and minimize our STM usage (which also purportedly costs 4X), then if we hit 8+ cores then perfomance has improved, not to mention all the other benefits. Okay I realize that napkin calculation can't possibly encompass the details, but just make a conceptual point.
participants (1)
-
Shelby Moore