function result caching

hello,I'm new to Haskell and I'm using it to do some simulations for a game, and I have a some functions that have as argument just one int (the current situation), but they do a lot of computations after that (future evolutions etc) I'd like to know if the results are "cached" by the compiler (there are only a few thousand values i call the functions on, but they are distributed on a fairly large interval (0-1000000), because of the codification. if they are not I'd like to know what is the best way to cache them manually, and where can I read more about this, and the optimizations the compiler does, because I've searched the web before and i found very little on this topic. thank you very much Silviu P.S. sorry for my English :)

On 10/12/06, Silviu Gheorghe
I'd like to know if the results are "cached" by the compiler
Hardly ever, as I understand things.
if they are not I'd like to know what is the best way to cache them manually, and where can I read more about this, and the optimizations the compiler does, because I've searched the web before and i found very little on this topic.
You need to search for the word "memoize" (or "memoise"). Here's a page about a memo function for GHC. http://www.haskell.org/ghc/docs/6.4.2/html/hslibs/memo-library.html Hope this helps! --Tom Phoenix

it does, thank you very much for the quick answer, unfortunately as I
understand it, it doesn't work well on ints :(
for just now i created a list
slowFunctionCacheList= [slowFunction (i) | i <-[0..5000000]]
and use "slowFunctionCacheList !! i" instead of "slowFunction (i)"
it helped alot (i mean i stoped the program after "3 hours still working"
and got the result in 2 minutes :))
i am still curious about a better method (and a general one), because this
is ugly, and it only works on ints as i see it.
but then again thank you for telling me it doesn't do it, because i had the
false impresion it does and i wouldn't stop it otherwise
On 10/13/06, Tom Phoenix
On 10/12/06, Silviu Gheorghe
wrote: I'd like to know if the results are "cached" by the compiler
Hardly ever, as I understand things.
if they are not I'd like to know what is the best way to cache them manually, and where can I read more about this, and the optimizations the compiler does, because I've searched the web before and i found very little on this topic.
You need to search for the word "memoize" (or "memoise"). Here's a page about a memo function for GHC.
http://www.haskell.org/ghc/docs/6.4.2/html/hslibs/memo-library.html
Hope this helps!
--Tom Phoenix

On Fri, 2006-10-13 at 01:27 +0300, Silviu Gheorghe wrote:
it does, thank you very much for the quick answer, unfortunately as I understand it, it doesn't work well on ints :(
for just now i created a list
slowFunctionCacheList= [slowFunction (i) | i <-[0..5000000]] and use "slowFunctionCacheList !! i" instead of "slowFunction (i)"
it helped alot (i mean i stoped the program after "3 hours still working" and got the result in 2 minutes :))
i am still curious about a better method (and a general one), because this is ugly, and it only works on ints as i see it.
I can describe a method that's uglier, faster, and more general; is that better or not? You're using an infinite list to store your cached results. (Well, your list is actually finite, but an infinite list would work just as well.) Instead of using an infinite list, you can use an infinite binary tree, with a cached result at every node. Construct a binary tree with the following property: Consider the path from the root to a node, where left branches are called "0" and right branches are called "1". This sequence of 0's and 1's is the binary expansion of the key whose cached value is stored at that node (with the least-significant-bit at the root of the tree). (Actually constructing this tree, and looking things up in it, is left as an exercise for the reader; but it isn't very hard.) This generalizes to any kind of key that can be uniquely serialized into bits; looking up the corresponding value then takes O(# of bits in the key) extra time and space, which is far better than the O(2^(# of bits in the key)) that an infinite list would use. Carl Witty

On Thu, Oct 12, 2006 at 04:01:14PM -0700, Carl Witty wrote:
Instead of using an infinite list, you can use an infinite binary tree, with a cached result at every node. Construct a binary tree with the following property: Consider the path from the root to a node, where left branches are called "0" and right branches are called "1". This sequence of 0's and 1's is the binary expansion of the key whose cached value is stored at that node (with the least-significant-bit at the root of the tree). (Actually constructing this tree, and looking things up in it, is left as an exercise for the reader; but it isn't very hard.)
This generalizes to any kind of key that can be uniquely serialized into bits; looking up the corresponding value then takes O(# of bits in the key) extra time and space, which is far better than the O(2^(# of bits in the key)) that an infinite list would use.
it is too bad IntSet and IntMap are strict in their subtrees, it would have been nice to provide things like infiniteMap :: (Int -> a) -> IntMap a infiniteMap = ... cachingSet :: (Int -> Bool) -> IntSet cachingSet = ... out of curiosity, why are IntMap and IntSet strict in their subtrees. since their subtrees are not of CPR form, I would think the benefit would not be that great and might even hurt in some situations... was there testing of the 'lazy' version? John -- John Meacham - ⑆repetae.net⑆john⑈

On Thu, Oct 12, 2006 at 08:40:44PM -0700, John Meacham wrote:
it is too bad IntSet and IntMap are strict in their subtrees, it would have been nice to provide things like
out of curiosity, why are IntMap and IntSet strict in their subtrees.
I guess the reason is balancing. I can't think of any way of balancing a lazy tree that wouldn't break abstraction. Perhaps I would be possible to use some trick to rebalance an existing tree to account for what's currently evaluated. But it could be very tricky to get it right and it would certainly go beyond Haskell 98. Best regards Tomasz

On Oct 13, 2006, at 4:05 AM, Tomasz Zielonka wrote:
On Thu, Oct 12, 2006 at 08:40:44PM -0700, John Meacham wrote:
it is too bad IntSet and IntMap are strict in their subtrees, it would have been nice to provide things like
out of curiosity, why are IntMap and IntSet strict in their subtrees.
I guess the reason is balancing. I can't think of any way of balancing a lazy tree that wouldn't break abstraction.
Uh, Patricia trees aren't balanced in the usual sense. There is exactly one tree structure for a given set of keys, regardless of insertion order etc. (IntSet and IntMap workes approximately as Carl Witty described last I checked, though I won't swear to whether bits are taken low to high or vice versa.) I had assumed the strictness was to avoid deferring O(n) insertion work to the first lookup operation---though really it makes no difference in an amortized sense. -Jan-Willem Maessen
Perhaps I would be possible to use some trick to rebalance an existing tree to account for what's currently evaluated. But it could be very tricky to get it right and it would certainly go beyond Haskell 98.
Best regards Tomasz _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Fri, Oct 13, 2006 at 08:52:09AM -0400, Jan-Willem Maessen wrote:
I guess the reason is balancing. I can't think of any way of balancing a lazy tree that wouldn't break abstraction.
Uh, Patricia trees aren't balanced in the usual sense.
There is exactly one tree structure for a given set of keys, regardless of insertion order etc. (IntSet and IntMap workes approximately as Carl Witty described last I checked, though I won't swear to whether bits are taken low to high or vice versa.)
Ah, IntMaps! Forget what I said. Best regards Tomasz

Carl Witty wrote:
Instead of using an infinite list, you can use an infinite binary tree, with a cached result at every node.
This, also known as patricia tree, is indeed the canonical answer. In general, one can always construct an (in)finite map from keys (Ints in the present case) to values as a trie. See also Ralf Hinze. Generalizing generalized tries. Journal of Functional Programming, 10(4):327-351, July 2000 http://www.informatik.uni-bonn.de/~ralf/publications/GGTries.ps.gz This paper uses generic programming, but that should not be the problem as concrete cases can always be constructed "by hand" in plain Haskell. To apply this to Ints, one should view them as type Int = [Digit] data Digit = Zero | One Also note that there is absolutely no balancing involved (which would break infinite and lazy stuff). Regards, apfelmus

G'day all. Carl Witty wrote:
Instead of using an infinite list, you can use an infinite binary tree, with a cached result at every node.
Quoting apfelmus@quantentunnel.de:
This, also known as patricia tree, is indeed the canonical answer.
A Patricia tree is but one infinite tree data structure. There's another (which is actually an infinite list of balanced binary trees) at http://haskell.org/hawiki/MemoisingCafs Cheers, Andrew Bromage

"Silviu Gheorghe"
slowFunctionCacheList= [slowFunction (i) | i <-[0..5000000]] and use "slowFunctionCacheList !! i" instead of "slowFunction (i)"
i am still curious about a better method (and a general one)
Not much different in principle, but better in practice - you could use an array rather than a list. O(1) lookups should make things (a lot) faster. -k -- If I haven't seen further, it is by standing in the footprints of giants

On Friday 13 October 2006 16:15, Ketil Malde wrote:
"Silviu Gheorghe"
writes: slowFunctionCacheList= [slowFunction (i) | i <-[0..5000000]] and use "slowFunctionCacheList !! i" instead of "slowFunction (i)"
i am still curious about a better method (and a general one)
Not much different in principle, but better in practice - you could use an array rather than a list. O(1) lookups should make things (a lot) faster.
Well, this is true only if the range of the domain function is small and fairly dense. He said it was large and sparsely populated. With 5000000 elements, you're looking at allocating about 20Mb of memory _just to hold pointers to closures_ and then allocating and filling out 5000000 closures, all before you get down to doing any real work! That's just not going to be very fast. You're going to take a HUGE constant runtime and space penalty for that O(1) lookup. Little-endian patricia trees are probably the right way to go here (which I think has been mentioned already). You get O(log n) lookup and good behavior for a sparse and/or infinite domain because only the parts of the tree that are explored get unfolded.
-k
-- Rob Dockins Talk softly and drive a Sherman tank. Laugh hard, it's a long way to the bank. -- TMBG

Robert Dockins
slowFunctionCacheList= [slowFunction (i) | i <-[0..5000000]] and use "slowFunctionCacheList !! i" instead of "slowFunction (i)"
Not much different in principle, but better in practice - you could use an array rather than a list. O(1) lookups should make things (a lot) faster.
Well, this is true only if the range of the domain function is small and fairly dense.
I don't think so.
With 5000000 elements, you're looking at allocating about 20Mb of memory
On the other hand, the lists allocates the 20Mb of pointers, *and* another 20Mb of cons cells for the lists.
to hold pointers to closures_ and then allocating and filling out 5000000 closures, all before you get down to doing any real work!
If I interpret you correctly, you want to make the array's contents strict? Not a good idea when the domain is sparse, but on the other hand it would let you unbox the contents, which means you'd only need to store the actual values. For boolean values, I think GHC stores one bit per value, i.e. less than a MB for this range.
Little-endian patricia trees [...]
Yes, sure, if you can't afford a 20Mb index. On the other hand, changing the function to use an array is a very small modification, and probably more than good enough in many cases. -k -- If I haven't seen further, it is by standing in the footprints of giants

On Saturday 14 October 2006 13:13, Ketil Malde wrote:
Robert Dockins
writes: slowFunctionCacheList= [slowFunction (i) | i <-[0..5000000]] and use "slowFunctionCacheList !! i" instead of "slowFunction (i)"
Not much different in principle, but better in practice - you could use an array rather than a list. O(1) lookups should make things (a lot) faster.
Well, this is true only if the range of the domain function is small and fairly dense.
I don't think so.
With 5000000 elements, you're looking at allocating about 20Mb of memory
On the other hand, the lists allocates the 20Mb of pointers, *and* another 20Mb of cons cells for the lists.
True, but only if you access deeply into the tail of the list. If one access only the first several hundred elements, say, then you'll only allocate the space needed for those. Of course, if you only want to access a small portion at the begining, then why create such a big list in the first place? Moral: lists will lose this contest in almost all cases.
to hold pointers to closures_ and then allocating and filling out 5000000 closures, all before you get down to doing any real work!
If I interpret you correctly, you want to make the array's contents strict? Not a good idea when the domain is sparse, but on the other hand it would let you unbox the contents, which means you'd only need to store the actual values. For boolean values, I think GHC stores one bit per value, i.e. less than a MB for this range.
No, I didn't suggest that the elements be strict. That would involve precomputing the entire table. You _could_ do that if you anticipate a LOT of access to sufficient to outweigh the initial cost. But that seems unlikely for a sparse domain, as you mentioned. However, even non-strict arrays are created all at once (ie, they are strict in their _structure_), and the closures have to be allocated as the array is being created. Creating a closure isn't terribly expensive, but creating 5000000 closures might take awhile and cost a lot of memory if the closure has a large number of free variables (which depends on the function definition and the exact details of the lambda lifter). Also, large arrays tend to play havoc with GHC's garbage collector; it has to scan all elements on every major GC, IIRC. That alone may offset any advantages won. In the end, the only way to be sure which method is best is to test it against your usage profile. My guess is that the array method will have enough overhead that it will lose against a tree. However I may be wrong, especially if the program will have a very long runtime and if a "warm-up" period is acceptable.
Little-endian patricia trees [...]
Yes, sure, if you can't afford a 20Mb index. On the other hand, changing the function to use an array is a very small modification, and probably more than good enough in many cases.
I completely agree; it is good for many cases, and can be a very useful technique. I just don't think it will be good for _this_ case (large, sparse domain where f(n) doesn't depend on all f(m) where m < n). That probability is positively correlated with the size of the domain. Again, the only way to really know is to implement and benchmark. Thankfully, caching techniques are completely local and can be changed easily.
-k
-- Rob Dockins Talk softly and drive a Sherman tank. Laugh hard, it's a long way to the bank. -- TMBG

thank you for all the answers
i was aware lists are is not the best solution, but i was too keen to see
the actual result
I'll do some tests though using different variants, because i have the
feeling that in my next program I'll face the "strong" form of this problem.
On 10/13/06, Silviu Gheorghe
it does, thank you very much for the quick answer, unfortunately as I understand it, it doesn't work well on ints :(
for just now i created a list
slowFunctionCacheList= [slowFunction (i) | i <-[0..5000000]] and use "slowFunctionCacheList !! i" instead of "slowFunction (i)"
it helped alot (i mean i stoped the program after "3 hours still working" and got the result in 2 minutes :))
i am still curious about a better method (and a general one), because this is ugly, and it only works on ints as i see it.
but then again thank you for telling me it doesn't do it, because i had the false impresion it does and i wouldn't stop it otherwise
On 10/13/06, Tom Phoenix
wrote: On 10/12/06, Silviu Gheorghe
wrote: I'd like to know if the results are "cached" by the compiler
Hardly ever, as I understand things.
if they are not I'd like to know what is the best way to cache them manually, and where can I read more about this, and the optimizations the compiler does, because I've searched the web before and i found very little on this topic.
You need to search for the word "memoize" (or "memoise"). Here's a page about a memo function for GHC.
http://www.haskell.org/ghc/docs/6.4.2/html/hslibs/memo-library.html
Hope this helps!
--Tom Phoenix
participants (10)
-
ajb@spamcop.net
-
apfelmus@quantentunnel.de
-
Carl Witty
-
Jan-Willem Maessen
-
John Meacham
-
Ketil Malde
-
Robert Dockins
-
Silviu Gheorghe
-
Tom Phoenix
-
Tomasz Zielonka