
On Sat, Mar 21, 2009 at 06:02:58AM -0700, Michael Mossey wrote:
So I'm wondering to what extent the haskell compiler recognizes computations it's done before. In a purely functional language this should be pretty easy, right? If it sees the same expression, it knows it will have the same value. That's my understanding, so far.
It really doesn't; this isn't as easy in a purely functional language as you think. For one thing, how to recognize which computations have been done before? The runtime would have to keep around some sort of hash mapping expressions to values, and incur a lot of overhead checking each thing to be evaluated against this map. Furthermore, this may be very expensive memory-wise. Consider the the case where some expression generates a very long (say, million-element) list, but it is immediately processed by some other code which counts how many threes the list contains. This can be done in constant memory, since the garbage collector can come along behind and clean up the processed parts of the list even while it is still being generated. But if we need to keep the list around just in case the expression that generated it is used again, we will need to keep a million-element list in memory! So sometimes, memoizing expression values like this is a pessimization, and it's very difficult to tell the difference. This is a tricky problem; I've run into pretty much the exact same problem (the need to cache computed sizes for recursively constructed elements) in my 'diagrams' library. Probably the easiest solution (which I haven't actually yet implemented in my library, but plan to do something along these lines!) is just to cache the size like you would in an imperative language. For example: data Thingy = Composite [Sized Thingy] | Primitive String data Sized a = S { size :: Int, thing :: a } thingySize :: Thingy -> Int thingySize (Primitive s) = length s thingySize (Composite sts) = sum (map size sts) mkSizedThingy :: Thingy -> Sized Thingy mkSizedThingy t = S (thingySize t) t This has exactly the property you want: due to laziness, the size of a SizedThingy will be computed *only* if it is ever needed; and it will be computed at most once. -Brent