Help understanding sharing

I'm new to Haskell and trying to get a better understanding of sharing (and ultimately memoization). I've read SOE and various of the tutorials, as well as browsing around the wiki and old mailing lists. Most of the examples of memoization seem to revolve around Fibonacci, and are based either on the fact that a list defined within the function will get shared between calls, or on doing some 'unsafeIO' (which I haven't dug too deeply into.) I've read various discussions that explain why function calls are generally not automatically memoized (e.g. f x gets recalculated rather than looked up based on the prior result). The rationale for that (big space leak and no guarantee of improved performance) makes sense. (Though I did like one poster's suggestion of a compiler pragma that hints that a particular function should be memoized.) I've seen other discussions that suggest that lists are always shared while in scope (so the fibs trick works). But is that just a feature of the standard compilers, or is it somewhere mandated in the Hakell spec (I don't see anything obvious in the Haskell Report tho haven't read it cover to cover)? The wiki page http://www.haskell.org/haskellwiki/Performance/Strictness says laziness == non-strictness + sharing but again nowhere gives a set of rules that guarantees what will be shared and what won't. I was hoping I might find it here: http://www.haskell.org/haskellwiki/Sharing but no such luck. Or are there no guarantees and you just have to know how your particular compiler works?? Cheers, Patrick ________________________________________________________________________ DISCLAIMER: This e-mail is intended only for the addressee named above. As this e-mail may contain confidential or privileged information, if you are not the named addressee, you are not authorised to retain, read, copy or disseminate this message or any part of it. If you received this email in error, please notify the sender and delete the message from your computer.

Patrick Surry wrote:
I've seen other discussions that suggest that lists are always shared while in scope (so the fibs trick works). But is that just a feature of the standard compilers, or is it somewhere mandated in the Hakell spec (I don't see anything obvious in the Haskell Report tho haven't read it cover to cover)?
It is just a feature of most compilers. The Haskell Report does not specify sharing. For most compilers, a sufficient condition for sharing is aliasing, e.g., let y = f x in (y,y,y,y,y) you can be sure that most compilers share one copy of "f x" for those five mentions of "y". As another example, let x = 0:x in x you can be sure that most compilers create a tight cyclic graph for that. In contrast, most compilers may create redundantly new expressions for the following: (f x, f x, f x, f x, f x) -- given the definition: recurse f = f (recurse f) recurse (\x -> 0:x)

To add to this; sharing is not always what you want; sometimes it's a time/space trade-off and sometimes it's actually strictly worse than not sharing. For example:
f :: Integer -> [Integer] f x = take 10000000 [x..]
sum :: [Integer] -> Integer sum = foldl' (+) 0
expensiveZero :: Integer expensiveZero = let (a,b) = (f 0, f 0) in sum a + sum (map negate b)
If the applications of f are unshared, "expensive" runs in small
constant memory. But, if the applications of f are shared, it will
likely exhaust memory (if it doesn't, add another few zeroes to the
"take" in f).
Here's why. Assume that (+) evaluates its left argument first. Then
"sum a" is going to consume the entire list stored in "a". If the
applications of f are unshared, the garbage collector will reclaim the
beginning of the list "a" while sum is evaluating! But if they are
shared, it can't; b is the same list and is still live until the rhs
of the (+) gets evaluated. So the entire list will end up in memory!
Not only that, the program will likely take longer to run than the
unshared version, because the garbage collector has so much more work
to do maintaining the live data set.
This is why most compilers use aliasing of names for sharing; it gives
the programmer control of whether a computation will be shared or not.
-- ryan
On Mon, Apr 14, 2008 at 8:24 PM, Albert Y. C. Lai
Patrick Surry wrote:
I've seen other discussions that suggest that lists are always shared while in scope (so the fibs trick works). But is that just a feature of the standard compilers, or is it somewhere mandated in the Hakell spec (I don't see anything obvious in the Haskell Report tho haven't read it cover to cover)?
It is just a feature of most compilers. The Haskell Report does not specify sharing.
For most compilers, a sufficient condition for sharing is aliasing, e.g.,
let y = f x in (y,y,y,y,y)
you can be sure that most compilers share one copy of "f x" for those five mentions of "y".
As another example,
let x = 0:x in x
you can be sure that most compilers create a tight cyclic graph for that.
In contrast, most compilers may create redundantly new expressions for the following:
(f x, f x, f x, f x, f x)
-- given the definition: recurse f = f (recurse f) recurse (\x -> 0:x)
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (3)
-
Albert Y. C. Lai
-
Patrick Surry
-
Ryan Ingram