Brent & Yeu,
I recently ran into the same question. You can see the thread[1] which includes lots of references to papers that describe the behavior you're seeing along with examples.
Implementations of call-by-name lambda calculus all tend to have the same runtime behavior that you're describing. It's usually called sharing.
You can look at how ghc reduces expressions like this directly(see [2] and [3]). However, this is really low level and doesn't help us understand the way other Haskell implementations work.
John Launchbury[4] came along and said "hey, these denotational definitions of call-by-name lambda calculus don't help us understand the runtime of current implementations". He developed a high level operational semantics for call-by-name that correctly, and simply, accounts for sharing across all implementations. I found the paper accessible and very helpful when analyzing the sometimes mysterious runtime behavior of Haskell.
Ariola et. all[5] improved upon on Launchbury's work and made an even more high level operational semantics for call-by-name which they called, confusingly enough, "call-by-need". This paper is not as accessible as Launchbury, but presents a clever way to describe how sharing works without using a heap construct.
If you read carefully through the thread I mentioned in the beginning and Launchbury's paper, you should be equipped to write down reductions by hand that illustrate the different runtime behaviors of your semantically equivalent expressions.
David
--
David Sankel
Sankel Software
www.sankelsoftware.com
585 617 4748 (Office)