
GHC is allowed to do common sub-expression elimination which would cause sharing. For example: module Opt where {-# NOINLINE bar #-} bar :: Int -> Int bar x = x + 3 foo :: Int -> Int foo s = bar s * bar s Here's the core for 'foo': foo = \ (s_ayA :: Int) -> case bar s_ayA of _ [Occ=Dead] { GHC.Types.I# x_aJa -> GHC.Types.I# (GHC.Prim.*# x_aJa x_aJa) } bar is invoked only once. In Michael's original example, CSE doesn't fire, probably because the tuple constructor is lazy. I am not entirely certain what the interaction here is. Edward Excerpts from Tom Ellis's message of 2016-11-07 21:10:41 +0000:
I'm not going to contradict Edward's answer in practice, because he knows far more about the internals of GHC than I do. However, I will contradict it in theory.
There is a straightforward operational semantics for Haskell programs as compiled by GHC via Core to STG (without any optimisations applied). I described this semantics in the following talk at Haskell eXchange 2016:
https://skillsmatter.com/skillscasts/8726-haskell-programs-how-do-they-run#v...
and I wrote it up as an article here:
http://h2.jaguarpaw.co.uk/posts/haskell-programs-how-do-they-run/
Under that semantics, yes, 'someGenerator s' is bound twice and thus is not shared.
Now, GHC may try to apply an "optimisation" and (accidentally?) introduce sharing. That would be a compiler bug, in my opinion. In fact, it would be a terrible blow to Haskell's future as a practical language, because we really need fine-grained control over sharing. So I really hope that Edward is wrong about this.
Tom
On Mon, Nov 07, 2016 at 12:56:02PM -0800, Edward Z. Yang wrote:
It's not guaranteed. Unfortunately there aren't really good ways to avoid sharing; the general advice is to convert values into functions, and apply them at the use site where sharing is OK.
Unrelatedly, in your sample code, dropping 1000000000 entries is not a good way to build a splittable RNG. Check out http://publications.lib.chalmers.se/records/fulltext/183348/local_183348.pdf and also its related work for some bettera pproaches.
Edward
Excerpts from Michael Roth's message of 2016-11-07 20:56:54 +0100:
Hello! A short question, given:
data Seed = ... data Value = ...
someGenerator :: Seed -> [Value]
createTwo :: Seed -> ([Value], [Value]) createTwo s = (as, bs) where as = someGenerator s bs = drop 1000000000 (someGenerator s)
Is it guaranteed that 'someGenerator s' is created twice and not shared between 'as' and 'bs'? Is this by language design? Are there any GHC options that change the behaviour?