
On Sat, 2009-05-23 at 20:42 -0400, Mario Blažević wrote:
On Sat 23/05/09 2:51 PM , Duncan Coutts duncan.coutts@worc.ox.ac.uk sent:
On Sat, 2009-05-23 at 13:31 -0400, Mario Blažević wrote: ...
So the function is not strict, and I don't understand why GHC should evaluate the arguments before the call.
Right, it's lazy in the first and strict in the second argument. As far as I can see we have no evidence that is is evaluating anything before the call.
When I look at the Core definition of `test', it begins with
\ (n1axl::integer:GHCziIntegerziInternals.Integer) (n2axn::integer:GHCziIntegerziInternals.Integer) -> %let as1sU :: integer:GHCziIntegerziInternals.Integer = base:DataziList.prod1 (main:Main.factors2 (base:DataziList.prod1 (base:GHCziNum.upzulist main:Main.lvl main:Main.lvl n1axl) base:DataziList.lvl1)) base:DataziList.lvl1 %in %case integer:GHCziIntegerziInternals.Integer (ghczmprim:GHCziPrim.parzh @ integer:GHCziIntegerziInternals.Integer as1sU) %of (dsapq::ghczmprim:GHCziPrim.Intzh)
I recommend using -ddump-simpl, as it produces more readable output.
To my untrained eyes, this looks like it's evaluating
product $ factors $ product [1..n1])
which is the first argument to `parallelize'.
That core code is doing: let blah = product $ factors $ product thelist in case par# blah of _ -> ... So it's calling the primitive par# but nothing is forced to WHNF (except the unused dummy return value of par#).
I assume that %case in Core evaluates the argument to WHNF, just like case in Haskell.
Yes. (In fact case in core always reduces to WHNF, where as in Haskell case (...) of bar -> (...) does not force anything.)
Then again, I could be completely misinterpreting what Core is, because I can't find any call to `parallelize' before or after that. It appears to be inlined in Core, regardless of whether the pragma
{-# INLINE parallelize #-}
is there or not.
Yes, because even without the pragma, ghc decides it's profitable to inline.
Actually, I can't see any effect of that pragma in the core files whatsoever, but it certainly has effect on run time.
How about diffing the whole core output (and using -ddump-simpl). If there's a performance difference then there must be a difference in the core code too.
Or do you mean to say that *your* installation of GHC behaves the same when the function `parallelize' is defined in the same module and when it's imported?
Yes, exactly. With both ghc-6.10.1 and a very recent build of ghc-6.11 Duncan