
Am Samstag 16 Januar 2010 18:53:33 schrieb Will Ness:
Daniel Fischer
writes: Am Donnerstag 14 Januar 2010 08:25:48 schrieb Will Ness:
Daniel Fischer
writes: Am Mittwoch 13 Januar 2010 10:43:42 schrieb Heinrich Apfelmus:
I wonder whether it's really the liveness of pair in
mergeSP (a,b) pair = let sm = spMerge b (fst pair) in (a ++ fst sm, merge (snd sm) (snd pair))
that is responsible for the space leak, ...
I think that is responsible. At least that's how I understand the core:
mergeSP (a,b) ~(c,d) = (a ++ bc, merge b' d) where (bc, b') = spMerge b c spMerge ...
That is equivalent to
first (a++) . second (`merge`d) $ spMerge b c
and Daniel's fix is equivalent to
first (a++) $ spMerge b c d
That should've been
mergeSP (a,b) p = first(a++) . second(`merge`snd p) $ spMerge b (fst p)
and
mergeSP (a,b) p = first(a++) $ spMerge b (fst p) (snd p)
The code fragments you've posted are essentially
mergeSP (a,b) p = let res = case p of (c,_) -> case spMerge b c of (# x,y #) -> (x,y) in (# (++) a (case res of (bc,_)-> bc) , case res of (_,b') -> case p of (_,d) -> merge b' d #)
and
mergeSP (a,b) p = let res = case p of (c,d) -> case spMerge b c d of (# x,y #) -> (x,y) in (# (++) a (case res of (bc,_)-> bc) , case res of (_,b') -> b' #)
This looks like Haskell to me, with many detailes explicitely written out, probaly serving as immediate input to the compiler - not its output. So it can't say to us much about how this is actually implemented on the lower level. (?)
It's 'core', the intermediate language GHC uses and does its transformations upon. It's more or less a slimmed down version of Haskell. That came from -ddump-simpl, thus it's the output of the simplifier, after numerous transformation/optimisation passes. I think that is then fairly directly translated to assembly (via Cmm), the STG to STG and Cmm passes do not much optimisation anymore (I may be wrong, what know I about the compiler. However, I've found that the -ddump-simpl output corresponds well to the actual behaviour whenever I look.). I find that much more redable than the -fext-core output (http://www.haskell.org/ghc/docs/6.12.1/html/users_guide/ext-core.html "GHC can dump its optimized intermediate code (said to be in “Core” format) to a file as a side-effect of compilation."), which says the same, only more verbose and less readable. Of course, if I could read assembly, that would exactly reveal what happens.
Your theory would certainly hold if the translation was done one-to-one without any further code rearrangements. But it could've been further re-arranged by the compiler at some later stage (is there one?) into an equivalent of, e.g.
mergeSP (a,b) p = let (bc,b') = case p of (c,_) -> case spMerge b c of (x,y) -> (x,y) in (# (++) a bc , case p of (_,d) -> merge b' d #)
and further,
mergeSP (a,b) p = let (c,d) = case p of (x,y) -> (x,y) (bc,b') = case spMerge b c of (x,y) -> (x,y) in (# (++) a bc , merge b' d #)
could it? This would take hold on /d/ and /c/ at the same time, right?
It would. But AFAICT, after the simplifier is done, no such rearrangements occur anymore.
What is that code that you've shown exactly? At which stage is it produced and is it subject to any further manipulation?
I'm no GHC expert either, so I don't know what it does when exactly. But after parsing and desugaring, there come a few iterations of the simplifier, intermingled with specialising, demand-analysis, CSE, let- floating, worker-wrapper-split, ... . At the end of all that, the Tidy Core is generated (part of which I posted). What happens from then on, well ...
I apologise if these are obvious questions, I don't know anything about GHC. I also don't know what (# x,y #) means?
Unboxed tuple (pair in this case). That means, have the components for themselves, don't wrap them in a (,) constructor.
One thing seems certain - we should not hold explicit references to same entities in different parts of our code, to avoid space leaks with more confidence.
Except when it's good to share ;)
To make code look as much tail-recursive as possible, so to speak.
Tail recursion isn't really important (for GHC at least, I think for lazy languages in general), due to different evaluation models (cf. http://www.haskell.org/pipermail/haskell-cafe/2009-March/058607.html and the thread starting with http://www.haskell.org/pipermail/haskell-cafe/2007-May/025570.html).
Does that make sense?
In general, yes.
Anyway that was a very educational (for me) and fruitful discussion, and I greatly appreciate your help, and fixing and improving of the code.
Thanks!
You're welcome.