more sharing in generated code

Hi, I have a question about sharing at the Haskell run-time level. Suppose we have a record update r { x = f (r x)} and suppose that most of the time f returns it's argument unchanged. I have the following questions: 1. Does the generated code for the record update build an identical record when f returns it's argument unchanged instead of sharing the old one? (I guess yes.) 2. Can we prevent building identical records? Recently I've heard about Q-combinators. Central idea: Change (f :: a -> a) to (f' :: a -> Maybe a) returning Nothing when the value didn't change. Then we can replace the record update with smarter code which preserves more sharing. My question is: Can (or could) we enable more sharing without changing the source code? Ideally the compiler would generate record update code which checks poiter-equality of the updated field value to decide whether a copy of the original record is needed or it can be shared (given an optimisation flag enabled). Thanks, Peter Background info: There was a discussion on the Agda mailing list about decreasing the Agda compiler memory-usage with more sharing[^1]. (The Agda compiler is written in Haskell.) I had the above idea and I was advised to ask about it on some Haskell mailing list. [^1]: https://lists.chalmers.se/pipermail/agda/2012/004485.html

On 03.11.12 10:05 AM, Peter Divianszky wrote:
Suppose we have a record update
r { x = f (r x)}
and suppose that most of the time f returns it's argument unchanged.
Recently I've heard about Q-combinators. Central idea: Change (f :: a -> a) to (f' :: a -> Maybe a) returning Nothing when the value didn't change. Then we can replace the record update with smarter code which preserves more sharing.
Just adding a remark here: I actually played with these Q-combinators, they actually worsened performance of Agda. The problem is that they make the code strict. The performance loss due to strictness outweighted the potential performance gain by increased sharing. Q-combinators were developed by John Harrison in the context of ML, which is strict anyway. I guess a compiler support for smart record update would not have the strictness penalty. Cheers, Andreas -- Andreas Abel <>< Du bist der geliebte Mensch. Theoretical Computer Science, University of Munich Oettingenstr. 67, D-80538 Munich, GERMANY andreas.abel@ifi.lmu.de http://www2.tcs.ifi.lmu.de/~abel/

On 03/11/2012 10:47, Andreas Abel wrote:
On 03.11.12 10:05 AM, Peter Divianszky wrote:
Suppose we have a record update
r { x = f (r x)}
and suppose that most of the time f returns it's argument unchanged.
Recently I've heard about Q-combinators. Central idea: Change (f :: a -> a) to (f' :: a -> Maybe a) returning Nothing when the value didn't change. Then we can replace the record update with smarter code which preserves more sharing.
Just adding a remark here: I actually played with these Q-combinators, they actually worsened performance of Agda. The problem is that they make the code strict. The performance loss due to strictness outweighted the potential performance gain by increased sharing. Q-combinators were developed by John Harrison in the context of ML, which is strict anyway.
I guess a compiler support for smart record update would not have the strictness penalty.
Yes, for that we need a copy first and an update later. This can be implemented by replacing every record update r' = r { x = y } with r' = r { x = unsafePerformIO (cond-update r r' (x r) y) } where we keep the current record update mechanism and implement cond-update like cond-update :: r -> r -> x -> IO x cond-update r_old r_new x_old x_new = do b <- x_old === x_new when b (replace r_new r_old) return x_new where (===) is pointer-equality and replace is a low-level function which replaces thunks in the heap. Peter

On 03/11/2012 11:20, Peter Divianszky wrote:
On 03/11/2012 10:47, Andreas Abel wrote:
On 03.11.12 10:05 AM, Peter Divianszky wrote:
Suppose we have a record update
r { x = f (r x)}
and suppose that most of the time f returns it's argument unchanged.
Recently I've heard about Q-combinators. Central idea: Change (f :: a -> a) to (f' :: a -> Maybe a) returning Nothing when the value didn't change. Then we can replace the record update with smarter code which preserves more sharing.
Just adding a remark here: I actually played with these Q-combinators, they actually worsened performance of Agda. The problem is that they make the code strict. The performance loss due to strictness outweighted the potential performance gain by increased sharing. Q-combinators were developed by John Harrison in the context of ML, which is strict anyway.
I guess a compiler support for smart record update would not have the strictness penalty.
Yes, for that we need a copy first and an update later. This can be implemented by replacing every record update
r' = r { x = y }
with
r' = r { x = unsafePerformIO (cond-update r r' (x r) y) }
where we keep the current record update mechanism and implement cond-update like
cond-update :: r -> r -> x -> IO x cond-update r_old r_new x_old x_new = do b <- x_old === x_new when b (replace r_new r_old) return x_new
where (===) is pointer-equality and replace is a low-level function which replaces thunks in the heap.
a small correction on cond-update: cond-update :: r -> r -> x -> IO x cond-update r_old r_new x_old x_new = do eval x_new b <- x_old === x_new when b (replace r_new r_old) return x_new Evaluation of x_new should be OK because we need x_new eventually. With this change, nested record updates with identical values behave like an identical function with sharing I think. Peter

Hi, Am Samstag, den 03.11.2012, 11:26 +0100 schrieb Peter Divianszky:
This can be implemented by replacing every record update
r' = r { x = y }
with
r' = r { x = unsafePerformIO (cond-update r r' (x r) y) }
where we keep the current record update mechanism and implement cond-update like
cond-update :: r -> r -> x -> IO x cond-update r_old r_new x_old x_new = do eval x_new b <- x_old === x_new when b (replace r_new r_old) return x_new
where (===) is pointer-equality and replace is a low-level function which replaces thunks in the heap.
I think one problem with this approach is that now, until "x r'" is evaluated, you keep both r and r' alive. If r was _not_ retained by anything else, then you are now preventing the GC from freeing it and hence increasing the memory footprint. Greetings, Joachim -- Joachim "nomeata" Breitner mail@joachim-breitner.de | nomeata@debian.org | GPG: 0x4743206C xmpp: nomeata@joachim-breitner.de | http://www.joachim-breitner.de/

On 2012-11-03T10:05+0100, Peter Divianszky wrote:
Suppose we have a record update
r { x = f (r x)}
Hi Peter, I think you mean this: r { x = f (x r) }
1. Does the generated code for the record update build an identical record when f returns it's argument unchanged instead of sharing the old one? (I guess yes.)
In GHC: In your example a new record is built, but all its entries (x in this case) are shared. Here's how I found out: λ> let f = id λ> data R = R { x :: Int } λ> let r = R 0 λ> let u = r { x = f (x r) } λ> :view r λ> :view u :view is from ghc-vis[1], the resulting visualization is attached. In this simple example, when you compile it with -O1 the compiler figures out that u will always be the same as r and shares them. This does not work when it's dynamic whether u will be identical to r. Hope to help, Dennis [1] http://hackage.haskell.org/package/ghc-vis

Hi Dennis,
I think you mean this:
r { x = f (x r) }
Yes, I made a typo. Thanks for advising ghc-vis.
In GHC: In your example a new record is built, but all its entries (x in this case) are shared.
The problem is, that in case of nested records, if an inner record is updated, the whole path to that record is copied. Of course copying just the path to the inner data is a lot better than copying the whole tree of data, but my proposal is about to optimize it further: don't copy of the path if the updated value is identical to the original one. The Agda compiler may benefit by this optimization for example. In fact, the last version of the proposal[^1] will copy the path to keep laziness properties, but it does it in such a way that the GC can collect the copied path instead of trying to collect the old path (which will if it is shared). [^1]: http://www.haskell.org/pipermail/haskell-cafe/2012-November/104311.html Does this make sense to you? Peter
participants (4)
-
Andreas Abel
-
Dennis Felsing
-
Joachim Breitner
-
Peter Divianszky