Hughes' parallel annotations for fixing a space leak

Dear haskell-cafe, I've been thinking about space leaks lately. In particular, I've been studying the tricky example with pairs break [] = ([],[]) break (x:xs) = if x == '\n' then ([],xs) else (x:ys,zs) where (ys,zs) = break xs which was discussed in the papers Jan Sparud. Fixing some space leaks without a garbage collector. http://bit.ly/cOxcVJ Philip Wadler. Fixing some space leaks with a garbage collector. http://homepages.inf.ed.ac.uk/wadler/topics/garbage-collection.html As I understand it, GHC implements the technique from Sparud's paper, so this is a solved problem. (It will only kick in when compiled with optimization, though, so -O0 will make the above leak in GHC 6.10.4 if I checked that correctly.) Now, I'm wondering about alternative solutions. Wadler mentions some sort of parallel combinators break xs = (PAR before '\n' ys, PAR after '\n' zs) where (ys,zs) = SYNCLIST xs which were introduced by John Hughes in his Phd thesis from 1983. They are intriguing! Unfortunately, I haven't been able to procure a copy of Hughes' thesis, either electronic or in paper. :( Can anyone help? Are there any other resources about this parallel approach? Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

On 31 March 2010 20:51, Heinrich Apfelmus
As I understand it, GHC implements the technique from Sparud's paper, so this is a solved problem.
This is not my understanding. As far as I know, the STG machine has a special notion of "selector thunks", which represent projections from product data types. These selector thunks are evaluated by the GHC garbage collector in the manner proposed by Wadler. The Sparud solution is IMHO much cleaner, though! Unfortunately I also have no idea where to obtain a copy of Hughes' thesis "The Design and Implementation of Programming Languages". Cheers, Max

Max Bolingbroke wrote:
Heinrich Apfelmus wrote:
As I understand it, GHC implements the technique from Sparud's paper, so this is a solved problem.
This is not my understanding. As far as I know, the STG machine has a special notion of "selector thunks", which represent projections from product data types. These selector thunks are evaluated by the GHC garbage collector in the manner proposed by Wadler.
Ah, that's how it is. Thanks. :) Funny that this special garbage collector support isn't used when compiling with -O0, though. But it makes sense to be required to use at least -O1 when you care about resources.
The Sparud solution is IMHO much cleaner, though!
I agree. It still requires special support from the garbage collector, though. Namely, the gc has to short-circuit indirection nodes, otherwise the pairs will be replaced by a long chain of indirection nodes and the break example would still leak. In a sense, Sparud's idea is about expressing selector thunks in terms of indirections and mutable thunk updates. Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

On Wed, Mar 31, 2010 at 3:51 PM, Heinrich Apfelmus
which were introduced by John Hughes in his Phd thesis from 1983. They are intriguing! Unfortunately, I haven't been able to procure a copy of Hughes' thesis, either electronic or in paper. :( Can anyone help? Are there any other resources about this parallel approach?
Aye, returning lazy pairs is one of those things that seems rather magical in several respects. Out of curiousity, have you looked at the unsafePerformIO thought experiment near the end of my Monad Reader article? It demonstrates that returning lazy pairs can introduce multiple code paths through a single function, reminiscent of (but different than) multi-mode logical relations. (Mercury, for example, optimizes relations differently depending on their mode.) I too am interested in looking at Hughes' thesis, I tried tracking it down early last year with little success. Best, Leon

Leon Smith wrote:
Heinrich Apfelmus wrote:
which were introduced by John Hughes in his Phd thesis from 1983. They are intriguing! Unfortunately, I haven't been able to procure a copy of Hughes' thesis, either electronic or in paper. :( Can anyone help? Are there any other resources about this parallel approach?
Aye, returning lazy pairs is one of those things that seems rather magical in several respects. Out of curiousity, have you looked at the unsafePerformIO thought experiment near the end of my Monad Reader article? It demonstrates that returning lazy pairs can introduce multiple code paths through a single function, reminiscent of (but different than) multi-mode logical relations. (Mercury, for example, optimizes relations differently depending on their mode.)
Yes, the multiple "code paths" phenomenon is pretty much the essence of lazy evaluation, i.e. only one part of the whole expression is evaluated and it depends on the demand pattern which part that is. Thanks to the kind souls on haskell-cafe, I finally understand the SYNCH primitive. The idea is that in let (y,z) = SYNCH x in ... the variables y and z are bound to x , but the value of x is only evaluated when *both* y and z are demanded. If you demand only y, then evaluation will stall. Clearly, this is useless without some form of parallelism that makes it possible to demand both at the same time. The SYNCHLIST function is a variation on that; it also guarantees that the list elements are consumed in lock-step. SYNCHLIST xs = (d1 `seq` x1, d2 `seq` x2) where (d1,d2) = SYNCH xs (t1,t2) = SYNCH (tail xs) (x1,x2) = case xs of [] -> ([],[]) (x:xs) -> (x:t1,x:t2) In other words, SYNCH binds a values to two different variables without actually sharing it. Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com
participants (3)
-
Heinrich Apfelmus
-
Leon Smith
-
Max Bolingbroke