
On Wed, Mar 29, 2006 at 12:50:02PM +0100, Jon Fairbairn wrote:
[...]
but add [a] pragma[s] to the effect that evaluation should be input driven, and that ll, ww, and cc are to be given equal time. Something like {-# STEPPER cs; ROUND_ROBIN ll,ww,cc #-} (please do not take this as a suggestion of real syntax!).
The way I would implement this is to add a new primitive, STEP, which is like seq except that it only evaluates its argument until it encounters another STEP. (It really isn't much different to seq).
[...]
It seems to me that this wouldn't take much effort to implement, but it would provide a simple means of removing space leaks from a whole bunch of programmes without mangling the source code much.
Actually, it may require no effort from compiler implementors. I just managed to get the desired effect in current GHC! :-) I implemented your idea of stepper by writing the function stepper that rewrites the list invoking "yield" every 500 processed elements. This way I can concurrently consume the list without the space leak - when a thread evaluates too many list elements, it gets preempted. I think it suffices if RTS employs a round-robin scheduler. I am not sure it's important. The code isn't as beautiful as the naive wc implementation. That's because I haven't yet thought how to hide newEmptyMVar, forkIO, putMVar i takeMVar. Perhaps someone will come up with a solution to this. import Control.Concurrent import Control.Monad import System.IO.Unsafe (unsafePerformIO) stepper l = s n l where n = 500 s 0 (x:xs) = unsafePerformIO $ do yield return (x : s n xs) s i (x:xs) = x : s (i-1) xs s _ [] = [] main = do cs <- liftM stepper getContents ll <- newEmptyMVar ww <- newEmptyMVar cc <- newEmptyMVar forkIO $ putMVar ll $! length (lines cs) forkIO $ putMVar ww $! length (words cs) forkIO $ putMVar cc $! length cs takeMVar ll >>= print takeMVar ww >>= print takeMVar cc >>= print See how well it works: $ cat words words words words | ./A +RTS -sstderr ./A +RTS -K8M -sstderr 394276 394272 3725868 <- that's the size of cs 643,015,284 bytes allocated in the heap 72,227,708 bytes copied during GC 109,948 bytes maximum residency (46 sample(s)) <- no space leak! 2452 collections in generation 0 ( 0.33s) 46 collections in generation 1 ( 0.00s) 2 Mb total memory in use <- no space leak! INIT time 0.00s ( 0.01s elapsed) MUT time 1.25s ( 1.27s elapsed) GC time 0.33s ( 0.36s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 1.58s ( 1.64s elapsed) %GC time 20.9% (22.0% elapsed) Alloc rate 514,412,227 bytes per MUT second Productivity 79.1% of total user, 76.2% of total elapsed Thanks for your idea, Jon! :-) Best regards Tomasz