
I'm having some trouble with excessive memory use in a program that uses a lot of IORefs. I was able to write a much simpler program which exhibits the same sort of behavior. It appears that "modifyIORef" and "writeIORef" leak memory; perhaps they keep a reference to the old value. I tried both ghc-6.8.3 and ghc-6.10.1. Is this a known limitation, or is this a ghc bug, or am I using IORefs in the wrong way? -jim module Main where import Data.IORef import Control.Monad -- Leaks memory leakcheck1 ior = do go 1000000000 where go 0 = return () go n = do modifyIORef ior (+1) go (n-1) -- Leaks memory leakcheck2 ior = do go 1000000000 where go 0 = return () go n = do x <- readIORef ior writeIORef ior (x+1) go (n-1) -- Runs in constant memory leakcheck3 ior = do go 1000000000 where go 0 = return () go n = do x <- readIORef ior go (n-1) main :: IO () main = do ior <- newIORef 0 leakcheck2 ior compiled with: ghc -O2 --make Leak.hs -o Leak

It looks offhand like you're not being strict enough when you put things back in the IORef, and so it's building up thunks of (+1)... With two slight mods: go 0 = return () go n = do modifyIORef ior (+1) go (n-1) --> go 0 = return () go n = do modifyIORef ior (\ x -> let x' = x+1 in x `seq` x') go (n-1) and go n = do x <- readIORef ior writeIORef ior (x+1) go (n-1) --> go n = do x <- readIORef ior writeIORef ior $! x+1 go (n-1) It runs much better (with loop count = 10,000,000) -- leak1 is the code you posted, leak2 is with these changes: rmm@Hugo:~$ ./leak1 +RTS -s ./leak1 +RTS -s 200,296,364 bytes allocated in the heap 365,950,896 bytes copied during GC 66,276,472 bytes maximum residency (7 sample(s)) 1,906,448 bytes maximum slop 131 MB total memory in use (1 MB lost due to fragmentation) <snip> %GC time 75.9% (79.2% elapsed) Alloc rate 977,656,335 bytes per MUT second Productivity 24.0% of total user, 20.5% of total elapsed rmm@Hugo:~$ ./leak2 +RTS -s ./leak2 +RTS -s 160,006,032 bytes allocated in the heap 11,720 bytes copied during GC 1,452 bytes maximum residency (1 sample(s)) 9,480 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) <snip> %GC time 0.5% (0.8% elapsed) Alloc rate 626,590,037 bytes per MUT second Productivity 99.2% of total user, 97.8% of total elapsed -Ross On Jun 18, 2009, at 10:46 PM, Jim Snow wrote:
I'm having some trouble with excessive memory use in a program that uses a lot of IORefs. I was able to write a much simpler program which exhibits the same sort of behavior. It appears that "modifyIORef" and "writeIORef" leak memory; perhaps they keep a reference to the old value. I tried both ghc-6.8.3 and ghc-6.10.1.
Is this a known limitation, or is this a ghc bug, or am I using IORefs in the wrong way?
-jim
module Main where
import Data.IORef import Control.Monad
-- Leaks memory leakcheck1 ior = do go 1000000000 where go 0 = return () go n = do modifyIORef ior (+1) go (n-1)
-- Leaks memory leakcheck2 ior = do go 1000000000 where go 0 = return () go n = do x <- readIORef ior writeIORef ior (x+1) go (n-1)
-- Runs in constant memory leakcheck3 ior = do go 1000000000 where go 0 = return () go n = do x <- readIORef ior go (n-1)
main :: IO () main = do ior <- newIORef 0 leakcheck2 ior
compiled with: ghc -O2 --make Leak.hs -o Leak _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Thu, Jun 18, 2009 at 9:55 PM, Ross Mellgren
It looks offhand like you're not being strict enough when you put things back in the IORef, and so it's building up thunks of (+1)...
With two slight mods:
go 0 = return () go n = do modifyIORef ior (+1) go (n-1)
-->
go 0 = return () go n = do modifyIORef ior (\ x -> let x' = x+1 in x `seq` x') go (n-1)
Just a slight prettification of that line: modifyIORef ior ((1+) $!) Or applied prefix if you prefer. Prefix ($!) has the nice interpretation as the HOF that makes its argument into a strict function. Luke

Luke Palmer wrote:
On Thu, Jun 18, 2009 at 9:55 PM, Ross Mellgren
mailto:rmm-haskell@z.odi.ac> wrote: It looks offhand like you're not being strict enough when you put things back in the IORef, and so it's building up thunks of (+1)...
With two slight mods:
go 0 = return () go n = do modifyIORef ior (+1) go (n-1)
-->
go 0 = return () go n = do modifyIORef ior (\ x -> let x' = x+1 in x `seq` x') go (n-1)
Just a slight prettification of that line:
modifyIORef ior ((1+) $!)
Or applied prefix if you prefer. Prefix ($!) has the nice interpretation as the HOF that makes its argument into a strict function.
Luke
do modifyIORef ior (\ x -> let x' = x+1 in x `seq` x') and do modifyIORef ior ((1+) $!) both still leak memory for me. However, do x <- readIORef ior writeIORef ior $! x+1 runs in constant space. I was able to fix my original program, and now it uses a predictable amount of memory. Thanks! -jim

D'oh, yeah that is better. You know, I actually had that and had expanded it because I was going to seq both the input and the result of the (+1), but punted on it and didn't switch back to the more compact format. -Ross On Jun 19, 2009, at 12:45 AM, Luke Palmer wrote:
On Thu, Jun 18, 2009 at 9:55 PM, Ross Mellgren
wrote: It looks offhand like you're not being strict enough when you put things back in the IORef, and so it's building up thunks of (+1)... With two slight mods:
go 0 = return () go n = do modifyIORef ior (+1) go (n-1)
-->
go 0 = return () go n = do modifyIORef ior (\ x -> let x' = x+1 in x `seq` x') go (n-1)
Just a slight prettification of that line:
modifyIORef ior ((1+) $!)
Or applied prefix if you prefer. Prefix ($!) has the nice interpretation as the HOF that makes its argument into a strict function.
Luke

I'm having some trouble with excessive memory use in a program that uses a lot of IORefs. I was able to write a much simpler program which exhibits the same sort of behavior. It appears that "modifyIORef" and "writeIORef" leak memory; perhaps they keep a reference to the old value. I tried both ghc-6.8.3 and ghc-6.10.1.
modifyIORef and writeIORef are not sufficiently strict for your needs. See this recent thread: http://www.nabble.com/Stack-overflow-td23746120.html Tim

jsnow:
I'm having some trouble with excessive memory use in a program that uses a lot of IORefs. I was able to write a much simpler program which exhibits the same sort of behavior. It appears that "modifyIORef" and "writeIORef" leak memory; perhaps they keep a reference to the old value. I tried both ghc-6.8.3 and ghc-6.10.1.
Is this a known limitation, or is this a ghc bug, or am I using IORefs in the wrong way?
-jim
module Main where
import Data.IORef import Control.Monad
-- Leaks memory leakcheck1 ior = do go 1000000000 where go 0 = return () go n = do modifyIORef ior (+1) go (n-1)
It is not possible to write a modifyIORef that *doesn't* leak memory! -- Don

dvde:
Don Stewart schrieb:
It is not possible to write a modifyIORef that *doesn't* leak memory!
Why? Or can one read about it somewhere?
Try writing a version of this program, using modifyIORef only, such that it doesn't exhaust the heap: import Data.IORef import Control.Monad import System.IO.Unsafe ref :: IORef Int ref = unsafePerformIO $ newIORef 0 {-# NOINLINE ref #-} main = do modifyIORef ref (\a -> a + 1) main Run it in a constrained environment, so you don't thrash: $ ./A +RTS -M100M Heap exhausted; Current maximum heap size is 99999744 bytes (95 MB); use `+RTS -M<size>' to increase it. The goal is to run in constant space. -- Don

Don Stewart schrieb:
dvde:
Don Stewart schrieb:
It is not possible to write a modifyIORef that *doesn't* leak memory!
Why? Or can one read about it somewhere?
Try writing a version of this program, using modifyIORef only, such that it doesn't exhaust the heap:
import Data.IORef import Control.Monad import System.IO.Unsafe
ref :: IORef Int ref = unsafePerformIO $ newIORef 0 {-# NOINLINE ref #-}
main = do modifyIORef ref (\a -> a + 1) main
Run it in a constrained environment, so you don't thrash:
$ ./A +RTS -M100M Heap exhausted; Current maximum heap size is 99999744 bytes (95 MB); use `+RTS -M<size>' to increase it.
The goal is to run in constant space.
-- Don
Hm, do you say it is not possible to write a modifyIORef function that does not leak memory, or do you say it is not possible to use the (existing) modifyIORef without having memory leaks? I wrote the following which runs in constant space, but it introduces strictness, which is not always desirable. And yes, using only modifyIORef this could not be done this way, because the strict evaluation happens on the IO-Monad-level. But such examples occured already in this thread. import Data.IORef import Control.Monad import System.IO.Unsafe ref :: IORef Int ref = unsafePerformIO $ newIORef 0 {-# NOINLINE ref #-} main = do myModifyIORef ref (\a -> a + 1) main myModifyIORef :: IORef a -> (a->a) -> IO () myModifyIORef ref f = do a <- readIORef ref let a' = f a seq a' $ writeIORef ref a' So would it make sense to create a strict modifyIORef' function? best regards, daniel

dvde:
Don Stewart schrieb:
dvde:
Don Stewart schrieb:
It is not possible to write a modifyIORef that *doesn't* leak memory!
Why? Or can one read about it somewhere?
Try writing a version of this program, using modifyIORef only, such that it doesn't exhaust the heap:
import Data.IORef import Control.Monad import System.IO.Unsafe
ref :: IORef Int ref = unsafePerformIO $ newIORef 0 {-# NOINLINE ref #-}
main = do modifyIORef ref (\a -> a + 1) main
Run it in a constrained environment, so you don't thrash:
$ ./A +RTS -M100M Heap exhausted; Current maximum heap size is 99999744 bytes (95 MB); use `+RTS -M<size>' to increase it.
The goal is to run in constant space.
-- Don
Hm, do you say it is not possible to write a modifyIORef function that does not leak memory, or do you say it is not possible to use the (existing) modifyIORef without having memory leaks?
The latter. atomicModifyIORef is harder though still, since it is a primop with the same properties as modifyIORef :/
So would it make sense to create a strict modifyIORef' function?
Very much so. In fact, I'd argue the vast majority of uses are for the WHNF-strict version. -- Don

The latter. atomicModifyIORef is harder though still, since it is a primop with the same properties as modifyIORef :/
So would it make sense to create a strict modifyIORef' function?
Very much so. In fact, I'd argue the vast majority of uses are for the WHNF-strict version.
I just fixed a leak with atomicModifyIORef that was exactly this problem. If it had at least been documented I wouldn't have had to do that. So I'm going to submit a library proposal to either 1) strictify atomicModifyIORef, 2) add atomicModifyIORef', or at the least 3) add documentation that says "this function leaks". Same story for modifyIORef of course. The only workaround I could find is to immediately read the value back out and 'seq' on it, but it's ugly. So two questions: writeIORef doesn't have this problem. If I am just writing a simple value, is writeIORef atomic? In other words, can I replace 'atomicModifyIORef r (const (x, ())' with 'writeIORef r x'? Any reason to not do solution 1 above?

Evan Laforge
The only workaround I could find is to immediately read the value back out and 'seq' on it, but it's ugly.
Yep! C'est la vie unfortunately. The way atomicModifyIORef works is that the new value isn't actually evaluated at all; GHC just swaps the old value with a thunk which will do the modification when the value is demanded. It's done like that so that the atomic modification can be done with a compare-and-swap CPU instruction; a fully-fledged lock would have to be taken otherwise, because your function could do an unbounded amount of work. While that's happening, other mutator threads could be writing into your memory cell, having read the same old value you did, and then *splat*, the souffle is ruined. Once you're taking a lock, you've got yourself an MVar. This is why IORefs are generally (always?) faster than MVars under contention; the lighter-weight lock mechanism means mutator threads don't block, if the CAS fails atomicModifyIORef just tries again in a busy loop. (I think!) Of course, the mutator threads themselves then tend to bump into each other or do redundant work when it's time to evaluate the thunks (GHC tries to avoid this using "thunk blackholing"). Contention issues here have gotten radically better in recent versions of GHC I think. Forgive me if I've gotten anything wrong here, I think Simon Marlow might be the only person who *really* understands how all this stuff works. :)
So two questions:
writeIORef doesn't have this problem. If I am just writing a simple value, is writeIORef atomic? In other words, can I replace 'atomicModifyIORef r (const (x, ())' with 'writeIORef r x'?
Any reason to not do solution 1 above?
Well if you're not inspecting or using the old value then it's safe to
just blow it away, yes.
Cheers,
G
--
Gregory Collins

Correct, here's a video of Simon explaining the thunk blackholing
issue and its solution in GHC 7:
http://vimeo.com/15573590
On 15 October 2010 21:31, Gregory Collins
Evan Laforge
writes: The only workaround I could find is to immediately read the value back out and 'seq' on it, but it's ugly.
Yep! C'est la vie unfortunately.
The way atomicModifyIORef works is that the new value isn't actually evaluated at all; GHC just swaps the old value with a thunk which will do the modification when the value is demanded.
It's done like that so that the atomic modification can be done with a compare-and-swap CPU instruction; a fully-fledged lock would have to be taken otherwise, because your function could do an unbounded amount of work. While that's happening, other mutator threads could be writing into your memory cell, having read the same old value you did, and then *splat*, the souffle is ruined.
Once you're taking a lock, you've got yourself an MVar. This is why IORefs are generally (always?) faster than MVars under contention; the lighter-weight lock mechanism means mutator threads don't block, if the CAS fails atomicModifyIORef just tries again in a busy loop. (I think!)
Of course, the mutator threads themselves then tend to bump into each other or do redundant work when it's time to evaluate the thunks (GHC tries to avoid this using "thunk blackholing"). Contention issues here have gotten radically better in recent versions of GHC I think.
Forgive me if I've gotten anything wrong here, I think Simon Marlow might be the only person who *really* understands how all this stuff works. :)
So two questions:
writeIORef doesn't have this problem. If I am just writing a simple value, is writeIORef atomic? In other words, can I replace 'atomicModifyIORef r (const (x, ())' with 'writeIORef r x'?
Any reason to not do solution 1 above?
Well if you're not inspecting or using the old value then it's safe to just blow it away, yes.
Cheers,
G -- Gregory Collins
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Push the envelope. Watch it bend.

Don Stewart wrote:
dvde:
Don Stewart schrieb:
It is not possible to write a modifyIORef that *doesn't* leak memory!
Why? Or can one read about it somewhere?
Try writing a version of this program, using modifyIORef only, such that it doesn't exhaust the heap:
import Data.IORef import Control.Monad import System.IO.Unsafe
ref :: IORef Int ref = unsafePerformIO $ newIORef 0 {-# NOINLINE ref #-}
main = do modifyIORef ref (\a -> a + 1) main
Run it in a constrained environment, so you don't thrash:
$ ./A +RTS -M100M Heap exhausted; Current maximum heap size is 99999744 bytes (95 MB); use `+RTS -M<size>' to increase it.
The goal is to run in constant space.
-- Don _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Thanks, that's good to know. do x <- readIORef ior writeIORef ior $! (x+1) Works for me. The laziness of modifyIORef and workarounds would be a good thing to have documented in the modifyIORef docs [1], since it's probably a common source of memory leaks. I'd also be in favor of a strict version of modifyIORef. [1] http://www.haskell.org/ghc/dist/current/docs/libraries/base/Data-IORef.html#... -jim

Jim Snow
Works for me. The laziness of modifyIORef and workarounds would be a good thing to have documented in the modifyIORef docs, since it's probably a common source of memory leaks. I'd also be in favor of a strict version of modifyIORef.
http://hackage.haskell.org/packages/archive/strict-io/0.1/doc/html/Data-IORe...
?
G.
--
Gregory Collins

It is not possible to write a modifyIORef that *doesn't* leak memory!
Why? Or can one read about it somewhere?
Possibly, Don meant that 'modifyIORef' is defined in a way that does not allow to enforce evaluation of the result of the modification function (a typical problem with fmap-style library functions): modifyIORef ref f = readIORef ref >>= writeIORef ref . f No matter whether 'f' is strict or not, the 'writeIORef r' doesn't evaluate its result, just stores the unevaluated application: > r<-newIORef 0 > modifyIORef r (\x->trace "done" $ x+1) > modifyIORef r (\x->trace "done" $ x+1) > readIORef r done done 2 If it had been defined like this instead mRef r ($) f = readIORef r >>= (writeIORef r $) . f it would be possible to transform the strictness of 'writeIORef r' to match that of 'f': > r<-newIORef 0 > mRef r ($) (\x->trace "done" $ x+1) > mRef r ($) (\x->trace "done" $ x+1) > readIORef r done done 2 > r<-newIORef 0 > mRef r ($!) (\x->trace "done" $ x+1) done > mRef r ($!) (\x->trace "done" $ x+1) done > readIORef r 2 Claus

Yes I guessed that. Thanks, Daniel Claus Reinke schrieb:
It is not possible to write a modifyIORef that *doesn't* leak memory!
Why? Or can one read about it somewhere?
Possibly, Don meant that 'modifyIORef' is defined in a way that does not allow to enforce evaluation of the result of the modification function (a typical problem with fmap-style library functions):
modifyIORef ref f = readIORef ref >>= writeIORef ref . f
No matter whether 'f' is strict or not, the 'writeIORef r' doesn't evaluate its result, just stores the unevaluated application:
r<-newIORef 0 modifyIORef r (\x->trace "done" $ x+1) modifyIORef r (\x->trace "done" $ x+1) readIORef r done done 2
If it had been defined like this instead
mRef r ($) f = readIORef r >>= (writeIORef r $) . f
it would be possible to transform the strictness of 'writeIORef r' to match that of 'f':
r<-newIORef 0 mRef r ($) (\x->trace "done" $ x+1) mRef r ($) (\x->trace "done" $ x+1) readIORef r done done 2 r<-newIORef 0 mRef r ($!) (\x->trace "done" $ x+1) done mRef r ($!) (\x->trace "done" $ x+1) done readIORef r 2
Claus
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (10)
-
Claus Reinke
-
Daniel van den Eijkel
-
Don Stewart
-
Evan Laforge
-
Gregory Collins
-
Jim Snow
-
Luke Palmer
-
Ross Mellgren
-
Thomas Schilling
-
Tim Docker