STM, newArray, and a stack overflow?

Hi, I'm seeing some weirdness here. My code does this: omap <- atomically $ (newArray_ (0,n-1) :: STM (TArray Int Int)) This gives a stack overflow when n is one million. The equivalent code in the ST monad doesn't show this behavior, and works as expected. I'm not sure what is going on here, so I thought I'd ask in case anybody can put me on the right track. -k -- If I haven't seen further, it is by standing in the footprints of giants

On 23 March 2011 13:35, Ketil Malde
I'm seeing some weirdness here. My code does this:
omap <- atomically $ (newArray_ (0,n-1) :: STM (TArray Int Int))
This gives a stack overflow when n is one million.
I can't reproduce this with ghci-7.0.2 (base-4.3.1.0, array-0.3.0.2 and stm-2.2.0.1). What version of ghc/ghci are you using? Bas

On Wed, Mar 23, 2011 at 2:22 PM, Bas van Dijk
On 23 March 2011 13:35, Ketil Malde
wrote: I'm seeing some weirdness here. My code does this:
omap <- atomically $ (newArray_ (0,n-1) :: STM (TArray Int Int))
This gives a stack overflow when n is one million.
I can't reproduce this with ghci-7.0.2 (base-4.3.1.0, array-0.3.0.2 and stm-2.2.0.1). What version of ghc/ghci are you using?
I can't reproduce it either, using ghci-6.12.3, base-4.2.0.2, array-0.3.0.1 and stm-2.2.0.1. However it takes a loooong time and a lot of CPU time =). Cheers, -- Felipe.

Bas van Dijk
omap <- atomically $ (newArray_ (0,n-1) :: STM (TArray Int Int))
I can't reproduce this with ghci-7.0.2 (base-4.3.1.0, array-0.3.0.2 and stm-2.2.0.1).
It works (calling the same function) from GHCi, but breaks when compiled. Also when compiling with -O0. I'm probably doing something wrong, but I can't for the life of me work out what it might be. I've surrounded the offending newArray with print statements, so I can't see how it could be something else.
What version of ghc/ghci are you using?
Happens with GHC 6.12.1 and 7.0.2. 6.12.1, array-0.3.0.0, stm-2.1.1.2 7.0.2, array-0.3.0.2, stm-2.2.0.1 -k -- If I haven't seen further, it is by standing in the footprints of giants

On 03/23/2011 10:34 AM, Ketil Malde wrote:
It works (calling the same function) from GHCi, but breaks when compiled. Also when compiling with -O0.
Confirmed for GHC 7.0.2. Works fine in GHCi, but compiling it (in my case, with -O) and running the executable causes a stack overflow unless I run it with +RTS -K16m, and even then it spends 90% of its time in GC. This looks like it is probably a bug, to me. Maybe it should be reported in GHC's Trac? - Jake

On 23 March 2011 17:19, Jake McArthur
On 03/23/2011 10:34 AM, Ketil Malde wrote:
It works (calling the same function) from GHCi, but breaks when compiled. Also when compiling with -O0.
Confirmed for GHC 7.0.2. Works fine in GHCi, but compiling it (in my case, with -O) and running the executable causes a stack overflow unless I run it with +RTS -K16m, and even then it spends 90% of its time in GC. This looks like it is probably a bug, to me. Maybe it should be reported in GHC's Trac?
It looks like a bug indeed. The problem can be reduced to just: atomically $ replicateM 1000000 (newTVar undefined) or even simpler: replicateM 1000000 (newIORef undefined) Bas

On 23 March 2011 18:42, Bas van Dijk
On 23 March 2011 17:19, Jake McArthur
wrote: On 03/23/2011 10:34 AM, Ketil Malde wrote:
It works (calling the same function) from GHCi, but breaks when compiled. Also when compiling with -O0.
Confirmed for GHC 7.0.2. Works fine in GHCi, but compiling it (in my case, with -O) and running the executable causes a stack overflow unless I run it with +RTS -K16m, and even then it spends 90% of its time in GC. This looks like it is probably a bug, to me. Maybe it should be reported in GHC's Trac?
It looks like a bug indeed.
The problem can be reduced to just:
atomically $ replicateM 1000000 (newTVar undefined)
or even simpler:
replicateM 1000000 (newIORef undefined)
Bas
Maybe it's not really a bug: For example the following very similar program also overflows the stack: (note that: replicateM n x = sequence (replicate n x)) main = sequence $ replicate 1000000 $ (randomIO :: IO Int) This happens because sequence is defined using a right fold: sequence ms = foldr k (return []) ms where k m m' = do x <- m xs <- m' return (x:xs) What happens is that sequence repeatedly pushes an x onto the stack then continues with m' until your stack overflows. The stack overflow disappears when you use a left fold: sequencel xs = foldl k (\r -> return $ r []) xs id where k g m = \r -> do x <- m g (r . (x:)) or written with explicit recursion: sequencel xs = go xs id where go [] r = return $ r [] go (m:ms) r = do x <- m go ms (r . (x:)) Note that I used a difference list to keep the list in the right order. Alternatively you can use a normal list (x:r) and reverse it when done. I'm not sure what's more efficient. I'm surprised I haven't encountered this problem with sequence before. Does this suggest we need the left folded sequencel? Regards, Bas

I fixed the bug in the newArray method of a TArray: http://hackage.haskell.org/trac/ghc/ticket/5042 Bas

Bas van Dijk
sequence ms = foldr k (return []) ms where k m m' = do x <- m xs <- m' return (x:xs)
Isn't this really a strictness problem with the STM monad? If I understand correctly, this forces xs before x can be examined. Something to be fed to listArray should be able to be consumed lazily, shouldn't it?
Note that I used a difference list to keep the list in the right order. Alternatively you can use a normal list (x:r) and reverse it when done. I'm not sure what's more efficient.
Hm, they will all need to allocate everything on the heap, no? Shouldn't it be possible to create an array in a loop with only constant memory overhead?
Does this suggest we need the left folded sequencel?
Or unsafeInterleaveSTM? It's interesting to see that newArray is using listArray under the covers, I really want to initialize the array from a list - but neither a listArray function nor the TArray constructor seem to be available. Thanks for helping out! -k -- If I haven't seen further, it is by standing in the footprints of giants

On 23 March 2011 21:07, Ketil Malde
Shouldn't it be possible to create an array in a loop with only constant memory overhead?
I think it should. Maybe we need something like this: unsafeArrayM :: Ix i => (i, i) -> Int -> IO e -> IO (Array i e) unsafeArrayM (l,u) n@(I# n#) (IO f) = IO $ \s1# -> case newArray# n# arrEleBottom s1# of (# s2#, marr# #) -> let go i# s# | i# ==# n# = case unsafeFreezeArray# marr# s# of (# s3#, arr# #) -> (# s3#, Array l u n arr# #) | otherwise = case f s# of (# s3#, x #) -> case writeArray# marr# i# x s3# of s4# -> go (i# +# 1#) s4# in go 0# s2# The given IO computation can then be something like: unsafeIOToSTM $ newTVar e. Note that I haven't compiled and tested this code at all nor thought about it to deeply ;-) Bas

On 11-03-23 05:31 PM, Ketil Malde wrote:
Any idea why it works in GHCI?
Documentedly, stack limit is 8M, and can be changed by +RTS -K42M (for example). Undocumentedly, certain magic numbers given to -K seem to waive the limit (or set it so high I haven't fathomed). GHC 6.10.4: 4 to 59 GHC 6.12.1: 4 to 63 GHC 6.12.3: 1 to 63 GHC 7.0.2: 1 to 67 Now, GHCI. In 6.10.4 and before, GHCI probably uses the same default stack limit as other executables produced by GHC. You get stack overflow in GHCI as expected. Since 6.12.1, GHCI probably is built to default to a magic number, and therefore you can't overflow its stack easily. You can bring back a limit to GHCI by for example "ghci +RTS -K8M -RTS".
participants (5)
-
Albert Y. C. Lai
-
Bas van Dijk
-
Felipe Almeida Lessa
-
Jake McArthur
-
Ketil Malde