
Okay, beginner encountering first major bug. I'm getting a stack overflow on a program that uses a lot of laziness to construct and modify lists---but the overflow happens even if I take just one element from the final list! The overflow seems to have started when I added the functions 'gauss' and 'gaussList' below, but I am at a loss to know why---these functions work perfectly in test cases and can even generate a long list. The following program is distilled (with a lot of work!) down to the essence of what triggers the problem. import Control.Monad import Control.Monad.Random import Data.List -- Here I put a moderate-sized list of filenames which just generic -- text or code. The longer the list, -- the more likely to overflow stack. Which is curious, because if IO were -- lazy here, it shouldn't be reading more than the first character of the -- first file. Methinks a hint to the problem. fileList = ... data TextGroup = Single Char Float deriving (Show) textGroup_addDelay :: TextGroup -> Float -> TextGroup textGroup_addDelay (Single c del) del2 = Single c (del+del2) mkTextGroup :: Char -> Rand StdGen TextGroup mkTextGroup c = liftM (Single c) gauss -- Here's the kinda weird thing I'm doing with Rand. I want to -- use fromList as the first computation to randomly choose -- a second Rand computation. It turns out if you replace -- this funkiness with -- gauss = getRandomR (0,1) -- the whole thing works. gaussList :: Rand StdGen (Rand StdGen Float) gaussList = fromList [ (getRandomR( -1.0 ,-0.8 ), 2) , (getRandomR( -0.8 ,-0.6 ), 2) , (getRandomR( -0.6 ,-0.4 ), 4) , (getRandomR( -0.4 ,-0.2 ), 8) , (getRandomR( -0.2 , 0.0 ), 12) , (getRandomR( 0.0 , 0.2 ), 12) , (getRandomR( 0.2 , 0.4 ), 8) , (getRandomR( 0.4 , 0.6 ), 4) , (getRandomR( 0.6 , 0.8 ), 2) , (getRandomR( 0.8 , 1.0 ), 2) ] gauss :: Rand StdGen Float gauss = do m <- gaussList m main = do gen0 <- newStdGen bufs <- mapM readFile fileList let buf = concat bufs let tgs = evalRand (do orig <- mapM mkTextGroup buf addBreaks orig) gen0 writeFile "output.ank" (show $ take 1 tgs)

Okay, I figured this out. mapM is not lazy, at least not for a monad that has state and under the circumstance you demand the state out the other side. I may rewrite the program. Or I may consider the ISS principle. ("Increase the stack, stupid.") -Mike

On 20/10/09 20:32, Michael Mossey wrote:
Okay, I figured this out. mapM is not lazy, at least not for a monad that has state and under the circumstance you demand the state out the other side.
If I understand what you are saying then this behaviour isn't unexpected. If you have a monad with state, and you ask for the final state, then it's likely that everything happening in that particular monad will has to be evaluated.
I may rewrite the program. Or I may consider the ISS principle. ("Increase the stack, stupid.")
You may also be able to improve the situation by adding a bit of strictness. In some cases the thunks resulting from laziness can take up a lot of space. Forcing evaluation, at well thought out locations in your program, may both speed things up and reduce memory/stack usage. /M -- Magnus Therning (OpenPGP: 0xAB4DFBA4) magnus@therning.org Jabber: magnus@therning.org http://therning.org/magnus identi.ca|twitter: magthe

Magnus Therning wrote:
On 20/10/09 20:32, Michael Mossey wrote:
Okay, I figured this out. mapM is not lazy, at least not for a monad that has state and under the circumstance you demand the state out the other side.
If I understand what you are saying then this behaviour isn't unexpected. If you have a monad with state, and you ask for the final state, then it's likely that everything happening in that particular monad will has to be evaluated.
Hi Magnus, Yes, that's what I was trying to imply. I realized it is mathematically impossible for mapM to be lazy for a monad with state. mapM doesn't seem to be lazy anywhere. For example, this program main = do buffers <- forM ["file1.txt","file2.txt","file3.txt"] readFile print . take 1 . concat $ buffers will, according to my experiments with the profiler, read all three files. It's possible to imagine lazy behavior here, but no doubt there is a technical reason against it.
I may rewrite the program. Or I may consider the ISS principle. ("Increase the stack, stupid.")
You may also be able to improve the situation by adding a bit of strictness. In some cases the thunks resulting from laziness can take up a lot of space. Forcing evaluation, at well thought out locations in your program, may both speed things up and reduce memory/stack usage.
You are probably right... I am having trouble spanning the gap between "my current understanding" and "well thought-out locations." Real World Haskell has a chapter on this, so I may look there. I was able to get the program to run by removing all places that I had created a chain of Rand computations that forced evaluation of the last one. I replaced these with computations that did not require evaluation of the last one. However, my program was still forced to read more files than it really needed in the end, as the above snippet demonstrates. Thanks, Mike

On Wed, Oct 21, 2009 at 9:44 AM, Michael Mossey
Magnus Therning wrote:
On 20/10/09 20:32, Michael Mossey wrote:
Okay, I figured this out. mapM is not lazy, at least not for a monad that has state and under the circumstance you demand the state out the other side.
If I understand what you are saying then this behaviour isn't unexpected. If you have a monad with state, and you ask for the final state, then it's likely that everything happening in that particular monad will has to be evaluated.
Hi Magnus,
Yes, that's what I was trying to imply. I realized it is mathematically impossible for mapM to be lazy for a monad with state.
mapM doesn't seem to be lazy anywhere. For example, this program
main = do buffers <- forM ["file1.txt","file2.txt","file3.txt"] readFile print . take 1 . concat $ buffers
will, according to my experiments with the profiler, read all three files. It's possible to imagine lazy behavior here, but no doubt there is a technical reason against it.
Laziness in combination with IO is a difficult thing. Just search the archives to see the number of threads about it.
I may rewrite the program. Or I may consider the ISS principle. ("Increase the stack, stupid.")
You may also be able to improve the situation by adding a bit of strictness. In some cases the thunks resulting from laziness can take up a lot of space. Forcing evaluation, at well thought out locations in your program, may both speed things up and reduce memory/stack usage.
You are probably right... I am having trouble spanning the gap between "my current understanding" and "well thought-out locations." Real World Haskell has a chapter on this, so I may look there.
I was able to get the program to run by removing all places that I had created a chain of Rand computations that forced evaluation of the last one. I replaced these with computations that did not require evaluation of the last one.
However, my program was still forced to read more files than it really needed in the end, as the above snippet demonstrates.
Just out of curiosity, did you verify that all three files were completely *read*, as opposed to all three opened, but only the first line of the first file being read? /M -- Magnus Therning (OpenPGP: 0xAB4DFBA4) magnus@therning.org Jabber: magnus@therning.org http://therning.org/magnus identi.ca|twitter: magthe

Magnus Therning wrote:
Just out of curiosity, did you verify that all three files were completely *read*, as opposed to all three opened, but only the first line of the first file being read?
/M
Actually, no I didn't verify that, in this example. I was thinking about my other program in which I used mapM over the characters of the file onto a Rand monad. That one definitely read all the characters of the file---certain processing routines were called about 30,000 times (according to the profiler). This makes sense to me now, because I was asking for the state out the other side of the mapM. Thanks, Mike
participants (2)
-
Magnus Therning
-
Michael Mossey