getting memory usage to be constant when mixing wav files

I've written a program that mixes wav files read in by hsndfile, and it does so with reasonably satisfactory performance using RTS options -A and -K. Unfortunately, it does not have constant memory usage. Profiling the program shows that almost all of the memory allocation comes down to the following two functions: the first which calculates and writes the values of individual indexes to the target array (longer), and the second which runs the first for all the indexes in the shorter array. addInBase :: IOCArray Int Double -> IOCArray Int Double -> (Int, Int) -> IO () addInBase longer shorter (!li,!si) = do x <- readArray longer li y <- readArray shorter si let result = x + y writeArray longer li result addIn :: IOCArray Int Double -> IOCArray Int Double -> Int -> Int -> Int -> IO () addIn longer shorter shorterBounds !li !si | si > shorterBounds = return () | si <= shorterBounds = addInBase longer shorter (li,si) >> addIn longer shorter shorterBounds (li+1) (si+1) Basically, the addIn function is called by mapM_ across a list of all of the occurrences of the input wav files. I've tried ! and seq in many different places and combinations, but I cannot get the memory usage constant. How can I achieve that? What technique or principle will show me more specifically what is not being evaluated so that I can avoid this trouble in the future? I have read the wiki pages on Strictness, Laziness, and Performance, but maybe I have not fully understood how to apply what is written there to this code. Any help is greatly appreciated. If you notice any other glaring mistakes or bad ideas in the code above, I'm interested in that as well. -- Renick Bell http://the3rd2nd.com

On Sun, Feb 14, 2010 at 07:22:52PM +0900, Renick Bell wrote:
addInBase :: IOCArray Int Double -> IOCArray Int Double -> (Int, Int) -> IO () addInBase longer shorter (!li,!si) = do x <- readArray longer li y <- readArray shorter si let result = x + y writeArray longer li result
addIn :: IOCArray Int Double -> IOCArray Int Double -> Int -> Int -> Int -> IO () addIn longer shorter shorterBounds !li !si | si > shorterBounds = return () | si <= shorterBounds = addInBase longer shorter (li,si) >> addIn longer shorter shorterBounds (li+1) (si+1)
These functions don't have any lazyness problem. They run in the IO monad which enforces execution order.
Basically, the addIn function is called by mapM_ across a list of all of the occurrences of the input wav files.
I'd guess that the mapM_ is where the problem lies in. If I understood correctly, you're doing something like (in pseudo-Haskell): do files <- mapM readFile filenames -- read files lazily mapM_ addIn files -- do something mapM_ writeFile files -- write files (strictly) In other words, I guess you're reading the files lazily, right? If that's the case, then mapM_ is your culprit. Each 'addIn' will force your inputs into memory. You hold a reference to them so you can save the results later, so after mapM_ has completed everything is in memory. The solution is simple for that pseudo-Haskell above: do mapM process filenames where process filename = readFile filename >>= addIn >>= writeFile In other words, read, process and write the files in one IO action. After the first 'process' finishes, every data is ready to be garbage collected. Of course you will still have those 'IOCArray's in memory during the 'process' call. -- Felipe.
participants (2)
-
Felipe Lessa
-
Renick Bell