Increasing memory use in stream computation

10 Oct 2013

      (Sorry for the long email.)

Summary: why does the attached program have non-constant memory use?

==== Introduction ====

I've written a program to do a big computation. Unfortunately, the computation 
takes a very long time (expectedly), and the memory use increases slowly 
(unexpectedly), until it fills up the entire memory and swap space of the 
computer (many gigabytes).

The rough structure of the program is:

• create a long (up to 20 million) list of objects;
• compute a number for each of those objects;
• compute the sum of the resulting list.

I switched the intermediate data structure from a list to a Stream (from the 
stream-fusion package), hoping to fix the memory issue. It decreased both the 
memory use and the rate of its increase, but after a long time, the program 
still uses up all available memory.

==== A simple program

After many hours of cutting down my program, I now have a small program 
(attached) that shows the same behaviour. It uses only said stream-fusion 
package, and vector. (I haven't yet tried to cut out the use of vector. I hope 
it is irrelevant, because all vectors are of fixed small size.)

I compile the program with ghc-7.6.1 using
...
ghc --make -threaded -rtsopts -with-rtsopts="-M1G -K128M" -O2 -main-is
  Test.main Test
The rts options may not be strictly necessary: I added them at some point to 
allow the use of multiple cores, and to prevent the program from crashing the 
machine by using all available memory.

When running the program, the resident memory quickly grows to about 3.5 MB 
(which I am fine with); this stays constant for a long time, but after about 7 
minutes, it starts to grow further. The growth is slow, but I really would 
hope this program to run in constant memory.

==== The code ====

Note that I added an instance for Monad Stream, using concatMap. This is 
implicitly used in the definition of the big stream.

The source of Data.Stream contains many alternative implementations of concat 
and concatMap, and alludes to the difficulty of making it fuse properly. Could 
it be that the fusion did not succeed in this case?

Thanks for any help!

Regards,

Arie

Arie Peterson

Claude Heiland-Allen

Arie Peterson

Arie Peterson

Bertram Felgenhauer

Arie Peterson

tags

participants (3)