
On Thu, Aug 28, 2014 at 11:09:55PM +0300, Michael Snoyman wrote:
I just added a comment onto that issue. I forgot to mention that that memory problem only occurs with optimizations turned on (-O or -O2). Can you test it out with one of those flags and let me know what happens?
Wow, quite a large difference appears when using -O. 4Mb when the action is run only once vs 789Mb when it is run twice. What's interesting is that the bytes allocated in the heap seems to grow by a reasonable amount when action is run twice, but the total resident memory explodes. The results when action is run only once: 1,440,041,040 bytes allocated in the heap 465,368 bytes copied during GC 35,992 bytes maximum residency (2 sample(s)) 21,352 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 2756 colls, 0 par 0.01s 0.01s 0.0000s 0.0006s Gen 1 2 colls, 0 par 0.00s 0.00s 0.0001s 0.0001s INIT time 0.00s ( 0.00s elapsed) MUT time 0.19s ( 0.19s elapsed) GC time 0.01s ( 0.01s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 0.20s ( 0.20s elapsed) %GC time 6.0% (5.3% elapsed) Alloc rate 7,563,673,522 bytes per MUT second Productivity 93.6% of total user, 93.5% of total elapsed Command being timed: "./foo +RTS -s" User time (seconds): 0.16 System time (seconds): 0.03 Percent of CPU this job got: 98% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.20 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 4024 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 338 Voluntary context switches: 1 Involuntary context switches: 21 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 The results when action is run twice: 2,080,041,040 bytes allocated in the heap 1,346,503,136 bytes copied during GC 389,736,000 bytes maximum residency (11 sample(s)) 8,871,312 bytes maximum slop 768 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 3968 colls, 0 par 0.28s 0.28s 0.0001s 0.0009s Gen 1 11 colls, 0 par 0.57s 0.57s 0.0519s 0.2668s INIT time 0.00s ( 0.00s elapsed) MUT time 0.32s ( 0.32s elapsed) GC time 0.85s ( 0.85s elapsed) EXIT time 0.03s ( 0.03s elapsed) Total time 1.20s ( 1.21s elapsed) %GC time 71.0% (70.7% elapsed) Alloc rate 6,553,280,639 bytes per MUT second Productivity 29.0% of total user, 28.8% of total elapsed Command being timed: "./foo +RTS -s" User time (seconds): 0.91 System time (seconds): 0.28 Percent of CPU this job got: 99% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.20 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 789432 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 196760 Voluntary context switches: 1 Involuntary context switches: 47 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0