
On Thu, Dec 16, 2010 at 4:13 AM, Simon Marlow
If your program has large memory requirements, you might also benefit from parallel GC in the old generation: +RTS -N2 -qg1.l
Testing shows this advice did not help in my case. The program that implements the undecidable algorithm in my package is already multiprocessor aware, but there is an inheritly sequential support program that translates the output of the main program into an XHTML document. For reasons I shall spare you of, this program is also memory intensive, sometimes requiring more memory that the main program. When this program is compiled without the -threaded option, and run on a large input, I found the program used 85 seconds of user time, and 99% of the CPU time on a Core 2 Duo machine. After compiling with the -threaded option, and running with -N2 -qg1, the program used 88 seconds of user time, and 103% of the CPU. I ran the test on what is provided by the Ubuntu package system for Ubuntu Lucid Lynx, GHC 6.12.1 and parallel 1.1.0.1. John