2013/3/5 Nathan Howell <nathan.d.howell@gmail.com>
Depends on the application, of course. The (on by default) parallel GC tends to kill performance for me... you might try running both with "+RTS -sstderr" to see if GC time is significantly higher, and try adding "+RTS -qg1" if it is.
 
You are correct: parallel GC is slowing computation down. After some experiments I can produce two behaviors: use single threaded GC (multithreaded version is slowed down by factor of 5 - but single threaded backs to normal) or increase heap size (multithreaded version slows down by factor of 2, single threaded version runs normally). I guess I must live with this ;)

--
Łukasz Dąbek