
#14414: Profiled program runs 2.5x faster than non-profiled -------------------------------------+------------------------------------- Reporter: Fuuzetsu | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by duog): Note that the entirety of the anomolous time is in the garbage collector. It looks to me like the program is running on only one core. If that's the case, then perhaps the garbage collector is spending entire time slices busy-waiting on a spin lock. As for why it doesn't happen while profiled, perhaps the non-profiled code has allocation-free loops, they can have poor interactions with garbage collection. Some things to try: * run the program inside `time` to verify that the numbers -s is reporting are correct * use `time` to prove that a non-haskell program is able to run on multiple cores. Sorry I don't have a one-liner for you... * try the profiled version, omitting -fprof-auto, this (should) give the same core as without -prof, and may exhibit the bad gc behaviour. * play with the -q* and -N RTS options to see if anything changes. * use https://wiki.haskell.org/ThreadScope to see exactly what the garbage collector is up to. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14414#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler