profiling cpu usage of concurrent program

Dear Café, I have a network application of which I would like to know which parts are responsible for the majority of CPU usage. Conventional GHC profiling does not quite work here, because 1. +RTS -p shows that the program spends most of its *time* idling, where cost centers of interest are so insignificant that the precision of %time in the .prof format make it impossible to determine the relative execution time of selected cost centers. The Wiki [1] suggests the -P option. 2. Even ThreadScope (haven't tried yet) may show me only what's expected: The work is done in worker threads. I am not so much concerned about the level of concurrency or parallelism, but which parts occupy the cpu. I ran the program with +RTS -P and sorted the output by ticks. Naturally, functions like threadDelay and liftIO come at the top. But these all stand for waiting IO actions. As first approximation, I might ignore these and consider the remainder. Is that a valid approach? Is there a profiler that measures something like CPU cycles per cost center? Should I turn to ghc-events-analyze [2]? Or perf? Execution time is not critical (as long as the queue is emptied faster than data is flowing in) but maxing out the computing resources may become critical because it mandates more expensive hardware. I've been pitching Haskell to my bosses by promising better performance (compared to Python). Appropriate profiling seems essential to keep that promise. Olaf [1] https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/profiling.ht... [2] https://hackage.haskell.org/package/ghc-events-analyze
participants (1)
-
Olaf Klinke