Re: Perf regression: ghc --make: add nicer names to RTS threads (threaded IO manager, make workers) (f686682)

6 Aug 2014

Hi Sergei,

Am Mittwoch, den 06.08.2014, 22:15 +0300 schrieb Sergei Trofimovich:
...
On Wed, 06 Aug 2014 09:30:45 +0200 Joachim Breitner  wrote:
...
the attached commit seems to have regressed the scs nofib benchmark by
~3%:
http://ghcspeed-nomeata.rhcloud.com/timeline/?ben=nofib/time/scs&env=1#/?exe=2&base=2+68&ben=nofib/time/scs&env=1&revs=50&equid=on
That's a test of compiled binary performance, not the compiler, right?
Correct.
...
...
The graph unfortunately is in the wrong order, as the tool gets confused
by timezones and by commits with identical CommitDate, e.g. due to
rebasing. This needs to be fixed, I manually verified that the commit
below is the first that shows the above-noise-level-increase of runtime.
(Other benchmarks seem to be unaffected.)
Is this regression expected and intended or unexpected? Is it fixable?
Or is is this simply inexplicable?
The graph looks mysterious (18 ms bump). Bencmark does not use haskell
threads at all.
Yes, I was surprised by that as well.
...
I'll try to reproduce degradation locally and will investigate.
Thanks!
...
The only runtime part affected by the patch only renames threads
(the renamer gets called once for each created thread):
...
...

diff --git a/libraries/base/GHC/Event/Thread.hs b/libraries/base/GHC/Event/Thread.hs
index 6e991bf..dcfa32a 100644
--- a/libraries/base/GHC/Event/Thread.hs
+++ b/libraries/base/GHC/Event/Thread.hs
@@ -39,6 +39,7 @@ import GHC.Event.Manager (Event, EventManager, evtRead, evtWrite, loop,
 import qualified GHC.Event.Manager as M
 import qualified GHC.Event.TimerManager as TM
 import GHC.Num ((-), (+))
+import GHC.Show (showSignedInt)
 import System.IO.Unsafe (unsafePerformIO)
 import System.Posix.Types (Fd)
@@ -244,11 +245,14 @@ startIOManagerThreads =
     forM_ [0..high] (startIOManagerThread eventManagerArray)
     writeIORef numEnabledEventManagers (high+1)
+show_int :: Int -> String
+show_int i = showSignedInt 0 i ""
+
 restartPollLoop :: EventManager -> Int -> IO ThreadId
 restartPollLoop mgr i = do
   M.release mgr
   !t <- forkOn i $ loop mgr
-  labelThread t "IOManager"
+  labelThread t ("IOManager on cap " ++ show_int i)
   return t
startIOManagerThread :: IOArray Int (Maybe (ThreadId, EventManager))
@@ -258,7 +262,7 @@ startIOManagerThread eventManagerArray i = do
   let create = do
         !mgr <- new True
         !t <- forkOn i $ loop mgr
-        labelThread t "IOManager"
+        labelThread t ("IOManager on cap " ++ show_int i)
         writeIOArray eventManagerArray i (Just (t,mgr))
   old <- readIOArray eventManagerArray i
   case old of
It does replace a reference to the a string ("IOManager") by something
involving allocation and computation. I guess that could have a
measurable effect.

What happens to programs relying on very cheap threads? Do we have
benchmarks for this class of programs at all?

Greetings,
Joachim


-- 
Joachim “nomeata” Breitner
  mail@joachim-breitner.de • http://www.joachim-breitner.de/
  Jabber: nomeata@joachim-breitner.de  • GPG-Key: 0xF0FBF51F
  Debian Developer: nomeata@debian.org