
On 26/03/2012 04:25, Sajith T S wrote:
Date: Sun, 25 Mar 2012 22:49:52 -0400 From: Sajith T S
To: The Haskell Cafe Subject: Google Summer of Code: a NUMA wishlist! Dear Cafe,
It's last minute-ish to bring this up (in my part of the world it's still March 25), but graduate students are famously a busy and lazy lot. :) I study at Indiana University Bloomington, and I wish to propose^W rush in this proposal and solicit feedback, mentors, etc while I can.
Since student application deadline is April 6, I figure we can beat this into a real proposal's shape by then. This probably also falls on the naive and ambitious side of things, and I might not even know what I'm talking about, but let's see! That's the idea of proposal, yes?
Broadly, idea is to improve support for NUMA systems. Specifically:
-- Real physical processor affinity with forkOn [1]. Can we fire all CPUs if we want to? (Currently, the number passed to forkOn is interpreted as number modulo the value returned by getNumCapabilities [2]).
You can get real processor affinity with +RTS -qa in combination with forkOn.
-- Also kind of associated with the above: when launching processes, we might want to specify a list of CPUs rather than the number of CPUs. Say, a -N [0,1,3] flag rather than -N 3 flag. This shall enable us to gawk at real pretty htop [3] output.
I like that idea.
-- From a very recent discussion on parallel-haskell [4], we learn that RTS' NUMA support could be improved. The hypothesis is that allocating nurseries per Capability might be a better plan than using global pool. We might borrow/steal ideas from hwloc [5] for this.
I like this idea too (since I suggested it :-).
-- Finally, a logging/monitoring infrastructure to verify assumptions and determine if/how local work stays.
I'm not sure if you're suggesting a *new* logging/monitoring framework
here, but in any case it would make much more sense to extend ghc-events
and ThreadScope rather than building something new. There is ongoing
work to have ThreadScope understand the output of the Linux "perf" tool,
which would give insight into CPU scheduling activity amongst other
things. Talk to Duncan Coutts
(I would like to acknowledge my fellow conspirators and leave them unnamed, lest they shall be embarrassed by my... naivete.)
Thanks, Sajith.
[1] http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurren... [2] http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurren... [3] http://htop.sourceforge.net/ [4] http://groups.google.com/group/parallel-haskell/browse_thread/thread/7ec1ebc... [5] http://www.open-mpi.org/projects/hwloc/