It seems like we could get some priority based scheduling (and still be slackers) if we allow marked green threads to be strictly associated with a specific OS thread (forkChildIO?).

I think you want the GHC-only GHC.Conc.forkOnIO

Suggestions like this are more motivation for the suggestion [1] to adopt a re-engineered / haskell-based RTS [2].

Tom

[1] http://www.reddit.com/r/haskell_proposals/comments/7itaz/simple_robust_maintainable_rts_for_ghc_io_pdf/
[2] http://www.seas.upenn.edu/~lipeng/homepage/papers/lmpjt07hw.pdf