Great!  I'm glad to hear folks are interested.  

It sounds like there is need for a better low-dependencies benchmark suite.  I was just grepping through nofib looking for things that are missing and I realized there are no uses of atomicModifyIORef, for example.

What we're working on at Indiana right this second is not quite this effort, but is the separate, complementary, effort to gather as much data as possible from a large swath of packages (high dependency-count) .

Note that fibon already has bitrotted, and does not quite work any
more. So there is some low hanging fruit in resurrecting that one.

Agreed.  Though I see that nofib already contains some of them.  

Even though stack + GHC head loses many of stack's benefits, I think that stack and cabal freeze should make it easier to keep things running for the long term than it was with fibon (which bitrotted quickly).
 
Another important step in that direction would be to define a common
output for benchmark suites defined in .cabal files, so it is easier to
set up things like http://perf.haskell.org/ghc and http://perf.haskell.
org/binary for these projects.

Yes, exitcode-stdio-1.0 is useful for testing but not so much for benchmarking.  To attempt to harvest Stackage benchmarks we were going to just assume things are criterion and catch errors as we go.  Should we go further and aim to standardize a new value for "type:" within benchmark suites?

About the harness: haskell.org is currently paying a student (CCed) to
setup a travis-like infrastructure based on gipeda (the software behind
perf.haskell.org) that would allow library authors to very simply get
continuous benchmark measurements. Let’s see what comes out of that!

What's the infrastructure that currently gathers the data for perf.haskell.org?  Is there a repo you can point to?  (Since gipeda itself is just the presentation layer, and something else must be running things & gathering data.)

Cheers,
 -Ryan