Ryan Newton:
But, anyway, it turns out that my example above is easily transformed from a bad GHC performance story into a good one.  If you'll bear with me, I'll show how below.
   First, Manuel makes a good point about the LLVM backend.  My "6X" anecdote was from a while ago and I didn't use llvm [1].  I redid it just now with 7.4.1+LLVM, results below.  (The below table should read correctly in fixed width font, but you can also see the data in the spreadsheet here.)

                   Time (ms)   Compiled File size   Comple+Runtime (ms)
GHC 7.4.1 O0    2444        1241K
GHC 7.4.1 O2    925        1132K             1561
GHC 7.4.1 O2 llvm  931         1133K
GHC 7.0.4 O2 via-C 684         974K

So LLVM didn't help [1].  And in fact the deprecated via-C backend did the best!  

I would classify that as a bug.

[1] P.P.S. Most concerning to me about Haskell/C++ comparisons are David Peixotto's findings that LLVM optimizations are not very effective on Haskell-generated LLVM compared with typical clang-generated LLVM.

There is some work underway to improve the situation, but I am sure more could be done.

Manuel