Hm, has that been optimized to output all at once? The implementation I recall is more or less mapM_ putChar, deferring the lock to putChar which never gets invoked because the list is empty.
Okay, just checked; it reserves the handle up front, and then the above implementation (albeit directly instead of via mapM_) is used only in the NoBuffering case, using an internal function that doesn't reserve. Which will complicate understanding what's going on, although my suggestion earlier about unbuffering output still applies.