
On Wed, Oct 22, 2014 at 12:10 AM, Kazu Yamamoto
Out of curiosity are these numbers from single runs or do you average?
Run three times and took the middle in this time.
What are the uncertainties on these numbers? Even on the Rackspace machines I was finding very large variances in my benchmarks, largely due to far outliers. I didn't investigate too far but it seems that a non-trivial fraction of connections were failing.
If cores are in sleep mode, the results are poor. You need to warm cores up somehow.
I forget how to disable the deep sleep mode by a command on Linux. (Open a special file and write something?) I believe that Andi knows that.
You need to set the CPU into C0 using /dev/cpu_dma_latency. Here's a short paper with a program to show the way to do it[1]. The Mio paper mentions this, and the results are pretty dramatic: "We disable power-saving by specifying the maximum transition latency for the CPU, which forces the CPU cores to stay in C0 state. Figure 12 shows the results, with the curves labelled “Default” and “NoSleep” showing the performance in the default configuration and the default configuration with power-saving disabled, respectively. Without limiting the CPU sleep states (curve “Default”), SimpleServerC cannot benefit from using more CPU cores and the throughput is less than 218,000 requests per second. In contrast, after preventing CPU cores entering deep sleep states (curve “NoSleep”), SimpleServerC scales up to 20 cores and can process 1.2 million requests per second, approximately 6 times faster than with the default configuration."[2]
To my experience, 1G network is NOT good enough.
The Rackspace machines come with bonded 10GigE, so hopefully over the internal DC network they can handle that. :) [1] http://en.community.dell.com/cfs-file/__key/telligent-evolution-components-a... [2] Section 5.1, http://haskell.cs.yale.edu/wp-content/uploads/2013/08/hask035-voellmy.pdf -- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/