
On 15 December 2005 10:21, Joel Reymont wrote:
Here are statistics that I gathered. I'm almost done modifying the program to use 1 timer thread instead of 1 per bot as well as writing to the socket from the writer thread. This should reduce the number of threads from 6k (2k x 3) to 2k plus change.
It appears that +RTS -k3k does make a difference. As per Simon, 2-4k avoids the thread being garbage collected because each thread gets its own block in the storage manager. Simon, did I get that right?
BTW, how does garbage-collecting a thread works in this scenario? My threads are very long-running.
The total is the number of bots launched, lobby is how many bots connected to the lobby. Failed is mostly due to connection reset by peer errors. The Windows C++ server uses IOCP and running a firewall was apparently interfering with that somehow. I hate Windows :-(.
--- Test#1 +RTS -k3k as per Simon. Keep-alive timeout of 9 minutes.
Total: 1961, Lobby: 1961, Failed: 0 Total: 2000, Lobby: 2000, Failed: 1
This test went smoothly and got to 2k connections very quickly. Maybe within 30 minutes or so. I did not gather CPU usage, etc. statistics.
--- Test #2, No thread stack increase, 1 minute keep-alive timeout, more network traffic
With a 1 minute timeout things run veeery slow. 86 physical and 158Mb of VM with 1k bots, CPU 50-60%. Data sent/received is 60-70 packets and 6-7kb/sec. Killed after a while.
The statistics are phys/VM, CPU usage in % and #packets/transfer speed
Total: 1345, Lobby: 1326, Failed: 0, 102/184, 50%, 90/8kb Total: 1395, Lobby: 1367, Failed: 2 Total: 1421, Lobby: 1394, Failed: 4 Total: 1490, Lobby: 1463, Failed: 4, 108/194, 50%, 110/11Kb Total: 1574, Lobby: 1546, Failed: 4, 113/202, 50%, 116/11kb
Hmm, your machine is spending 50% of its time doing nothing, and the network traffic is very low. I wouldn't expect 2k connections to pose any problem at all, so further investigation is definitely required. With 2k connections the overhead of select() is going to start to be a problem. You would notice the system time going up. -threaded may help with this, because it calls select() less often. If that's not the cause, we should find out what your app is doing while it's idle. If there are runnable threads (eg. the lauchner), then the app should not be spending any of its time idle. Cheers, Simon