
Folks, In my current architecture I launch a two threads per socket where the socket reader places results in a TMVar and the socket writer takes input from a TChan. I also have the worker thread the does the bulk of packet processing and a timer thread. The time thread sleeps for a few minutes and exits after posting a timeout event if it hasn't been killed before. My goal is to launch poker 2,000 bots that join the server "lobby" and sit there sending small keep-alive packets every few minutes. The ultimate goal is for 4,000 bots to be playing but I'm taking it one step at a time. This is Mac OSX Tiger with a couple of header files modified to allow FD_SETSIZE of 10240. This is the maximum allowed by 'ulimit -n'. I'm also running ghc 6.4.1, compiled after FD_SETSIZE has been increased. I can get to 2k bots without any trouble if I use a keep-alive timeout of 9 minutes. Memory usage with 2k bots is 161Mb of physical memory and 262Mb VM. CPU usage 20-40%. Memory usage is constant once all bots have been launched. With a 1 minute keep-alive timeout system is starting to get stressed almost right away. There's verbose logging going on and almost every event/packet sent and received is traced. The extra logging of the timeout events probably adds to the stress and so, I assume, do the extra packets. New bots are being launched very slowly even with just 200 bots already running. Based on the above, would you have any suggestions for an improved architecture? I will try 1) disabling logging alltogether and 2) increase thread stack size to 3k (+RTS -k3k) as per Simon Marlow's suggestion. As per simon if a thread stack space is between 2k and 4k then each thread gets its own memory block (right Simon?) and threads are not GCd then. I'm a bit concerned about trippling my memory use with -k3k, though. I'm not sure if switching to a continuations-based framework will help me. Has anyone tried this? Thanks, Joel -- http://wagerlabs.com/