
On 4/14/07, Fawzi Mohamed
but making my worker threads if I know the number of worker will be more efficient, my program is extremely parallel, but putting a par everywhere would be very memory costly and would probably break the program in the wrong places, I know where I should spark threads so that I have few high-level tasks and coarse grain parallelism, and if I know the number of workers I can do much more efficiently by hand. Furthermore I can put the tasks in a queue in order of decreasing cost and get a rather good load balancing without having to think too much about a static distribution. So having a thread pool by hand does make sense, doing it with OS threads and trying to beat the GHC runtime does not.
I think you should probably consider the extremely lightweight forkIO threads as your "work items" and the GHC runtime as your thread pool system (it will find out how many threads you want using the RTS options and distribute it for you). If you're worried about memory efficiency you can tweak the initial stack sizes for threads etc. using runtime options. It's still true that you don't want to fork off trivial computations in a separate thread, BUT that's true for manual work item queues as well (you'd want each work item to be a substantial amount of computation because there is overhead per item). E.g. if you have a list you might not want one thread per element (and you wouldn't want one work item per element either) if the per element tasks are fairly trivial, so you'd first group the list into chunks, and then let each chunk be a work item (i.e. spawn a forkIO thread to process it). I'd be interested in seeing benchmarks on this, but I do think that you'll be better off just spawning a lightweight thread per task, rather than first wrapping it in some data structure as a work item, then putting it in a queue, then popping items of the queue into threads. Seems that doing it that way would just be duplicating work. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862