Re: [Haskell-beginners] Parallel Processing in Libraries?

I'd like to have a library which utilized parallel programming (mostly for map-reduce tasks).
Since parallel computation is a complex topic, there are many solutions that might not apply to a problem. I find that often, the simplest, often overlooked method of implementing parallel computation is by breaking down a problem into chunks that can easily be computed by a single core. Multi-threading is another topic entirely, and while it is often related to parallelization, it is also often used only to allocate more cores (improve performance) rather than implement an algorithm that is otherwise serial, but then you get to have to deal with all the difficulty of multi-threading like synchronization, shared memory access, thread-safe code, having to deal with errors potentially affecting the whole process vs. a single worker. Thus, once you break down the problem into individual components, one way to implement parallelization is by using a message queue task list system. The principle is pretty simple, using a message broker such as RabbitMQ or ZeroMQ, workers connect to this message queue and listen to messages from a controller telling them to process some data using some function. First come, first served. Once the worker is done, the reply is sent back to the message queue or it is made available by any other means (for example in a folder with shared access). The worker could then notify other workers to further process this data. The elegance of this system lies in how flexible the actual hardware architecture that processes the load can be. For example, using some cloud provider, you can automatically spawn more VMs to process a higher load of data, and discard the VMs once they get idle long enough. On a single machine, you can fire up one or as many processes as you have computing cores (if your workload is almost 100% CPU-bound) and let the OS take care of scheduling the tasks. I am not aware of a complete task processing library for Haskell; for an example of a mature project that provides such features, check out Celery on Python. If you cannot find a substitute, I suppose you could always use Celery and run your Haskell code from inside the Python interpreter.
participants (1)
-
Sébastien Leblanc