
Parallel Haskell Digest ======================= Edition 1 2011-03-31 http://www.well-typed.com/blog/52 Hello Haskellers! If you're in the mood for a HWN chaser, I'd like to introduce the first Parallel Haskell Digest, a newsletter aiming to show off all the work that's going on using parallelism and concurrency in the Haskell community. The digest is a bit of an experiment. For the community in general, I hope to help you catch up with a monthly recap of news, interesting blog posts and discussions about parallelism in Haskell. For people (like me) who are new to parallelism and concurrency in Haskell, or maybe just have a passing interest, I hope to give you little nibbles, with regular features like the Word of Month, Featured Code and Parallel Puzzlers. Finally, as a bit of disclosure, I'm writing this digest as part of my work to promote the Parallel GHC Project. So don't be surprised if I give it a little special emphasis :-) Word of the Month ----------------- This very first edition of the PH digest is brought to you by the word /spark/. You may have heard of sparks and threads in the same sentence. What's the difference? A Haskell thread is a thread of execution for IO code. Multiple Haskell threads can execute IO code concurrently and they can communicate using shared mutable variables and channels. Sparks are specific to parallel Haskell. Abstractly, a spark is a pure computation which may be evaluated in parallel. Sparks are introduced with the par combinator; the expression (x `par` y) "sparks off" x, telling the runtime that it may evaluate the value of x in parallel to other work. Whether or not a spark is evaluated in parallel with other computations, or other Haskell IO threads, depends on what your hardware supports and on how your program is written. Sparks are put in a work queue and when a CPU core is idle, it can execute a spark by taking one from the work queue and evaluating it. On a multi-core machine, both threads and sparks can be used to achieve parallelism. Threads give you concurrent, non-deterministic parallelism, while sparks give you pure deterministic parallelism. Haskell threads are ideal for applications like network servers where you need to do lots of I/O and using concurrency fits the nature of the problem. Sparks are ideal for speeding up pure calculations where adding non-deterministic concurrency would just make things more complicated. Parallel GHC project news ---------------------------------------------------------------------- The Parallel GHC Project is an MSR-funded project to push the real-world use of parallel Haskell. Part of this project involves effort by Well-Typed to provide tools for use by the general community: Work shall soon begin working on making the "Modified Additive Lagged Fibonacci" and perhaps other random number generators from the SPRNG library available in Haskell. Current plans are to implement the algorithms directly in Haskell, and to expose them as instances of System.Random.RandomGen. These generators are attractive for use in Monte Carlo simulations because they are splittable and have good statistical quality, while providing high performance. Work is underway on extending the GHC EventLog and associated tools (ghc-events, ThreadScope). The aim is to support profiling of multi-process or distributed Haskell systems such as client/server or MPI programs. This involves incorporate some changes made in related projects (Eden, EdenTV). This work may have some benefits even for profiling single-process programs. It should also allow comparative profiling where multiple runs of the same program (e.g. different inputs or slightly different code) are viewed on the same timeline. For more information on the Parallel GHC project, see http://www.haskell.org/haskellwiki/Parallel_GHC_Project Featured Code ---------------------------------------------------------------------- The feature code for this month is hulk, an IRC server in 1250 lines of code. Hulk was coded up by Chris Done in one evening, and it was used the next morning. (Chris has done a bit of cleaning up since). Here's what Chris had to say about his experience writing Hulk: Haskell's lightweight threads make it natural (and guilt-free!) to choose the design of one-thread-per-client. With a single MVar containing a pure value for the whole server-state, and the help of a monad stack of ReaderT(connection information), WriterT (replies) and StateT (user details), it was trivial to make the bulk of the code completely pure. LineBuffering on the (Handle-wrapped) sockets tripped me up; as opposed to NoBuffering, this did not behave as expected: *many* threads would block when only *one* should have. Overall, it was textbook Haskell; keep the main code pure, and only the truly impure in the outer shell. It's up on hackage so you can install it with a quick cabal install hulk You can also fork his code on GitHub https://github.com/chrisdone/hulk Got a cool use of Parallel Haskell or an interesting problem? Nominate it for the PH digest featured code/parallel puzzler! Blog Posts ---------------------------------------------------------------------- 1. Parallelism is not Concurrency - Bob Harper https://existentialtype.wordpress.com/2011/03/17/parallelism-is-not-concurre... 2. Petri Net Concurrency - Edward Z. Yang http://blog.ezyang.com/2011/03/petri-net-concurrency Parallel-Haskell and Haskell Cafe ---------------------------------------------------------------------- 1. How does the Yesod web framework deal with concurrency? Ertugrul Soeylemez is using concurrent Haskell to serve up two related sites and spawn off some background workers would pose any problems with Yesod and WAI. No problem, according to Michael Snoyman http://www.haskell.org/pipermail/haskell-cafe/2011-January/088754.html 2. Concurrency best practices? Wren Ng Thorton is using STM to "run a lot of things in parallel without the headaches of locks." It works great, but then what if you want IO? How would you enforce atomic IO operations, eg. printing log messages in two threads without getting garbage on screen? Some suggestions were the twighlight-stm package, and using STM.TChan with potential performance penalty. http://www.haskell.org/pipermail/haskell-cafe/2011-February/088987.html 3. Possible bug in Control.Concurrent Krzysztof Skrzętnicki encountered an unexpected deadlock using Control.Concurrent. Was it a normal consequence of repeatedly reading from channel with only a single element put into it, or a bug? - http://www.haskell.org/pipermail/haskell-cafe/2011-February/089135.html - http://hackage.haskell.org/trac/ghc/ticket/4154 4. Faster timeout but is it correct? Bas van Dijk proposed increasingly faster potential replacements for System.Timeout, including some that use the new GHC event manager. After much scrutiny, it turns out that one version is not faster and the events based one had a bug. Maybe next time! - http://www.haskell.org/pipermail/haskell-cafe/2011-February/089360.html - http://hackage.haskell.org/trac/ghc/ticket/4963 5. Unable to get parallelism using `par`. SMP parallelism gains inferior than expected. C K Kashyap followed /A tutorial on Parallel and Concurrent programming in Haskell/ but couldn't get a speed-up. Others had better luck with different sets of inputs José Pedro Magalhães also tried using SMP parallelism without getting performance gains he was hoping for. Simon Marlow suggested analysing with Threadscope, checking for too many/few sparks, and checking to see if a lot of time was being spent in GC. - http://www.haskell.org/pipermail/haskell-cafe/2011-February/089419.html - http://research.microsoft.com/en-us/um/people/.../parallel/AFP08-notes.pdf - http://www.haskell.org/pipermail/haskell-cafe/2011-February/089635.html 6. Auto elimination of MVars using a monad or monad transformer. Chris Dew asked if GHC eliminates MVars during compilation, or if there was a way to eliminate them automatically. Ryan Ingram suggested looking into the Adaptive package and Functional Reactive Programming. http://www.haskell.org/pipermail/haskell-cafe/2011-February/089660.html Stack Overflow ---------------------------------------------------------------------- 1. Parallelism on divide & conquer algorithm (7 Feb) http://stackoverflow.com/questions/4920077/parallelism-on-divide-conquer-alg... 2. How to best synchronize game engine and network server in Haskell? (11 Feb) http://stackoverflow.com/questions/4973401/how-to-best-synchronize-game-engi... 3. Haskell concurrency and Handles (27 Feb) http://stackoverflow.com/questions/5136375/haskell-concurrency-and-handles 4. Strict evaluation techniques for concurrent channels in Haskell (6 Mar) http://stackoverflow.com/questions/5211827/strict-evaluation-techniques-for-... Help and feedback ---------------------------------------------------------------------- Got something for the digest? Send me an email at eric@well-typed.com. I'm particularly interested in short parallelism/concurrency puzzles, cool projects for featured code. Other comments and feedback (criticism and corrections especially!) would be welcome too. -- Eric Kow http://www.nltg.brighton.ac.uk/home/Eric.Kow For a faster response, try +44 (0)1273 64 2905 or xmpp:kowey@jabber.fr (Jabber or Google Talk only)