Parallel Haskell stories

Friends I'm giving a talk at a developer conference in London on Friday 18th, about parallel programming in Haskell. http://skillsmatter.com/event/scala/functionalpx-2011/ad-1382 I know that some of you have been using Haskell for parallel or concurrent programming quite a bit, so this email is just to ask could you contribute a little vignette or story about using Haskell in a *parallel/concurrent* application that I could use to illustrate my talk? I can rant all I like about the glories of Haskell, but I'm a biased witness. It's much more convincing if I can illustrate with true tales from the trenches. I'm interested in both parallelism and concurrency, so for example the success of Warp in using Concurrent Haskell is in scope. I can't say a lot about any one example, obviously, but what would be great would be - an idea of how Haskell helped (esp if you have a head to head comparison) - code snippets that illustrate how lovely it all is - brief performance indicators *Insight* is the key word. I don't just want to say "Company X used Haskell to do Y" because that doesn't convey any re-usable insights or ideas. What is the essence? I'll start writing in earnest on Monday 14th; the talk is on Friday. I have plenty to say already, so treat this as an opportunity not an obligation. But still, examples would be good! Do let me know if you have one. Simon

On 9 March 2011 17:18, Simon Peyton-Jones
I can't say a lot about any one example, obviously, but what would be great would be - an idea of how Haskell helped (esp if you have a head to head comparison) - code snippets that illustrate how lovely it all is - brief performance indicators *Insight* is the key word. I don't just want to say "Company X used Haskell to do Y" because that doesn't convey any re-usable insights or ideas. What is the essence?
For what it's worth… (I imagine you're looking for more speed-intensive projects, but I like talking about Haskell, so here goes) my pet project IRCd called Hulk[1]. I whipped up a simple IRCd in an evening, and then the next day we (the dev team where I work[4]) were using it! Each client runs as a lightweight thread. The initial version used MVars all over the place, basically imperative, typical sloppy "getting it done quickly" code. I rewrote it a couple days later so that only the code that accepts connections and responds on the sockets is impure (100~ lines), and all the main code is totally pure (700~ lines), written in a monad stack of reader (connection information), writer (replies) and state (user details) (I abuse them[2]). I wrote about 500 lines of the pure stuff before even running it, and it just worked. There is one MVar in the project, that holds a pure value of all the state, this along with totally pure code, which has the obvious benefits; testability, a kind of "transactional" isolation. Some things (like checking against a SHA1 encrypted password file) require IO, so I have a MonadProvider class[3], with methods for *providing* values that will be impurely-acquired in the real server, but while testing I can provide my own values in a pure Reader or whatnot (HUnit and even QuickCheck spring to mind). One benefit, which didn't occur to me until asked about it, is that I didn't have to think much about choosing to have a separate thread per client because threads are so light-weight, my (Common Lisper) colleague was taken aback, being used to only "expensive" OS threads. The process been running for the past three weeks and we use it for all our dev discussion and for displaying things from feeds with rss2irc like commits, service events, support tickets, etc. Not bad for a few evenings of hacking! Insights: (1) It's nice and actually practical to make everything pure apart from the truly non-pure things, the monad of which can be parametrized. (2) Use of light-weight threads feels very natural, letting you think as if you're working in a single thread (with, e.g. blocking sockets), with no real cost. (3) Rapid prototyping in Haskell is actually very easy and effective. [1]: https://github.com/chrisdone/hulk [2]: class Monad m => MonadProvider m where providePreface :: m (Maybe String) provideMotd :: m (Maybe String) provideKey :: m String providePasswords :: m String [3]: I could stick all these in the StateT, but I think Reader and Writer are like nice, statically-enforced documentation. newtype IRC m a = IRC { runIRC :: ReaderT (UTCTime,Conn) (WriterT [Reply] (StateT Env m)) a } deriving (Monad ,Functor ,MonadWriter [Reply] ,MonadState Env ,MonadReader (UTCTime,Conn)) [4]: FWIW, the development section of CREATE-NET, a research centre in Italy. We are hiring Haskellers for application dev…

A couple of years ago, Lambda Lounge [1] a local group of language
enthusiasts had a shootout [2] where you were supposed to implement a
simple vending machine [3] that took coins and dispensed candy.
With *very* little experience using Haskell I was able to implement a
vending machine server [4] that wrapped its candy and change stock in
a TVar so it took multiple concurrent requests and either dispensed
change/candy or block if there wasn't any stock. I think that STM's
ease of use impressed people.
If you're thinking of running the code, please let me know and I'll
get it working with a recent GHC.
-deech
[1] lambdalounge.org
[2] http://lambdalounge.org/2009/05/05/may-meeting/
[3] http://stllambdalounge.files.wordpress.com/2009/03/vendingmachinespecificati...
[4] http://patch-tag.com/r/deech2k/VendingMachine/snapshot/current/content/prett...
On Wed, Mar 9, 2011 at 11:50 AM, Christopher Done
On 9 March 2011 17:18, Simon Peyton-Jones
wrote: I can't say a lot about any one example, obviously, but what would be great would be - an idea of how Haskell helped (esp if you have a head to head comparison) - code snippets that illustrate how lovely it all is - brief performance indicators *Insight* is the key word. I don't just want to say "Company X used Haskell to do Y" because that doesn't convey any re-usable insights or ideas. What is the essence?
For what it's worth… (I imagine you're looking for more speed-intensive projects, but I like talking about Haskell, so here goes) my pet project IRCd called Hulk[1]. I whipped up a simple IRCd in an evening, and then the next day we (the dev team where I work[4]) were using it! Each client runs as a lightweight thread. The initial version used MVars all over the place, basically imperative, typical sloppy "getting it done quickly" code. I rewrote it a couple days later so that only the code that accepts connections and responds on the sockets is impure (100~ lines), and all the main code is totally pure (700~ lines), written in a monad stack of reader (connection information), writer (replies) and state (user details) (I abuse them[2]). I wrote about 500 lines of the pure stuff before even running it, and it just worked. There is one MVar in the project, that holds a pure value of all the state, this along with totally pure code, which has the obvious benefits; testability, a kind of "transactional" isolation. Some things (like checking against a SHA1 encrypted password file) require IO, so I have a MonadProvider class[3], with methods for providing values that will be impurely-acquired in the real server, but while testing I can provide my own values in a pure Reader or whatnot (HUnit and even QuickCheck spring to mind). One benefit, which didn't occur to me until asked about it, is that I didn't have to think much about choosing to have a separate thread per client because threads are so light-weight, my (Common Lisper) colleague was taken aback, being used to only "expensive" OS threads. The process been running for the past three weeks and we use it for all our dev discussion and for displaying things from feeds with rss2irc like commits, service events, support tickets, etc. Not bad for a few evenings of hacking! Insights: (1) It's nice and actually practical to make everything pure apart from the truly non-pure things, the monad of which can be parametrized. (2) Use of light-weight threads feels very natural, letting you think as if you're working in a single thread (with, e.g. blocking sockets), with no real cost. (3) Rapid prototyping in Haskell is actually very easy and effective. [1]: https://github.com/chrisdone/hulk [2]:
class Monad m => MonadProvider m where
providePreface :: m (Maybe String)
provideMotd :: m (Maybe String)
provideKey :: m String
providePasswords :: m String
[3]: I could stick all these in the StateT, but I think Reader and Writer are like nice, statically-enforced documentation.
newtype IRC m a = IRC {
runIRC :: ReaderT (UTCTime,Conn) (WriterT [Reply] (StateT Env m)) a
}
deriving (Monad
,Functor
,MonadWriter [Reply]
,MonadState Env
,MonadReader (UTCTime,Conn))
[4]: FWIW, the development section of CREATE-NET, a research centre in Italy. We are hiring Haskellers for application dev… _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Wed, Mar 9, 2011 at 11:18 AM, Simon Peyton-Jones
could you contribute a little vignette or story about using Haskell in a *parallel/concurrent* application that I could use to illustrate my talk?
The Cryptographic Protocol Shapes Analyzer (CPSA) attempts to enumerate all essentially different executions possible for a cryptographic protocol. At each step in the analysis, it infers what else must have happened based on an incomplete description of an execution. The steps form a derivation tree. If CPSA terminates on its input, each leaf of the derivation tree is a complete description of some executions, and the leaves collectively enumerate all possible executions. It is crucial for performance reasons that CPSA does not compute a tree, but instead computes a directed acrylic graph (DAG). To do so, it must identify pairs of nodes in the derivation that are isomorphic. For large problems, the isomorphism check dominates all other computation during the analysis. So how does CPSA use parallelism? CPSA expands the derivation DAG in a breadth-first order. For each node at the current level, the program performs the isomorphism check on all nodes higher up in the DAG in one thread. All threads then join, and in the main thread, the intra-level isomorphism checks are performed. It's that simple. At the code level, all that is done is replace one map call with a parallelized version of map that I saw in one of your papers: #if defined HAVE_PAR parMap :: (a -> b) -> [a] -> [b] parMap _ [] = [] parMap f (x:xs) = par y (pseq ys (y:ys)) where y = f x ys = parMap f xs Performance varies depending on the shape of the DAG. A wide bushy DAG makes better use of more processors. I have found one can often get a speedup of three on a quad processor machine. Of course performance also varies depending on the version of GHC runtime in use. A speedup of three may not sound like much, but on some problems, CPSA takes twenty-four hours to run on a single core. Parallelizing CPSA was not as simple as I made it sound. I had the restructure the sequential description of the algorithm to expose the parallelism to the runtime, and the algorithm became less obvious. Of course, this cost is well worth it. John

On Thu, Mar 10, 2011 at 6:04 PM, John D. Ramsdell
At the code level, all that is done is replace one map call with a parallelized version of map that I saw in one of your papers:
#if defined HAVE_PAR parMap :: (a -> b) -> [a] -> [b] parMap _ [] = []
Opps. Please delete the #if line. In the actual code, I have preserved the ability to compile CPSA using GHC 6.8 by using the C preprocessor as follows: #if defined HAVE_PAR parMap :: (a -> b) -> [a] -> [b] parMap _ [] = [] parMap f (x:xs) = par y (pseq ys (y:ys)) where y = f x ys = parMap f xs #else parMap :: (a -> b) -> [a] -> [b] parMap = map #endif John

On Wed, Mar 9, 2011 at 17:18, Simon Peyton-Jones
could you contribute a little vignette or story about using Haskell in a *parallel/concurrent* application that I could use to illustrate my talk?
Combinatorrent is a BitTorrent client written in Haskell, based upon the same ideas as Etorrent, a BitTorrent client written in Haskell. Combinatorrent uses an aggressive forkIO-style on top of STM Transactional channels for communication. The client implements the official protocol and several extensions in far fewer lines than contemporary clients, usually a factor of 4-5 in souce code lines. Four advantages present in Haskell (There is some overlap with Erlang here): * Spawning a new lightweight thread is extremely cheap. It encourages a coding style where you just spawn another process to do work for you rather than a style where you try, emphasis try, to harness a few OS threads. * Immutability and Persistence of the data plays a big role in getting the program simple. You can just send off a large data structure to another thread without having to worry that the other thread destructively destroys invariants. This in effect yields an effective way to use persistence to give quick caching of data in other threads, fast responsibility handover and so on. * Protocol parsing is a special case of an Applicative Functor. This yields protocol parsers that reads like specifications in RFCs, and still they run extremely fast. * We do configuration of BitTorrent extensions by having each extension be an element in a Monoid and then use function composition as the monoid-operation to chain together extensions at hook-points. This yields a very simple and very effective way to dynamically configure each peer we communicate with with the peers supported extensions. It also decouples the extension-code neatly from the main code. All in all, I was quite happy with using Haskell, even though I am currently more focusing on my Erlang client. I have read code in C++, C and Java clients, and it is scary how much work they have to do in order to achieve just part of the work. (Performance could be better, but this is GHC612 without the new IO-manager measurements - and I expect it to have deep impact). -- J.
participants (5)
-
aditya siram
-
Christopher Done
-
Jesper Louis Andersen
-
John D. Ramsdell
-
Simon Peyton-Jones