
Hi fellow haskellers, I'm interested in the performance of parallel and/or distributed implementations in haskell language. For example, supose I want to develop an application that distributes a computation between many multicore computers, what are the advantages I can take from haskell in that? Anyone who has experience doing distributed computing with haskell, and could point some diferences between haskell and other languages? Any feedback is very much apreciated. Thanks. Regards. -- ============================================== Ivan Sichmann Freitas Engenharia de Computação 2009 UNICAMP http://identi.ca/ivansichmann Grupo Pró Software Livre UNICAMP - GPSL ==============================================

ivansichfreitas:
Hi fellow haskellers,
I'm interested in the performance of parallel and/or distributed implementations in haskell language. For example, supose I want to develop an application that distributes a computation between many multicore computers, what are the advantages I can take from haskell in that? Anyone who has experience doing distributed computing with haskell, and could point some diferences between haskell and other languages?
Any feedback is very much apreciated. Thanks.
The primary work in this area has been GUM/GPH and GDH, these days being led from St Andrews, http://www-fp.cs.st-andrews.ac.uk/wordpress/ * GUM, a parallel execution layer to allow GHC 'par' strategies and friends to run transparently across nodes, on top of an underlying message passing layer. http://www.macs.hw.ac.uk/~dsg/gph/ GHC implements the same abstractions for shared memory multicores. * Glasgow Distributed Haskell, an extension of the Parallel Haskell language to support distribution and fault tolerance, http://www.macs.hw.ac.uk/~dsg/gdh/ These systems run on a fork of GHC's runtime. Recent work is underway to merge some of the facilities back into mainline GHC. http://hackage.haskell.org/trac/ghc/wiki/HackPar There is also the Eden project, which I think is another implementation of a parallel Haskell model, with features to support distribution. The key players to talk to are Kevin Hammond, Jost Berthold, Hans-Wolfgang Loidl, Philip Trinder et al. Besides "traditional" language research projects, there are also various open source libraries on Hackage to support distributed computation in Haskell to some degree or another. For example: A framework for distributed STM, http://hackage.haskell.org/package/DSTM Holumbus-Distribution, distributed data structures like Chan, MVar or functions http://hackage.haskell.org/package/Holumbus-Distribution Holumbus map-reduce skeleton http://hackage.haskell.org/package/Holumbus-MapReduce net-concurrent is a simple haskell library for doing parallel computation on several computers using the network http://hackage.haskell.org/package/net-concurrent -- Don

http://book.realworldhaskell.org/read/concurrent-and-multicore-programming.h... http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell Although the last two edits on that page are from 2010 and 2009. So what *is* the current status of DPH? J.W.

waldmann:
http://book.realworldhaskell.org/read/concurrent-and-multicore-programming.h...
http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell Although the last two edits on that page are from 2010 and 2009. So what *is* the current status of DPH?
Note that DPH is a programming model, but the implementation currently targets shared memory multicores (and to some extent GPUs), not distributed systems. Since GHC 6.10 DPH was in "technology preview" mode (alpha), since 6.12 it is more stable, and more reliable, though significant work is still happening with the vectorizer in GHC HEAD (so expect even more reliable performance in GHC 6.14) -- Don

Don Stewart
Note that DPH is a programming model, but the implementation currently targets shared memory multicores (and to some extent GPUs), not distributed systems.
Yes. I understand that's only part of what the original poster wanted, but I'd sure want to use ghc-generated code on a (non-distributed) GPU. I keep telling students and colleagues that functional/declarative code "automatically" parallelizes, with basically "no extra effort" from the programmer (because it's all in the compiler) - but I would feel better with some real code and benchmarks to back that up. GPU computing via ghc could be a huge marketing opportunity - if it works, it should be all over the front page of haskell.org? J.W.

waldmann:
Don Stewart
writes: Note that DPH is a programming model, but the implementation currently targets shared memory multicores (and to some extent GPUs), not distributed systems.
Yes. I understand that's only part of what the original poster wanted, but I'd sure want to use ghc-generated code on a (non-distributed) GPU.
I keep telling students and colleagues that functional/declarative code "automatically" parallelizes, with basically "no extra effort" from the programmer (because it's all in the compiler) - but I would feel better with some real code and benchmarks to back that up.
Well, that's not really a good thing to say. Some subsets of Haskell automatically parallelize (like the array combinator libraries), others require simple annotations (like parallel strategies). Others are more explicit, like concurrent collections. There are many programming models, with varying degrees of power/usability. -- Don

waldmann:
functional/declarative code "automatically" parallelizes,
Well, that's not really a good thing to say.
Sure, sure, and I expand on the details in my lectures.
But in advertising (the elevator sales pitch), we simplify. Cf. "well-typed programs don't go wrong".
Good! I think we could all agree on a slogan based on the point that Haskell approaches take parallelism from difficult tasks to easy tasks. -- Don "expectations management" Stewart

Before Haskell took off with parallelism, it was assumed that Haskell would be trivial to run concurrently on cores because majority of Haskell programs were pure, so you could simply run different functions on different cores and string the results together when your done It turned out that using such a naive method created massive overhead (to the point where it wasn't worth it), and so different concurrent paradigms were introduced into Haskell to provide parallelism (nested data structures, parallel strategies, collections, STM). In I believe almost every case for these algorithms, there is a compromise between ease of implementation vs performance gains. Haskell is still by far one of the best languages to deal with concurrency/parallelism. In most other conventional languages used today (with are imperative or multi-paradigm), parallelism breaks modularity/abstraction (which is one of the main reasons why most desktop applications/games are still single core, and the few exceptions use parallelism in very trivial cases). This is of course mainly to to deal with state (semaphores/mutex). Although it is possible to program in other languages using 'pure' code, its often very ugly (and in that case you may as well use Haskell) On Tue, Sep 7, 2010 at 8:37 AM, Johannes Waldmann < waldmann@imn.htwk-leipzig.de> wrote:
Don Stewart
writes: Note that DPH is a programming model, but the implementation currently targets shared memory multicores (and to some extent GPUs), not distributed systems.
Yes. I understand that's only part of what the original poster wanted, but I'd sure want to use ghc-generated code on a (non-distributed) GPU.
I keep telling students and colleagues that functional/declarative code "automatically" parallelizes, with basically "no extra effort" from the programmer (because it's all in the compiler) - but I would feel better with some real code and benchmarks to back that up.
GPU computing via ghc could be a huge marketing opportunity - if it works, it should be all over the front page of haskell.org?
J.W.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

*Mistake, in where I said "majority of Haskell programs were pure" I meant
"majority of code in Haskell programs was pure"
On Tue, Sep 7, 2010 at 11:07 AM, Mathew de Detrich
Before Haskell took off with parallelism, it was assumed that Haskell would be trivial to run concurrently on cores because majority of Haskell programs were pure, so you could simply run different functions on different cores and string the results together when your done
It turned out that using such a naive method created massive overhead (to the point where it wasn't worth it), and so different concurrent paradigms were introduced into Haskell to provide parallelism (nested data structures, parallel strategies, collections, STM). In I believe almost every case for these algorithms, there is a compromise between ease of implementation vs performance gains.
Haskell is still by far one of the best languages to deal with concurrency/parallelism. In most other conventional languages used today (with are imperative or multi-paradigm), parallelism breaks modularity/abstraction (which is one of the main reasons why most desktop applications/games are still single core, and the few exceptions use parallelism in very trivial cases). This is of course mainly to to deal with state (semaphores/mutex). Although it is possible to program in other languages using 'pure' code, its often very ugly (and in that case you may as well use Haskell)
On Tue, Sep 7, 2010 at 8:37 AM, Johannes Waldmann < waldmann@imn.htwk-leipzig.de> wrote:
Don Stewart
writes: Note that DPH is a programming model, but the implementation currently targets shared memory multicores (and to some extent GPUs), not distributed systems.
Yes. I understand that's only part of what the original poster wanted, but I'd sure want to use ghc-generated code on a (non-distributed) GPU.
I keep telling students and colleagues that functional/declarative code "automatically" parallelizes, with basically "no extra effort" from the programmer (because it's all in the compiler) - but I would feel better with some real code and benchmarks to back that up.
GPU computing via ghc could be a huge marketing opportunity - if it works, it should be all over the front page of haskell.org?
J.W.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Mathew de Detrich
Haskell is still by far one of the best languages to deal with concurrency/parallelism.
Sure, I fully agree. I am using concurrency (with explicit forkIO, communication via Chan) a lot (my Haskell application controls several external constraint solvers). For parallelism, I'm just missing some benchmark code that I can run on my machine (i7 CPU, GTX 295 GPU, ghc-6.12.3) more or less "out-of-the-box" and that will impress my students and myself. (That is, get a speed-up of 8, or 480, without the program looking 8 times (or 480 times) more ugly...) - Johannes.

On 07/09/2010, at 6:11 PM, Johannes Waldmann wrote:
Mathew de Detrich
writes: Haskell is still by far one of the best languages to deal with concurrency/parallelism.
Sure, I fully agree.
I am using concurrency (with explicit forkIO, communication via Chan) a lot (my Haskell application controls several external constraint solvers).
For parallelism, I'm just missing some benchmark code that I can run on my machine (i7 CPU, GTX 295 GPU, ghc-6.12.3) more or less "out-of-the-box" and that will impress my students and myself. (That is, get a speed-up of 8, or 480, without the program looking 8 times (or 480 times) more ugly...)
The matrix-matrix multiplication benchmark from the Repa library does this. Check out http://www.cse.unsw.edu.au/~benl/papers/repa/repa-icfp2010.pdf http://hackage.haskell.org/package/repa http://hackage.haskell.org/package/repa-examples Though be warned you must use a recent GHC head build to get good performance. After GHC 7.0 is out (in a few weeks) we'll be able to release a properly stable version. Note that "speedup" is an important consideration, but not the end of the story. It's harder to find a benchmark that displays all of nice code + speedup + good absolute performance. The first and last of these tend not to be friends. Ben.

2010/9/7 Ben Lippmeier
Though be warned you must use a recent GHC head build to get good performance. After GHC 7.0 is out (in a few weeks) we'll be able to release a properly stable version.
Pardon a probably stupid question, but did I miss something ? http://hackage.haskell.org/trac/ghc/roadmap David.

2010/9/7 David Virebayre
2010/9/7 Ben Lippmeier
: Though be warned you must use a recent GHC head build to get good performance. After GHC 7.0 is out (in a few weeks) we'll be able to release a properly stable version.
Pardon a probably stupid question, but did I miss something ?
This is not stupid, but yes you missed something :) http://www.reddit.com/r/haskell/comments/dad6j/unless_theres_a_major_hiccup_... Cheers, Thu

This is not stupid, but yes you missed something :) http://www.reddit.com/r/haskell/comments/dad6j/unless_theres_a_major_hiccup_...
Oh, I saw that thread, but at the time it had vrey few comments, so I definately missed something ! Thanks ! David.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 9/7/10 05:08 , David Virebayre wrote:
2010/9/7 Ben Lippmeier
: Though be warned you must use a recent GHC head build to get good performance. After GHC 7.0 is out (in a few weeks) we'll be able to release a properly stable version.
Pardon a probably stupid question, but did I miss something ? http://hackage.haskell.org/trac/ghc/roadmap
I think the change is because there's a new type inference engine coming, which justifies the major version bump. - -- brandon s. allbery [linux,solaris,freebsd,perl] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkyGYi4ACgkQIn7hlCsL25UwXgCdHCeZezzttZe+RHJxeQCDQ3Gx JFcAoMeQcbnaQWoGutTcpAH8EnnxVZ2V =O6U3 -----END PGP SIGNATURE-----
participants (8)
-
Ben Lippmeier
-
Brandon S Allbery KF8NH
-
David Virebayre
-
Don Stewart
-
Ivan S. Freitas
-
Johannes Waldmann
-
Mathew de Detrich
-
Vo Minh Thu