
I was looking around for a haskell solution for distributing a computation across multiple machines and couldn't find anything that looked current and alive. I found out that the Hadoop project (java based) can interact with binary executables via stdin and stdout. So I have set up a Hadoop cluster of 5 machines, wrapped my Haskell code to accept data via stdin and write results to stdout and successfully executed it on the cluster. I would prefer a more Haskell orientated solution and welcome any suggestions. If not maybe this will be of use to others. Regards brad

On 10/16/07, Brad Clow
I would prefer a more Haskell orientated solution and welcome any suggestions. If not maybe this will be of use to others.
Well, Hadoop is aiming towards a Google style of cluster processing and the path towards that is pretty clear: 1) An XDR like serialisation scheme with support for backwards compatibility (which involves unique-for-all-time ids in the IDL and "required", "optional" etc tag). Data.Binary would be a great start for this, but it's sadly lazy in parsing and they never applied my patch for optional strictness so one would probably have to start from scratch. 2) An RPC system which handles the most common use case: arguments and replies are serialised using the above system, TCP transport, simple timeouts, STM for concurrency. Then you can start doing cool stuff like using the GHC API for code motion and building a simple MapReduce like framework etc. -- Adam Langley agl@imperialviolet.org http://www.imperialviolet.org 650-283-9641

At Brooklyn College we have been running distributed Haskell (release 5, though) under Linux/Mosic for several years. There are some problems, but we think they are well under control. Murray Gross Brooklyn College, City University of New York On Tue, 16 Oct 2007, Brad Clow wrote:
I was looking around for a haskell solution for distributing a computation across multiple machines and couldn't find anything that looked current and alive. I found out that the Hadoop project (java based) can interact with binary executables via stdin and stdout. So I have set up a Hadoop cluster of 5 machines, wrapped my Haskell code to accept data via stdin and write results to stdout and successfully executed it on the cluster.
I would prefer a more Haskell orientated solution and welcome any suggestions. If not maybe this will be of use to others.
Regards brad _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (3)
-
Adam Langley
-
Brad Clow
-
Murray Gross