As usual, I could suggest  a really crazy alternative: It could be possible to design your code as an EDSL that can emit his own source code, in the same way that web formlets emit HTML rendering.   

In this case, instead of HTML rendering, the rendering would be the source code of the closure that you want to execute remotely. Then you can compile it at the emitting node or at the receiver. 

The advantage is that you may remotely execute any routine coded using the EDSL. Hiding the mechanism behind a few primitives of the EDSL.

The disadvantage is that you can not use IO routines with liftIO. You need an special lifting mechanism that produces also the source code of the IO routine.

I´m doing some research on this mechanism with the Transient monad. 

2015-03-16 1:54 GMT+01:00 felipe zapata <tifonzafel@gmail.com>:
I haven't considered that idea, but it seems the natural solution.

Many thanks

On 15 March 2015 at 20:31, Ozgun Ataman <ozgun.ataman@soostone.com> wrote:
Anecdotal support for this idea: This is exactly how we distribute hadron[1]-based Hadoop MapReduce programs to cluster nodes at work. The compiled executable essentially ships itself to the nodes and recognizes the different environment when executed in that context. 

[1] hadron is a haskell hadoop streaming framework that came out of our work. It's on github and close to being released on hackage once the current dev branch is finalized/merged. In case it's helpful:  https://github.com/soostone/hadron

Oz

On Mar 15, 2015, at 8:06 PM, Andrew Cowie <andrew@operationaldynamics.com> wrote:

Bit of a whinger from left-field, but rather than deploying a Main script and then using GHCi, have you considered compiling the program and shipping that?

Before you veto the idea out of hand, statically compiled binaries are good for being almost self-contained, and (depending on what you changed) and they rsync well. And if that doesn't appeal, then consider instead building the Haskell program dynamically; Hello World is only a couple kB; serious program only a hundred or so.

Anyway, I know you're just looking to send a code fragment closure, but if you're dealing with the input and output of the program through a stable interface, then the program is the closure.

Just a thought.

AfC

On Mon, Mar 16, 2015 at 9:53 AM felipe zapata <tifonzafel@gmail.com> wrote:
Hi all,
I have posted the following question on stackoverflow, but so far I have not received an answer.
http://stackoverflow.com/questions/29039815/distributing-haskell-on-a-cluster


I have a piece of code that process files,

processFiles ::  [FilePath] -> (FilePath -> IO ()) -> IO ()

This function spawns an async process that execute an IO action. This IO action must be submitted to a cluster through a job scheduling system (e.g Slurm).

Because I must use the job scheduling system, it's not possible to use cloudHaskell to distribute the closure. Instead the program writes a new Main.hs containing the desired computations, that is copy to the cluster node together with all the modules that main depends on and then it is executed remotely with "runhaskell Main.hs [opts]". Then the async process should ask periodically to the job scheduling system (using threadDelay) if the job is done.

Is there a way to avoid creating a new Main? Can I serialize the IO action and execute it somehow in the node?

Best,

Felipe

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe




--
Alberto.