
[Sorry for the multiple reposts - couldn't quite figure out which email address doesn't get refused by the list..] Hi Carter, thank you for the good points you raise. I'll try and address each of them as best I can below.
0) I think you could actually implement this proposal as a userland library, at least as you've described it. Have you tried doing so?
Indeed, this could be done without touching the compiler at all. We thought long and hard about a path that would ultimately make an extension either unnecessary, or at any rate very small. At this point, the only thing that we are proposing to add to the compiler is the syntactic form "static e". Contrary to the presentation in the paper, the 'unstatic' function can be implemented entirely as library code and does not need to be a primop. Moreover, we do not need to piece together any kind of global remote table at compile time or link time, because we're piggy backing on that already constructed by the system linker. The `static e` form could as well be a piece of Template Haskell, but making it a proper extension means that the compiler can enforce more invariants and be a bit more helpful to the user. In particular, detecting situations where symbolic references cannot be generated because e.g. the imported packages were not compiled as dynamic linked libraries. Or seamlessly supporting calling `static f` on an idenfier `f` that is not exported by the module.
1) what does this accomplish that can not be accomplished by having various nodes agree on a DSL, and sending ASTs to each other? 1a) in fact, I'd argue (and some others agree, and i'll admit my opinions have been shaped by those more expert than me) that the sending a wee AST you can interpret on the other side is much SAFER than "sending a function symbol thats hard coded hopefully into both programs in a way that it means the same thing".
I very much subscribe to the idea of defining small DSL's for exchanging code between nodes. And this proposal is compatible with that idea. One thing that might not have been so clear in the original email is that we are proposing here to introduce just *one such DSL*. It's just that it's a trivial one whose grammar only contains linker symbol names. As it happens, distributed-static today already supports two such DSL's: a DSL of labels, which are arbitrary string names for functions, and a small language for composing Static values together. There is a patch lying around by Edsko proposing to add a third "DSL": one that allows nodes to trade arbitrary Haskell strings that are then eval'ed on the other end by the 'plugins' package. As Facundo explains at the end of his email, the notion of a "static" value ought to be a more general one than was first envisioned in the paper: a static value is any closed denotation, denoted in any of a choice of multiple small languages, some of which ship standard with distributed-static. The user can define his own DSL for shipping code around. This is why we propose to make Static into a class. Each DSL is generated by one datatype. Each such datatype has a Static instance. If you would like to ship an AST around the cluster, you can make the datatype for that AST an instance of Static, with 'unstatic' being defined as an interpreter for your AST. Concretely: data HsExpr = ... instance Static HsExpr where unstatic e = Hs.interpret e
I've had many educational conversations with
... ?
2) how does it provide more type safety than the current TH based approach? (I've seen Tim and others hit very very gnarly bugs in cloud haskell based upon the "magic static values" approach).
The type safety of the current TH approach is reasonable I think. One potential problem comes from managing dynamically typed values in the remote table, which must be coerced to the right type and use the right decoders if you don't use TH. With the approach we propose, there is no remote table, so I guess this should help eliminate a source of bugs.
3) this proposal requires changes to linking etc that would really make it useful only on systems and deployments that only have Template Haskell AND Dynamic linking. (and also rules out any context where it'd be nice to deploy a static app or say, use CH in ios! )
I don't know about iOS. And it's very likely that there are contexts in which this extension doesn't work. But as I said above, you are always free to define your own DSL's that cover the particular use case that you have in mind. The nice thing with this particular DSL is that it requires little to no TH to generate label names, which can always be a source of bugs, especially when you forget to include them in the global remote table (which is something that TH doesn't and can't help you with). Furthermore, it was my understanding that GHC is heading towards a world of "dynamic linkable by default", and it is by now something that is supported on most platforms by GHC. See e.g. https://ghc.haskell.org/trac/ghc/wiki/DynamicGhcPrograms There are fairly good solutions to deploy self contained dynamically linked apps these days, e.g. Docker. And in any case, with a few extra flags we can still do away with the dynamic linking requirement on some (all?) platforms.
to repeat: have you considered defining an AST type + interpreter for the computations you want to send around, and doing that? I think its a much simpler, safer, easier, flexible and PORTABLE approach, though one current CH doesn't do (though the folks working on CH seem to be receptive to switching to such a strategy if someone validates it)
We have, and it's an option with different tradeoffs. Both solutions could gainfully live side by side and are in fact complementary. I contend that the solution described by Facundo has the advantage of eliminating much of the syntactic overhead associated with sending references to (higher-order) values across the cluster. We have more ideas specific to distributed-process which we can discuss in a separate thread to reduce the syntactic overhead even further, to practically nothing. Best, Mathieu