UHC-like JavaScript backend in GHC

Hi all, Last year, I worked on improving the Utrecht Haskell Compiler (UHC) JavaScript backend[1] up to the point where it is now possible to write complete JavaScript applications with it. Alessandro Vermeulen then ported one of my web applications to the UHC JS backend, and we've started using the result of that in production afterwards. The result of these efforts were presented at IFL 2012 and HIW 2012 by Alessandro and Atze Dijkstra respectively. One of the most powerful new features of this backend, in addition to the customised RTS, is the small subset of JS we can use when importing JavaScript functions with the "js" calling convention. For example, we can import the push() method for arrays using (roughly) the following import declaration: foreign import js "%1.push(%2)" push :: JSArray a -> a -> IO (JSArray a) Here, we write a small expression in the import declaration, in which the object and arguments are numbered. The object argument, %1, corresponds to the `JSArray a` in the type signature of `push`, while the method argument %2 corresponds with the `a` in the type signature. Chris Done drew inspiration from this calling convention in designing the FFI for his Fay programming language[2]. One of UHC's nice features is that it is very easy to extend it, due to its heavy use of attribute grammars and its sophisticated variability management. Working on, and extending the UHC JavaScript backend was relatively easy because of this. One of UHC's downsides, however, is that it lags behind on supporting advanced language features, such as GADTs, functional dependencies, type families, etc. As a result, only a limited number of packages from Hackage can be compiled with UHC. In addition, some code requires additional workarounds and explicit type signatures, which would be redundant if, e.g., fundeps would be available. In an ideal situation, we can have both the js calling convention from UHC's JavaScript backend, and GHC's more advanced language support. Combined, we can use most libraries on Hackage, we don't have to worry about the lack of language support while designing our JS applications, and we don't have to use explicit type annotations in scenarios where, e.g. fundeps, would un-ambiguate the situation quite easily. In the near future (i.e. after I have finished my MSc thesis), I would like to get started with getting more familiar with GHC's internals. In particular, I would like to see if it is possible to implement the js calling convention in GHC and have GHC spit out JavaScript, much like UHC is currently doing. Before I get started: does the GHC architecture currently allow for adding a new calling convention which departs from the conventional C FFIs and introduces a custom RTS? If not, where are the current major bottlenecks? And would it be possible to remove these bottlenecks, without significantly affecting the compilation times, and without affecting the performance of generated native code (when the JS backend is not used)? Any input on this is appreciated :) Cheers, Jurriën [1] http://uu-computerscience.github.com/uhc-js/ [2] http://fay-lang.org

| currently doing. Before I get started: does the GHC architecture | currently allow for adding a new calling convention which departs from | the conventional C FFIs and introduces a custom RTS? GHC certainly supports new back ends. You'd probably want to replace the entire back end, and go from optimised Core to Javascript. Should be entirely feasible. I'm sure you are checking out all relevant stuff, but you don't mention: http://www.haskell.org/haskellwiki/STG_in_Javascript http://article.gmane.org/gmane.comp.lang.haskell.cafe/88970 https://plus.google.com/102016502921512042165/posts/Z7NtU4eF8Zh Difficulties may be in supporting all of GHC stuff, esp concurrency. Eg the I/O library depends heavily on concurrency, so you may need to replace it entirely. Keep us posted! Simon If not, where are | the current major bottlenecks? And would it be possible to remove these | bottlenecks, without significantly affecting the compilation times, and | without affecting the performance of generated native code (when the JS | backend is not used)? | | Any input on this is appreciated :) | | Cheers, | | | Jurriën | | | [1] http://uu-computerscience.github.com/uhc-js/ | [2] http://fay-lang.org | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On 13 Nov 2012, at 16:17, Simon Peyton-Jones
| currently doing. Before I get started: does the GHC architecture | currently allow for adding a new calling convention which departs from | the conventional C FFIs and introduces a custom RTS?
GHC certainly supports new back ends. You'd probably want to replace the entire back end, and go from optimised Core to Javascript. Should be entirely feasible.
Good to hear!
I'm sure you are checking out all relevant stuff, but you don't mention:
http://www.haskell.org/haskellwiki/STG_in_Javascript http://article.gmane.org/gmane.comp.lang.haskell.cafe/88970 https://plus.google.com/102016502921512042165/posts/Z7NtU4eF8Zh
Yes, I did check out other work that's been done in this area, albeit only briefly. Unless I've overlooked it (which is very much possible), none of the other solutions (except Fay) support an FFI that bridges the gap between JS's OO and the functional world, like our JS-like language in the foreign imports. In real-life situations, where you want to get rid of writing JS entirely, but still might want to use existing JS libraries such as jQuery, this feature is essential.
Difficulties may be in supporting all of GHC stuff, esp concurrency. Eg the I/O library depends heavily on concurrency, so you may need to replace it entirely.
This is indeed a tricky part, and it might not be possible to map all of it to JS, so replacing it entirely is not an unlikely option. It might be possible to implement fork in terms of WebWorkers, though.
Keep us posted!
Will do :) Jurriën
Simon
If not, where are | the current major bottlenecks? And would it be possible to remove these | bottlenecks, without significantly affecting the compilation times, and | without affecting the performance of generated native code (when the JS | backend is not used)? | | Any input on this is appreciated :) | | Cheers, | | | Jurriën | | | [1] http://uu-computerscience.github.com/uhc-js/ | [2] http://fay-lang.org | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On 13 November 2012 16:33, J. Stutterheim
Yes, I did check out other work that's been done in this area, albeit only briefly. Unless I've overlooked it (which is very much possible), none of the other solutions (except Fay) support an FFI that bridges the gap between JS's OO and the functional world, like our JS-like language in the foreign imports. In real-life situations, where you want to get rid of writing JS entirely, but still might want to use existing JS libraries such as jQuery, this feature is essential.
Just a small point, but Fay's FFI differs from UHC/GHC's in that it natively supports String/Double and functions without needing wrappers and conversions from CString or whatnot. E.g. you write addClassWith :: (Double -> String -> Fay String) -> JQuery -> Fay JQuery addClassWith = ffi "%2.addClass(%1)" and you're already ready to use it. If I recall in UHC last I tried, I had to do some serializing/unserializing for the string types, and make a wrapper function for the callback. Whether it makes any sense for a UHC/GHC-backend to behave like this, I don't know. But people really like it.

One of the main reasons we didn't do this with UHC was that we had to focus on more elementary parts of the FFI/RTS first: dynamic/wrapper imports, object interaction, etc. I must admit that I forgot the exact reasons for not converting between the types automatically, after we had finished with the first bit, though..
On 13 Nov 2012, at 19:18, Christopher Done
On 13 November 2012 16:33, J. Stutterheim
wrote: Yes, I did check out other work that's been done in this area, albeit only briefly. Unless I've overlooked it (which is very much possible), none of the other solutions (except Fay) support an FFI that bridges the gap between JS's OO and the functional world, like our JS-like language in the foreign imports. In real-life situations, where you want to get rid of writing JS entirely, but still might want to use existing JS libraries such as jQuery, this feature is essential.
Just a small point, but Fay's FFI differs from UHC/GHC's in that it natively supports String/Double and functions without needing wrappers and conversions from CString or whatnot. E.g. you write
addClassWith :: (Double -> String -> Fay String) -> JQuery -> Fay JQuery addClassWith = ffi "%2.addClass(%1)"
and you're already ready to use it. If I recall in UHC last I tried, I had to do some serializing/unserializing for the string types, and make a wrapper function for the callback. Whether it makes any sense for a UHC/GHC-backend to behave like this, I don't know. But people really like it.

On Mon, Nov 12, 2012 at 9:16 AM, Jurriën Stutterheim
Hi all,
foreign import js "%1.push(%2)" push :: JSArray a -> a -> IO (JSArray a)
I'm not sure if it's even necessary to extend GHC itself for this. Even though this exact syntax (with the js calling convention name) is not supported, the import pattern is available as a string at compile time [1], so you can easily generate the desired code with a compiler that uses the GHC API. I work on GHCJS [2], a compiler that generates Javascript from STG. Unfortunately, GHCJS is in a state of flux at the moment so it's a bit hard to come up with a proof of concept implementation at this point. I started a complete rewrite a few months ago, because the old version didn't have the performance I needed. The new version [3] appears to generate much faster code, but a lot of things (including FFI) have not yet been implemented. It's still a bit too early to tell if the new code generator can fully replace the old one. I would like to add the friendlier FFI syntax later, but as far as i can see, it should be pretty straightforward to do this... (at least compared to supporting many of the other GHC features in JS) WebWorkers might not be able to do what you need for concurrency, since the ways you can communicate between them are really limited, you have to serialize everything, no shared data. This is why GHCJS has its own scheduler [4] in the RTS. luite [1] http://www.haskell.org/ghc/docs/7.6.1/html/libraries/ghc-7.6.1/ForeignCall.h... [2] GHCJS - https://github.com/ghcjs/ghcjs [3] GHCJS new code generator - https://github.com/ghcjs/ghcjs/tree/gen2 [4] GHCJS scheduler - https://github.com/ghcjs/ghcjs/blob/master/rts/rts-trampoline.js#L244

On 13 Nov 2012, at 19:08, Luite Stegeman
On Mon, Nov 12, 2012 at 9:16 AM, Jurriën Stutterheim
wrote: Hi all,
foreign import js "%1.push(%2)" push :: JSArray a -> a -> IO (JSArray a)
I'm not sure if it's even necessary to extend GHC itself for this. Even though this exact syntax (with the js calling convention name) is not supported, the import pattern is available as a string at compile time [1], so you can easily generate the desired code with a compiler that uses the GHC API. I work on GHCJS [2], a compiler that generates Javascript from STG.
That's an interesting approach too. If you're writing a new compiler which uses the GHC API, I agree there is probably no real need to create a separate js calling convention, although using a C calling convention for JS code might be a bit confusing. Indeed, the string is sufficient for generating the required code. UHC's string parser can probably easily be ported, as can some of the code generation parts. Does/can cabal-install support GHCJS? I suppose that's a minor advantage of extending GHC itself; you get cabal support almost for free.
Unfortunately, GHCJS is in a state of flux at the moment so it's a bit hard to come up with a proof of concept implementation at this point. I started a complete rewrite a few months ago, because the old version didn't have the performance I needed. The new version [3] appears to generate much faster code, but a lot of things (including FFI) have not yet been implemented. It's still a bit too early to tell if the new code generator can fully replace the old one.
How big are the JS files generated with either the new or the old code generator? I recall there was a HS -> JS effort out there that generated huge JS files. UHC's output is relatively compact and doesn't grow as fast with bigger programs.
I would like to add the friendlier FFI syntax later, but as far as i can see, it should be pretty straightforward to do this... (at least compared to supporting many of the other GHC features in JS)
Sounds like it would be pretty straightforward, yes.
WebWorkers might not be able to do what you need for concurrency, since the ways you can communicate between them are really limited, you have to serialize everything, no shared data. This is why GHCJS has its own scheduler [4] in the RTS.
WebWorkers is quite limited indeed. I'm not yet sure how the serialisation might complicate matters, but it seems that WebWorkers is only really a possible backend for `fork`, and not `forkIO`.
luite
[1] http://www.haskell.org/ghc/docs/7.6.1/html/libraries/ghc-7.6.1/ForeignCall.h... [2] GHCJS - https://github.com/ghcjs/ghcjs [3] GHCJS new code generator - https://github.com/ghcjs/ghcjs/tree/gen2 [4] GHCJS scheduler - https://github.com/ghcjs/ghcjs/blob/master/rts/rts-trampoline.js#L244
Jurriën

Does/can cabal-install support GHCJS? I suppose that's a minor advantage of extending GHC itself; you get cabal support almost for free.
Yes. There are two GHCJS installation options. One is the standalone option that includes wrappers for cabal and ghc-pkg. You use `ghcjs-cabal` to install packages, see the result with `ghcjs-pkg list`. The standalone compiler can be installed with cabal-install, but it does require you to run `ghcjs-boot` in a configured GHC source tree, to install the core libraries (ghc-prim, base, integer-gmp). The alternative is the integrated compiler, where you completely replace your existing GHC with one that can output Javascript. You don't get separate package databases this way.
How big are the JS files generated with either the new or the old code generator? I recall there was a HS -> JS effort out there that generated huge JS files. UHC's output is relatively compact and doesn't grow as fast with bigger programs.
Relatively big for the new generator because I haven't focused on this yet. The generated code has lots of redundant assignments that can be weeded out later with a dataflow analysis pass. The old generator is a bit more compact (similar to haste compiler). Both versions have a function-level linker that only includes functions that are actually used.
WebWorkers is quite limited indeed. I'm not yet sure how the serialisation might complicate matters, but it seems that WebWorkers is only really a possible backend for `fork`, and not `forkIO`.
For one, you cannot serialize closures, so it will probably be similar to the restrictions in Cloud Haskell in that you can only call top-level things on the other side (Unless you don't use Javascript closures for your Haskell closures, the new GHCJS generator can actually move closures to a WebWorker, at least in theory, it's not yet implemented) luite

You also need an accomplice web server to host the JS file containing
the JavaScript for the web worker to run. I don't see how you can
"fork" threads without such support.
On 13 November 2012 20:53, Luite Stegeman
Does/can cabal-install support GHCJS? I suppose that's a minor advantage of extending GHC itself; you get cabal support almost for free.
Yes. There are two GHCJS installation options. One is the standalone option that includes wrappers for cabal and ghc-pkg. You use `ghcjs-cabal` to install packages, see the result with `ghcjs-pkg list`. The standalone compiler can be installed with cabal-install, but it does require you to run `ghcjs-boot` in a configured GHC source tree, to install the core libraries (ghc-prim, base, integer-gmp).
The alternative is the integrated compiler, where you completely replace your existing GHC with one that can output Javascript. You don't get separate package databases this way.
How big are the JS files generated with either the new or the old code generator? I recall there was a HS -> JS effort out there that generated huge JS files. UHC's output is relatively compact and doesn't grow as fast with bigger programs.
Relatively big for the new generator because I haven't focused on this yet. The generated code has lots of redundant assignments that can be weeded out later with a dataflow analysis pass. The old generator is a bit more compact (similar to haste compiler). Both versions have a function-level linker that only includes functions that are actually used.
WebWorkers is quite limited indeed. I'm not yet sure how the serialisation might complicate matters, but it seems that WebWorkers is only really a possible backend for `fork`, and not `forkIO`.
For one, you cannot serialize closures, so it will probably be similar to the restrictions in Cloud Haskell in that you can only call top-level things on the other side (Unless you don't use Javascript closures for your Haskell closures, the new GHCJS generator can actually move closures to a WebWorker, at least in theory, it's not yet implemented)
luite
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
participants (5)
-
Christopher Done
-
J. Stutterheim
-
Jurriën Stutterheim
-
Luite Stegeman
-
Simon Peyton-Jones