
I don’t know if generating llvm from stg instead of cmm would be a better approach, which is what ghcjs and eta do as far as I know.
Wouldn't a step from STG to LLVM be much harder (LLVM IR is a pretty low-level representation compared to STG)? There are also a few passes on the Cmm level that seem necessary, e.g., `cmmLayoutStack`.
There is certainly a tradeoff between retaining more high-level information and having to lower them oneself. If I remember luite correctly, he said he had a similar intermediate format to cmm, just not cmm but something richer, which allows to better target javascript. The question basically boils down to asking if cmm is too low-level for llvm already; the embedding of wordsizes is an example where I think cmm might be to low-level for llvm.
Ok, I see. This is quite interesting - I'm wondering if it makes sense to collect thought/ideas like that somewhere (e.g., a wiki page with all the issues of using current Cmm for LLVM backend, or just adding some comments in the code).
That indeed is an interesting question, to which I don’t have a satisfying answer yet. I’m trying to note these kinds of lookup down in code as I’m trying to go along porting over the textual ir gen to the bitcode ir gen. I would consider this to be a second iteration though. First trying to get the bitcode pipeline to work nicely (this includes having an option to have ghc emit bitcode instead of the assembled object code), and see where that takes us, and then trying to incrementally improve on that. cheers, moritz