
On Tue, 2005-05-31 at 08:15 +0100, Axel Simon wrote:
On Mon, 2005-05-30 at 19:18 +0100, Duncan Coutts wrote:
[..]
Going back to the lexer, it now produces exactly the same output as the original lexer (including positions and unique names). Sadly it seems to have got quite a bit slower for reasons I don't quite understand. In particular making it monadic (which we need to do because of) seems to make it rather slower. It is now taking 6 seconds rather than 2 and so is now only a little faster that the original lexer. Though on the positive side it means that if the lexer is taking 6 out of the 8 second total then the parser is only taking 2 seconds which is quite good.
Ok, I'm impressed, too. But was the parser the culprit? It did use a lot of space, but then most of the time in our current setup is spent in serialisation. So if I understand your intention you mainly try to improve the memory footprint, not the compilation time?
Basically yes. The real problem was the memory use. The existing parser was taking 270Mb for the Gtk+ headers while this new one now takes 29Mb. I've tried integrating this parser into c2hs and overall, producing the precomp file now runs in 80Mb of heap space. In fact a significant minority of that space is only required during the serialisation, the name analysis phase only pushes the memory requirements up to 50Mb or so. (I may be wrong about that, it may be that the serialisation is simply forcing the result of the name analysis which thereby increases the heap use.) The slowness of the serialisation is a seperate problem. But reducing the memory requirements of the other phases makes even that part faster. On my fast athlon it used to take about a minute to generate the Gtk+ precomp file (and 380Mb). It now takes 13 seconds (and 80Mb). I guess the improvement to the time taken to do the serialisation is mostly from having to do less GC. There's still some small difference in the precomp file which I have not yet tracked down (but in my earlier parser tests, the AST seems to be exactly the same, right down to the source locations and unique names). So I think it's worth trying to get this done for the 0.9.8 gtk2hs release. That should provide reasonable testing and then we can create patches for the mainline c2hs. Duncan