precompiled headers with c2hs (again)

All, Just to let people know, we've progressed somewhat on the precompiled headers front. Outwardly, the patch is much like Axel described previously. Internally, we rewrote the serialisation using the Binary module from GHC. This allowed us to use shared strings which reduces the file size considerably (~20Mb to ~9Mb). It still uses a lazy reading scheme so while producing the *.precomp file is still rather slow but reading it is really quick. For some reason (as yet undiscovered) the serialisation is very slow and memory hungry. On my machine it takes 16 seconds to parse all of gtk/gtk but 45 seconds to serialise all that to disk. Our current branch is hosted in the gtk2hs cvs: http://cvs.sourceforge.net/viewcvs.py/gtk2hs/gtk2hs/tools/c2hs/ It is based on c2hs 0.13.4. There are one or two interesting patches in addition to the precompiled headers patch: http://cvs.sourceforge.net/viewcvs.py/gtk2hs/gtk2hs/tools/c2hs/gen/GenBind.hs?r1=1.1&r2=1.2 This one makes c2hs chase typedef'ed types so that C function prototypes that use typedef'ed types get propper Haskell types rather than Ptr (). http://cvs.sourceforge.net/viewcvs.py/gtk2hs/gtk2hs/tools/c2hs/chs/CHS.hs?r1=1.1&r2=1.2 This one makes c2hs understand hierarchical module names. Duncan

Sorry for being quite, recently - too much other stuff going on... On Tue, 2004-12-14 at 03:18 +0000, Duncan Coutts wrote:
All,
Just to let people know, we've progressed somewhat on the precompiled headers front. Outwardly, the patch is much like Axel described previously. Internally, we rewrote the serialisation using the Binary module from GHC. This allowed us to use shared strings which reduces the file size considerably (~20Mb to ~9Mb). It still uses a lazy reading scheme so while producing the *.precomp file is still rather slow but reading it is really quick.
For some reason (as yet undiscovered) the serialisation is very slow and memory hungry. On my machine it takes 16 seconds to parse all of gtk/gtk but 45 seconds to serialise all that to disk.
My only *guess* would be that to serialise, you force some/all of the semantic analysis of the C AST that usually only occurs lazily for those parts of the header that are needed for the binding of the currently compiled .chs file. It depends on what information exactly you serialise. (After all, when I wrote the module CTrav and those it depends on, I did that under the assumption that in case of a large pre- processed header file only a small fraction of the declarations will be relevant for the current run of c2hs. In other words, I optimised for the individual analysis functions to be reasonable fast and not for the analysis of a *complete* header file to be fast.)
Our current branch is hosted in the gtk2hs cvs: http://cvs.sourceforge.net/viewcvs.py/gtk2hs/gtk2hs/tools/c2hs/
It is based on c2hs 0.13.4. There are one or two interesting patches in addition to the precompiled headers patch:
http://cvs.sourceforge.net/viewcvs.py/gtk2hs/gtk2hs/tools/c2hs/gen/GenBind.hs?r1=1.1&r2=1.2 This one makes c2hs chase typedef'ed types so that C function prototypes that use typedef'ed types get propper Haskell types rather than Ptr ().
http://cvs.sourceforge.net/viewcvs.py/gtk2hs/gtk2hs/tools/c2hs/chs/CHS.hs?r1=1.1&r2=1.2 This one makes c2hs understand hierarchical module names.
Sounds cool. Cheers, Manuel

On Mon, 2004-12-20 at 21:56 +1100, Manuel M T Chakravarty wrote:
For some reason (as yet undiscovered) the serialisation is very slow and memory hungry. On my machine it takes 16 seconds to parse all of gtk/gtk but 45 seconds to serialise all that to disk.
My only *guess* would be that to serialise, you force some/all of the semantic analysis of the C AST that usually only occurs lazily for those parts of the header that are needed for the binding of the currently compiled .chs file. It depends on what information exactly you serialise.
Actually it turns out not to be that. It was my first suspicion too, so I generated DeepSeq instances for everything (with DrIFT) and ran that before serialising. I inserted timing points in key places. It turned out that the DeepSeq took very little time at all (some time, so the deepSeq was actually working) but the serialisation still took forever. It seems that the serialisation allocates enormous amounts of garbage which is why it takes so long. Simon M reckons that ghc's Binary module should run in constant space (well, log stack space) when the right optimisations are used. I'll probably have to analyse the optimised core code so see what's really going on, if it is doing allocation anywhere. Duncan
participants (2)
-
Duncan Coutts
-
Manuel M T Chakravarty