
The idea that I currently like the most is to make it possible to save
and load objects in the "GHC heap format". That way, deserialisation could be done with a simple fread() and a fast pointer fixup pass, which would hopefully make running many 'ghc -c' processes as fast as a single 'ghc --make'. This trick is commonly employed in the games industry to speed-up load times [1]. Given that Haskell is a garbage-collected language, the implementation will be trickier than in C++ and will have to be done on the RTS level.
Is this a good idea? How hard it would be to implement this optimisation?
I believe OCaml does something like this.
Interesting. What does OCaml do in this department? A bit of googling didn't turn up a link. For many years Chez scheme had a "saved heaps" capability. It was recently dropped because of the preponderance of SE Linux which randomizes addresses and messes it up, but here's the doc for V7: http://www.scheme.com/csug7/use.html#g10 I've always wondered why there weren't more language implementations with saved heaps. Under Chez the startup times were amazing (a 50KLOC compiler a two second load would become 4 milleseconds). Google Dart apparently has or will have saved heaps. It seems like an obvious choice (caching initialized heaps) for enormous websites with slow load times like GMail. Chez also has pretty fast serialization to a binary "FASL" (fast loading) format, but I'm not sure if those were mmap'ed into the heap on load or required some parsing. The gamasutra link that Mikhail provided seems to describe a process where the programmer knows exactly what the expected heap representation is for a particular object is, and manually creates it. Sounds like walking on thin ice. Do we know of any memory safe GC'd language implementations that can dump a single object (rather than the whole heap)? Would invoke the GC in a special way to trace the structure and copy it into a new region (to make it contiguous)? Cheers, -Ryan