GHC compile times (was Re: GHC 6.4.3 is stalled)

On Jul 25, 2006, at 1:34 PM, Christian Maeder wrote:
On our solaris sparc machine compiling our main binary (optimized) takes 3h:38min whereas (only) 55min under linux. At least our sparcs may die out sooner or later.
Interestingly enough, it takes the 3+ hours to compile GHC 6.5 on my Intel Core-Duo Mac. Why would there be such a huge difference between Linux and Mac OSX on the same architecture? How can I pinpoint the bottleneck? Thanks, Joel -- http://wagerlabs.com/

On Tue, 2006-07-25 at 13:45 +0100, Joel Reymont wrote:
On Jul 25, 2006, at 1:34 PM, Christian Maeder wrote:
On our solaris sparc machine compiling our main binary (optimized) takes 3h:38min whereas (only) 55min under linux. At least our sparcs may die out sooner or later.
Interestingly enough, it takes the 3+ hours to compile GHC 6.5 on my Intel Core-Duo Mac.
BTW, ghc's build system does support parallel make, so if you do have more than one CPU then you can cut compile times by using make -j2 (assuming you've got 2 CPUs) You can also cut compile times by compiling fewer non-core libraries and by using -fasm rather than -fvia-C. add these lines to your mk/build.mk (which will not exist unless you create it): SRC_HC_OPTS = -O -fasm GhcBootLibs = YES Note that using GhcBootLibs = YES means you only get a minimal set of packages, eg no stm, mtl, fgl, QuickCheck etc etc. If that's not to your liking then you'll need to customise the libraries/Makefile . Duncan

Duncan, Thanks for the tip! I'm _really_ interested in why it takes 55 min on Linux and 3+ hours on Mac Intel, though. Any clues? Thanks, Joel On Jul 25, 2006, at 2:09 PM, Duncan Coutts wrote:
BTW, ghc's build system does support parallel make, so if you do have more than one CPU then you can cut compile times by using make -j2 (assuming you've got 2 CPUs)

Joel Reymont wrote:
Thanks for the tip! I'm _really_ interested in why it takes 55 min on Linux and 3+ hours on Mac Intel, though. Any clues?
There are a lot of variables in a GHC build, we'd have to be sure that those measurements were taken on completely identical builds - i.e. profiled libs built, same optimisation levels, via-C and same version of gcc (or -fasm), GMP library built/not built, etc. If you think your build is slow, try building it on Windows sometime :-( Cheers, Simon

On Jul 25, 2006, at 2:57 PM, Simon Marlow wrote:
If you think your build is slow, try building it on Windows sometime :-(
Someone on #haskell also suggested using jhc for a while :D. Still, I'm very curious why ocaml builds fast and ghc builds slow. Is this because profiling the compiler hasn't been a high priority yet? Thanks, Joel -- http://wagerlabs.com/

On Jul 25, 2006, at 2:57 PM, Simon Marlow wrote:
If you think your build is slow, try building it on Windows sometime :-(
Well, I re-built ghc from scratch on my PowerBook G4 1.25Ghz with 1Gb of memory. It took somewhere north of 7 hours. The MacBook Pro 2Ghz looks speedy by comparison and 3+ hours look like nothing. -- http://wagerlabs.com/

On 26 July 2006 09:41, Joel Reymont wrote:
On Jul 25, 2006, at 2:57 PM, Simon Marlow wrote:
If you think your build is slow, try building it on Windows sometime :-(
Well, I re-built ghc from scratch on my PowerBook G4 1.25Ghz with 1Gb of memory. It took somewhere north of 7 hours. The MacBook Pro 2Ghz looks speedy by comparison and 3+ hours look like nothing.
That's way too long. Next time, it might be a good idea to pipe the log file through something that adds timestamps to each line - I've attached a little Haskell program I use for this. Cheers, Simon

Simon Marlow wrote:
On 26 July 2006 09:41, Joel Reymont wrote:
On Jul 25, 2006, at 2:57 PM, Simon Marlow wrote:
If you think your build is slow, try building it on Windows sometime :-( Well, I re-built ghc from scratch on my PowerBook G4 1.25Ghz with 1Gb of memory. It took somewhere north of 7 hours. The MacBook Pro 2Ghz looks speedy by comparison and 3+ hours look like nothing.
That's way too long. Next time, it might be a good idea to pipe the log file through something that adds timestamps to each line - I've attached a little Haskell program I use for this.
Cheers, Simon
When I use darwinports on my powerbook 1.33GHz and 2GB of RAM is also takes hours and hours and hours. Next time I will try to get a record of exactly how long -- Chris

Joel Reymont
Thanks for the tip! I'm _really_ interested in why it takes 55 min on Linux and 3+ hours on Mac Intel, though. Any clues?
Building a compiler generally reads/touches/creates a very large number of files. So one possibility is the relative efficiency of the OS filesystem implementation. Apple's HFS+ is reputed to be fairly slow, as are the Microsoft filesystems (VFAT, NTFS), at least compared to the various unix-derived filesystems (UFS, ext2 etc). I recall from a few years back that building nhc98 took twice as long under Windows as under linux, on the very same machine, with the same versions of boot-compilers. The only major variable I could think of at the time was VFAT vs ext2. Regards, Malcolm

Joel Reymont
Thanks for the tip! I'm _really_ interested in why it takes 55 min on Linux and 3+ hours on Mac Intel, though. Any clues?
Another thought. The ghc HACKING guide has this to say: The GHC build tree is set up so that, by default, it builds a compiler ready for installing and using. That means full optimisation, and the build can take a *long* time. If you unpack your source tree and right away say "./configure; make", expect to have to wait a while. For hacking, you want the build to be quick - quick to build in the first place, and quick to rebuild after making changes. Tuning your build setup can make the difference between several hours to build GHC, and less than an hour. Here's how to do it. http://cvs.haskell.org/cgi-bin/cvsweb.cgi/~checkout~/fptools/ghc/HACKING?con... Regards, Malcolm

On Wed, Jul 26, 2006 at 11:54:37AM +0100, Malcolm Wallace wrote:
For hacking, you want the build to be quick - quick to build in the first place, and quick to rebuild after making changes. Tuning your build setup can make the difference between several hours to build GHC, and less than an hour. Here's how to do it.
This reminds me of something. I ofter use {-# NOINLINE ... #-} with ghc combined with frugal export lists in order to speed up the rebuild process when I know something is not going to benefit from being inlined or temporarily when I am making lots of internal changes to a file and want the rebuilds to go as fast as possible. However, whenever I change a data type or class even if they are not exported, it seems to force a full rebuild of everything that depends on that file. Is there any fundamental reason this can't be fixed? why do the non exported classes and data types end up in the hi file anyway (assuming they appear in no exported functions type signature of course) perhaps there could be a mode that means "optimize, but do so in a way that minimized the need to rebuild anything", so it will do optimization and inlining within a module, but will avoid anything that changes the external interface in a way that will cause dependencies to need to be rebuilt. John -- John Meacham - ⑆repetae.net⑆john⑈

| However, whenever I change a data type or class even if they are not | exported, it seems to force a full rebuild of everything that depends on | that file. Is there any fundamental reason this can't be fixed? why do | the non exported classes and data types end up in the hi file anyway | (assuming they appear in no exported functions type signature of course) There's no fundamental reason. I think I just thought that it'd be seldom for a data type or class to be defined only internally to a module, and not exported at all. Somewhat more common is for the *implementation* of the data type (i.e. its data constructors) to be internal, but the type itself is exported. So then one would want to have a partial spec in the interface file, giving the kind but not the constructors. Again, I didn’t work on this case. I'll add a Trac feature request, so we don't forget this Simon

Simon Peyton-Jones wrote:
| However, whenever I change a data type or class even if they are not | exported, it seems to force a full rebuild of everything that depends on | that file. Is there any fundamental reason this can't be fixed? why do | the non exported classes and data types end up in the hi file anyway | (assuming they appear in no exported functions type signature of course)
There's no fundamental reason. I think I just thought that it'd be seldom for a data type or class to be defined only internally to a module, and not exported at all.
Somewhat more common is for the *implementation* of the data type (i.e. its data constructors) to be internal, but the type itself is exported. So then one would want to have a partial spec in the interface file, giving the kind but not the constructors. Again, I didn’t work on this case.
That's funny, I was under the impression that we had fixed this at some stage in the past - that is, a data type can be exported without its constructors in an interface. I remember because it caused a bunch of bugs when the code generator couldn't figure out how to evaluate a type because it had no constructors. Maybe we should look at the code :-) Cheers, Simon
participants (8)
-
Chris Kuklewicz
-
Duncan Coutts
-
Joel Reymont
-
John Meacham
-
Malcolm Wallace
-
Simon Marlow
-
Simon Marlow
-
Simon Peyton-Jones