faster compiling for ghc

Evan Laforge

11 Nov 2009 11 Nov '09

8:02 p.m.

Recently the "go" language was announced at golang.org. There's not a lot in there to make a haskeller envious, except one real big one: compilation speed. The go compiler is wonderfully speedy. Anyone have any tips for speeding up ghc? Not using -O2 helps a lot. I spend a lot of time linking (ghc api drags in huge amounts of ghc code) and I'm hoping the new dynamic linking stuff will speed that up. I suppose it should be possible to run the whole thing under the bytecode compiler, and this works fine for rerunning tests, I can just stay in ghci, make changes, :r and rerun, but it runs into trouble as soon as code wants to link in foreign C. I also recently discovered -fobject-code, which indeed starts compiling right away, cutting out the ghc startup overhead. However, it doesn't appear to help with the final link, so I wind up having to reinvoke ghc anyway. According to Rob Pike, the main reason for 6g's speed is that in a dependency tree where A depends on B depends on C, the interface for B will pull up all the info needed from C. So compiling A only needs to look at B. Would it help ghc at all if it did the same with hi files? I've heard that ghc does more cross module inlining than your typical imperative language, but with optimization off maybe we can ignore all that? I've seen various bits of noise about supporting parallel builds with --make and it seems to involve making the whole compiler re-entrant which is non-trivial. Would it be simpler to parallelize pure portions of the compilation, say with parallel strategies? Or just start one ghc per core and have a locking scheme so they don't step on each others files?

Show replies by date

Bulat Ziganshin

12 Nov 12 Nov

1:18 a.m.

Hello Evan, Thursday, November 12, 2009, 4:02:17 AM, you wrote:

...

Recently the "go" language was announced at golang.org. There's not a lot in there to make a haskeller envious, except one real big one: compilation speed. The go compiler is wonderfully speedy.

are you seen hugs, for example? i think that ghc is slow because it's written in haskell and compiled by itself hugs provides good interactive environment and good ghc compatibility, you can use conditional compilation to hide remaining differences. unfortunately, many haskell libs doesn't support hugs -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

David Virebayre

2:22 a.m.

On Thu, Nov 12, 2009 at 7:18 AM, Bulat Ziganshin wrote:

...

Hello Evan,

Thursday, November 12, 2009, 4:02:17 AM, you wrote:

...
Recently the "go" language was announced at golang.org. There's not a lot in there to make a haskeller envious, except one real big one: compilation speed. The go compiler is wonderfully speedy.

are you seen hugs, for example? i think that ghc is slow because it's written in haskell and compiled by itself

If I understood, Evan is interested in ideas to speed up compilation. As far as I know, hugs is an interpreter, not a compiler.

Evan Laforge

3:19 a.m.

On Wed, Nov 11, 2009 at 11:22 PM, David Virebayre wrote:

...

On Thu, Nov 12, 2009 at 7:18 AM, Bulat Ziganshin wrote:

...
Hello Evan,

Thursday, November 12, 2009, 4:02:17 AM, you wrote:

...
Recently the "go" language was announced at golang.org. There's not a lot in there to make a haskeller envious, except one real big one: compilation speed. The go compiler is wonderfully speedy.

are you seen hugs, for example? i think that ghc is slow because it's written in haskell and compiled by itself

If I understood, Evan is interested in ideas to speed up compilation. As far as I know, hugs is an interpreter, not a compiler.

Well, the bottom line is a faster "make a change, see it in action" cycle. As I mentioned, ghci's bytecode compiler is pretty good as long as I don't have to recompile the unchanged modules, but I've never been able to get it to work once I have C libraries to link in, it doesn't take the same flags as the real linker (and it's OS X so there's that funky framework stuff) and no matter how many libraries I try to put in manually it has some missing symbol. I should give hugs a try, but I suspect it may have the same problem. I also seem to recall it can't save and reload the bytecode for unchanged modules, which is going to be slow no matter how fast the actual compilation is. Hugs is also going to have trouble linking in the ghc api... though to load code at runtime it might be faster and smaller to link in hugs rather than the ghc api.

Bulat Ziganshin

3:24 a.m.

New subject: Re[2]: faster compiling for ghc

Hello David, Thursday, November 12, 2009, 10:22:41 AM, you wrote:

...

...
are you seen hugs, for example? i think that ghc is slow because it's written in haskell and compiled by itself

...

If I understood, Evan is interested in ideas to speed up compilation. As far as I know, hugs is an interpreter, not a compiler.

it's impossible to interpret haskell - how can you do type inference? hugs, like ghci, is bytecode interpreter. the difference is their implementation languages - haskell vs C -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Rafal Kolanski

7:10 a.m.

Bulat Ziganshin wrote:

...

it's impossible to interpret haskell - how can you do type inference? hugs, like ghci, is bytecode interpreter. the difference is their implementation languages - haskell vs C

We use Standard ML for the Isabelle/HOL theorem prover, and it's interpreted, even has an interactive toplevel. It uses type inference, does it not? In fact, in a not-very-serious discussion at some point of what one could replace javascript with for a browser-embedded language, SML came up. What makes Haskell so different that it can't be interpreted in the SML style? Sincerely, Rafal Kolanski.

Bulat Ziganshin

7:25 a.m.

New subject: Re[2]: faster compiling for ghc

Hello Rafal, Thursday, November 12, 2009, 3:10:54 PM, you wrote:

...

...
it's impossible to interpret haskell - how can you do type inference? hugs, like ghci, is bytecode interpreter. the difference is their implementation languages - haskell vs C

...

We use Standard ML for the Isabelle/HOL theorem prover, and it's interpreted, even has an interactive toplevel. It uses type inference, does it not? In fact, in a not-very-serious discussion at some point of what one could replace javascript with for a browser-embedded language, SML came up.

ghc also has interactive toplevel. it compiles haskell down to bytecode, though. type inference is a part of compilation process, afaik, ocaml also generates bytecode. don't know about isabelle -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Peter Verswyvelen

7:26 a.m.

Regarding speeding up linking or compilation, IMO the real speedup you would get from incremental compilation & linking. It's okay if the initial compilation & linking take a long time, but the duration of next c&l iterations should only depend on the number of changes one does, not on the total project size. On Thu, Nov 12, 2009 at 1:10 PM, Rafal Kolanski wrote:

...

Bulat Ziganshin wrote:

...
it's impossible to interpret haskell - how can you do type inference? hugs, like ghci, is bytecode interpreter. the difference is their implementation languages - haskell vs C

We use Standard ML for the Isabelle/HOL theorem prover, and it's interpreted, even has an interactive toplevel. It uses type inference, does it not? In fact, in a not-very-serious discussion at some point of what one could replace javascript with for a browser-embedded language, SML came up.

What makes Haskell so different that it can't be interpreted in the SML style?

Sincerely,

Rafal Kolanski.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Bulat Ziganshin

7:39 a.m.

New subject: Re[2]: faster compiling for ghc

Hello Peter, Thursday, November 12, 2009, 3:26:21 PM, you wrote: incremental is just a word. what exactly we mean? ghc, like any other .obj-generating compiler, doesn't recompile unchanged source files (if their dependencies aren't changed too). otoh, (my old ghc 6.6) recompiles Main.hs if imported Sub.hs added new declaration (anyway unused in Main), so it may be improved some way

...

Regarding speeding up linking or compilation, IMO the real speedup you would get from incremental compilation & linking. It's okay if the initial compilation & linking take a long time, but the duration of next c&l iterations should only depend on the number of changes one does, not on the total project size.

...

On Thu, Nov 12, 2009 at 1:10 PM, Rafal Kolanski wrote:

...
Bulat Ziganshin wrote:

...
it's impossible to interpret haskell - how can you do type inference? hugs, like ghci, is bytecode interpreter. the difference is their implementation languages - haskell vs C

We use Standard ML for the Isabelle/HOL theorem prover, and it's interpreted, even has an interactive toplevel. It uses type inference, does it not? In fact, in a not-very-serious discussion at some point of what one could replace javascript with for a browser-embedded language, SML came up.

What makes Haskell so different that it can't be interpreted in the SML style?

Sincerely,

Rafal Kolanski.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

-- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Sebastian Sylvan

7:53 a.m.

New subject: Re[2]: faster compiling for ghc

On Thu, Nov 12, 2009 at 12:39 PM, Bulat Ziganshin

...

wrote:

...

Hello Peter,

Thursday, November 12, 2009, 3:26:21 PM, you wrote:

incremental is just a word. what exactly we mean?

Incremental linking means the general idea of reusing previous linking results, only patching it up with respect to changed obj files. So it's about reducing link times, not compile times. This has various consequences for executable size etc. so not something you'd want to do for release builds I think... Here's the documentation for VC's incremental linking option: http://msdn.microsoft.com/en-us/library/4khtbfyf(VS.80).aspx -- Sebastian Sylvan

Neil Mitchell

5:57 a.m.

Hi, I'd really love a faster GHC! I spend hours every day waiting for GHC, so any improvements would be most welcome. I remember when developing Yhc on a really low powered computer, it had around 200 modules and loaded from scratch (with all the Prelude etc) in about 3 seconds on Hugs. ghc --make took about that long to start compiling the first file, and I think a complete compile was around 5 minutes. It's one of the main reasons I stuck with Hugs for so long. Running GHC in parallel with --make would be nice, but I find on Windows that the link time is the bottleneck for most projects. Thanks, Neil 2009/11/12 Bulat Ziganshin :

...

Hello Evan,

Thursday, November 12, 2009, 4:02:17 AM, you wrote:

...
Recently the "go" language was announced at golang.org. There's not a lot in there to make a haskeller envious, except one real big one: compilation speed. The go compiler is wonderfully speedy.

are you seen hugs, for example? i think that ghc is slow because it's written in haskell and compiled by itself

hugs provides good interactive environment and good ghc compatibility, you can use conditional compilation to hide remaining differences. unfortunately, many haskell libs doesn't support hugs

-- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Bulat Ziganshin

6:16 a.m.

New subject: Re[2]: faster compiling for ghc

Hello Neil, Thursday, November 12, 2009, 1:57:06 PM, you wrote:

...

I'd really love a faster GHC!

there are few obvious ideas: 1) use Binary package for .hi files 2) allow to save/load bytecode 3) allow to run program directly from .hi files w/o linking 4) save mix of all .hi files as "program database" using mysql or so second one may be useful for hugs too. also, once i have asked you for CPP support in winhigs ;) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Jason Dagit

8:25 p.m.

On Thu, Nov 12, 2009 at 2:57 AM, Neil Mitchell wrote:

...

Hi,

I'd really love a faster GHC! I spend hours every day waiting for GHC, so any improvements would be most welcome.

Has anyone built a profiling enabled GHC to get data on where GHC spends time during compilation?

...

I remember when developing Yhc on a really low powered computer, it had around 200 modules and loaded from scratch (with all the Prelude etc) in about 3 seconds on Hugs. ghc --make took about that long to start compiling the first file, and I think a complete compile was around 5 minutes. It's one of the main reasons I stuck with Hugs for so long.

Running GHC in parallel with --make would be nice, but I find on Windows that the link time is the bottleneck for most projects.

Yes, when GHC calls GNU ld, it can be very costly. In my experience, on a linux virtual host I had to build my own GHC to disable split-obj because having it enabled caused ld to use about 1GB of memory. This is insane for a virtual host. I tried to solve it by adding swap that meant linking a trivial Setup.hs took about an hour. In conclusion, improving GNU ld could be a huge win for GHC, at least on linux. Does GHC on windows use GNU ld? Jason

Evan Laforge

8:52 p.m.

...

...
Running GHC in parallel with --make would be nice, but I find on Windows that the link time is the bottleneck for most projects.

Yes, when GHC calls GNU ld, it can be very costly. In my experience, on a

This is also my experience. GNU ld is old and slow. I believe its generality also hurts it, there is a much faster linker called gold, but it's ELF-only. The reference to incremental linking was interesting, but AFAIK gnu and apple ld don't support that.

...

linux virtual host I had to build my own GHC to disable split-obj because having it enabled caused ld to use about 1GB of memory. This is insane for a virtual host. I tried to solve it by adding swap that meant linking a trivial Setup.hs took about an hour.

Oh, this is interesting. I recently stumbled across split-obj, and I gathered it's a special ghc hack to reduce binary size by putting functions in their own obj files (I guess ld is not smart enough to only link used code?). If it slows down links, would it be worthwhile to disable split-obj for normal builds, and turn it on with -O2 for a production build?

...

In conclusion, improving GNU ld could be a huge win for GHC, at least on linux. Does GHC on windows use GNU ld?

Improving GNU ld would be a huge win in a lot of places, and the fact that no one has done it (excepting gold of course) goes to show it's a lot easier said than done! On Thu, Nov 12, 2009 at 4:58 PM, Richard O'Keefe wrote:

...

On Nov 12, 2009, at 2:02 PM, Evan Laforge wrote:

...
Recently the "go" language was announced at golang.org.

It looks a lot like Limbo; does it have Limbo's dynamic loading?

Nope, I don't think it's that similar to limbo actually, though it does have channels. It reminds me of haskell in some places, for example no implicit conversions, RTS multiplexed lightweight threads, and interfaces which are vaguely similar to typeclasses. The channels are a subset of TChans, its select {} is like a non-nesting orElse restricted to reading from channels.

...

...
According to Rob Pike, the main reason for 6g's speed

It's clear that 6g doesn't do as much optimisation as gccgo. It probably doesn't do as much optimisation as GHC. And it certainly doesn't have any kind of generics, let along type-level programming. I'd say the semantic distance between 'go' and x86 is quite a bit less than that between Haskell and x86. No laziness!

Indeed, the language is closer to the hardware and the type system is simpler. However, ghc can also be run without optimization. I think the main issue is that the designers had compilation speed as a feature from the beginning, and implemented some neat tricks to that end, which is why I mentioned how it pulls dependencies up to minimize file reading. Of course it could be that ghc already accomplishes the same end with --make by just keeping the interfaces in memory. Or file reading time is dwarfed by compiler cogitation. Of course a research language needs a flexible evolving compiler, maybe that's incompatible with a fast one. But fast builds are such a pleasure!

Magnus Therning

13 Nov 13 Nov

2:07 a.m.

On 13/11/09 01:52, Evan Laforge wrote:

...

...
...
Running GHC in parallel with --make would be nice, but I find on Windows that the link time is the bottleneck for most projects.

Yes, when GHC calls GNU ld, it can be very costly. In my experience, on a

This is also my experience. GNU ld is old and slow. I believe its generality also hurts it, there is a much faster linker called gold, but it's ELF-only.

For someone like me, who only is interested in ELF on Linux, is there some way of getting GHC to use gold instead of ld?

...

...
In conclusion, improving GNU ld could be a huge win for GHC, at least on linux. Does GHC on windows use GNU ld?

Improving GNU ld would be a huge win in a lot of places, and the fact that no one has done it (excepting gold of course) goes to show it's a lot easier said than done!

It could also mean that ld is good enough. Similar to how CVS was good enough for an awfully long time... ;-) /M -- Magnus Therning (OpenPGP: 0xAB4DFBA4) magnus＠therning．org Jabber: magnus＠therning．org http://therning.org/magnus identi.ca|twitter: magthe

Ketil Malde

2:58 a.m.

Jason Dagit writes:

...

...
Running GHC in parallel with --make would be nice, but I find on Windows that the link time is the bottleneck for most projects.

...

Yes, when GHC calls GNU ld, it can be very costly. In my experience,

I'll add mine: On my Ubuntu systems, linking is nearly instantaneous. On RedHat systems, it takes a noticeable amount of time - perhaps five to ten seconds. I'm not sure what the difference is caused by, I can try to investigate if it is of interest. -k -- If I haven't seen further, it is by standing in the footprints of giants

Nicolas Pouillard

4:07 p.m.

Excerpts from Jason Dagit's message of Fri Nov 13 02:25:06 +0100 2009:

...

On Thu, Nov 12, 2009 at 2:57 AM, Neil Mitchell wrote:

...
Hi,

I'd really love a faster GHC! I spend hours every day waiting for GHC, so any improvements would be most welcome.

Has anyone built a profiling enabled GHC to get data on where GHC spends time during compilation?

...
I remember when developing Yhc on a really low powered computer, it had around 200 modules and loaded from scratch (with all the Prelude etc) in about 3 seconds on Hugs. ghc --make took about that long to start compiling the first file, and I think a complete compile was around 5 minutes. It's one of the main reasons I stuck with Hugs for so long.

Running GHC in parallel with --make would be nice, but I find on Windows that the link time is the bottleneck for most projects.

Yes, when GHC calls GNU ld, it can be very costly. In my experience, on a

I confirm that I also had this experience on Arch Linux, GNU ld was allocating Gigs of memory, however this is very hard to reproduce. Actually I've wrapped /usr/bin/ld with a timeout :) -- Nicolas Pouillard http://nicolaspouillard.fr

Richard O'Keefe

12 Nov 12 Nov

7:58 p.m.

On Nov 12, 2009, at 2:02 PM, Evan Laforge wrote:

...

Recently the "go" language was announced at golang.org.

It looks a lot like Limbo; does it have Limbo's dynamic loading?

...

According to Rob Pike, the main reason for 6g's speed

It's clear that 6g doesn't do as much optimisation as gccgo. It probably doesn't do as much optimisation as GHC. And it certainly doesn't have any kind of generics, let along type-level programming. I'd say the semantic distance between 'go' and x86 is quite a bit less than that between Haskell and x86. No laziness!

John Meacham

19 Nov 19 Nov

12:13 a.m.

A good trick is to use NOINLINE and restricted module exports to ensure changes in one module don't cause others to be recompiled. A common idiom is something like. module TypeAnalysis(typeAnalyze) where where the module is a fairly large complicated beast, but it just has the single entry point of typeAnalyze, by putting a {-# NOINLINE typeAnalyze #-} in there, you can be sure that changes to the module don't cause other modules to be recompiled in general. John -- John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/

5664

Age (days ago)

5671

Last active (days ago)

List overview

Download

18 comments

13 participants

participants (13)

Bulat Ziganshin
David Virebayre
Evan Laforge
Jason Dagit
John Meacham
Ketil Malde
Magnus Therning
Neil Mitchell
Nicolas Pouillard
Peter Verswyvelen
Rafal Kolanski
Richard O'Keefe
Sebastian Sylvan