GHC 6.8.1 is impressive!

I just finished a running Bluespec's regression suite on a version of our tools compiled with ghc 6.8.1. The results were impressive on two fronts: 1. All of our tests (almost 14,000) had the same behavior as with ghc 6.6.1 2. Our Haskell code was roughly 33% faster (relative to ghc 6.6.1). It seems that pointer-tagging made a big difference for our code base (since, if I'm reading the release notes correctly, constructor specialization isn't in yet). We just wanted to take a moment to say that we were pleasantly surprised by ghc 6.8.1 (so much so that we're going to try and use it for released builds sooner than we would have otherwise planned) and to thank everyone involved for the effort they put into it. Thank you, - Ravi Nanavati (and the rest of Bluespec, Inc.)

| I just finished a running Bluespec's regression suite on a version of | our tools compiled with ghc 6.8.1. The results were impressive on two | fronts: | | 1. All of our tests (almost 14,000) had the same behavior as with ghc 6.6.1 | 2. Our Haskell code was roughly 33% faster (relative to ghc 6.6.1). It | seems that pointer-tagging made a big difference for our code base | (since, if I'm reading the release notes correctly, constructor | specialization isn't in yet). | | We just wanted to take a moment to say that we were pleasantly | surprised by ghc 6.8.1 Thanks for taking the time to let us know Ravi. Constructor specialisation (aka call-pattern specialisation) is in 6.8.1; try -O2 and see what difference that makes. I'd be interested to know. Simon

On 11/9/07, Simon Peyton-Jones
| I just finished a running Bluespec's regression suite on a version of | our tools compiled with ghc 6.8.1. The results were impressive on two | fronts: | | 1. All of our tests (almost 14,000) had the same behavior as with ghc 6.6.1 | 2. Our Haskell code was roughly 33% faster (relative to ghc 6.6.1). It | seems that pointer-tagging made a big difference for our code base | (since, if I'm reading the release notes correctly, constructor | specialization isn't in yet). | | We just wanted to take a moment to say that we were pleasantly | surprised by ghc 6.8.1
Thanks for taking the time to let us know Ravi.
Constructor specialisation (aka call-pattern specialisation) is in 6.8.1; try -O2 and see what difference that makes. I'd be interested to know.
In that case, the 33% I cite above includes constructor specialization, since we compile with -O2 anyway. I just wasn't sure that it was in since it wasn't called out in the release notes. On that topic, does anyone know a clever way to get a more detailed changelog between 6.6.1 and 6.8.1? It would be useful for situations like that when I'm curious about changes that didn't make the release notes. Thanks, - Ravi

| In that case, the 33% I cite above includes constructor | specialization, since we compile with -O2 anyway. I just wasn't sure | that it was in since it wasn't called out in the release notes. Probably because it was in 6.6 too, only less polished. | On | that topic, does anyone know a clever way to get a more detailed | changelog between 6.6.1 and 6.8.1? It would be useful for situations | like that when I'm curious about changes that didn't make the release | notes. Well the release notes (not the announcement) are pretty detailed: http://haskell.org/ghc/docs/6.8.1/html/users_guide/release-6-8-1.html Beyond that, it's 'darcs changes', but you'll get a *lot* of output! Simon

I'd like to second that. 6.8 is quite an improvement. Well done!
-- Lennart
On Nov 8, 2007 12:18 PM, Ravi Nanavati
I just finished a running Bluespec's regression suite on a version of our tools compiled with ghc 6.8.1. The results were impressive on two fronts:
1. All of our tests (almost 14,000) had the same behavior as with ghc 6.6.1 2. Our Haskell code was roughly 33% faster (relative to ghc 6.6.1). It seems that pointer-tagging made a big difference for our code base (since, if I'm reading the release notes correctly, constructor specialization isn't in yet).
We just wanted to take a moment to say that we were pleasantly surprised by ghc 6.8.1 (so much so that we're going to try and use it for released builds sooner than we would have otherwise planned) and to thank everyone involved for the effort they put into it.
Thank you,
- Ravi Nanavati (and the rest of Bluespec, Inc.) _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

I'd like to third that. The main improvement for GF is that the build is 5 times faster, 3 minutes with 6.8.1 instead of 15 minutes with 6.6.1, using -O2 with both. The compiled program runs 13% faster with 6.8.1. Code: http://www.cs.chalmers.se/Cs/Research/Language-technology/darcs/ GF/ Machine: 1.83 GHz MacBook, 2 GB RAM, OS X 10.4.10 /Björn On Nov 9, 2007, at 8:24 , Lennart Augustsson wrote:
I'd like to second that. 6.8 is quite an improvement. Well done!
-- Lennart
On Nov 8, 2007 12:18 PM, Ravi Nanavati
wrote: I just finished a running Bluespec's regression suite on a version of our tools compiled with ghc 6.8.1. The results were impressive on two fronts:
1. All of our tests (almost 14,000) had the same behavior as with ghc 6.6.1 2. Our Haskell code was roughly 33% faster (relative to ghc 6.6.1). It seems that pointer-tagging made a big difference for our code base (since, if I'm reading the release notes correctly, constructor specialization isn't in yet).
We just wanted to take a moment to say that we were pleasantly surprised by ghc 6.8.1 (so much so that we're going to try and use it for released builds sooner than we would have otherwise planned) and to thank everyone involved for the effort they put into it.
Thank you,
- Ravi Nanavati (and the rest of Bluespec, Inc.)

New ghc sped up my small app (~2000 lines) by ~38%. Nice job! Anyway, my application is a bit slower when compiled with -O2 compared to -01 only (both with ghc 6.6 and 6.8). Is that normal? Peter. Lennart Augustsson wrote:
I'd like to second that. 6.8 is quite an improvement. Well done!

O2 mainly switches on two transformations: "liberate case" and "call-pattern specialisation". (I think it also gets passed on to gcc.) Trying -O2 -fno-liberate-case, and -O2 -fno-spec-constr might tell which was making the difference. Simon | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users-bounces@haskell.org] On Behalf Of | Peter Hercek | Sent: 09 November 2007 14:19 | To: glasgow-haskell-users@haskell.org | Subject: Re: GHC 6.8.1 is impressive! | | New ghc sped up my small app (~2000 lines) by ~38%. Nice job! | Anyway, my application is a bit slower when compiled with -O2 | compared to -01 only (both with ghc 6.6 and 6.8). | Is that normal? | | Peter. | | Lennart Augustsson wrote: | > I'd like to second that. 6.8 is quite an improvement. Well done! | > | | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Each test I mention here is actually 3 or 4 application runs. If there were 4 runs then the first one was discarded, so there are still only 3 results available in one test. The idea is that I discard the first test if it got significantly higher page fault count. Ok, it is not any more 100% probability that -O2 is slower than -O with my application. I rerun the tests I was running before again 3 times and in one case the -O2 variant was quicker. Before I did the comparisons about 3 times, so that would indicate that -O2 is slower with about 83% probability :-) The time differences are minuscule, but they do not seem to be a result of a bad/good luck only. I did optimize the code only once to reduce memory consumption. Speed was always good enough for me. It is a Gtk2Hs application which draws charts. Data are read from a text file, preprocessed, and a chart is shown. Ignore real times in the results since I need to fill in one edit box to run it for a longer time (to process more input data) and differences in my typing speed are most of the real time differences. Before each compile the project was cleaned. Options were always like this: --make -Wall <theOptimizationOptions> <fileList> The machine: Windows XP 64bit (running 32 bit Haskell and the app.) 2GiB DDR400 RAM Athlon XP64 X2 4800+ C&Q disabled (but it does not seem to have impact) -O -fexcess-precision real 25.391 user 19.109 system 0.359 cpu 19.469 page_faults 79315 real 25.188 user 19.141 system 0.453 cpu 19.594 page_faults 79314 real 25.000 user 19.031 system 0.375 cpu 19.406 page_faults 79302 -O2 -fexcess-precision real 24.922 user 19.141 system 0.438 cpu 19.578 page_faults 78550 real 25.266 user 18.984 system 0.484 cpu 19.469 page_faults 78538 real 25.000 user 19.109 system 0.563 cpu 19.672 page_faults 78539 -O2 -fno-liberate-case -fexcess-precision real 24.516 user 18.844 system 0.453 cpu 19.297 page_faults 79310 real 24.219 user 18.875 system 0.438 cpu 19.313 page_faults 78203 real 24.375 user 18.656 system 0.516 cpu 19.172 page_faults 79305 -O2 -fno-spec-constr -fexcess-precision real 24.203 user 18.641 system 0.719 cpu 19.359 page_faults 78543 real 24.719 user 18.781 system 0.625 cpu 19.406 page_faults 78536 real 24.688 user 19.000 system 0.500 cpu 19.500 page_faults 78536 So it looks like liberate-case hurts my app a bit and something else in -O2 is helping a bit. But I do not mind since it is quick enough. I just found it interesting that -O2 is not helping. If you would like some more tests let me know. Peter. Simon Peyton-Jones wrote:
O2 mainly switches on two transformations: "liberate case" and "call-pattern specialisation". (I think it also gets passed on to gcc.)
Trying -O2 -fno-liberate-case, and -O2 -fno-spec-constr
might tell which was making the difference.
Simon

Hello Peter, Friday, November 9, 2007, 8:47:21 PM, you wrote:
-O2 -fno-liberate-case -fexcess-precision -O2 -fno-spec-constr -fexcess-precision
test also with -O2 -fno-spec-constr -fno-liberate-case -fexcess-precision -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

-O2 -fno-liberate-case -fno-spec-constr -fexcess-precision real 24.500 user 19.172 system 0.359 cpu 19.531 page_faults 79337 real 26.406 user 18.938 system 0.375 cpu 19.313 page_faults 79477 real 28.891 user 19.016 system 0.391 cpu 19.406 page_faults 79357 Peter. Bulat Ziganshin wrote:
Hello Peter,
Friday, November 9, 2007, 8:47:21 PM, you wrote:
-O2 -fno-liberate-case -fexcess-precision -O2 -fno-spec-constr -fexcess-precision
test also with -O2 -fno-spec-constr -fno-liberate-case -fexcess-precision

-----Original Message----- From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users-bounces@haskell.org] On Behalf Of Peter Hercek Sent: Friday, November 09, 2007 9:19 AM To: glasgow-haskell-users@haskell.org Subject: Re: GHC 6.8.1 is impressive! New ghc sped up my small app (~2000 lines) by ~38%. Nice job! Anyway, my application is a bit slower when compiled with -O2 compared to -01 only (both with ghc 6.6 and 6.8). Is that normal? I assume you meant -O1, not -01, in "compared to -01." :) It's certainly not necessarily abnormal. It is well known that one can create situations where optimizations have a negative impact. The interesting question is, what are the characteristics of this particular application that are atypical, and thus appear to cause atypical optimization behavior? This type of situation is similar to the situation sometimes seen with multi-processor machines. In most cases, if you add processors, you get better performance. In some cases the reverse results. This is because there is some overhead to multiprocessing, and if a program is structure in such a way that parallelism is impossible, the multiprocessor overhead decreases performance and thus, for that particular application, performance is decreased. To continue the analogy, in most cases one can make minor modifications that will allow the application to use available parallelism. It is likely that small modifications to your program would allow it to take advantages of the increased optimization provided by -O2. So, as I said, it will be very interesting to discover the reason for the behavior you are experiencing. It might point to ways to make small code modifications to your application to increase performance. It might point to ways to improve the optimization. That's all premature of course. We don't know enough, to cite one quick example we don't know whether the optimization behavior is related to ghc behavior or to gcc behavior (although, of course, even if it is related to gcc behavior there could be interesting code generation issues). It would be fun to discover the details. Seth Kurtzberg Software Engineer Specializing in Security, Reliability, and the Hardware/Software Interface Peter. Lennart Augustsson wrote:
I'd like to second that. 6.8 is quite an improvement. Well done!
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
participants (8)
-
Bjorn Bringert
-
Bulat Ziganshin
-
Lennart Augustsson
-
Peter Hercek
-
Ravi Nanavati
-
Seth Kurtzberg
-
Simon Peyton-Jones
-
Wolfgang Jeltsch