gcc vs. clang builds of 7.8.3 on OS X

In building the OS X bindist for 7.8.3, I had to choose which of several ways to build it. In particular, I could build it with a newere Xcode, which uses clang, or an older Xcode which uses gcc. I decided to nofib benchmark the variations and see before I released. Here is what I found... I compared two candidate builds: - x86_64 architecture - targeted at OS X 10.7 and later - one built with Xcode 5.1 on 10.9, which uses clang - one built with Xcode 4.5 on 10.8, which uses gcc I installed both bindists, side-by-side on the same machine: a 10.9 machine, with Xcode 5.1, which uses clang. The machine is a MacMini, 2.5GHz Intel Core i5 (dual core, reports as 4 cpus). Summary: - clang build was always faster - non-threaded was -3.2% run-time - threaded was -7.3% run-time - clang's improvement in GC run-time was better than -10% - clang builds were significantly bigger You can find the details here: - analysis-Silver-10.9-gcc-vs-clang.html http://www.ozonehouse.com/mark/platform/analysis-Silver-10.9-gcc-vs-clang.ht... - analysis-Silver-10.9-gcc-vs-clang-threaded.html http://www.ozonehouse.com/mark/platform/analysis-Silver-10.9-gcc-vs-clang-th... The only concern is that the binary sizes were significantly bigger: +230% - I haven't investigated more, but I'm wondering if nofib doesn't strip the binaries before measuring, and perhaps clang's debugging info is much greater? Next up... we are evaluating a bindist built with the HPC Mac OS X gcc compiler (based on gcc 4.9)... and preliminary results are looking even better! Stay tuned... - Mark

I thought clang was slower than gcc because clang doesn't support thread
local variables (in some form we need) and therefore GC performance
suffered a lot on clang.
On Sat, Jul 12, 2014 at 9:27 PM, Mark Lentczner
In building the OS X bindist for 7.8.3, I had to choose which of several ways to build it. In particular, I could build it with a newere Xcode, which uses clang, or an older Xcode which uses gcc. I decided to nofib benchmark the variations and see before I released. Here is what I found...
I compared two candidate builds:
- x86_64 architecture - targeted at OS X 10.7 and later - one built with Xcode 5.1 on 10.9, which uses clang - one built with Xcode 4.5 on 10.8, which uses gcc
I installed both bindists, side-by-side on the same machine: a 10.9 machine, with Xcode 5.1, which uses clang. The machine is a MacMini, 2.5GHz Intel Core i5 (dual core, reports as 4 cpus).
Summary:
- clang build was always faster - non-threaded was -3.2% run-time - threaded was -7.3% run-time - clang's improvement in GC run-time was better than -10% - clang builds were significantly bigger
You can find the details here:
- analysis-Silver-10.9-gcc-vs-clang.html http://www.ozonehouse.com/mark/platform/analysis-Silver-10.9-gcc-vs-clang.ht... - analysis-Silver-10.9-gcc-vs-clang-threaded.html http://www.ozonehouse.com/mark/platform/analysis-Silver-10.9-gcc-vs-clang-th...
The only concern is that the binary sizes were significantly bigger: +230% - I haven't investigated more, but I'm wondering if nofib doesn't strip the binaries before measuring, and perhaps clang's debugging info is much greater?
Next up... we are evaluating a bindist built with the HPC Mac OS X gcc compiler (based on gcc 4.9)... and preliminary results are looking even better! Stay tuned...
- Mark
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Maybe it depends on the version of OS X being used? Maybe TLS works
differently pre 10.8 or 10.9?
On Saturday, July 12, 2014, Johan Tibell
I thought clang was slower than gcc because clang doesn't support thread local variables (in some form we need) and therefore GC performance suffered a lot on clang.
On Sat, Jul 12, 2014 at 9:27 PM, Mark Lentczner
javascript:_e(%7B%7D,'cvml','mark.lentczner@gmail.com');> wrote: In building the OS X bindist for 7.8.3, I had to choose which of several ways to build it. In particular, I could build it with a newere Xcode, which uses clang, or an older Xcode which uses gcc. I decided to nofib benchmark the variations and see before I released. Here is what I found...
I compared two candidate builds:
- x86_64 architecture - targeted at OS X 10.7 and later - one built with Xcode 5.1 on 10.9, which uses clang - one built with Xcode 4.5 on 10.8, which uses gcc
I installed both bindists, side-by-side on the same machine: a 10.9 machine, with Xcode 5.1, which uses clang. The machine is a MacMini, 2.5GHz Intel Core i5 (dual core, reports as 4 cpus).
Summary:
- clang build was always faster - non-threaded was -3.2% run-time - threaded was -7.3% run-time - clang's improvement in GC run-time was better than -10% - clang builds were significantly bigger
You can find the details here:
- analysis-Silver-10.9-gcc-vs-clang.html http://www.ozonehouse.com/mark/platform/analysis-Silver-10.9-gcc-vs-clang.ht... - analysis-Silver-10.9-gcc-vs-clang-threaded.html http://www.ozonehouse.com/mark/platform/analysis-Silver-10.9-gcc-vs-clang-th...
The only concern is that the binary sizes were significantly bigger: +230% - I haven't investigated more, but I'm wondering if nofib doesn't strip the binaries before measuring, and perhaps clang's debugging info is much greater?
Next up... we are evaluating a bindist built with the HPC Mac OS X gcc compiler (based on gcc 4.9)... and preliminary results are looking even better! Stay tuned...
- Mark
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org javascript:_e(%7B%7D,'cvml','ghc-devs@haskell.org'); http://www.haskell.org/mailman/listinfo/ghc-devs

I will try to measure on 10.7 later today. Preliminary numbers for gcc 4.9 are even better than clang - it saves 12% over gcc 4.2 builds. However, the gcc runtime isn't the same as the Apple standard... and we are so far at a loss how to package a ghc based on 4.9 that would work for Mac users without gcc 4.9. - Mark

why wouldn't it work?
heres my 4.9 gcc build, I believe it should work on any >= 10.7 system that
has xcode cli tools installed,
please let me know if it fails!
http://www.wellposed.com/opensource/ghc/releasebuild-unofficial/ghc-7.8.3-x8...
On Sat, Jul 12, 2014 at 9:04 PM, Mark Lentczner
I will try to measure on 10.7 later today.
Preliminary numbers for gcc 4.9 are even better than clang - it saves 12% over gcc 4.2 builds. However, the gcc runtime isn't the same as the Apple standard... and we are so far at a loss how to package a ghc based on 4.9 that would work for Mac users without gcc 4.9.
- Mark

It won't work in our case because the gcc 4.9 build we have references it's own c rts lib, which is 4.9 specific, and is notably different than what is on a stock Mac Imagine if we were to ship a libHSrts.a, compiled against the gcc 4.9 libc (and it's includes). Now a user without gcc 4.9 on thier system, installs that bindist. And when they compile code that references libc, it'll compile against the system libc (and it's includes). If tha code is paired with Haskell code (or *is* Haskell code via the FFI), and is then linked with libHSrts.a from the bindist.... now we have an executable that has parts compiled against two different libc-s. This won't work unless the two libcs (and their includes) are ABI compatible.... which I don't know if it is between gcc's 4.9 libc, and the libc Apple ships for it's systems. On Sat, Jul 12, 2014 at 6:40 PM, Carter Schonwald < carter.schonwald@gmail.com> wrote:
why wouldn't it work?
heres my 4.9 gcc build, I believe it should work on any >= 10.7 system that has xcode cli tools installed, please let me know if it fails!
http://www.wellposed.com/opensource/ghc/releasebuild-unofficial/ghc-7.8.3-x8...
On Sat, Jul 12, 2014 at 9:04 PM, Mark Lentczner
wrote: I will try to measure on 10.7 later today.
Preliminary numbers for gcc 4.9 are even better than clang - it saves 12% over gcc 4.2 builds. However, the gcc runtime isn't the same as the Apple standard... and we are so far at a loss how to package a ghc based on 4.9 that would work for Mac users without gcc 4.9.
- Mark

The clang executable size mystery deepens: The sizes are indeed waaaay big: 7.4M test-files-clang/test* 4.5M test-files-clang/test-stripped* 1.4M test-files-gcc/test* 1.1M test-files-gcc/test-stripped* Looking at the load info from the stripped versions, it is all in the main text segment: test-files-clang/load test-files-gcc/load __TEXT.__text : 3,554,134 833,502 __TEXT.__stubs : 876 672 __TEXT.__stub_helper : 1,476 1,136 __TEXT.__const : 59,040 32,104 __TEXT.__cstring : 24,156 24,900 __TEXT.__dof_HaskellEv : 4,774 4,774 __TEXT.__eh_frame : 22,976 46,664 __DATA.__got : 1,264 880 __DATA.__nl_symbol_ptr : 16 16 __DATA.__la_symbol_ptr : 1,168 896 __DATA.__mod_init_func : 8 8 __DATA.__const : 130,048 79,744 __DATA.__data : 231,848 22,904 __DATA.__common : 45,924 46,092 __DATA.__bss : 856 840 TOTAL SIZE : 4,078,564 1,095,132 But, the compiled sizes are identical: 1.9K test-files-clang/Main.o 1.9K test-files-gcc/Main.o And, after dumping the link command, and looking up all the libs linked in (identical set in both cases), the clang libs are actually *smaller:* 13,004,272 ...clang.../lib/ghc-7.8.3/base-4.7.0.1/libHSbase-4.7.0.1.a 792,352 ...clang.../lib/ghc-7.8.3/ghc-prim-0.3.1.0/libHSghc-prim-0.3.1.0.a 1,010,824 ...clang.../lib/ghc-7.8.3/integer-gmp-0.5.1.0/libHSinteger-gmp-0.5.1.0.a 55,816 ...clang.../lib/ghc-7.8.3/rts-1.0/libCffi.a 565,112 ...clang.../lib/ghc-7.8.3/rts-1.0/libHSrts.a 24,378,416 ...gcc.../lib/ghc-7.8.3/base-4.7.0.1/libHSbase-4.7.0.1.a 1,253,176 ...gcc.../lib/ghc-7.8.3/ghc-prim-0.3.1.0/libHSghc-prim-0.3.1.0.a 1,014,280 ...gcc.../lib/ghc-7.8.3/integer-gmp-0.5.1.0/libHSinteger-gmp-0.5.1.0.a 57,984 ...gcc.../lib/ghc-7.8.3/rts-1.0/libCffi.a 556,432 ...gcc.../lib/ghc-7.8.3/rts-1.0/libHSrts.a So now I'm totally mystified! What is in that 3M of extra text segment?!?!? - Mark

Found the culprit!!!!!!! XCodeVersion=`xcodebuild -version | grep Xcode | sed "s/Xcode //"` This line in configure doesn't work on a system that just the Xcode command line tools installed! It also won't work on an OS X system that has some other tool chain (say, via brew) installed. On such systems, itsets XCodeVersion to "", which in tur The follow on code sets XCodeVersion1 and XCodeVersion2 to "0", and then this code runs, causing the problem: SplitObjsBroken=NO if test "$TargetOS_CPP" = "darwin" then # Split objects is broken (#4013) with XCode < 3.2 if test "$XCodeVersion1" -lt 3 then SplitObjsBroken=YES else if test "$XCodeVersion1" -eq 3 then if test "$XCodeVersion2" -lt 2 then SplitObjsBroken=YES fi fi fi fi Alas, it doesn't look like SplitObjsBroken has the logic to allow it to be overriden on the ./configure invocation (anyone know for sure? my autoconf is very rusty....) Too late here for me to think of a fix.... - Mark

On Sun, Jul 13, 2014 at 12:13 AM, Mark Lentczner
Found the culprit!!!!!!!
XCodeVersion=`xcodebuild -version | grep Xcode | sed "s/Xcode //"`
This line in configure doesn't work on a system that just the Xcode command line tools installed! It also won't work on an OS X system that has some other tool chain (say, via brew) installed. On such systems, itsets XCodeVersion to "", which in tur
The follow on code sets XCodeVersion1 and XCodeVersion2 to "0", and then this code runs, causing the problem:
SplitObjsBroken=NO if test "$TargetOS_CPP" = "darwin" then # Split objects is broken (#4013) with XCode < 3.2 if test "$XCodeVersion1" -lt 3 then SplitObjsBroken=YES else if test "$XCodeVersion1" -eq 3 then if test "$XCodeVersion2" -lt 2 then SplitObjsBroken=YES fi fi fi fi
Alas, it doesn't look like SplitObjsBroken has the logic to allow it to be overriden on the ./configure invocation (anyone know for sure? my autoconf is very rusty....)
Too late here for me to think of a fix....
Would it be possible to simply stop supporting Xcode builds that old? #4013 is three years old and Xcode < 3.2 only applies to Mac OS X 10.5 and earlier.
participants (4)
-
Bob Ippolito
-
Carter Schonwald
-
Johan Tibell
-
Mark Lentczner