RE: jhc vs ghc and the surprising result involving ghc generatedassembly.

On 01 November 2005 16:32, Florian Weimer wrote:
* Simon Marlow:
gcc started generating this rubbish around version 3.4, if I recall correctly. I've tried disabling various optimisations, but can't seem to convince gcc not to generate the extra jump. You don't get this from the native code generator, BTW.
But the comparison is present in the C code. What do you want GCC to do?
I didn't mean to sound overly critical of gcc. But here's what I was complaining about - the code generated by gcc (3.4.x) is as follows: Main_zdwfac_info: .text .align 8 .text movq (%rbp), %rdx cmpq $1, %rdx jne .L2 movq 8(%rbp), %r13 leaq 16(%rbp), %rbp movq (%rbp), %rax .L4: jmp *%rax .L2: movq %rdx, %rax imulq 8(%rbp), %rax movq %rax, 8(%rbp) leaq -1(%rdx), %rax movq %rax, (%rbp) movl $Main_zdwfac_info, %eax jmp .L4 there's an obvious simplification - the last two instructions should be replaced by just jmp Main_zdwfac_info eliminating one branch and a mov. This occurs all over the place in our code. Whenever a function has more than one computed goto, gcc insists on commoning up the jmp instructions even when it's a really bad idea, like above. Actually if I add -O2, then I get better code, so perhaps this isn't a real problem. Although gcc still generates this: Fac_zdwfac_info: .text .align 8 movq (%rbp), %rdx testq %rdx, %rdx jne .L3 movq 8(%rbp), %r13 addq $16, %rbp movq (%rbp), %rax jmp *%rax .p2align 4,,7 .L3: movq 8(%rbp), %rax imulq %rdx, %rax decq %rdx movq %rdx, (%rbp) movq %rax, 8(%rbp) movl $Fac_zdwfac_info, %eax jmp *%rax and fails to combine the movs with the jmp instruction (we do this simplification ourselves when post-processing the assembly code). I'll compile up gcc 4 and see what happens with that. Cheers, Simon

* Simon Marlow:
gcc started generating this rubbish around version 3.4, if I recall correctly. I've tried disabling various optimisations, but can't seem to convince gcc not to generate the extra jump. You don't get this from the native code generator, BTW.
But the comparison is present in the C code. What do you want GCC to do?
I didn't mean to sound overly critical of gcc.
It didn't come across that way. I just want to construct a test case, so it can be fixed on the GCC side, and see if I can suggest alternatives.
Actually if I add -O2, then I get better code, so perhaps this isn't a real problem. Although gcc still generates this:
movl $Fac_zdwfac_info, %eax jmp *%rax
and fails to combine the movs with the jmp instruction (we do this simplification ourselves when post-processing the assembly code).
I agree, GCC should optimize this case. A minimal test case is: extern void bar(); void foo() { void *p = bar; goto *p; } None of the GCC versions I have tried optimizes away the indirect call.
I'll compile up gcc 4 and see what happens with that.
The jump target is not propagated, either. Same with 4.1. However, beginning with GCC 3.4, you can use: extern void bar(); void foo() { void (*p)(void) = bar; p(); } And the indirect call is turned into a direct jump. Tail recursive calls and really indirect tail calls are also optimzed. Together with -fomit-frame-pointer, this could give you what you need, without post-processing the generated assembler code (which is desirable because the asm volatile statements inhibit further optimization). Is it correct that you use indirect gotos across functions? Such gotos aren't supported by GCC and work only by accident.

On Wed, 2005-11-02 at 14:59 +0100, Florian Weimer wrote:
Is it correct that you use indirect gotos across functions? Such gotos aren't supported by GCC and work only by accident.
Even direct gotos aren't universally supported. Some info in Fergus Henderson's paper may be of interest http://felix.sourceforge.net/papers/mercury_to_c.ps -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net

Is it correct that you use indirect gotos across functions? Such gotos aren't supported by GCC and work only by accident.
Even direct gotos aren't universally supported. Some info in Fergus Henderson's paper may be of interest
This paper seems to be from 1995 or so: %DVIPSSource: TeX output 1995.11.29:1656 (Why is it so uncommon to put the publication date on the first page?) GCC's IL has changed significantly since then; it's not clear if it still applies.

On Wed, 2005-11-02 at 18:05 +0100, Florian Weimer wrote:
Is it correct that you use indirect gotos across functions? Such gotos aren't supported by GCC and work only by accident.
Even direct gotos aren't universally supported. Some info in Fergus Henderson's paper may be of interest
This paper seems to be from 1995 or so:
%DVIPSSource: TeX output 1995.11.29:1656
(Why is it so uncommon to put the publication date on the first page?)
GCC's IL has changed significantly since then; it's not clear if it still applies.
I am using some of it in Felix, that part I am using seems to work fine on all platforms tested: various versions of g++ and under Linux, OSX, Cygwin, and MinGW, possibly more. The config script checks assembler labels are supported, if they are the indirect jumps 'just work'. Of course the config would have to be built by hand for cross compilation ;( However my system obeys a constraint: the runtime conspires to ensure the function containing the target label is entered before the jump is done. The address is calculated by the caller though. So I don't run into any problems loading the right data section pointer. I suspect Haskell cannot do that, since it would defeat the intended optimisation. [More precisely, in Felix the technique is used to implement non-local gotos, and which can only occur in procedures, not in functions] -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net
participants (3)
-
Florian Weimer
-
Simon Marlow
-
skaller