diff mbox

[i386] : Expand sibling-tail-calls via accumulator register

Message ID 356718653.5712706.1400794422397.JavaMail.zimbra@redhat.com
State New
Headers show

Commit Message

Kai Tietz May 22, 2014, 9:33 p.m. UTC
Hello,

This patch avoids for sibling-tail-calls the use of pseudo-register.  Instead it uses for load of memory-address the accumulator-register.  By this we avoid that IRA/LRA need to choose a register.  So we reduce register-pressure.

The accumulator-register is always a valid register on tail-call case.  All other registers might be callee-saved, or used for argument-passing.  The only case where we would use the accumulator on call is the variadic-case for x86_64 ABI.  Just that this function never is a candidate for sibling-tail-calls.

ChangeLog

2014-05-22  Kai Tietz  <ktietz@redhat.com>

	* config/i386/i386.c (ix86_expand_call): Enforce for sibcalls
	on memory the use of accumulator-register.

Regression tested for x86_64-unknown-linux-gnu, x86_64-w64-mingw32, and i686-pc-cygwin.
Ok for apply?

Regards,
Kai

Comments

H.J. Lu May 22, 2014, 10:07 p.m. UTC | #1
On Thu, May 22, 2014 at 2:33 PM, Kai Tietz <ktietz@redhat.com> wrote:
> Hello,
>
> This patch avoids for sibling-tail-calls the use of pseudo-register.  Instead it uses for load of memory-address the accumulator-register.  By this we avoid that IRA/LRA need to choose a register.  So we reduce register-pressure.
>
> The accumulator-register is always a valid register on tail-call case.  All other registers might be callee-saved, or used for argument-passing.  The only case where we would use the accumulator on call is the variadic-case for x86_64 ABI.  Just that this function never is a candidate for sibling-tail-calls.
>
> ChangeLog
>
> 2014-05-22  Kai Tietz  <ktietz@redhat.com>
>
>         * config/i386/i386.c (ix86_expand_call): Enforce for sibcalls
>         on memory the use of accumulator-register.
>
> Regression tested for x86_64-unknown-linux-gnu, x86_64-w64-mingw32, and i686-pc-cygwin.
> Ok for apply?
>
> Regards,
> Kai
>
> Index: i386.c
> ===================================================================
> --- i386.c      (Revision 210412)
> +++ i386.c      (Arbeitskopie)
> @@ -24898,8 +24898,19 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx call
>            ? !sibcall_insn_operand (XEXP (fnaddr, 0), word_mode)
>            : !call_insn_operand (XEXP (fnaddr, 0), word_mode))
>      {
> +      rtx r;
>        fnaddr = convert_to_mode (word_mode, XEXP (fnaddr, 0), 1);
> -      fnaddr = gen_rtx_MEM (QImode, copy_to_mode_reg (word_mode, fnaddr));
> +      if (!sibcall)
> +       r = copy_to_mode_reg (word_mode, fnaddr);
> +      else
> +       {
> +         r = gen_rtx_REG (word_mode, AX_REG)

If fnaddr points to a function with variable argument list in
64-bit, AX_REG may be used to store number of FP arguments
passed in registers.

> +         if (! general_operand (fnaddr, VOIDmode))
> +                   fnaddr = force_operand (fnaddr, r);
> +         if (fnaddr != r)
> +           emit_move_insn (r, fnaddr);
> +       }
> +      fnaddr = gen_rtx_MEM (QImode, r);
>      }
>
>    call = gen_rtx_CALL (VOIDmode, fnaddr, callarg1);
Jeff Law May 23, 2014, 6:04 p.m. UTC | #2
On 05/22/14 16:07, H.J. Lu wrote:
> On Thu, May 22, 2014 at 2:33 PM, Kai Tietz <ktietz@redhat.com> wrote:
>> Hello,
>>
>> This patch avoids for sibling-tail-calls the use of pseudo-register.  Instead it uses for load of memory-address the accumulator-register.  By this we avoid that IRA/LRA need to choose a register.  So we reduce register-pressure.
>>
>> The accumulator-register is always a valid register on tail-call case.  All other registers might be callee-saved, or used for argument-passing.  The only case where we would use the accumulator on call is the variadic-case for x86_64 ABI.  Just that this function never is a candidate for sibling-tail-calls.
>>
>> ChangeLog
>>
>> 2014-05-22  Kai Tietz  <ktietz@redhat.com>
>>
>>          * config/i386/i386.c (ix86_expand_call): Enforce for sibcalls
>>          on memory the use of accumulator-register.
>>
>> Regression tested for x86_64-unknown-linux-gnu, x86_64-w64-mingw32, and i686-pc-cygwin.
>> Ok for apply?
>>
>> Regards,
>> Kai
>>
>> Index: i386.c
>> ===================================================================
>> --- i386.c      (Revision 210412)
>> +++ i386.c      (Arbeitskopie)
>> @@ -24898,8 +24898,19 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx call
>>             ? !sibcall_insn_operand (XEXP (fnaddr, 0), word_mode)
>>             : !call_insn_operand (XEXP (fnaddr, 0), word_mode))
>>       {
>> +      rtx r;
>>         fnaddr = convert_to_mode (word_mode, XEXP (fnaddr, 0), 1);
>> -      fnaddr = gen_rtx_MEM (QImode, copy_to_mode_reg (word_mode, fnaddr));
>> +      if (!sibcall)
>> +       r = copy_to_mode_reg (word_mode, fnaddr);
>> +      else
>> +       {
>> +         r = gen_rtx_REG (word_mode, AX_REG)
>
> If fnaddr points to a function with variable argument list in
> 64-bit, AX_REG may be used to store number of FP arguments
> passed in registers.
Right, but as Kai mentioned earlier, a varardic function should have 
been rejected by now as a sibcall target.

Regardless, I think adding a check here shouldn't hurt and makes the 
backend more bullet-proof if the target independent bits get smarter in 
the future.

jeff
Jeff Law May 23, 2014, 6:16 p.m. UTC | #3
On 05/22/14 15:33, Kai Tietz wrote:
> Hello,
>
> This patch avoids for sibling-tail-calls the use of pseudo-register.
> Instead it uses for load of memory-address the accumulator-register.
> By this we avoid that IRA/LRA need to choose a register.  So we
> reduce register-pressure.
>
> The accumulator-register is always a valid register on tail-call
> case.  All other registers might be callee-saved, or used for
> argument-passing.  The only case where we would use the accumulator
> on call is the variadic-case for x86_64 ABI.  Just that this function
> never is a candidate for sibling-tail-calls.
>
> ChangeLog
>
> 2014-05-22  Kai Tietz  <ktietz@redhat.com>
>
> * config/i386/i386.c (ix86_expand_call): Enforce for sibcalls on
> memory the use of accumulator-register.
I'm generally not a fan of explicitly mentioning hard registers in RTL. 
  Though most of the major problems related to doing that have been 
resolved through the years.


>
> Regression tested for x86_64-unknown-linux-gnu, x86_64-w64-mingw32,
> and i686-pc-cygwin. Ok for apply?
In the interest of defensive programming, can you verify that fnaddr 
doesn't refer to a varardic function?  Hmm, I guess we can't get to a 
signature here.  So, never mind.

So I think the way to go is to ensure that the x86 port always rejects 
sibcalls to a varardic target, which I think can be done in 
ix86-function_ok_for_sibcall.

With that change this patch should be OK.  But please post it one more 
time for a final review.

jeff
Richard Henderson May 27, 2014, 3:49 p.m. UTC | #4
On 05/22/2014 02:33 PM, Kai Tietz wrote:
> 	* config/i386/i386.c (ix86_expand_call): Enforce for sibcalls
> 	on memory the use of accumulator-register.

I don't like this at all.

I'm fine with allowing memories that are fully symbolic, e.g.

extern void (*foo)(void);
void f(void) { foo(); }

but any time you've got to use one insn to form the address in %eax, you might
as well have also issued the memory load into %eax.  And if the memory load is
moved earlier, you no longer need to constrain to %eax, but let the register
allocator choose the call-clobbered register to use.


r~
diff mbox

Patch

Index: i386.c
===================================================================
--- i386.c	(Revision 210412)
+++ i386.c	(Arbeitskopie)
@@ -24898,8 +24898,19 @@  ix86_expand_call (rtx retval, rtx fnaddr, rtx call
 	   ? !sibcall_insn_operand (XEXP (fnaddr, 0), word_mode)
 	   : !call_insn_operand (XEXP (fnaddr, 0), word_mode))
     {
+      rtx r;
       fnaddr = convert_to_mode (word_mode, XEXP (fnaddr, 0), 1);
-      fnaddr = gen_rtx_MEM (QImode, copy_to_mode_reg (word_mode, fnaddr));
+      if (!sibcall)
+	r = copy_to_mode_reg (word_mode, fnaddr);
+      else
+	{
+	  r = gen_rtx_REG (word_mode, AX_REG);
+	  if (! general_operand (fnaddr, VOIDmode))
+		    fnaddr = force_operand (fnaddr, r);
+	  if (fnaddr != r)
+	    emit_move_insn (r, fnaddr);
+	}
+      fnaddr = gen_rtx_MEM (QImode, r);
     }
 
   call = gen_rtx_CALL (VOIDmode, fnaddr, callarg1);