diff mbox

[AArch64] Improve TARGET_LEGITIMIZE_ADDRESS_P hook

Message ID 53DB504B.3030303@arm.com
State New
Headers show

Commit Message

Jiong Wang Aug. 1, 2014, 8:31 a.m. UTC
currently, aarch64 LEGITIMIZE_ADDRESS_P hook will reject all "reg + offset" address given
"offset" is beyond supported range.

while this may be too strict. we should honor the "strict_p" parameter in the hook. before
reload, we accept all offset if it's a frame access, because the offset may change during
later register elimination.

the early reject of "reg + offset" may cause extra registers created, and if that register
live range is across function invoking then callee saved reg needed, thus introduce extra
reg save/restore also.

give a simple example as:

int
test15 (void)
{
   unsigned char a[480];
   initialize_array (a, 480);

   if (a[0] == 0x10)
     return 1;

   return 0;
}

.S before the patch
(-O2 -fPIC)
===
test15:
         sub     sp, sp, #480
         mov     w1, 480
         stp     x29, x30, [sp, -32]!
         add     x29, sp, 0
         str     x19, [sp, 16]
         add     x19, x29, 32
         mov     x0, x19
         bl      initialize_array
         ldrb    w0, [x19]
         ldr     x19, [sp, 16]
         ldp     x29, x30, [sp], 32
         cmp     w0, 16
         cset    w0, eq
         add     sp, sp, 480
         ret

.S after the patch
===
test15:
         stp     x29, x30, [sp, -496]!
         mov     w1, 480
         add     x29, sp, 0
         add     x0, x29, 16
         bl      initialize_array
         ldrb    w0, [x29, 16]
         ldp     x29, x30, [sp], 496
         cmp     w0, 16
         cset    w0, eq
         ret

test done
=========
no regression on aarch64-none-elf bare metal.
bootstrap OK on aarch64.

OK for trunk?

thanks.

gcc/
   * config/aarch64/aarch64.c (aarch64_classify_address): Accept all offset for frame access
   when strict_p is false.

gcc/testsuite
   * gcc.target/aarch64/legitimize_stack_var_before_reload_1.c: New testcase.

Comments

Venkataramanan Kumar Aug. 1, 2014, 10:59 a.m. UTC | #1
Hi Jiong,

(Snip)
+  && (op0 == virtual_stack_vars_rtx
+      || op0 == frame_pointer_rtx
+      || op0 == arg_pointer_rtx)
(Snip)

The above check is means that these are the ways to access the frame.
is it possible to have stack_pointer_rtx has op0?


On 1 August 2014 14:01, Jiong Wang <jiong.wang@arm.com> wrote:
> currently, aarch64 LEGITIMIZE_ADDRESS_P hook will reject all "reg + offset"
> address given
> "offset" is beyond supported range.
>
> while this may be too strict. we should honor the "strict_p" parameter in
> the hook. before
> reload, we accept all offset if it's a frame access, because the offset may
> change during
> later register elimination.
>
> the early reject of "reg + offset" may cause extra registers created, and if
> that register
> live range is across function invoking then callee saved reg needed, thus
> introduce extra
> reg save/restore also.
>
> give a simple example as:
>
> int
> test15 (void)
> {
>   unsigned char a[480];
>   initialize_array (a, 480);
>
>   if (a[0] == 0x10)
>     return 1;
>
>   return 0;
> }
>
> .S before the patch
> (-O2 -fPIC)
> ===
> test15:
>         sub     sp, sp, #480
>         mov     w1, 480
>         stp     x29, x30, [sp, -32]!
>         add     x29, sp, 0
>         str     x19, [sp, 16]
>         add     x19, x29, 32
>         mov     x0, x19
>         bl      initialize_array
>         ldrb    w0, [x19]
>         ldr     x19, [sp, 16]
>         ldp     x29, x30, [sp], 32
>         cmp     w0, 16
>         cset    w0, eq
>         add     sp, sp, 480
>         ret
>
> .S after the patch
> ===
> test15:
>         stp     x29, x30, [sp, -496]!
>         mov     w1, 480
>         add     x29, sp, 0
>         add     x0, x29, 16
>         bl      initialize_array
>         ldrb    w0, [x29, 16]
>         ldp     x29, x30, [sp], 496
>         cmp     w0, 16
>         cset    w0, eq
>         ret
>
> test done
> =========
> no regression on aarch64-none-elf bare metal.
> bootstrap OK on aarch64.
>
> OK for trunk?
>
> thanks.
>
> gcc/
>   * config/aarch64/aarch64.c (aarch64_classify_address): Accept all offset
> for frame access
>   when strict_p is false.
>
> gcc/testsuite
>   * gcc.target/aarch64/legitimize_stack_var_before_reload_1.c: New testcase.
Jiong Wang Aug. 1, 2014, 12:05 p.m. UTC | #2
Hi Venkataramanan,

thanks for review.

On 01/08/14 11:59, Venkataramanan Kumar wrote:
> Hi Jiong,
>
> (Snip)
> +  && (op0 == virtual_stack_vars_rtx
> +      || op0 == frame_pointer_rtx
> +      || op0 == arg_pointer_rtx)
> (Snip)
>
> The above check is means that these are the ways to access the frame.
> is it possible to have stack_pointer_rtx has op0?

I think there is no necessary to add stack_pointer_rtx check here.

because normally all local variables will be addressed using virtual_stack_vars_rtx,
and then rebased to frame_pointer_rtx in instantiate_new_reg, and finally eliminated
to fp/sp during reload depending on whether frame pointer is needed. so

+  && (op0 == virtual_stack_vars_rtx
+      || op0 == frame_pointer_rtx

could actually catch almost all we want.

-- Jiong

>
>
> On 1 August 2014 14:01, Jiong Wang <jiong.wang@arm.com> wrote:
>> currently, aarch64 LEGITIMIZE_ADDRESS_P hook will reject all "reg + offset"
>> address given
>> "offset" is beyond supported range.
>>
>> while this may be too strict. we should honor the "strict_p" parameter in
>> the hook. before
>> reload, we accept all offset if it's a frame access, because the offset may
>> change during
>> later register elimination.
>>
>> the early reject of "reg + offset" may cause extra registers created, and if
>> that register
>> live range is across function invoking then callee saved reg needed, thus
>> introduce extra
>> reg save/restore also.
>>
>> give a simple example as:
>>
>> int
>> test15 (void)
>> {
>>    unsigned char a[480];
>>    initialize_array (a, 480);
>>
>>    if (a[0] == 0x10)
>>      return 1;
>>
>>    return 0;
>> }
>>
>> .S before the patch
>> (-O2 -fPIC)
>> ===
>> test15:
>>          sub     sp, sp, #480
>>          mov     w1, 480
>>          stp     x29, x30, [sp, -32]!
>>          add     x29, sp, 0
>>          str     x19, [sp, 16]
>>          add     x19, x29, 32
>>          mov     x0, x19
>>          bl      initialize_array
>>          ldrb    w0, [x19]
>>          ldr     x19, [sp, 16]
>>          ldp     x29, x30, [sp], 32
>>          cmp     w0, 16
>>          cset    w0, eq
>>          add     sp, sp, 480
>>          ret
>>
>> .S after the patch
>> ===
>> test15:
>>          stp     x29, x30, [sp, -496]!
>>          mov     w1, 480
>>          add     x29, sp, 0
>>          add     x0, x29, 16
>>          bl      initialize_array
>>          ldrb    w0, [x29, 16]
>>          ldp     x29, x30, [sp], 496
>>          cmp     w0, 16
>>          cset    w0, eq
>>          ret
>>
>> test done
>> =========
>> no regression on aarch64-none-elf bare metal.
>> bootstrap OK on aarch64.
>>
>> OK for trunk?
>>
>> thanks.
>>
>> gcc/
>>    * config/aarch64/aarch64.c (aarch64_classify_address): Accept all offset
>> for frame access
>>    when strict_p is false.
>>
>> gcc/testsuite
>>    * gcc.target/aarch64/legitimize_stack_var_before_reload_1.c: New testcase.
Marcus Shawcroft Aug. 1, 2014, 12:20 p.m. UTC | #3
On 1 August 2014 09:31, Jiong Wang <jiong.wang@arm.com> wrote:

> gcc/
>   * config/aarch64/aarch64.c (aarch64_classify_address): Accept all offset
> for frame access
>   when strict_p is false.


This part is OK.

> gcc/testsuite
>   * gcc.target/aarch64/legitimize_stack_var_before_reload_1.c: New testcase.

This test case relies on:
+/* { dg-final { scan-rtl-dump "mem/j/c:QI \\(plus:DI" "expand" } } */
... which is not very specific.  Can we tighten this up to look for
something directly related to the relevant frame access?

/Marcus
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index ed80269..c8e5808 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3248,6 +3248,21 @@  aarch64_classify_address (struct aarch64_address_info *info,
     case PLUS:
       op0 = XEXP (x, 0);
       op1 = XEXP (x, 1);
+
+      if (! strict_p
+	  && GET_CODE (op0) == REG
+	  && (op0 == virtual_stack_vars_rtx
+	      || op0 == frame_pointer_rtx
+	      || op0 == arg_pointer_rtx)
+	  && GET_CODE (op1) == CONST_INT)
+	{
+	  info->type = ADDRESS_REG_IMM;
+	  info->base = op0;
+	  info->offset = op1;
+
+	  return true;
+	}
+
       if (GET_MODE_SIZE (mode) != 0
 	  && CONST_INT_P (op1)
 	  && aarch64_base_register_rtx_p (op0, strict_p))
diff --git a/gcc/testsuite/gcc.target/aarch64/legitimize_stack_var_before_reload_1.c b/gcc/testsuite/gcc.target/aarch64/legitimize_stack_var_before_reload_1.c
new file mode 100644
index 0000000..e5f0bd1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/legitimize_stack_var_before_reload_1.c
@@ -0,0 +1,21 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-expand" } */
+
+extern void initialize_array (unsigned char *, int);
+
+int
+test15 (void)
+{
+  unsigned char a[480];
+
+  initialize_array (a, 480);
+
+  if (a[0] == 0x10)
+    return 1;
+
+  return 0;
+}
+
+/* { dg-final { scan-rtl-dump "mem/j/c:QI \\(plus:DI" "expand" } } */
+
+/* { dg-final { cleanup-rtl-dump "expand" } } */