diff mbox

[AArch64] Improve legitimize_address

Message ID AM5PR0802MB2610FED9736F1EBB2A34EAF783F90@AM5PR0802MB2610.eurprd08.prod.outlook.com
State New
Headers show

Commit Message

Wilco Dijkstra Sept. 6, 2016, 1:14 p.m. UTC
Improve aarch64_legitimize_address - avoid splitting the offset if it is
supported.  When we do split, take the mode size into account.  BLKmode
falls into the unaligned case but should be treated like LDP/STP.
This improves codesize slightly due to fewer base address calculations:

int f(int *p) { return p[5000] + p[7000]; }

Now generates:

f:
	add	x0, x0, 16384
	ldr	w1, [x0, 3616]
	ldr	w0, [x0, 11616]
	add	w0, w1, w0
	ret

instead of:

f:
	add	x1, x0, 16384
	add	x0, x0, 24576
	ldr	w1, [x1, 3616]
	ldr	w0, [x0, 3424]
	add	w0, w1, w0
	ret

OK for trunk?

ChangeLog:
2016-09-06  Wilco Dijkstra  <wdijkstr@arm.com>

    gcc/
	* config/aarch64/aarch64.c (aarch64_legitimize_address):
	Avoid use of base_offset if offset already in range.
--

Comments

Richard Earnshaw (lists) Sept. 7, 2016, 12:43 p.m. UTC | #1
On 06/09/16 14:14, Wilco Dijkstra wrote:
> Improve aarch64_legitimize_address - avoid splitting the offset if it is
> supported.  When we do split, take the mode size into account.  BLKmode
> falls into the unaligned case but should be treated like LDP/STP.
> This improves codesize slightly due to fewer base address calculations:
> 
> int f(int *p) { return p[5000] + p[7000]; }
> 
> Now generates:
> 
> f:
> 	add	x0, x0, 16384
> 	ldr	w1, [x0, 3616]
> 	ldr	w0, [x0, 11616]
> 	add	w0, w1, w0
> 	ret
> 
> instead of:
> 
> f:
> 	add	x1, x0, 16384
> 	add	x0, x0, 24576
> 	ldr	w1, [x1, 3616]
> 	ldr	w0, [x0, 3424]
> 	add	w0, w1, w0
> 	ret
> 
> OK for trunk?
> 
> ChangeLog:
> 2016-09-06  Wilco Dijkstra  <wdijkstr@arm.com>
> 
>     gcc/
> 	* config/aarch64/aarch64.c (aarch64_legitimize_address):
> 	Avoid use of base_offset if offset already in range.

OK.

R.

> --
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 27bbdbad8cddc576f9ed4fd0670116bd6d318412..119ff0aecb0c9f88899fa141b2c7f9158281f9c3 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -5058,9 +5058,19 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, machine_mode mode)
>        /* For offsets aren't a multiple of the access size, the limit is
>  	 -256...255.  */
>        else if (offset & (GET_MODE_SIZE (mode) - 1))
> -	base_offset = (offset + 0x100) & ~0x1ff;
> +	{
> +	  base_offset = (offset + 0x100) & ~0x1ff;
> +
> +	  /* BLKmode typically uses LDP of X-registers.  */
> +	  if (mode == BLKmode)
> +	    base_offset = (offset + 512) & ~0x3ff;
> +	}
> +      /* Small negative offsets are supported.  */
> +      else if (IN_RANGE (offset, -256, 0))
> +	base_offset = 0;
> +      /* Use 12-bit offset by access size.  */
>        else
> -	base_offset = offset & ~0xfff;
> +	base_offset = offset & (~0xfff * GET_MODE_SIZE (mode));
>  
>        if (base_offset != 0)
>  	{
>
Christophe Lyon Sept. 7, 2016, 7:50 p.m. UTC | #2
Hi Wilco,

On 7 September 2016 at 14:43, Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
> On 06/09/16 14:14, Wilco Dijkstra wrote:
>> Improve aarch64_legitimize_address - avoid splitting the offset if it is
>> supported.  When we do split, take the mode size into account.  BLKmode
>> falls into the unaligned case but should be treated like LDP/STP.
>> This improves codesize slightly due to fewer base address calculations:
>>
>> int f(int *p) { return p[5000] + p[7000]; }
>>
>> Now generates:
>>
>> f:
>>       add     x0, x0, 16384
>>       ldr     w1, [x0, 3616]
>>       ldr     w0, [x0, 11616]
>>       add     w0, w1, w0
>>       ret
>>
>> instead of:
>>
>> f:
>>       add     x1, x0, 16384
>>       add     x0, x0, 24576
>>       ldr     w1, [x1, 3616]
>>       ldr     w0, [x0, 3424]
>>       add     w0, w1, w0
>>       ret
>>
>> OK for trunk?
>>
>> ChangeLog:
>> 2016-09-06  Wilco Dijkstra  <wdijkstr@arm.com>
>>
>>     gcc/
>>       * config/aarch64/aarch64.c (aarch64_legitimize_address):
>>       Avoid use of base_offset if offset already in range.
>
> OK.
>
> R.

After this patch, I've noticed a regression:
FAIL: gcc.target/aarch64/ldp_vec_64_1.c scan-assembler ldp\td[0-9]+, d[0-9]
You probably need to adjust the scan pattern.

Thanks,

Christophe


>
>> --
>>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index 27bbdbad8cddc576f9ed4fd0670116bd6d318412..119ff0aecb0c9f88899fa141b2c7f9158281f9c3 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -5058,9 +5058,19 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, machine_mode mode)
>>        /* For offsets aren't a multiple of the access size, the limit is
>>        -256...255.  */
>>        else if (offset & (GET_MODE_SIZE (mode) - 1))
>> -     base_offset = (offset + 0x100) & ~0x1ff;
>> +     {
>> +       base_offset = (offset + 0x100) & ~0x1ff;
>> +
>> +       /* BLKmode typically uses LDP of X-registers.  */
>> +       if (mode == BLKmode)
>> +         base_offset = (offset + 512) & ~0x3ff;
>> +     }
>> +      /* Small negative offsets are supported.  */
>> +      else if (IN_RANGE (offset, -256, 0))
>> +     base_offset = 0;
>> +      /* Use 12-bit offset by access size.  */
>>        else
>> -     base_offset = offset & ~0xfff;
>> +     base_offset = offset & (~0xfff * GET_MODE_SIZE (mode));
>>
>>        if (base_offset != 0)
>>       {
>>
>
Wilco Dijkstra Sept. 8, 2016, 11:52 a.m. UTC | #3
Christophe Lyon wrote:
> After this patch, I've noticed a regression:
> FAIL: gcc.target/aarch64/ldp_vec_64_1.c scan-assembler ldp\td[0-9]+, d[0-9]
> You probably need to adjust the scan pattern.

The original code was better and what we should generate. IVOpt chooses to use
indexing eventhough it is more expensive, and that is a latent bug.

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65068 which shows the same
cost calculation issue.

Wilco
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 27bbdbad8cddc576f9ed4fd0670116bd6d318412..119ff0aecb0c9f88899fa141b2c7f9158281f9c3 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5058,9 +5058,19 @@  aarch64_legitimize_address (rtx x, rtx /* orig_x  */, machine_mode mode)
       /* For offsets aren't a multiple of the access size, the limit is
 	 -256...255.  */
       else if (offset & (GET_MODE_SIZE (mode) - 1))
-	base_offset = (offset + 0x100) & ~0x1ff;
+	{
+	  base_offset = (offset + 0x100) & ~0x1ff;
+
+	  /* BLKmode typically uses LDP of X-registers.  */
+	  if (mode == BLKmode)
+	    base_offset = (offset + 512) & ~0x3ff;
+	}
+      /* Small negative offsets are supported.  */
+      else if (IN_RANGE (offset, -256, 0))
+	base_offset = 0;
+      /* Use 12-bit offset by access size.  */
       else
-	base_offset = offset & ~0xfff;
+	base_offset = offset & (~0xfff * GET_MODE_SIZE (mode));
 
       if (base_offset != 0)
 	{