diff mbox

[AArch64] Mark symbols as constant

Message ID AM5PR0802MB26104A35233ABF33CCD45DF783C50@AM5PR0802MB2610.eurprd08.prod.outlook.com
State New
Headers show

Commit Message

Wilco Dijkstra June 20, 2017, 2:56 p.m. UTC
Richard Earnshaw wrote:

> What testing has this had with -fpic?  I'm not convinced that this
> assertion is true in that case?

I ran the GLIBC tests which pass. -fpic works since it does also form a
constant address, ie. instead of:

adrp  x1, global
add x1, x1, :lo12:global

we do:

adrp  x1, :got:global
ldr x1, [x1, :got_lo12:global]

CSEing or rematerializing either sequence works in the same way.

With TLS the resulting addresses are also constant, however this could
cause rather complex TLS sequences to be rematerialized.  It seems
best to block that.  Updated patch below:


Aarch64_legitimate_constant_p currently returns false for symbols,
eventhough they are always valid constants.  This means LOSYM isn't
CSEd correctly.  If we return true CSE works better, resulting in
smaller/faster code (0.3% smaller code on SPEC2006).  Avoid this
for TLS symbols since their sequence is complex.

int x0 = 1, x1 = 2, x2 = 3;

int 
f (int x, int y)
{
  x += x1;
  if (x > 100)
    y += x2;
  x += x0;
  return x + y;
}

Before:
	adrp	x3, .LANCHOR0
	add	x4, x3, :lo12:.LANCHOR0
	ldr	w2, [x3, #:lo12:.LANCHOR0]
	add	w0, w0, w2
	cmp	w0, 100
	ble	.L5
	ldr	w2, [x4, 8]
	add	w1, w1, w2
.L5:
	add	x3, x3, :lo12:.LANCHOR0
	ldr	w2, [x3, 4]
	add	w0, w0, w2
	add	w0, w0, w1
	ret

After:
	adrp	x2, .LANCHOR0
	add	x3, x2, :lo12:.LANCHOR0
	ldr	w2, [x2, #:lo12:.LANCHOR0]
	add	w0, w0, w2
	cmp	w0, 100
	ble	.L5
	ldr	w2, [x3, 8]
	add	w1, w1, w2
.L5:
	ldr	w2, [x3, 4]
	add	w0, w0, w2
	add	w0, w0, w1
	ret

Bootstrap OK, OK for commit?

ChangeLog:
2017-06-20  Wilco Dijkstra  <wdijkstr@arm.com>

	* config/aarch64/aarch64.c (aarch64_legitimate_constant_p):
	Return true for non-tls symbols.
--

Comments

Richard Earnshaw (lists) June 21, 2017, 10:01 a.m. UTC | #1
On 20/06/17 15:56, Wilco Dijkstra wrote:
> Richard Earnshaw wrote:
> 
>> What testing has this had with -fpic?  I'm not convinced that this
>> assertion is true in that case?
> 
> I ran the GLIBC tests which pass. -fpic works since it does also form a
> constant address, ie. instead of:
> 
> adrp  x1, global
> add x1, x1, :lo12:global
> 
> we do:
> 
> adrp  x1, :got:global
> ldr x1, [x1, :got_lo12:global]
> 
> CSEing or rematerializing either sequence works in the same way.
> 
> With TLS the resulting addresses are also constant, however this could
> cause rather complex TLS sequences to be rematerialized.  It seems
> best to block that.  Updated patch below:
> 
> 
> Aarch64_legitimate_constant_p currently returns false for symbols,
> eventhough they are always valid constants.  This means LOSYM isn't
> CSEd correctly.  If we return true CSE works better, resulting in
> smaller/faster code (0.3% smaller code on SPEC2006).  Avoid this
> for TLS symbols since their sequence is complex.
> 
> int x0 = 1, x1 = 2, x2 = 3;
> 
> int 
> f (int x, int y)
> {
>   x += x1;
>   if (x > 100)
>     y += x2;
>   x += x0;
>   return x + y;
> }
> 
> Before:
> 	adrp	x3, .LANCHOR0
> 	add	x4, x3, :lo12:.LANCHOR0
> 	ldr	w2, [x3, #:lo12:.LANCHOR0]
> 	add	w0, w0, w2
> 	cmp	w0, 100
> 	ble	.L5
> 	ldr	w2, [x4, 8]
> 	add	w1, w1, w2
> .L5:
> 	add	x3, x3, :lo12:.LANCHOR0
> 	ldr	w2, [x3, 4]
> 	add	w0, w0, w2
> 	add	w0, w0, w1
> 	ret
> 
> After:
> 	adrp	x2, .LANCHOR0
> 	add	x3, x2, :lo12:.LANCHOR0
> 	ldr	w2, [x2, #:lo12:.LANCHOR0]
> 	add	w0, w0, w2
> 	cmp	w0, 100
> 	ble	.L5
> 	ldr	w2, [x3, 8]
> 	add	w1, w1, w2
> .L5:
> 	ldr	w2, [x3, 4]
> 	add	w0, w0, w2
> 	add	w0, w0, w1
> 	ret
> 
> Bootstrap OK, OK for commit?
> 
> ChangeLog:
> 2017-06-20  Wilco Dijkstra  <wdijkstr@arm.com>
> 
> 	* config/aarch64/aarch64.c (aarch64_legitimate_constant_p):
> 	Return true for non-tls symbols.
> --

OK.

R.

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 5ec6bbfcf484baa4005b8a88cb98d0d04f710877..060cd8476d2954119daac495ecb059c9be73edbe 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -10111,6 +10111,11 @@ aarch64_legitimate_constant_p (machine_mode mode, rtx x)
>        && aarch64_valid_symref (XEXP (x, 0), GET_MODE (XEXP (x, 0))))
>      return true;
>  
> +  /* Treat symbols as constants.  Avoid TLS symbols as they are complex,
> +     so spilling them is better than rematerialization.  */
> +  if (SYMBOL_REF_P (x) && !SYMBOL_REF_TLS_MODEL (x))
> +    return true;
> +
>    return aarch64_constant_address_p (x);
>  }
>  
>
Andreas Schwab June 22, 2017, 3:22 p.m. UTC | #2
On Jun 20 2017, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:

> 	* config/aarch64/aarch64.c (aarch64_legitimate_constant_p):
> 	Return true for non-tls symbols.

This breaks gcc.target/aarch64/reload-valid-spoff.c with -mabi=ilp32:

/opt/gcc/gcc-20170622/gcc/testsuite/gcc.target/aarch64/reload-valid-spoff.c: In function 'arp_file':
/opt/gcc/gcc-20170622/gcc/testsuite/gcc.target/aarch64/reload-valid-spoff.c:63:1: internal compiler error: in plus_constant, at explow.c:88
0x801ebf plus_constant(machine_mode, rtx_def*, long, bool)
	../../gcc/explow.c:88
0x10b66df recog_126
	../../gcc/config/aarch64/aarch64.md:1188
0x10b66df recog_128
	../../gcc/config/aarch64/aarch64.md:2685
0x10b66df recog_129
	../../gcc/config/aarch64/aarch64-simd.md:4348
0x10e851f recog_for_combine_1
	../../gcc/combine.c:11148
0x10e8ba7 recog_for_combine
	../../gcc/combine.c:11404
0x10f8ca7 try_combine
	../../gcc/combine.c:3520
0x10fbcbb combine_instructions
	../../gcc/combine.c:1275
0x10fbcbb rest_of_handle_combine
	../../gcc/combine.c:14653
0x10fbcbb execute
	../../gcc/combine.c:14698

Andreas.
Wilco Dijkstra June 23, 2017, 10:56 a.m. UTC | #3
Andreas Schwab wrote:
>
> This breaks gcc.target/aarch64/reload-valid-spoff.c with -mabi=ilp32:

Indeed, there is a odd ILP32 bug that causes high/lo_sum to be generated
in SI mode in expand:

(insn 15 14 16 4 (set (reg:SI 125)
        (high:SI (symbol_ref/u:DI ("*.LC1") [flags 0x2]))) 
     (nil))
(insn 16 15 17 4 (set (reg:SI 124)
        (lo_sum:SI (reg:SI 125)
            (symbol_ref/u:DI ("*.LC1") [flags 0x2])))

Eventually this may end up as a 64-bit adrp, a 32-bit lo_sum and a load that
expects a 64-bit address...

I have a simple workaround to disable the symbol optimization in ILP32,
but I'm looking into fixing the underlying cause.

Wilco
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5ec6bbfcf484baa4005b8a88cb98d0d04f710877..060cd8476d2954119daac495ecb059c9be73edbe 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -10111,6 +10111,11 @@  aarch64_legitimate_constant_p (machine_mode mode, rtx x)
       && aarch64_valid_symref (XEXP (x, 0), GET_MODE (XEXP (x, 0))))
     return true;
 
+  /* Treat symbols as constants.  Avoid TLS symbols as they are complex,
+     so spilling them is better than rematerialization.  */
+  if (SYMBOL_REF_P (x) && !SYMBOL_REF_TLS_MODEL (x))
+    return true;
+
   return aarch64_constant_address_p (x);
 }