Message ID | AM5PR0802MB26104A35233ABF33CCD45DF783C50@AM5PR0802MB2610.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
On 20/06/17 15:56, Wilco Dijkstra wrote: > Richard Earnshaw wrote: > >> What testing has this had with -fpic? I'm not convinced that this >> assertion is true in that case? > > I ran the GLIBC tests which pass. -fpic works since it does also form a > constant address, ie. instead of: > > adrp x1, global > add x1, x1, :lo12:global > > we do: > > adrp x1, :got:global > ldr x1, [x1, :got_lo12:global] > > CSEing or rematerializing either sequence works in the same way. > > With TLS the resulting addresses are also constant, however this could > cause rather complex TLS sequences to be rematerialized. It seems > best to block that. Updated patch below: > > > Aarch64_legitimate_constant_p currently returns false for symbols, > eventhough they are always valid constants. This means LOSYM isn't > CSEd correctly. If we return true CSE works better, resulting in > smaller/faster code (0.3% smaller code on SPEC2006). Avoid this > for TLS symbols since their sequence is complex. > > int x0 = 1, x1 = 2, x2 = 3; > > int > f (int x, int y) > { > x += x1; > if (x > 100) > y += x2; > x += x0; > return x + y; > } > > Before: > adrp x3, .LANCHOR0 > add x4, x3, :lo12:.LANCHOR0 > ldr w2, [x3, #:lo12:.LANCHOR0] > add w0, w0, w2 > cmp w0, 100 > ble .L5 > ldr w2, [x4, 8] > add w1, w1, w2 > .L5: > add x3, x3, :lo12:.LANCHOR0 > ldr w2, [x3, 4] > add w0, w0, w2 > add w0, w0, w1 > ret > > After: > adrp x2, .LANCHOR0 > add x3, x2, :lo12:.LANCHOR0 > ldr w2, [x2, #:lo12:.LANCHOR0] > add w0, w0, w2 > cmp w0, 100 > ble .L5 > ldr w2, [x3, 8] > add w1, w1, w2 > .L5: > ldr w2, [x3, 4] > add w0, w0, w2 > add w0, w0, w1 > ret > > Bootstrap OK, OK for commit? > > ChangeLog: > 2017-06-20 Wilco Dijkstra <wdijkstr@arm.com> > > * config/aarch64/aarch64.c (aarch64_legitimate_constant_p): > Return true for non-tls symbols. > -- OK. R. > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 5ec6bbfcf484baa4005b8a88cb98d0d04f710877..060cd8476d2954119daac495ecb059c9be73edbe 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -10111,6 +10111,11 @@ aarch64_legitimate_constant_p (machine_mode mode, rtx x) > && aarch64_valid_symref (XEXP (x, 0), GET_MODE (XEXP (x, 0)))) > return true; > > + /* Treat symbols as constants. Avoid TLS symbols as they are complex, > + so spilling them is better than rematerialization. */ > + if (SYMBOL_REF_P (x) && !SYMBOL_REF_TLS_MODEL (x)) > + return true; > + > return aarch64_constant_address_p (x); > } > >
On Jun 20 2017, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > * config/aarch64/aarch64.c (aarch64_legitimate_constant_p): > Return true for non-tls symbols. This breaks gcc.target/aarch64/reload-valid-spoff.c with -mabi=ilp32: /opt/gcc/gcc-20170622/gcc/testsuite/gcc.target/aarch64/reload-valid-spoff.c: In function 'arp_file': /opt/gcc/gcc-20170622/gcc/testsuite/gcc.target/aarch64/reload-valid-spoff.c:63:1: internal compiler error: in plus_constant, at explow.c:88 0x801ebf plus_constant(machine_mode, rtx_def*, long, bool) ../../gcc/explow.c:88 0x10b66df recog_126 ../../gcc/config/aarch64/aarch64.md:1188 0x10b66df recog_128 ../../gcc/config/aarch64/aarch64.md:2685 0x10b66df recog_129 ../../gcc/config/aarch64/aarch64-simd.md:4348 0x10e851f recog_for_combine_1 ../../gcc/combine.c:11148 0x10e8ba7 recog_for_combine ../../gcc/combine.c:11404 0x10f8ca7 try_combine ../../gcc/combine.c:3520 0x10fbcbb combine_instructions ../../gcc/combine.c:1275 0x10fbcbb rest_of_handle_combine ../../gcc/combine.c:14653 0x10fbcbb execute ../../gcc/combine.c:14698 Andreas.
Andreas Schwab wrote: > > This breaks gcc.target/aarch64/reload-valid-spoff.c with -mabi=ilp32: Indeed, there is a odd ILP32 bug that causes high/lo_sum to be generated in SI mode in expand: (insn 15 14 16 4 (set (reg:SI 125) (high:SI (symbol_ref/u:DI ("*.LC1") [flags 0x2]))) (nil)) (insn 16 15 17 4 (set (reg:SI 124) (lo_sum:SI (reg:SI 125) (symbol_ref/u:DI ("*.LC1") [flags 0x2]))) Eventually this may end up as a 64-bit adrp, a 32-bit lo_sum and a load that expects a 64-bit address... I have a simple workaround to disable the symbol optimization in ILP32, but I'm looking into fixing the underlying cause. Wilco
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 5ec6bbfcf484baa4005b8a88cb98d0d04f710877..060cd8476d2954119daac495ecb059c9be73edbe 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -10111,6 +10111,11 @@ aarch64_legitimate_constant_p (machine_mode mode, rtx x) && aarch64_valid_symref (XEXP (x, 0), GET_MODE (XEXP (x, 0)))) return true; + /* Treat symbols as constants. Avoid TLS symbols as they are complex, + so spilling them is better than rematerialization. */ + if (SYMBOL_REF_P (x) && !SYMBOL_REF_TLS_MODEL (x)) + return true; + return aarch64_constant_address_p (x); }