Message ID | 87inlfsr6e.fsf@linaro.org |
---|---|
State | New |
Headers | show |
On 05/05/2017 01:23 AM, Richard Sandiford wrote: > If we decide to allocate a call-clobbered register R to a value that > is live across a call, LRA will create a new spill register TMPR, > insert: > > TMPR <- R > > before the call and > > R <- TMPR > > after it. But if we then failed to allocate a register to TMPR, we would > always spill it to the stack, even if R was known to be equivalent to > a constant or to some existing memory location. And on AArch64, we'd > always fail to allocate such a register for 128-bit Advanced SIMD modes, > since no registers of those modes are call-preserved. > > This patch avoids the problem by copying the equivalence information > from the original pseudo to the spill register. It means that the > code for the testcase is as good with -O2 as it is with -O, > whereas previously the -O code was better. > > Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? > > Thanks, > Richard > > > [Based on commit branches/ARM/sve-branch@247248] > > 2017-05-05 Richard Sandiford <richard.sandiford@linaro.org> > > gcc/ > * lra-constraints.c (lra_copy_reg_equiv): New function. > (split_reg): Use it to copy equivalence information from the > original register to the spill register. > > gcc/testsuite/ > * gcc.target/aarch64/spill_1.c: New test. OK. jeff
On 05/05/2017 12:23 AM, Richard Sandiford wrote: > 2017-05-05 Richard Sandiford <richard.sandiford@linaro.org> > > gcc/ > * lra-constraints.c (lra_copy_reg_equiv): New function. > (split_reg): Use it to copy equivalence information from the > original register to the spill register. This patch breaks aarch64 bootstrap. I get a link error for lto1 and f951 godump.o: In function `go_define(unsigned int, char const*)': godump.c:(.text+0x36c): relocation truncated to fit: R_AARCH64_ADR_PREL_LO21 against `.rodata' godump.c:(.text+0x4b4): relocation truncated to fit: R_AARCH64_ADR_PREL_LO21 against `.rodata' The godump.c.271r.ira file looks OK, I see (insn 237 223 225 10 (set (reg/f:DI 489) (high:DI (label_ref 240))) "../../gcc-svn/gcc/godump.c":174 49 {*movdi_aarch64} (expr_list:REG_EQUIV (high:DI (label_ref 240)) (insn_list:REG_LABEL_OPERAND 240 (nil)))) ... (insn 238 115 1157 10 (set (reg/f:DI 490) (lo_sum:DI (reg/f:DI 489) (label_ref 240))) "../../gcc-svn/gcc/godump.c":174 929 {add_losym_di} (expr_list:REG_DEAD (reg/f:DI 489) (expr_list:REG_EQUIV (label_ref 240) (insn_list:REG_LABEL_OPERAND 240 (nil))))) But in the godump.c.272r.reload file I see in a different basic block (insn 1244 76 1161 22 (set (reg/f:DI 7 x7 [490]) (label_ref 240)) "../../gcc-svn/gcc/godump.c":221 49 {*movdi_aarch64} (nil)) which is not OK. This label ref is the address of a jumptable in the rodata section, and can't be loaded with a single instruction. It looks like there needs to be some extra work when rematerializing, to handle equiv values that can't just be copied to a register. I haven't had a chance to step through this in a debugger to see what is going on yet. Jim
On Sun, May 7, 2017 at 10:26 PM, Andrew Pinski <pinskia@gmail.com> wrote: > On Sun, May 7, 2017 at 9:30 PM, Jim Wilson <jim.wilson@linaro.org> wrote: >> On 05/05/2017 12:23 AM, Richard Sandiford wrote: >>> >>> 2017-05-05 Richard Sandiford <richard.sandiford@linaro.org> >>> >>> gcc/ >>> * lra-constraints.c (lra_copy_reg_equiv): New function. >>> (split_reg): Use it to copy equivalence information from the >>> original register to the spill register. >> >> >> This patch breaks aarch64 bootstrap. I get a link error for lto1 and f951 >> >> godump.o: In function `go_define(unsigned int, char const*)': >> godump.c:(.text+0x36c): relocation truncated to fit: >> R_AARCH64_ADR_PREL_LO21 against `.rodata' >> godump.c:(.text+0x4b4): relocation truncated to fit: R_AARCH64_ADR_PREL_LO21 >> against `.rodata' >> >> The godump.c.271r.ira file looks OK, I see >> >> (insn 237 223 225 10 (set (reg/f:DI 489) >> (high:DI (label_ref 240))) "../../gcc-svn/gcc/godump.c":174 49 >> {*movdi_aarch64} >> (expr_list:REG_EQUIV (high:DI (label_ref 240)) >> (insn_list:REG_LABEL_OPERAND 240 (nil)))) >> ... >> (insn 238 115 1157 10 (set (reg/f:DI 490) >> (lo_sum:DI (reg/f:DI 489) >> (label_ref 240))) "../../gcc-svn/gcc/godump.c":174 929 >> {add_losym_di} >> (expr_list:REG_DEAD (reg/f:DI 489) >> (expr_list:REG_EQUIV (label_ref 240) >> (insn_list:REG_LABEL_OPERAND 240 (nil))))) >> >> But in the godump.c.272r.reload file I see in a different basic block >> >> (insn 1244 76 1161 22 (set (reg/f:DI 7 x7 [490]) >> (label_ref 240)) "../../gcc-svn/gcc/godump.c":221 49 >> {*movdi_aarch64} >> (nil)) >> >> which is not OK. This label ref is the address of a jumptable in the rodata >> section, and can't be loaded with a single instruction. It looks like there >> needs to be some extra work when rematerializing, to handle equiv values >> that can't just be copied to a register. > > Hmm, shouldn't have the mov rejected as being invalid? I think that > is one issue with the aarch64 backend there. > > See https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01800.html > > > I can't remember if the following patch was ever submitted or committed. > > Here are my notes about this patch from the internal bug report we got > here at Cavium (back in 2013): > > Switch tables are implemented using the tiny model but they are stored > in the rodata section which means they could overflow the address. Some more notes: (In reply to comment #15) > (In reply to comment #14) > > //(insn 793 35 795 (set (reg/f:DI 2 x2 [450]) > > // (label_ref 456)) 32 {*movdi_aarch64} > > // (insn_list:REG_LABEL_OPERAND 456 (expr_list:REG_EQUAL (label_ref 456) > > // (nil)))) > > adr x2, .L806 // 793 *movdi_aarch64/9 [length = 4] > > Which is generated by the register allocator and we never split it into the > adrp/add again. The register allocator is doing an ok job as the backend said this is a valid pattern. We need a constraint for *movdi_aarch64/*movsi_aarch64 which says this is not a valid pattern and to go through the standard gen_movinsn path. Thanks, Andrew Pinski > > Note this patch most likely won't apply directly either: > > From: Andrew Pinski <apinski@cavium.com> > Date: Thu, 15 Aug 2013 20:42:12 +0000 (-0700) > Subject: 2013-08-11 Andrew Pinski <apinski@cavium.com> > X-Git-Tag: tc47_SDK_3_1_0_build_28~9 > X-Git-Url: http://cagit1.caveonetworks.com/git/?p=toolchain%2Fgcc.git;a=commitdiff_plain;h=1714f167b575956801264db5cd2300203a58e3e9 > > 2013-08-11 Andrew Pinski <apinski@cavium.com> > > Bug #7079 > * config/aarch64/aarch64.md (*movsi_aarch64): Use Ust instead of S constrain. > (*movdi_aarch64): Likewise. > (ldr_got_small_sidi): Add type attribute. > * config/aarch64/constraints.md (Ust): New constraint like S but only > if the symbol is SYMBOL_TINY_ABSOLUTE. > --- > > diff --git a/gcc/ChangeLog.aarch64 b/gcc/ChangeLog.aarch64 > index 3c8ff13..f4e1e91 100644 > --- a/gcc/ChangeLog.aarch64 > +++ b/gcc/ChangeLog.aarch64 > @@ -1,3 +1,12 @@ > +2013-08-11 Andrew Pinski <apinski@cavium.com> > + > + Bug #7079 > + * config/aarch64/aarch64.md (*movsi_aarch64): Use Ust instead of S constrain. > + (*movdi_aarch64): Likewise. > + (ldr_got_small_sidi): Add type attribute. > + * config/aarch64/constraints.md (Ust): New constraint like S but only > + if the symbol is SYMBOL_TINY_ABSOLUTE. > + > 2013-08-14 Yufeng Zhang <yufeng.zhang@arm.com> > > * function.c (assign_parm_find_data_types): Set passed_mode and > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 485bd59..0cd6a41 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -810,7 +810,7 @@ > > (define_insn "*movsi_aarch64" > [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,*w,m, > m,r,r ,*w, r,*w") > - (match_operand:SI 1 "aarch64_mov_operand" " r,r,k,M,m, > m,rZ,*w,S,Ush,rZ,*w,*w"))] > + (match_operand:SI 1 "aarch64_mov_operand" " r,r,k,M,m, > m,rZ,*w,Ust,Ush,rZ,*w,*w"))] > "(register_operand (operands[0], SImode) > || aarch64_reg_or_zero (operands[1], SImode))" > "@ > @@ -836,7 +836,7 @@ > > (define_insn "*movdi_aarch64" > [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,*w,m, > m,r,r, *w, r,*w,w") > - (match_operand:DI 1 "aarch64_mov_operand" " r,r,k,N,m, > m,rZ,*w,S,Ush,rZ,*w,*w,Dd"))] > + (match_operand:DI 1 "aarch64_mov_operand" " r,r,k,N,m, > m,rZ,*w,Ust,Ush,rZ,*w,*w,Dd"))] > "(register_operand (operands[0], DImode) > || aarch64_reg_or_zero (operands[1], DImode))" > "@ > @@ -3978,6 +3978,7 @@ > "TARGET_ILP32" > "ldr\\t%w0, [%1, #:got_lo12:%a2]" > [(set_attr "v8type" "load1") > + (set_attr "type" "load1") > (set_attr "mode" "DI")] > ) > > diff --git a/gcc/config/aarch64/constraints.md > b/gcc/config/aarch64/constraints.md > index 2570400..a36bdaf 100644 > --- a/gcc/config/aarch64/constraints.md > +++ b/gcc/config/aarch64/constraints.md > @@ -114,6 +114,12 @@ > (and (match_code "const_int") > (match_test "(unsigned) exact_log2 (ival) <= 4"))) > > +(define_constraint "Ust" > + "A constraint that matches an absolute symbolic address; only for > tiny model." > + (and (match_code "const,symbol_ref,label_ref") > + (match_test "aarch64_symbolic_address_p (op) > + && aarch64_classify_symbolic_expression (op, SYMBOL_CONTEXT_ADR) > == SYMBOL_TINY_ABSOLUTE"))) > + > (define_memory_constraint "Q" > "A memory address which uses a single base register with no offset." > (and (match_code "mem") > > >> >> I haven't had a chance to step through this in a debugger to see what is >> going on yet. >> >> Jim >>
Index: gcc/lra-constraints.c =================================================================== --- gcc/lra-constraints.c 2017-04-18 19:52:35.062175087 +0100 +++ gcc/lra-constraints.c 2017-05-05 08:19:18.243479648 +0100 @@ -5394,6 +5394,29 @@ choose_split_class (enum reg_class alloc #endif } +/* Copy any equivalence information from ORIGINAL_REGNO to NEW_REGNO. + It only makes sense to call this function if NEW_REGNO is always + equal to ORIGINAL_REGNO. */ + +static void +lra_copy_reg_equiv (unsigned int new_regno, unsigned int original_regno) +{ + if (!ira_reg_equiv[original_regno].defined_p) + return; + + ira_expand_reg_equiv (); + ira_reg_equiv[new_regno].defined_p = true; + if (ira_reg_equiv[original_regno].memory) + ira_reg_equiv[new_regno].memory + = copy_rtx (ira_reg_equiv[original_regno].memory); + if (ira_reg_equiv[original_regno].constant) + ira_reg_equiv[new_regno].constant + = copy_rtx (ira_reg_equiv[original_regno].constant); + if (ira_reg_equiv[original_regno].invariant) + ira_reg_equiv[new_regno].invariant + = copy_rtx (ira_reg_equiv[original_regno].invariant); +} + /* Do split transformations for insn INSN, which defines or uses ORIGINAL_REGNO. NEXT_USAGE_INSNS specifies which instruction in the EBB next uses ORIGINAL_REGNO; it has the same form as the @@ -5515,6 +5538,7 @@ split_reg (bool before_p, int original_r new_reg = lra_create_new_reg (mode, original_reg, rclass, "split"); reg_renumber[REGNO (new_reg)] = hard_regno; } + int new_regno = REGNO (new_reg); save = emit_spill_move (true, new_reg, original_reg); if (NEXT_INSN (save) != NULL_RTX && !call_save_p) { @@ -5523,7 +5547,7 @@ split_reg (bool before_p, int original_r fprintf (lra_dump_file, " Rejecting split %d->%d resulting in > 2 save insns:\n", - original_regno, REGNO (new_reg)); + original_regno, new_regno); dump_rtl_slim (lra_dump_file, save, NULL, -1, 0); fprintf (lra_dump_file, " ))))))))))))))))))))))))))))))))))))))))))))))))\n"); @@ -5538,18 +5562,24 @@ split_reg (bool before_p, int original_r fprintf (lra_dump_file, " Rejecting split %d->%d " "resulting in > 2 restore insns:\n", - original_regno, REGNO (new_reg)); + original_regno, new_regno); dump_rtl_slim (lra_dump_file, restore, NULL, -1, 0); fprintf (lra_dump_file, " ))))))))))))))))))))))))))))))))))))))))))))))))\n"); } return false; } + /* Transfer equivalence information to the spill register, so that + if we fail to allocate the spill register, we have the option of + rematerializing the original value instead of spilling to the stack. */ + if (!HARD_REGISTER_NUM_P (original_regno) + && mode == PSEUDO_REGNO_MODE (original_regno)) + lra_copy_reg_equiv (new_regno, original_regno); after_p = usage_insns[original_regno].after_p; - lra_reg_info[REGNO (new_reg)].restore_rtx = regno_reg_rtx[original_regno]; - bitmap_set_bit (&check_only_regs, REGNO (new_reg)); + lra_reg_info[new_regno].restore_rtx = regno_reg_rtx[original_regno]; + bitmap_set_bit (&check_only_regs, new_regno); bitmap_set_bit (&check_only_regs, original_regno); - bitmap_set_bit (&lra_split_regs, REGNO (new_reg)); + bitmap_set_bit (&lra_split_regs, new_regno); for (;;) { if (GET_CODE (next_usage_insns) != INSN_LIST) @@ -5565,7 +5595,7 @@ split_reg (bool before_p, int original_r if (lra_dump_file != NULL) { fprintf (lra_dump_file, " Split reuse change %d->%d:\n", - original_regno, REGNO (new_reg)); + original_regno, new_regno); dump_insn_slim (lra_dump_file, as_a <rtx_insn *> (usage_insn)); } } Index: gcc/testsuite/gcc.target/aarch64/spill_1.c =================================================================== --- /dev/null 2017-05-05 07:30:08.141451281 +0100 +++ gcc/testsuite/gcc.target/aarch64/spill_1.c 2017-05-05 08:19:18.244456716 +0100 @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +typedef int v4si __attribute__ ((vector_size (16))); + +void bar (void); +void +foo (void) +{ + v4si x = { 1, 1, 1, 1 }; + asm ("# %0" :: "w" (x)); + bar (); + asm ("# %0" :: "w" (x)); +} + +/* { dg-final { scan-assembler-times {\tmovi\tv[0-9]+\.4s,} 2 } } */ +/* { dg-final { scan-assembler-not {\tldr\t} } } */ +/* { dg-final { scan-assembler-not {\tstr\t} } } */