Message ID | 20140410161004.GG1817@tucnak.redhat.com |
---|---|
State | New |
Headers | show |
On 04/10/2014 09:10 AM, Jakub Jelinek wrote: > 2014-04-10 Jakub Jelinek <jakub@redhat.com> > > PR rtl-optimization/60663 > * cse.c (cse_insn): Set src_volatile on ASM_OPERANDS in > PARALLEL. > > * gcc.target/arm/pr60663.c: New test. Ok if it passes. But you're right that ARM backend needs that rtx_costs change too. r~
On 11 April 2014 00:10, Jakub Jelinek <jakub@redhat.com> wrote: > On Tue, Apr 01, 2014 at 11:41:12AM +0800, Zhenqiang Chen wrote: >> Ping? >> >> Bootstrap and no make check regression on X86-64. >> >> Bootstrap on ARM. In ARM regression test, some new PASS and FAIL of >> debug info check for gcc.dg/guality/pr36728-1.c and >> gcc.dg/guality/pr36728-2.c since register allocation result is >> different with the patch. There is no real new FAIL due to the patch. > >> > --- a/gcc/cse.c >> > +++ b/gcc/cse.c >> > @@ -4280,6 +4280,19 @@ find_sets_in_insn (rtx insn, struct set **psets) >> > ; >> > else if (GET_CODE (SET_SRC (y)) == CALL) >> > ; >> > + else if (GET_CODE (SET_SRC (y)) == ASM_OPERANDS) >> > + { >> > + if (i + 1 < lim) >> > + { >> > + rtx n = XVECEXP (x, 0, i + 1); >> > + /* For inline assemble with multiple outputs, we can not >> > + handle the SET separately. Refer PR60663. */ >> > + if (GET_CODE (n) == SET >> > + && GET_CODE (SET_SRC (n)) == ASM_OPERANDS) >> > + break; >> > + } >> > + sets[n_sets++].rtl = y; >> > + } >> > else >> > sets[n_sets++].rtl = y; >> > } > > This doesn't look like a correct fix. First of all, it will not handle > many of the cases where we should not break the inline asm appart, > e.g. if you have: > int > foo (void) > { > unsigned i; > asm ("%0" : "=r" (i) : : "r5"); > return i; > } > then it will still happily drop the clobber on the floor. > But also, e.g. volatile asm or asm goto which is even stronger reason > not to fiddle with it too much, isn't handled by not adding the sets in > find_sets_in_insn, but rather just setting src_volatile flag. > So, I'd say a better fix than this is something like following patch > (untested yet, but fixes the testcase). Thanks. The patch can fix the issue. I tested it on X86-64 and ARM. There is no regression. > Or, fix up the insane arm costs for ASM_OPERANDS: > case ASM_OPERANDS: > /* Just a guess. Cost one insn per input. */ > *cost = COSTS_N_INSNS (ASM_OPERANDS_INPUT_LENGTH (x)); > return true; > I don't think this heuristics is even close to reality most of the > time, more importantly, for no inputs, just outputs and clobbers it means > *cost = 0; and that is why ARM is supposedly the only target now where CSE > thinks it is worthwhile to break all inline asms without inputs appart. I will raise a patch to discuss it for ARM backend. Thanks! -Zhenqiang > 2014-04-10 Jakub Jelinek <jakub@redhat.com> > > PR rtl-optimization/60663 > * cse.c (cse_insn): Set src_volatile on ASM_OPERANDS in > PARALLEL. > > * gcc.target/arm/pr60663.c: New test. > > --- gcc/cse.c.jj 2014-03-12 10:13:41.000000000 +0100 > +++ gcc/cse.c 2014-04-10 17:21:27.517330918 +0200 > @@ -4642,6 +4642,13 @@ cse_insn (rtx insn) > && REGNO (dest) >= FIRST_PSEUDO_REGISTER) > sets[i].src_volatile = 1; > > + /* Also do not record result of a non-volatile inline asm with > + more than one result or with clobbers, we do not want CSE to > + break the inline asm apart. */ > + else if (GET_CODE (src) == ASM_OPERANDS > + && GET_CODE (x) == PARALLEL) > + sets[i].src_volatile = 1; > + > #if 0 > /* It is no longer clear why we used to do this, but it doesn't > appear to still be needed. So let's try without it since this > --- gcc/testsuite/gcc.target/arm/pr60663.c.jj 2014-04-10 17:30:04.392608591 +0200 > +++ gcc/testsuite/gcc.target/arm/pr60663.c 2014-04-10 17:29:25.000000000 +0200 > @@ -0,0 +1,11 @@ > +/* PR rtl-optimization/60663 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -march=armv7-a" } */ > + > +int > +foo (void) > +{ > + unsigned i, j; > + asm ("%0 %1" : "=r" (i), "=r" (j)); > + return i; > +} > > > Jakub
On Fri, Apr 11, 2014 at 05:19:59PM +0800, Zhenqiang Chen wrote: > > Or, fix up the insane arm costs for ASM_OPERANDS: > > case ASM_OPERANDS: > > /* Just a guess. Cost one insn per input. */ > > *cost = COSTS_N_INSNS (ASM_OPERANDS_INPUT_LENGTH (x)); > > return true; > > I don't think this heuristics is even close to reality most of the > > time, more importantly, for no inputs, just outputs and clobbers it means > > *cost = 0; and that is why ARM is supposedly the only target now where CSE > > thinks it is worthwhile to break all inline asms without inputs appart. > > I will raise a patch to discuss it for ARM backend. AFAIK Kyrill is already working on such a patch, but it was agreed to put it into stage1 only and eventually backport for 4.9.1. Jakub
--- gcc/cse.c.jj 2014-03-12 10:13:41.000000000 +0100 +++ gcc/cse.c 2014-04-10 17:21:27.517330918 +0200 @@ -4642,6 +4642,13 @@ cse_insn (rtx insn) && REGNO (dest) >= FIRST_PSEUDO_REGISTER) sets[i].src_volatile = 1; + /* Also do not record result of a non-volatile inline asm with + more than one result or with clobbers, we do not want CSE to + break the inline asm apart. */ + else if (GET_CODE (src) == ASM_OPERANDS + && GET_CODE (x) == PARALLEL) + sets[i].src_volatile = 1; + #if 0 /* It is no longer clear why we used to do this, but it doesn't appear to still be needed. So let's try without it since this --- gcc/testsuite/gcc.target/arm/pr60663.c.jj 2014-04-10 17:30:04.392608591 +0200 +++ gcc/testsuite/gcc.target/arm/pr60663.c 2014-04-10 17:29:25.000000000 +0200 @@ -0,0 +1,11 @@ +/* PR rtl-optimization/60663 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +int +foo (void) +{ + unsigned i, j; + asm ("%0 %1" : "=r" (i), "=r" (j)); + return i; +}