Message ID | 20170427073250.GW1809@tucnak |
---|---|
State | New |
Headers | show |
On 04/27/2017 01:32 AM, Jakub Jelinek wrote: > Hi! > > As mentioned in the PR and can be seen on the testcase (too large for > testsuite, with lots of delta reduction I got 48KB *.f90 file still using > a dozen of modules), we miscompile it because we have mem(sp+64) memory > (what %st is loaded from) and are checking whether it is safe to move > earlier in the insn stream, and modified_between_p tells us it is, except > there is a stack pop instruction (i.e. sp autoinc). > And sp autoinc is apparently special in GCC: > /* There are no REG_INC notes for SP. */ Right. It's been the source of numerous problems through the years. One could argue that we should just bite the bullet and add them. The cost can't be that high and it'd avoid these kinds of problems in the future and allow for some code cleanups as well. We could probably scan the IL at the end of auto-inc-dec.c to add the missing notes. I thought I saw a comment once which indicates the rationale behind not including REG_INC notes for pushes/pops, but I can't find it anymore. > > The following patch handles that, plus then undoes that in ix86_agi_dependent > where from what I understood we want the previous behavior - push, pop and > call modifications of SP don't cause AGI stalls for addresses that have > SP base (SP can't appear as index). > > Not really sure about the == stack_pointer_rtx vs. > REG_P () && REGNO () == STACK_POINTER_REGNUM, there is lots of code that > just uses pointer comparisons and others that check REGNO, as an example > of the former e.g. push/pop_operand. So, is SP always shared, or can there > be other REGs with SP regno? SP is supposed to be shared, you should be able to compare against stack_pointer_rtx. > > Other than the ix86_agi_dependent which in my stats was the single case > that hit this difference, I've seen it making a difference e.g. in ifcvt > decisions, but at least the cases I've debugged didn't end up in any code > generation changes. E.g. both x86_64 and i686 libstdc++.so.6 and > libgo.so.11 as the two largest shared libraries built during bootstrap > are identical without/with this patch (objdump -dr is identical that is). > While without the config/i386/i386.c changes there were tons of differences. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2017-04-27 Jakub Jelinek <jakub@redhat.com> > > PR target/79430 > * rtlanal.c (reg_set_p): If reg is a stack_pointer_rtx, also > check for stack push/pop autoinc. > * config/i386/i386.c (ix86_agi_dependent): Return false > if the only reason why modified_in_p returned true is that > addr is SP based and set_insn is a push or pop. THe rtlanal.c changes are fine by me. Uros should chime in on the x86 specific bits. Jeff
On Thu, Apr 27, 2017 at 6:13 PM, Jeff Law <law@redhat.com> wrote: > On 04/27/2017 01:32 AM, Jakub Jelinek wrote: >> >> Hi! >> >> As mentioned in the PR and can be seen on the testcase (too large for >> testsuite, with lots of delta reduction I got 48KB *.f90 file still using >> a dozen of modules), we miscompile it because we have mem(sp+64) memory >> (what %st is loaded from) and are checking whether it is safe to move >> earlier in the insn stream, and modified_between_p tells us it is, except >> there is a stack pop instruction (i.e. sp autoinc). >> And sp autoinc is apparently special in GCC: >> /* There are no REG_INC notes for SP. */ > > Right. It's been the source of numerous problems through the years. One > could argue that we should just bite the bullet and add them. The cost > can't be that high and it'd avoid these kinds of problems in the future and > allow for some code cleanups as well. > > We could probably scan the IL at the end of auto-inc-dec.c to add the > missing notes. > > I thought I saw a comment once which indicates the rationale behind not > including REG_INC notes for pushes/pops, but I can't find it anymore. > >> >> The following patch handles that, plus then undoes that in >> ix86_agi_dependent >> where from what I understood we want the previous behavior - push, pop and >> call modifications of SP don't cause AGI stalls for addresses that have >> SP base (SP can't appear as index). >> >> Not really sure about the == stack_pointer_rtx vs. >> REG_P () && REGNO () == STACK_POINTER_REGNUM, there is lots of code that >> just uses pointer comparisons and others that check REGNO, as an example >> of the former e.g. push/pop_operand. So, is SP always shared, or can >> there >> be other REGs with SP regno? > > SP is supposed to be shared, you should be able to compare against > stack_pointer_rtx. > > >> >> Other than the ix86_agi_dependent which in my stats was the single case >> that hit this difference, I've seen it making a difference e.g. in ifcvt >> decisions, but at least the cases I've debugged didn't end up in any code >> generation changes. E.g. both x86_64 and i686 libstdc++.so.6 and >> libgo.so.11 as the two largest shared libraries built during bootstrap >> are identical without/with this patch (objdump -dr is identical that is). >> While without the config/i386/i386.c changes there were tons of >> differences. >> >> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? >> >> 2017-04-27 Jakub Jelinek <jakub@redhat.com> >> >> PR target/79430 >> * rtlanal.c (reg_set_p): If reg is a stack_pointer_rtx, also >> check for stack push/pop autoinc. >> * config/i386/i386.c (ix86_agi_dependent): Return false >> if the only reason why modified_in_p returned true is that >> addr is SP based and set_insn is a push or pop. > > THe rtlanal.c changes are fine by me. Uros should chime in on the x86 > specific bits. LGTM, with comparison to stack_pointer_rtx, as mentioned by Jeff above. Thanks, Uros.
--- gcc/rtlanal.c.jj 2017-04-26 12:11:04.019878187 +0200 +++ gcc/rtlanal.c 2017-04-26 17:48:14.131705330 +0200 @@ -1221,6 +1221,24 @@ reg_set_p (const_rtx reg, const_rtx insn || find_reg_fusage (insn, CLOBBER, reg))))) return true; + /* There are no REG_INC notes for SP autoinc. */ + if (reg == stack_pointer_rtx && INSN_P (insn)) + { + subrtx_var_iterator::array_type array; + FOR_EACH_SUBRTX_VAR (iter, array, PATTERN (insn), NONCONST) + { + rtx mem = *iter; + if (mem + && MEM_P (mem) + && GET_RTX_CLASS (GET_CODE (XEXP (mem, 0))) == RTX_AUTOINC) + { + if (XEXP (XEXP (mem, 0), 0) == stack_pointer_rtx) + return true; + iter.skip_subrtxes (); + } + } + } + return set_of (reg, insn) != NULL_RTX; } --- gcc/config/i386/i386.c.jj 2017-04-26 17:48:01.108877052 +0200 +++ gcc/config/i386/i386.c 2017-04-26 17:50:44.890717389 +0200 @@ -29243,7 +29243,27 @@ ix86_agi_dependent (rtx_insn *set_insn, if (MEM_P (recog_data.operand[i])) { rtx addr = XEXP (recog_data.operand[i], 0); - return modified_in_p (addr, set_insn) != 0; + if (modified_in_p (addr, set_insn) != 0) + { + /* No AGI stall if SET_INSN is a push or pop and USE_INSN + has SP based memory (unless index reg is modified in a pop). */ + rtx set = single_set (set_insn); + if (set + && (push_operand (SET_DEST (set), GET_MODE (SET_DEST (set))) + || pop_operand (SET_SRC (set), GET_MODE (SET_SRC (set))))) + { + struct ix86_address parts; + if (ix86_decompose_address (addr, &parts) + && REG_P (parts.base) + && REGNO (parts.base) == STACK_POINTER_REGNUM + && (parts.index == NULL_RTX + || MEM_P (SET_DEST (set)) + || !modified_in_p (parts.index, set_insn))) + return false; + } + return true; + } + return false; } return false; }