diff mbox

Check for sp push/pop insns in reg_set_p (PR target/79430)

Message ID 20170427073250.GW1809@tucnak
State New
Headers show

Commit Message

Jakub Jelinek April 27, 2017, 7:32 a.m. UTC
Hi!

As mentioned in the PR and can be seen on the testcase (too large for
testsuite, with lots of delta reduction I got 48KB *.f90 file still using
a dozen of modules), we miscompile it because we have mem(sp+64) memory
(what %st is loaded from) and are checking whether it is safe to move
earlier in the insn stream, and modified_between_p tells us it is, except
there is a stack pop instruction (i.e. sp autoinc).
And sp autoinc is apparently special in GCC:
      /* There are no REG_INC notes for SP.  */
  /* Cannot handle auto inc of the stack.  */
  if (inc_reg == stack_pointer_rtx)
etc. - it is present even on targets that have AUTO_INC_DEC 0 (like
i?86/x86_64), don't have REG_INC notes etc.
reg_set_p currently has:
  /* We can be passed an insn or part of one.  If we are passed an insn,
     check if a side-effect of the insn clobbers REG.  */
  if (INSN_P (insn)
      && (FIND_REG_INC_NOTE (insn, reg)
so it handles insns with REG_INC notes fine, but doesn't know about the
SP special case.

The following patch handles that, plus then undoes that in ix86_agi_dependent
where from what I understood we want the previous behavior - push, pop and
call modifications of SP don't cause AGI stalls for addresses that have
SP base (SP can't appear as index).

Not really sure about the == stack_pointer_rtx vs.
REG_P () && REGNO () == STACK_POINTER_REGNUM, there is lots of code that
just uses pointer comparisons and others that check REGNO, as an example
of the former e.g. push/pop_operand.  So, is SP always shared, or can there
be other REGs with SP regno?

Other than the ix86_agi_dependent which in my stats was the single case
that hit this difference, I've seen it making a difference e.g. in ifcvt
decisions, but at least the cases I've debugged didn't end up in any code
generation changes.  E.g. both x86_64 and i686 libstdc++.so.6 and
libgo.so.11 as the two largest shared libraries built during bootstrap
are identical without/with this patch (objdump -dr is identical that is).
While without the config/i386/i386.c changes there were tons of differences.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-04-27  Jakub Jelinek  <jakub@redhat.com>

	PR target/79430
	* rtlanal.c (reg_set_p): If reg is a stack_pointer_rtx, also
	check for stack push/pop autoinc.
	* config/i386/i386.c (ix86_agi_dependent): Return false
	if the only reason why modified_in_p returned true is that
	addr is SP based and set_insn is a push or pop.


	Jakub

Comments

Jeff Law April 27, 2017, 4:13 p.m. UTC | #1
On 04/27/2017 01:32 AM, Jakub Jelinek wrote:
> Hi!
> 
> As mentioned in the PR and can be seen on the testcase (too large for
> testsuite, with lots of delta reduction I got 48KB *.f90 file still using
> a dozen of modules), we miscompile it because we have mem(sp+64) memory
> (what %st is loaded from) and are checking whether it is safe to move
> earlier in the insn stream, and modified_between_p tells us it is, except
> there is a stack pop instruction (i.e. sp autoinc).
> And sp autoinc is apparently special in GCC:
>        /* There are no REG_INC notes for SP.  */
Right.  It's been the source of numerous problems through the years. 
One could argue that we should just bite the bullet and add them.  The 
cost can't be that high and it'd avoid these kinds of problems in the 
future and allow for some code cleanups as well.

We could probably scan the IL at the end of auto-inc-dec.c to add the 
missing notes.

I thought I saw a comment once which indicates the rationale behind not 
including REG_INC notes for pushes/pops, but I can't find it anymore.

> 
> The following patch handles that, plus then undoes that in ix86_agi_dependent
> where from what I understood we want the previous behavior - push, pop and
> call modifications of SP don't cause AGI stalls for addresses that have
> SP base (SP can't appear as index).
> 
> Not really sure about the == stack_pointer_rtx vs.
> REG_P () && REGNO () == STACK_POINTER_REGNUM, there is lots of code that
> just uses pointer comparisons and others that check REGNO, as an example
> of the former e.g. push/pop_operand.  So, is SP always shared, or can there
> be other REGs with SP regno?
SP is supposed to be shared, you should be able to compare against 
stack_pointer_rtx.


> 
> Other than the ix86_agi_dependent which in my stats was the single case
> that hit this difference, I've seen it making a difference e.g. in ifcvt
> decisions, but at least the cases I've debugged didn't end up in any code
> generation changes.  E.g. both x86_64 and i686 libstdc++.so.6 and
> libgo.so.11 as the two largest shared libraries built during bootstrap
> are identical without/with this patch (objdump -dr is identical that is).
> While without the config/i386/i386.c changes there were tons of differences.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2017-04-27  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR target/79430
> 	* rtlanal.c (reg_set_p): If reg is a stack_pointer_rtx, also
> 	check for stack push/pop autoinc.
> 	* config/i386/i386.c (ix86_agi_dependent): Return false
> 	if the only reason why modified_in_p returned true is that
> 	addr is SP based and set_insn is a push or pop.
THe rtlanal.c changes are fine by me.  Uros should chime in on the x86 
specific bits.

Jeff
Uros Bizjak May 1, 2017, 9:24 a.m. UTC | #2
On Thu, Apr 27, 2017 at 6:13 PM, Jeff Law <law@redhat.com> wrote:
> On 04/27/2017 01:32 AM, Jakub Jelinek wrote:
>>
>> Hi!
>>
>> As mentioned in the PR and can be seen on the testcase (too large for
>> testsuite, with lots of delta reduction I got 48KB *.f90 file still using
>> a dozen of modules), we miscompile it because we have mem(sp+64) memory
>> (what %st is loaded from) and are checking whether it is safe to move
>> earlier in the insn stream, and modified_between_p tells us it is, except
>> there is a stack pop instruction (i.e. sp autoinc).
>> And sp autoinc is apparently special in GCC:
>>        /* There are no REG_INC notes for SP.  */
>
> Right.  It's been the source of numerous problems through the years. One
> could argue that we should just bite the bullet and add them.  The cost
> can't be that high and it'd avoid these kinds of problems in the future and
> allow for some code cleanups as well.
>
> We could probably scan the IL at the end of auto-inc-dec.c to add the
> missing notes.
>
> I thought I saw a comment once which indicates the rationale behind not
> including REG_INC notes for pushes/pops, but I can't find it anymore.
>
>>
>> The following patch handles that, plus then undoes that in
>> ix86_agi_dependent
>> where from what I understood we want the previous behavior - push, pop and
>> call modifications of SP don't cause AGI stalls for addresses that have
>> SP base (SP can't appear as index).
>>
>> Not really sure about the == stack_pointer_rtx vs.
>> REG_P () && REGNO () == STACK_POINTER_REGNUM, there is lots of code that
>> just uses pointer comparisons and others that check REGNO, as an example
>> of the former e.g. push/pop_operand.  So, is SP always shared, or can
>> there
>> be other REGs with SP regno?
>
> SP is supposed to be shared, you should be able to compare against
> stack_pointer_rtx.
>
>
>>
>> Other than the ix86_agi_dependent which in my stats was the single case
>> that hit this difference, I've seen it making a difference e.g. in ifcvt
>> decisions, but at least the cases I've debugged didn't end up in any code
>> generation changes.  E.g. both x86_64 and i686 libstdc++.so.6 and
>> libgo.so.11 as the two largest shared libraries built during bootstrap
>> are identical without/with this patch (objdump -dr is identical that is).
>> While without the config/i386/i386.c changes there were tons of
>> differences.
>>
>> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>>
>> 2017-04-27  Jakub Jelinek  <jakub@redhat.com>
>>
>>         PR target/79430
>>         * rtlanal.c (reg_set_p): If reg is a stack_pointer_rtx, also
>>         check for stack push/pop autoinc.
>>         * config/i386/i386.c (ix86_agi_dependent): Return false
>>         if the only reason why modified_in_p returned true is that
>>         addr is SP based and set_insn is a push or pop.
>
> THe rtlanal.c changes are fine by me.  Uros should chime in on the x86
> specific bits.

LGTM, with comparison to stack_pointer_rtx, as mentioned by Jeff above.

Thanks,
Uros.
diff mbox

Patch

--- gcc/rtlanal.c.jj	2017-04-26 12:11:04.019878187 +0200
+++ gcc/rtlanal.c	2017-04-26 17:48:14.131705330 +0200
@@ -1221,6 +1221,24 @@  reg_set_p (const_rtx reg, const_rtx insn
 		  || find_reg_fusage (insn, CLOBBER, reg)))))
     return true;
 
+  /* There are no REG_INC notes for SP autoinc.  */
+  if (reg == stack_pointer_rtx && INSN_P (insn))
+    {
+      subrtx_var_iterator::array_type array;
+      FOR_EACH_SUBRTX_VAR (iter, array, PATTERN (insn), NONCONST)
+	{
+	  rtx mem = *iter;
+	  if (mem
+	      && MEM_P (mem)
+	      && GET_RTX_CLASS (GET_CODE (XEXP (mem, 0))) == RTX_AUTOINC)
+	    {
+	      if (XEXP (XEXP (mem, 0), 0) == stack_pointer_rtx)
+		return true;
+	      iter.skip_subrtxes ();
+	    }
+	}
+    }
+
   return set_of (reg, insn) != NULL_RTX;
 }
 
--- gcc/config/i386/i386.c.jj	2017-04-26 17:48:01.108877052 +0200
+++ gcc/config/i386/i386.c	2017-04-26 17:50:44.890717389 +0200
@@ -29243,7 +29243,27 @@  ix86_agi_dependent (rtx_insn *set_insn,
     if (MEM_P (recog_data.operand[i]))
       {
 	rtx addr = XEXP (recog_data.operand[i], 0);
-	return modified_in_p (addr, set_insn) != 0;
+	if (modified_in_p (addr, set_insn) != 0)
+	  {
+	    /* No AGI stall if SET_INSN is a push or pop and USE_INSN
+	       has SP based memory (unless index reg is modified in a pop).  */
+	    rtx set = single_set (set_insn);
+	    if (set
+		&& (push_operand (SET_DEST (set), GET_MODE (SET_DEST (set)))
+		    || pop_operand (SET_SRC (set), GET_MODE (SET_SRC (set)))))
+	      {
+		struct ix86_address parts;
+		if (ix86_decompose_address (addr, &parts)
+		    && REG_P (parts.base)
+		    && REGNO (parts.base) == STACK_POINTER_REGNUM
+		    && (parts.index == NULL_RTX
+			|| MEM_P (SET_DEST (set))
+			|| !modified_in_p (parts.index, set_insn)))
+		  return false;
+	      }
+	    return true;
+	  }
+	return false;
       }
   return false;
 }