Patchwork Fix combiner i3/i2 pattern splitting (PR rtl-optimization/45695, take 2)

login
register
mail settings
Submitter Jakub Jelinek
Date Sept. 20, 2010, 10:29 a.m.
Message ID <20100920102901.GJ1269@tyan-ft48-01.lab.bos.redhat.com>
Download mbox | patch
Permalink /patch/65196/
State New
Headers show

Comments

Jakub Jelinek - Sept. 20, 2010, 10:29 a.m.
On Sat, Sep 18, 2010 at 06:23:45PM +0200, Eric Botcazou wrote:
> > HAVE_cc0 targets would still be buggy.
> 
> But not more than before the fix, right?

Sure, but we shouldn't keep it broken IMHO.

> > If you want I can gather statistics on how many cases would match with the
> > swapped patterns compared to without that.
> 
> Yes, that would be nice, for example on the core compiler itself at -O2, TIA.

The stats gathered across x86_64-linux and i686-linux bootstraps+regtests
were roughly 7000 cases where the two sets matched successfully in the first
chosen order and 1 where it only matched after swapping.  Thus I agree to
remove the code to swap it if CLOBBERS are notticed.

The following patch just fixes the HAVE_cc0 case then and does the removal.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.5?

2010-09-20  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/45695
	* combine.c (try_combine): When splitting a two set pattern,
	make sure the pattern which will be put into i2 doesn't use REGs
	or MEMs set by insns in between i2 and i3.  Don't try to swap the
	two patterns, if the chosen order is not possible, just give up.

	* gcc.c-torture/execute/pr45695.c: New test.



	Jakub
Eric Botcazou - Sept. 20, 2010, 11:20 a.m.
> Sure, but we shouldn't keep it broken IMHO.

On the mainline, but not necessarily on the branch.

> The stats gathered across x86_64-linux and i686-linux bootstraps+regtests
> were roughly 7000 cases where the two sets matched successfully in the
> first chosen order and 1 where it only matched after swapping.  Thus I
> agree to remove the code to swap it if CLOBBERS are notticed.

Thanks.

> The following patch just fixes the HAVE_cc0 case then and does the removal.

It will enable more combining for all targets though, won't it?  If so, I 
think that it isn't appropriate for the branch.  Just replace the 'break' 
with 'undo_all + return' there (and add the missing call to use_crosses_set_p 
for cc0 targets if you want).

> 2010-09-20  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR rtl-optimization/45695
> 	* combine.c (try_combine): When splitting a two set pattern,
> 	make sure the pattern which will be put into i2 doesn't use REGs
> 	or MEMs set by insns in between i2 and i3.  Don't try to swap the
> 	two patterns, if the chosen order is not possible, just give up.
>
> 	* gcc.c-torture/execute/pr45695.c: New test.

OK for mainline only.
H.J. Lu - Jan. 2, 2011, 11:14 p.m.
On Mon, Sep 20, 2010 at 3:29 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Sat, Sep 18, 2010 at 06:23:45PM +0200, Eric Botcazou wrote:
>> > HAVE_cc0 targets would still be buggy.
>>
>> But not more than before the fix, right?
>
> Sure, but we shouldn't keep it broken IMHO.
>
>> > If you want I can gather statistics on how many cases would match with the
>> > swapped patterns compared to without that.
>>
>> Yes, that would be nice, for example on the core compiler itself at -O2, TIA.
>
> The stats gathered across x86_64-linux and i686-linux bootstraps+regtests
> were roughly 7000 cases where the two sets matched successfully in the first
> chosen order and 1 where it only matched after swapping.  Thus I agree to
> remove the code to swap it if CLOBBERS are notticed.
>
> The following patch just fixes the HAVE_cc0 case then and does the removal.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.5?
>
> 2010-09-20  Jakub Jelinek  <jakub@redhat.com>
>
>        PR rtl-optimization/45695
>        * combine.c (try_combine): When splitting a two set pattern,
>        make sure the pattern which will be put into i2 doesn't use REGs
>        or MEMs set by insns in between i2 and i3.  Don't try to swap the
>        two patterns, if the chosen order is not possible, just give up.
>

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46755#c8


H.J.

Patch

--- gcc/combine.c.jj	2010-09-20 09:34:17.828054811 +0200
+++ gcc/combine.c	2010-09-20 09:40:01.934646399 +0200
@@ -3690,36 +3690,41 @@  try_combine (rtx i3, rtx i2, rtx i1, rtx
 	   && GET_CODE (XVECEXP (newpat, 0, 1)) == SET
 	   && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT
 	   && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART
-	   && ! use_crosses_set_p (SET_SRC (XVECEXP (newpat, 0, 1)),
-				   DF_INSN_LUID (i2))
 	   && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
 				  XVECEXP (newpat, 0, 0))
 	   && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
 				  XVECEXP (newpat, 0, 1))
 	   && ! (contains_muldiv (SET_SRC (XVECEXP (newpat, 0, 0)))
-		 && contains_muldiv (SET_SRC (XVECEXP (newpat, 0, 1))))
-#ifdef HAVE_cc0
-	   /* We cannot split the parallel into two sets if both sets
-	      reference cc0.  */
-	   && ! (reg_referenced_p (cc0_rtx, XVECEXP (newpat, 0, 0))
-		 && reg_referenced_p (cc0_rtx, XVECEXP (newpat, 0, 1)))
-#endif
-	   )
+		 && contains_muldiv (SET_SRC (XVECEXP (newpat, 0, 1)))))
     {
       /* Normally, it doesn't matter which of the two is done first,
-	 but it does if one references cc0.  In that case, it has to
+	 but the one that references cc0 can't be the second, and
+	 one which uses any regs/memory set in between i2 and i3 can't
 	 be first.  */
+      if (!use_crosses_set_p (SET_SRC (XVECEXP (newpat, 0, 1)),
+			      DF_INSN_LUID (i2))
 #ifdef HAVE_cc0
-      if (reg_referenced_p (cc0_rtx, XVECEXP (newpat, 0, 0)))
+	  && !reg_referenced_p (cc0_rtx, XVECEXP (newpat, 0, 0))
+#endif
+	 )
+	{
+	  newi2pat = XVECEXP (newpat, 0, 1);
+	  newpat = XVECEXP (newpat, 0, 0);
+	}
+      else if (!use_crosses_set_p (SET_SRC (XVECEXP (newpat, 0, 0)),
+				   DF_INSN_LUID (i2))
+#ifdef HAVE_cc0
+	       && !reg_referenced_p (cc0_rtx, XVECEXP (newpat, 0, 1))
+#endif
+	      )
 	{
 	  newi2pat = XVECEXP (newpat, 0, 0);
 	  newpat = XVECEXP (newpat, 0, 1);
 	}
       else
-#endif
 	{
-	  newi2pat = XVECEXP (newpat, 0, 1);
-	  newpat = XVECEXP (newpat, 0, 0);
+	  undo_all ();
+	  return 0;
 	}
 
       i2_code_number = recog_for_combine (&newi2pat, i2, &new_i2_notes);
@@ -3735,44 +3740,11 @@  try_combine (rtx i3, rtx i2, rtx i1, rtx
 		  {
 		    rtx reg = XEXP (XVECEXP (newi2pat, 0, i), 0);
 		    if (reg_overlap_mentioned_p (reg, newpat))
-		      break;
+		      {
+			undo_all ();
+			return 0;
+		      }
 		  }
-
-	      if (i >= 0)
-		{
-		  /* CLOBBERs on newi2pat prevent it going first.
-		     Try the other order of the insns if possible.  */
-		  temp = newpat;
-		  newpat = XVECEXP (newi2pat, 0, 0);
-		  newi2pat = temp;
-#ifdef HAVE_cc0
-		  if (reg_referenced_p (cc0_rtx, newpat))
-		    {
-		      undo_all ();
-		      return 0;
-		    }
-#endif
-
-		  i2_code_number = recog_for_combine (&newi2pat, i2,
-						      &new_i2_notes);
-		  if (i2_code_number < 0)
-		    {
-		      undo_all ();
-		      return 0;
-		    }
-
-		  if (GET_CODE (newi2pat) == PARALLEL)
-		    for (i = XVECLEN (newi2pat, 0) - 1; i >= 0; i--)
-		      if (GET_CODE (XVECEXP (newi2pat, 0, i)) == CLOBBER)
-			{
-			  rtx reg = XEXP (XVECEXP (newi2pat, 0, i), 0);
-			  if (reg_overlap_mentioned_p (reg, newpat))
-			    {
-			      undo_all ();
-			      return 0;
-			    }
-			}
-		}
 	    }
 
 	  insn_code_number = recog_for_combine (&newpat, i3, &new_i3_notes);
--- gcc/testsuite/gcc.c-torture/execute/pr45695.c.jj	2010-09-20 09:35:18.281365460 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr45695.c	2010-09-20 09:35:18.299365813 +0200
@@ -0,0 +1,32 @@ 
+/* PR rtl-optimization/45695 */
+
+extern void abort (void);
+
+__attribute__((noinline)) void
+g (int x)
+{
+  asm volatile ("" : "+r" (x));
+}
+
+__attribute__((noinline)) int
+f (int a, int b, int d)
+{
+  int r = -1;
+  b += d;
+  if (d == a)
+    r = b - d;
+  g (b);
+  return r;
+}
+
+int
+main (void)
+{
+  int l;
+  asm ("" : "=r" (l) : "0" (0));
+  if (f (l + 0, l + 1, l + 4) != -1)
+    abort ();
+  if (f (l + 4, l + 1, l + 4) != 1)
+    abort ();
+  return 0;
+}