diff mbox

[PR79663] Only reversely combine refs for ZERO length chains in predcom

Message ID VI1PR0802MB21767BE3572DFBC65CC99BBFE7530@VI1PR0802MB2176.eurprd08.prod.outlook.com
State New
Headers show

Commit Message

Bin Cheng Feb. 23, 2017, 11:24 a.m. UTC
Hi,
This patch resolves spec2k/mgrid regression as reported by PR79663.  Root cause has been
described thoroughly in comment #1/#2 of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79663
This patch handles ZERO/non-ZERO length chains differently and only reversely combines refs
for ZERO length chains.  It also does small improvement with in place list swap.
Bootstrap and test on x86_64 and AArch64, is it OK?

Thanks,
bin

2017-01-21  Bin Cheng  <bin.cheng@arm.com>

	PR tree-optimization/79663
	* tree-predcom.c (combine_chains): Process refs in reverse order
	only for ZERO length chains, and add explaining comment.

Comments

Richard Biener Feb. 23, 2017, 3:02 p.m. UTC | #1
On Thu, Feb 23, 2017 at 12:24 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
> Hi,
> This patch resolves spec2k/mgrid regression as reported by PR79663.  Root cause has been
> described thoroughly in comment #1/#2 of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79663
> This patch handles ZERO/non-ZERO length chains differently and only reversely combines refs
> for ZERO length chains.  It also does small improvement with in place list swap.
> Bootstrap and test on x86_64 and AArch64, is it OK?

Ok.

Thanks,
Richard.

> Thanks,
> bin
>
> 2017-01-21  Bin Cheng  <bin.cheng@arm.com>
>
>         PR tree-optimization/79663
>         * tree-predcom.c (combine_chains): Process refs in reverse order
>         only for ZERO length chains, and add explaining comment.
Jeff Law Feb. 23, 2017, 10:01 p.m. UTC | #2
On 02/23/2017 04:24 AM, Bin Cheng wrote:
> Hi,
> This patch resolves spec2k/mgrid regression as reported by PR79663.  Root cause has been
> described thoroughly in comment #1/#2 of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79663
> This patch handles ZERO/non-ZERO length chains differently and only reversely combines refs
> for ZERO length chains.  It also does small improvement with in place list swap.
> Bootstrap and test on x86_64 and AArch64, is it OK?
>
> Thanks,
> bin
>
> 2017-01-21  Bin Cheng  <bin.cheng@arm.com>
>
> 	PR tree-optimization/79663
> 	* tree-predcom.c (combine_chains): Process refs in reverse order
> 	only for ZERO length chains, and add explaining comment.
>
I went ahead and installed this.

jeff
diff mbox

Patch

diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index 9723e9c..57d8f7d 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -2283,7 +2283,7 @@  combine_chains (chain_p ch1, chain_p ch2)
   enum tree_code op = ERROR_MARK;
   bool swap = false;
   chain_p new_chain;
-  unsigned i;
+  int i, j, num;
   gimple *root_stmt;
   tree rslt_type = NULL_TREE;
 
@@ -2305,6 +2305,9 @@  combine_chains (chain_p ch1, chain_p ch2)
 	return NULL;
     }
 
+  ch1->combined = true;
+  ch2->combined = true;
+
   if (swap)
     std::swap (ch1, ch2);
 
@@ -2317,39 +2320,41 @@  combine_chains (chain_p ch1, chain_p ch2)
   new_chain->length = ch1->length;
 
   gimple *insert = NULL;
-  auto_vec<dref> tmp_refs;
-  gcc_assert (ch1->refs.length () == ch2->refs.length ());
-  /* Process in reverse order so dominance point is ready when it comes
-     to the root ref.  */
-  for (i = ch1->refs.length (); i > 0; i--)
-    {
-      r1 = ch1->refs[i - 1];
-      r2 = ch2->refs[i - 1];
+  num = ch1->refs.length ();
+  i = (new_chain->length == 0) ? num - 1 : 0;
+  j = (new_chain->length == 0) ? -1 : 1;
+  /* For ZERO length chain, process refs in reverse order so that dominant
+     position is ready when it comes to the root ref.
+     For non-ZERO length chain, process refs in order.  See PR79663.  */
+  for (; num > 0; num--, i += j)
+    {
+      r1 = ch1->refs[i];
+      r2 = ch2->refs[i];
       nw = XCNEW (struct dref_d);
       nw->distance = r1->distance;
-      nw->stmt = stmt_combining_refs (r1, r2, i == 1 ? insert : NULL);
 
-      /* Record dominance point where root combined stmt should be inserted
-	 for chains with 0 length.  Though all root refs dominate following
-	 refs, it's possible the combined stmt doesn't.  See PR70754.  */
-      if (ch1->length == 0
+      /* For ZERO length chain, insert combined stmt of root ref at dominant
+	 position.  */
+      nw->stmt = stmt_combining_refs (r1, r2, i == 0 ? insert : NULL);
+      /* For ZERO length chain, record dominant position where combined stmt
+	 of root ref should be inserted.  In this case, though all root refs
+	 dominate following ones, it's possible that combined stmt doesn't.
+	 See PR70754.  */
+      if (new_chain->length == 0
 	  && (insert == NULL || stmt_dominates_stmt_p (nw->stmt, insert)))
 	insert = nw->stmt;
 
-      tmp_refs.safe_push (nw);
+      new_chain->refs.safe_push (nw);
     }
-
-  /* Restore the order for new chain's refs.  */
-  for (i = tmp_refs.length (); i > 0; i--)
-    new_chain->refs.safe_push (tmp_refs[i - 1]);
-
-  ch1->combined = true;
-  ch2->combined = true;
-
-  /* For chains with 0 length, has_max_use_after must be true since root
-     combined stmt must dominates others.  */
   if (new_chain->length == 0)
     {
+      /* Restore the order for ZERO length chain's refs.  */
+      num = new_chain->refs.length () >> 1;
+      for (i = 0, j = new_chain->refs.length () - 1; i < num; i++, j--)
+	std::swap (new_chain->refs[i], new_chain->refs[j]);
+
+      /* For ZERO length chain, has_max_use_after must be true since root
+	 combined stmt must dominates others.  */
       new_chain->has_max_use_after = true;
       return new_chain;
     }