diff mbox

PR 71483 - Fix live SLP operations

Message ID D385CFCC.101D6%alan.hayward@arm.com
State New
Headers show

Commit Message

Alan Hayward June 14, 2016, 2:14 p.m. UTC
In the given testcase, g++ splits a live operation into two scalar
statements
and four vector statements.

_5 = _4 >> 2;
  _7 = (short int) _5;

Is turned into:

vect__5.32_80 = vect__4.31_76 >> 2;
  vect__5.32_81 = vect__4.31_77 >> 2;
  vect__5.32_82 = vect__4.31_78 >> 2;
  vect__5.32_83 = vect__4.31_79 >> 2;
  vect__7.33_86 = VEC_PACK_TRUNC_EXPR <vect__5.32_80, vect__5.32_81>;
  vect__7.33_87 = VEC_PACK_TRUNC_EXPR <vect__5.32_82, vect__5.32_83>;

_5 is then accessed outside the loop.

This patch ensures that vectorizable_live_operation picks the correct
scalar
statement.
I removed the "three possibilites" comment because it was no longer
accurate
(it's also possible to have more vector statements than scalar statements)
and
the calculation is now much simpler.

Tested on x86 and aarch64.
Ok to commit?

gcc/
    PR tree-optimization/71483
    * tree-vect-loop.c (vectorizable_live_operation): Pick correct index
    for slp

testsuite/g++.dg/vect
    PR tree-optimization/71483
    * pr71483.c: New


Alan.


     {

Comments

Richard Biener June 15, 2016, 6:38 a.m. UTC | #1
On June 14, 2016 4:14:20 PM GMT+02:00, Alan Hayward <alan.hayward@arm.com> wrote:
>In the given testcase, g++ splits a live operation into two scalar
>statements
>and four vector statements.
>
>_5 = _4 >> 2;
>  _7 = (short int) _5;
>
>Is turned into:
>
>vect__5.32_80 = vect__4.31_76 >> 2;
>  vect__5.32_81 = vect__4.31_77 >> 2;
>  vect__5.32_82 = vect__4.31_78 >> 2;
>  vect__5.32_83 = vect__4.31_79 >> 2;
>  vect__7.33_86 = VEC_PACK_TRUNC_EXPR <vect__5.32_80, vect__5.32_81>;
>  vect__7.33_87 = VEC_PACK_TRUNC_EXPR <vect__5.32_82, vect__5.32_83>;
>
>_5 is then accessed outside the loop.
>
>This patch ensures that vectorizable_live_operation picks the correct
>scalar
>statement.
>I removed the "three possibilites" comment because it was no longer
>accurate
>(it's also possible to have more vector statements than scalar
>statements)
>and
>the calculation is now much simpler.
>
>Tested on x86 and aarch64.
>Ok to commit?

OK.

Thanks,
Richard.

>gcc/
>    PR tree-optimization/71483
>   * tree-vect-loop.c (vectorizable_live_operation): Pick correct index
>    for slp
>
>testsuite/g++.dg/vect
>    PR tree-optimization/71483
>    * pr71483.c: New
>
>
>Alan.
>
>
>diff --git a/gcc/testsuite/g++.dg/vect/pr71483.c
>b/gcc/testsuite/g++.dg/vect/pr71483.c
>new file mode 100644
>index 
>0000000000000000000000000000000000000000..77f879c9a89b8b41ef9dde3c343591857
>2dc8d01
>--- /dev/null
>+++ b/gcc/testsuite/g++.dg/vect/pr71483.c
>@@ -0,0 +1,11 @@
>+/* { dg-do compile } */
>+int b, c, d;
>+short *e;
>+void fn1() {
>+  for (; b; b--) {
>+    d = *e >> 2;
>+    *e++ = d;
>+    c = *e;
>+    *e++ = d;
>+  }
>+}
>diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
>index 
>4c8678505df6ec572b69fd7d12ac55cf4619ece6..a2413bf9c678d11cc2ffd22bc7d984e91
>1831804 100644
>--- a/gcc/tree-vect-loop.c
>+++ b/gcc/tree-vect-loop.c
>@@ -6368,24 +6368,20 @@ vectorizable_live_operation (gimple *stmt,
>
>       int num_scalar = SLP_TREE_SCALAR_STMTS (slp_node).length ();
>       int num_vec = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node);
>-      int scalar_per_vec = num_scalar / num_vec;
>
>-      /* There are three possibilites here:
>-	 1: All scalar stmts fit in a single vector.
>-	 2: All scalar stmts fit multiple times into a single vector.
>-	    We must choose the last occurence of stmt in the vector.
>-	 3: Scalar stmts are split across multiple vectors.
>-	    We must choose the correct vector and mod the lane accordingly. 
>*/
>+      /* Get the last occurrence of the scalar index from the
>concatenation of
>+	 all the slp vectors. Calculate which slp vector it is and the index
>+	 within.  */
>+      int pos = (num_vec * nunits) - num_scalar + slp_index;
>+      int vec_entry = pos / nunits;
>+      int vec_index = pos % nunits;
>
>       /* Get the correct slp vectorized stmt.  */
>-      int vec_entry = slp_index / scalar_per_vec;
>   vec_lhs = gimple_get_lhs (SLP_TREE_VEC_STMTS (slp_node)[vec_entry]);
>
>       /* Get entry to use.  */
>-      bitstart = build_int_cst (unsigned_type_node,
>-				scalar_per_vec - (slp_index % scalar_per_vec));
>+      bitstart = build_int_cst (unsigned_type_node, vec_index);
>       bitstart = int_const_binop (MULT_EXPR, bitsize, bitstart);
>-      bitstart = int_const_binop (MINUS_EXPR, vec_bitsize, bitstart);
>     }
>   else
>     {
diff mbox

Patch

diff --git a/gcc/testsuite/g++.dg/vect/pr71483.c
b/gcc/testsuite/g++.dg/vect/pr71483.c
new file mode 100644
index 
0000000000000000000000000000000000000000..77f879c9a89b8b41ef9dde3c343591857
2dc8d01
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/pr71483.c
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+int b, c, d;
+short *e;
+void fn1() {
+  for (; b; b--) {
+    d = *e >> 2;
+    *e++ = d;
+    c = *e;
+    *e++ = d;
+  }
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 
4c8678505df6ec572b69fd7d12ac55cf4619ece6..a2413bf9c678d11cc2ffd22bc7d984e91
1831804 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6368,24 +6368,20 @@  vectorizable_live_operation (gimple *stmt,

       int num_scalar = SLP_TREE_SCALAR_STMTS (slp_node).length ();
       int num_vec = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node);
-      int scalar_per_vec = num_scalar / num_vec;

-      /* There are three possibilites here:
-	 1: All scalar stmts fit in a single vector.
-	 2: All scalar stmts fit multiple times into a single vector.
-	    We must choose the last occurence of stmt in the vector.
-	 3: Scalar stmts are split across multiple vectors.
-	    We must choose the correct vector and mod the lane accordingly.  */
+      /* Get the last occurrence of the scalar index from the
concatenation of
+	 all the slp vectors. Calculate which slp vector it is and the index
+	 within.  */
+      int pos = (num_vec * nunits) - num_scalar + slp_index;
+      int vec_entry = pos / nunits;
+      int vec_index = pos % nunits;

       /* Get the correct slp vectorized stmt.  */
-      int vec_entry = slp_index / scalar_per_vec;
       vec_lhs = gimple_get_lhs (SLP_TREE_VEC_STMTS (slp_node)[vec_entry]);

       /* Get entry to use.  */
-      bitstart = build_int_cst (unsigned_type_node,
-				scalar_per_vec - (slp_index % scalar_per_vec));
+      bitstart = build_int_cst (unsigned_type_node, vec_index);
       bitstart = int_const_binop (MULT_EXPR, bitsize, bitstart);
-      bitstart = int_const_binop (MINUS_EXPR, vec_bitsize, bitstart);
     }
   else