Message ID | 20171209095012.GD2353@tucnak |
---|---|
State | New |
Headers | show |
Series | Fix vectorization of POINTER_DIFF_EXPR (PR tree-optimization/83338) | expand |
On December 9, 2017 10:50:12 AM GMT+01:00, Jakub Jelinek <jakub@redhat.com> wrote: >Hi! > >When POINTER_PLUS_EXPR is vectorized, we vectorize it as a vector >PLUS_EXPR. >This (usually? Not sure about targets where sizetype has different >precision from POINTER_SIZE; maybe those just don't have vector types) >works >because we vectorize pointer variables as vectors of unsigned pointer >sized >integers and if sizetype has the same precision, then it is the same >vector >too, so both operands and result are compatible vectors. > >POINTER_DIFF_EXPR is different, the arguments are pointers which get >vectype of vectors of unsigned pointer sized integers, but the result >is a corresponding signed type, so vectype_out is vector of signed >pointer >sized integers; those aren't compatible. > >So, we can't just vectorize POINTER_DIFF_EXPR as vector MINUS_EXPR, we >need >to VCE the result from vectype to vectype_out. > >Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok >for >trunk? OK. Richard. >2017-12-09 Jakub Jelinek <jakub@redhat.com> > > PR tree-optimization/83338 > * tree-vect-stmts.c (vectorizable_operation): Handle POINTER_DIFF_EXPR > vectorization as MINUS_EXPR with a subsequent VIEW_CONVERT_EXPR from > vector of unsigned integers to vector of signed integers. > > * gcc.dg/vect/pr83338.c: New test. > >--- gcc/tree-vect-stmts.c.jj 2017-12-08 12:21:58.000000000 +0100 >+++ gcc/tree-vect-stmts.c 2017-12-09 00:55:17.614147824 +0100 >@@ -5226,7 +5226,7 @@ vectorizable_operation (gimple *stmt, gi > stmt_vec_info stmt_info = vinfo_for_stmt (stmt); > tree vectype; > loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); >- enum tree_code code; >+ enum tree_code code, orig_code; > machine_mode vec_mode; > tree new_temp; > int op_type; >@@ -5264,7 +5264,7 @@ vectorizable_operation (gimple *stmt, gi > if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME) > return false; > >- code = gimple_assign_rhs_code (stmt); >+ orig_code = code = gimple_assign_rhs_code (stmt); > > /* For pointer addition and subtraction, we should use the normal > plus and minus for the vector operation. */ >@@ -5455,6 +5455,14 @@ vectorizable_operation (gimple *stmt, gi > /* Handle def. */ > vec_dest = vect_create_destination_var (scalar_dest, vectype); > >+ /* POINTER_DIFF_EXPR has pointer arguments which are vectorized as >+ vectors with unsigned elements, but the result is signed. So, we >+ need to compute the MINUS_EXPR into vectype temporary and >+ VIEW_CONVERT_EXPR it into the final vectype_out result. */ >+ tree vec_cvt_dest = NULL_TREE; >+ if (orig_code == POINTER_DIFF_EXPR) >+ vec_cvt_dest = vect_create_destination_var (scalar_dest, >vectype_out); >+ > /* In case the vectorization factor (VF) is bigger than the number > of elements that we can fit in a vectype (nunits), we have to generate > more than one vector stmt - i.e - we need to "unroll" the >@@ -5546,6 +5554,15 @@ vectorizable_operation (gimple *stmt, gi > new_temp = make_ssa_name (vec_dest, new_stmt); > gimple_assign_set_lhs (new_stmt, new_temp); > vect_finish_stmt_generation (stmt, new_stmt, gsi); >+ if (vec_cvt_dest) >+ { >+ new_temp = build1 (VIEW_CONVERT_EXPR, vectype_out, new_temp); >+ new_stmt = gimple_build_assign (vec_cvt_dest, >VIEW_CONVERT_EXPR, >+ new_temp); >+ new_temp = make_ssa_name (vec_cvt_dest, new_stmt); >+ gimple_assign_set_lhs (new_stmt, new_temp); >+ vect_finish_stmt_generation (stmt, new_stmt, gsi); >+ } > if (slp_node) > SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt); > } >--- gcc/testsuite/gcc.dg/vect/pr83338.c.jj 2017-12-09 >01:00:06.602565622 +0100 >+++ gcc/testsuite/gcc.dg/vect/pr83338.c 2017-12-09 01:00:18.297422116 >+0100 >@@ -0,0 +1,10 @@ >+/* PR tree-optimization/83338 */ >+/* { dg-do compile } */ >+ >+void >+foo (char **p, char **q, __PTRDIFF_TYPE__ *r) >+{ >+ int i; >+ for (i = 0; i < 1024; i++) >+ r[i] = p[i] - q[i]; >+} > > > Jakub
--- gcc/tree-vect-stmts.c.jj 2017-12-08 12:21:58.000000000 +0100 +++ gcc/tree-vect-stmts.c 2017-12-09 00:55:17.614147824 +0100 @@ -5226,7 +5226,7 @@ vectorizable_operation (gimple *stmt, gi stmt_vec_info stmt_info = vinfo_for_stmt (stmt); tree vectype; loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); - enum tree_code code; + enum tree_code code, orig_code; machine_mode vec_mode; tree new_temp; int op_type; @@ -5264,7 +5264,7 @@ vectorizable_operation (gimple *stmt, gi if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME) return false; - code = gimple_assign_rhs_code (stmt); + orig_code = code = gimple_assign_rhs_code (stmt); /* For pointer addition and subtraction, we should use the normal plus and minus for the vector operation. */ @@ -5455,6 +5455,14 @@ vectorizable_operation (gimple *stmt, gi /* Handle def. */ vec_dest = vect_create_destination_var (scalar_dest, vectype); + /* POINTER_DIFF_EXPR has pointer arguments which are vectorized as + vectors with unsigned elements, but the result is signed. So, we + need to compute the MINUS_EXPR into vectype temporary and + VIEW_CONVERT_EXPR it into the final vectype_out result. */ + tree vec_cvt_dest = NULL_TREE; + if (orig_code == POINTER_DIFF_EXPR) + vec_cvt_dest = vect_create_destination_var (scalar_dest, vectype_out); + /* In case the vectorization factor (VF) is bigger than the number of elements that we can fit in a vectype (nunits), we have to generate more than one vector stmt - i.e - we need to "unroll" the @@ -5546,6 +5554,15 @@ vectorizable_operation (gimple *stmt, gi new_temp = make_ssa_name (vec_dest, new_stmt); gimple_assign_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (stmt, new_stmt, gsi); + if (vec_cvt_dest) + { + new_temp = build1 (VIEW_CONVERT_EXPR, vectype_out, new_temp); + new_stmt = gimple_build_assign (vec_cvt_dest, VIEW_CONVERT_EXPR, + new_temp); + new_temp = make_ssa_name (vec_cvt_dest, new_stmt); + gimple_assign_set_lhs (new_stmt, new_temp); + vect_finish_stmt_generation (stmt, new_stmt, gsi); + } if (slp_node) SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt); } --- gcc/testsuite/gcc.dg/vect/pr83338.c.jj 2017-12-09 01:00:06.602565622 +0100 +++ gcc/testsuite/gcc.dg/vect/pr83338.c 2017-12-09 01:00:18.297422116 +0100 @@ -0,0 +1,10 @@ +/* PR tree-optimization/83338 */ +/* { dg-do compile } */ + +void +foo (char **p, char **q, __PTRDIFF_TYPE__ *r) +{ + int i; + for (i = 0; i < 1024; i++) + r[i] = p[i] - q[i]; +}