Message ID | 20160204164017.GO3017@tucnak.redhat.com |
---|---|
State | New |
Headers | show |
Hi All, Here is updated patch - I came back to move call statements also since masked loads are presented by internal call. I also assume that for the following simple loop for (i = 0; i < n; i++) if (b1[i]) a1[i] = sqrtf(a2[i] * a2[i] + a3[i] * a3[i]); motion must be done for all vector statements in semi-hammock including SQRT. Bootstrap and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-02-05 Yuri Rumyantsev <ysrumyan@gmail.com> PR tree-optimization/69652 * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all skipped scalar statements, introduce variable LAST_VUSE to keep vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the begining of current masked store processing, did source re-formatting, skip parsing of debug gimples, stop processing if a gimple with volatile operand has been encountered, save scalar statement with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE iterator, change vuse of all saved scalar statements to LAST_VUSE if it makes sence. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr69652.c: New test. 2016-02-04 19:40 GMT+03:00 Jakub Jelinek <jakub@redhat.com>: > On Thu, Feb 04, 2016 at 05:46:27PM +0300, Yuri Rumyantsev wrote: >> Here is a patch that cures the issues with non-correct vuse for scalar >> statements during code motion, i.e. if vuse of scalar statement is >> vdef of masked store which has been sunk to new basic block, we must >> fix it up. The patch also fixed almost all remarks pointed out by >> Jacub. >> >> Bootstrapping and regression testing on v86-64 did not show any new failures. >> Is it OK for trunk? >> >> ChangeLog: >> 2016-02-04 Yuri Rumyantsev <ysrumyan@gmail.com> >> >> PR tree-optimization/69652 >> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 >> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all >> skipped scalar statements, introduce variable LAST_VUSE that has >> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the >> begining of current masked store processing, did source re-formatting, >> skip parsing of debug gimples, stop processing when call or gimple >> with volatile operand habe been encountered, save scalar statement >> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE >> iterator, change vuse of all saved scalar statements to LAST_VUSE if >> it makes sence. >> >> gcc/testsuite/ChangeLog: >> * gcc.dg/torture/pr69652.c: New test. > > Your mailer breaks ChangeLog formatting, so it is hard to check the > formatting of the ChangeLog entry. > > diff --git a/gcc/testsuite/gcc.dg/torture/pr69652.c b/gcc/testsuite/gcc.dg/torture/pr69652.c > new file mode 100644 > index 0000000..91f30cf > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/torture/pr69652.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -ffast-math -ftree-vectorize " } */ > +/* { dg-additional-options "-mavx" { target { i?86-*-* x86_64-*-* } } } */ > + > +void fn1(double **matrix, int column, int row, int n) > +{ > + int k; > + for (k = 0; k < n; k++) > + if (matrix[row][k] != matrix[column][k]) > + { > + matrix[column][k] = -matrix[column][k]; > + matrix[row][k] = matrix[row][k] - matrix[column][k]; > + } > +} > \ No newline at end of file > > Please make sure the last line of the test is a new-line. > > @@ -6971,6 +6972,8 @@ optimize_mask_stores (struct loop *loop) > gsi_next (&gsi)) > { > stmt = gsi_stmt (gsi); > + if (is_gimple_debug (stmt)) > + continue; > if (is_gimple_call (stmt) > && gimple_call_internal_p (stmt) > && gimple_call_internal_fn (stmt) == IFN_MASK_STORE) > > This is not needed, you do something only for is_gimple_call, > which is never true if is_gimple_debug, so the code used to be fine as is. > > + /* Skip debug sstatements. */ > > s/ss/s/ > > + if (is_gimple_debug (gsi_stmt (gsi))) > + continue; > + stmt1 = gsi_stmt (gsi); > + /* Do not consider writing to memory,volatile and call > > Missing space after , > > + /* Skip scalar statements. */ > + if (!VECTOR_TYPE_P (TREE_TYPE (lhs))) > + { > + /* If scalar statement has vuse we need to modify it > + when another masked store will be sunk. */ > + if (gimple_vuse (stmt1)) > + scalar_vuse.safe_push (stmt1); > continue; > + } > > I don't think it is safe to take for granted that the scalar stmts are all > going to be DCEd, but I could be wrong. > > + /* Check that LHS does not have uses outside of STORE_BB. */ > + res = true; > + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) > + { > + gimple *use_stmt; > + use_stmt = USE_STMT (use_p); > + if (is_gimple_debug (use_stmt)) > + continue; > > Ignoring debug stmts to make decision whether you move or not is > of course the right thing to do. But IMHO you should remember if > you saw any is_gimple_debug stmts in some bool var. > > + if (gimple_bb (use_stmt) != store_bb) > + { > + res = false; > + break; > + } > + } > + if (!res) > + break; > > - if (gimple_vuse (stmt1) > - && gimple_vuse (stmt1) != gimple_vuse (last_store)) > - break; > + if (gimple_vuse (stmt1) > + && gimple_vuse (stmt1) != gimple_vuse (last_store)) > + break; > > + /* Can move STMT1 to STORE_BB. */ > + if (dump_enabled_p ()) > + { > + dump_printf_loc (MSG_NOTE, vect_location, > + "Move stmt to created bb\n"); > + dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt1, 0); > + } > > And if yes, invalidate them here, because the move would otherwise > generate invalid IL. > > + gsi_move_before (&gsi_from, &gsi_to); > + /* Shift GSI_TO for further insertion. */ > + gsi_prev (&gsi_to); > + } > + /* Put other masked stores with the same mask to STORE_BB. */ > + if (worklist.is_empty () > + || gimple_call_arg (worklist.last (), 2) != mask > + || worklist.last () != stmt1) > + break; > + last = worklist.pop (); > } > add_phi_arg (phi, gimple_vuse (last_store), e, UNKNOWN_LOCATION); > + /* Mask stores motion could crossing scalar statements with vuse > + which should be corrected. */ > > s/crossing/cross/ > That said, I'm not really sure if without some verification if such > reads are really dead it is safe to skip them and update this way. > Richard? > > + last_vuse = gimple_vuse (last_store); > + while (!scalar_vuse.is_empty ()) > + { > + stmt = scalar_vuse.pop (); > + if (gimple_vuse (stmt) != last_vuse) > + { > + gimple_set_vuse (stmt, last_vuse); > + update_stmt (stmt); > + } > + } > } > } > > Jakub
On Fri, Feb 5, 2016 at 3:54 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote: > Hi All, > > Here is updated patch - I came back to move call statements also since > masked loads are presented by internal call. I also assume that for > the following simple loop > for (i = 0; i < n; i++) > if (b1[i]) > a1[i] = sqrtf(a2[i] * a2[i] + a3[i] * a3[i]); > motion must be done for all vector statements in semi-hammock including SQRT. > > Bootstrap and regression testing did not show any new failures. > Is it OK for trunk? The patch is incredibly hard to parse due to the re-indenting. Please consider sending diffs with -b. This issue exposes that you are moving (masked) stores across loads without checking aliasing. In the specific case those loads are dead and thus this is safe but in general I thought we were checking that we are using the same VUSE during the sinking operation. Thus, I'd rather have + /* Check that LHS does not have uses outside of STORE_BB. */ + res = true; + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) + { + gimple *use_stmt; + use_stmt = USE_STMT (use_p); + if (is_gimple_debug (use_stmt)) + continue; + if (gimple_bb (use_stmt) != store_bb) + { + res = false; + break; + } + } also check for the dead code case and DCE those stmts here. Like so: if (has_zero_uses (lhs)) { gsi_remove (&gsi_from, true); continue; } before the above loop. Richard. > ChangeLog: > > 2016-02-05 Yuri Rumyantsev <ysrumyan@gmail.com> > > PR tree-optimization/69652 > * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 > to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all > skipped scalar statements, introduce variable LAST_VUSE to keep > vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the > begining of current masked store processing, did source re-formatting, > skip parsing of debug gimples, stop processing if a gimple with > volatile operand has been encountered, save scalar statement > with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE > iterator, change vuse of all saved scalar statements to LAST_VUSE if > it makes sence. > > gcc/testsuite/ChangeLog: > * gcc.dg/torture/pr69652.c: New test. > > 2016-02-04 19:40 GMT+03:00 Jakub Jelinek <jakub@redhat.com>: >> On Thu, Feb 04, 2016 at 05:46:27PM +0300, Yuri Rumyantsev wrote: >>> Here is a patch that cures the issues with non-correct vuse for scalar >>> statements during code motion, i.e. if vuse of scalar statement is >>> vdef of masked store which has been sunk to new basic block, we must >>> fix it up. The patch also fixed almost all remarks pointed out by >>> Jacub. >>> >>> Bootstrapping and regression testing on v86-64 did not show any new failures. >>> Is it OK for trunk? >>> >>> ChangeLog: >>> 2016-02-04 Yuri Rumyantsev <ysrumyan@gmail.com> >>> >>> PR tree-optimization/69652 >>> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 >>> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all >>> skipped scalar statements, introduce variable LAST_VUSE that has >>> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the >>> begining of current masked store processing, did source re-formatting, >>> skip parsing of debug gimples, stop processing when call or gimple >>> with volatile operand habe been encountered, save scalar statement >>> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE >>> iterator, change vuse of all saved scalar statements to LAST_VUSE if >>> it makes sence. >>> >>> gcc/testsuite/ChangeLog: >>> * gcc.dg/torture/pr69652.c: New test. >> >> Your mailer breaks ChangeLog formatting, so it is hard to check the >> formatting of the ChangeLog entry. >> >> diff --git a/gcc/testsuite/gcc.dg/torture/pr69652.c b/gcc/testsuite/gcc.dg/torture/pr69652.c >> new file mode 100644 >> index 0000000..91f30cf >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/torture/pr69652.c >> @@ -0,0 +1,14 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2 -ffast-math -ftree-vectorize " } */ >> +/* { dg-additional-options "-mavx" { target { i?86-*-* x86_64-*-* } } } */ >> + >> +void fn1(double **matrix, int column, int row, int n) >> +{ >> + int k; >> + for (k = 0; k < n; k++) >> + if (matrix[row][k] != matrix[column][k]) >> + { >> + matrix[column][k] = -matrix[column][k]; >> + matrix[row][k] = matrix[row][k] - matrix[column][k]; >> + } >> +} >> \ No newline at end of file >> >> Please make sure the last line of the test is a new-line. >> >> @@ -6971,6 +6972,8 @@ optimize_mask_stores (struct loop *loop) >> gsi_next (&gsi)) >> { >> stmt = gsi_stmt (gsi); >> + if (is_gimple_debug (stmt)) >> + continue; >> if (is_gimple_call (stmt) >> && gimple_call_internal_p (stmt) >> && gimple_call_internal_fn (stmt) == IFN_MASK_STORE) >> >> This is not needed, you do something only for is_gimple_call, >> which is never true if is_gimple_debug, so the code used to be fine as is. >> >> + /* Skip debug sstatements. */ >> >> s/ss/s/ >> >> + if (is_gimple_debug (gsi_stmt (gsi))) >> + continue; >> + stmt1 = gsi_stmt (gsi); >> + /* Do not consider writing to memory,volatile and call >> >> Missing space after , >> >> + /* Skip scalar statements. */ >> + if (!VECTOR_TYPE_P (TREE_TYPE (lhs))) >> + { >> + /* If scalar statement has vuse we need to modify it >> + when another masked store will be sunk. */ >> + if (gimple_vuse (stmt1)) >> + scalar_vuse.safe_push (stmt1); >> continue; >> + } >> >> I don't think it is safe to take for granted that the scalar stmts are all >> going to be DCEd, but I could be wrong. >> >> + /* Check that LHS does not have uses outside of STORE_BB. */ >> + res = true; >> + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) >> + { >> + gimple *use_stmt; >> + use_stmt = USE_STMT (use_p); >> + if (is_gimple_debug (use_stmt)) >> + continue; >> >> Ignoring debug stmts to make decision whether you move or not is >> of course the right thing to do. But IMHO you should remember if >> you saw any is_gimple_debug stmts in some bool var. >> >> + if (gimple_bb (use_stmt) != store_bb) >> + { >> + res = false; >> + break; >> + } >> + } >> + if (!res) >> + break; >> >> - if (gimple_vuse (stmt1) >> - && gimple_vuse (stmt1) != gimple_vuse (last_store)) >> - break; >> + if (gimple_vuse (stmt1) >> + && gimple_vuse (stmt1) != gimple_vuse (last_store)) >> + break; >> >> + /* Can move STMT1 to STORE_BB. */ >> + if (dump_enabled_p ()) >> + { >> + dump_printf_loc (MSG_NOTE, vect_location, >> + "Move stmt to created bb\n"); >> + dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt1, 0); >> + } >> >> And if yes, invalidate them here, because the move would otherwise >> generate invalid IL. >> >> + gsi_move_before (&gsi_from, &gsi_to); >> + /* Shift GSI_TO for further insertion. */ >> + gsi_prev (&gsi_to); >> + } >> + /* Put other masked stores with the same mask to STORE_BB. */ >> + if (worklist.is_empty () >> + || gimple_call_arg (worklist.last (), 2) != mask >> + || worklist.last () != stmt1) >> + break; >> + last = worklist.pop (); >> } >> add_phi_arg (phi, gimple_vuse (last_store), e, UNKNOWN_LOCATION); >> + /* Mask stores motion could crossing scalar statements with vuse >> + which should be corrected. */ >> >> s/crossing/cross/ >> That said, I'm not really sure if without some verification if such >> reads are really dead it is safe to skip them and update this way. >> Richard? >> >> + last_vuse = gimple_vuse (last_store); >> + while (!scalar_vuse.is_empty ()) >> + { >> + stmt = scalar_vuse.pop (); >> + if (gimple_vuse (stmt) != last_vuse) >> + { >> + gimple_set_vuse (stmt, last_vuse); >> + update_stmt (stmt); >> + } >> + } >> } >> } >> >> Jakub
Thanks Richard for your comments. I changes algorithm to remove dead scalar statements as you proposed. Bootstrap and regression testing did not show any new failures on x86-64. Is it OK for trunk? Changelog: 2016-02-10 Yuri Rumyantsev <ysrumyan@gmail.com> PR tree-optimization/69652 * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 to nested loop, did source re-formatting, skip debug statements, add check on statement with volatile operand, remove dead scalar statements. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr69652.c: New test. 2016-02-09 15:33 GMT+03:00 Richard Biener <richard.guenther@gmail.com>: > On Fri, Feb 5, 2016 at 3:54 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote: >> Hi All, >> >> Here is updated patch - I came back to move call statements also since >> masked loads are presented by internal call. I also assume that for >> the following simple loop >> for (i = 0; i < n; i++) >> if (b1[i]) >> a1[i] = sqrtf(a2[i] * a2[i] + a3[i] * a3[i]); >> motion must be done for all vector statements in semi-hammock including SQRT. >> >> Bootstrap and regression testing did not show any new failures. >> Is it OK for trunk? > > The patch is incredibly hard to parse due to the re-indenting. Please > consider sending > diffs with -b. > > This issue exposes that you are moving (masked) stores across loads without > checking aliasing. In the specific case those loads are dead and thus > this is safe > but in general I thought we were checking that we are using the same VUSE > during the sinking operation. > > Thus, I'd rather have > > + /* Check that LHS does not have uses outside of STORE_BB. */ > + res = true; > + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) > + { > + gimple *use_stmt; > + use_stmt = USE_STMT (use_p); > + if (is_gimple_debug (use_stmt)) > + continue; > + if (gimple_bb (use_stmt) != store_bb) > + { > + res = false; > + break; > + } > + } > > also check for the dead code case and DCE those stmts here. Like so: > > if (has_zero_uses (lhs)) > { > gsi_remove (&gsi_from, true); > continue; > } > > before the above loop. > > Richard. > >> ChangeLog: >> >> 2016-02-05 Yuri Rumyantsev <ysrumyan@gmail.com> >> >> PR tree-optimization/69652 >> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 >> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all >> skipped scalar statements, introduce variable LAST_VUSE to keep >> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the >> begining of current masked store processing, did source re-formatting, >> skip parsing of debug gimples, stop processing if a gimple with >> volatile operand has been encountered, save scalar statement >> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE >> iterator, change vuse of all saved scalar statements to LAST_VUSE if >> it makes sence. >> >> gcc/testsuite/ChangeLog: >> * gcc.dg/torture/pr69652.c: New test. >> >> 2016-02-04 19:40 GMT+03:00 Jakub Jelinek <jakub@redhat.com>: >>> On Thu, Feb 04, 2016 at 05:46:27PM +0300, Yuri Rumyantsev wrote: >>>> Here is a patch that cures the issues with non-correct vuse for scalar >>>> statements during code motion, i.e. if vuse of scalar statement is >>>> vdef of masked store which has been sunk to new basic block, we must >>>> fix it up. The patch also fixed almost all remarks pointed out by >>>> Jacub. >>>> >>>> Bootstrapping and regression testing on v86-64 did not show any new failures. >>>> Is it OK for trunk? >>>> >>>> ChangeLog: >>>> 2016-02-04 Yuri Rumyantsev <ysrumyan@gmail.com> >>>> >>>> PR tree-optimization/69652 >>>> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 >>>> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all >>>> skipped scalar statements, introduce variable LAST_VUSE that has >>>> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the >>>> begining of current masked store processing, did source re-formatting, >>>> skip parsing of debug gimples, stop processing when call or gimple >>>> with volatile operand habe been encountered, save scalar statement >>>> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE >>>> iterator, change vuse of all saved scalar statements to LAST_VUSE if >>>> it makes sence. >>>> >>>> gcc/testsuite/ChangeLog: >>>> * gcc.dg/torture/pr69652.c: New test. >>> >>> Your mailer breaks ChangeLog formatting, so it is hard to check the >>> formatting of the ChangeLog entry. >>> >>> diff --git a/gcc/testsuite/gcc.dg/torture/pr69652.c b/gcc/testsuite/gcc.dg/torture/pr69652.c >>> new file mode 100644 >>> index 0000000..91f30cf >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.dg/torture/pr69652.c >>> @@ -0,0 +1,14 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-O2 -ffast-math -ftree-vectorize " } */ >>> +/* { dg-additional-options "-mavx" { target { i?86-*-* x86_64-*-* } } } */ >>> + >>> +void fn1(double **matrix, int column, int row, int n) >>> +{ >>> + int k; >>> + for (k = 0; k < n; k++) >>> + if (matrix[row][k] != matrix[column][k]) >>> + { >>> + matrix[column][k] = -matrix[column][k]; >>> + matrix[row][k] = matrix[row][k] - matrix[column][k]; >>> + } >>> +} >>> \ No newline at end of file >>> >>> Please make sure the last line of the test is a new-line. >>> >>> @@ -6971,6 +6972,8 @@ optimize_mask_stores (struct loop *loop) >>> gsi_next (&gsi)) >>> { >>> stmt = gsi_stmt (gsi); >>> + if (is_gimple_debug (stmt)) >>> + continue; >>> if (is_gimple_call (stmt) >>> && gimple_call_internal_p (stmt) >>> && gimple_call_internal_fn (stmt) == IFN_MASK_STORE) >>> >>> This is not needed, you do something only for is_gimple_call, >>> which is never true if is_gimple_debug, so the code used to be fine as is. >>> >>> + /* Skip debug sstatements. */ >>> >>> s/ss/s/ >>> >>> + if (is_gimple_debug (gsi_stmt (gsi))) >>> + continue; >>> + stmt1 = gsi_stmt (gsi); >>> + /* Do not consider writing to memory,volatile and call >>> >>> Missing space after , >>> >>> + /* Skip scalar statements. */ >>> + if (!VECTOR_TYPE_P (TREE_TYPE (lhs))) >>> + { >>> + /* If scalar statement has vuse we need to modify it >>> + when another masked store will be sunk. */ >>> + if (gimple_vuse (stmt1)) >>> + scalar_vuse.safe_push (stmt1); >>> continue; >>> + } >>> >>> I don't think it is safe to take for granted that the scalar stmts are all >>> going to be DCEd, but I could be wrong. >>> >>> + /* Check that LHS does not have uses outside of STORE_BB. */ >>> + res = true; >>> + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) >>> + { >>> + gimple *use_stmt; >>> + use_stmt = USE_STMT (use_p); >>> + if (is_gimple_debug (use_stmt)) >>> + continue; >>> >>> Ignoring debug stmts to make decision whether you move or not is >>> of course the right thing to do. But IMHO you should remember if >>> you saw any is_gimple_debug stmts in some bool var. >>> >>> + if (gimple_bb (use_stmt) != store_bb) >>> + { >>> + res = false; >>> + break; >>> + } >>> + } >>> + if (!res) >>> + break; >>> >>> - if (gimple_vuse (stmt1) >>> - && gimple_vuse (stmt1) != gimple_vuse (last_store)) >>> - break; >>> + if (gimple_vuse (stmt1) >>> + && gimple_vuse (stmt1) != gimple_vuse (last_store)) >>> + break; >>> >>> + /* Can move STMT1 to STORE_BB. */ >>> + if (dump_enabled_p ()) >>> + { >>> + dump_printf_loc (MSG_NOTE, vect_location, >>> + "Move stmt to created bb\n"); >>> + dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt1, 0); >>> + } >>> >>> And if yes, invalidate them here, because the move would otherwise >>> generate invalid IL. >>> >>> + gsi_move_before (&gsi_from, &gsi_to); >>> + /* Shift GSI_TO for further insertion. */ >>> + gsi_prev (&gsi_to); >>> + } >>> + /* Put other masked stores with the same mask to STORE_BB. */ >>> + if (worklist.is_empty () >>> + || gimple_call_arg (worklist.last (), 2) != mask >>> + || worklist.last () != stmt1) >>> + break; >>> + last = worklist.pop (); >>> } >>> add_phi_arg (phi, gimple_vuse (last_store), e, UNKNOWN_LOCATION); >>> + /* Mask stores motion could crossing scalar statements with vuse >>> + which should be corrected. */ >>> >>> s/crossing/cross/ >>> That said, I'm not really sure if without some verification if such >>> reads are really dead it is safe to skip them and update this way. >>> Richard? >>> >>> + last_vuse = gimple_vuse (last_store); >>> + while (!scalar_vuse.is_empty ()) >>> + { >>> + stmt = scalar_vuse.pop (); >>> + if (gimple_vuse (stmt) != last_vuse) >>> + { >>> + gimple_set_vuse (stmt, last_vuse); >>> + update_stmt (stmt); >>> + } >>> + } >>> } >>> } >>> >>> Jakub
On Wed, Feb 10, 2016 at 11:26 AM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote: > Thanks Richard for your comments. > I changes algorithm to remove dead scalar statements as you proposed. > > Bootstrap and regression testing did not show any new failures on x86-64. > Is it OK for trunk? Ok. Thanks, Richard. > Changelog: > 2016-02-10 Yuri Rumyantsev <ysrumyan@gmail.com> > > PR tree-optimization/69652 > * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 > to nested loop, did source re-formatting, skip debug statements, > add check on statement with volatile operand, remove dead scalar > statements. > > gcc/testsuite/ChangeLog: > * gcc.dg/torture/pr69652.c: New test. > > > 2016-02-09 15:33 GMT+03:00 Richard Biener <richard.guenther@gmail.com>: >> On Fri, Feb 5, 2016 at 3:54 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote: >>> Hi All, >>> >>> Here is updated patch - I came back to move call statements also since >>> masked loads are presented by internal call. I also assume that for >>> the following simple loop >>> for (i = 0; i < n; i++) >>> if (b1[i]) >>> a1[i] = sqrtf(a2[i] * a2[i] + a3[i] * a3[i]); >>> motion must be done for all vector statements in semi-hammock including SQRT. >>> >>> Bootstrap and regression testing did not show any new failures. >>> Is it OK for trunk? >> >> The patch is incredibly hard to parse due to the re-indenting. Please >> consider sending >> diffs with -b. >> >> This issue exposes that you are moving (masked) stores across loads without >> checking aliasing. In the specific case those loads are dead and thus >> this is safe >> but in general I thought we were checking that we are using the same VUSE >> during the sinking operation. >> >> Thus, I'd rather have >> >> + /* Check that LHS does not have uses outside of STORE_BB. */ >> + res = true; >> + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) >> + { >> + gimple *use_stmt; >> + use_stmt = USE_STMT (use_p); >> + if (is_gimple_debug (use_stmt)) >> + continue; >> + if (gimple_bb (use_stmt) != store_bb) >> + { >> + res = false; >> + break; >> + } >> + } >> >> also check for the dead code case and DCE those stmts here. Like so: >> >> if (has_zero_uses (lhs)) >> { >> gsi_remove (&gsi_from, true); >> continue; >> } >> >> before the above loop. >> >> Richard. >> >>> ChangeLog: >>> >>> 2016-02-05 Yuri Rumyantsev <ysrumyan@gmail.com> >>> >>> PR tree-optimization/69652 >>> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 >>> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all >>> skipped scalar statements, introduce variable LAST_VUSE to keep >>> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the >>> begining of current masked store processing, did source re-formatting, >>> skip parsing of debug gimples, stop processing if a gimple with >>> volatile operand has been encountered, save scalar statement >>> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE >>> iterator, change vuse of all saved scalar statements to LAST_VUSE if >>> it makes sence. >>> >>> gcc/testsuite/ChangeLog: >>> * gcc.dg/torture/pr69652.c: New test. >>> >>> 2016-02-04 19:40 GMT+03:00 Jakub Jelinek <jakub@redhat.com>: >>>> On Thu, Feb 04, 2016 at 05:46:27PM +0300, Yuri Rumyantsev wrote: >>>>> Here is a patch that cures the issues with non-correct vuse for scalar >>>>> statements during code motion, i.e. if vuse of scalar statement is >>>>> vdef of masked store which has been sunk to new basic block, we must >>>>> fix it up. The patch also fixed almost all remarks pointed out by >>>>> Jacub. >>>>> >>>>> Bootstrapping and regression testing on v86-64 did not show any new failures. >>>>> Is it OK for trunk? >>>>> >>>>> ChangeLog: >>>>> 2016-02-04 Yuri Rumyantsev <ysrumyan@gmail.com> >>>>> >>>>> PR tree-optimization/69652 >>>>> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 >>>>> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all >>>>> skipped scalar statements, introduce variable LAST_VUSE that has >>>>> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the >>>>> begining of current masked store processing, did source re-formatting, >>>>> skip parsing of debug gimples, stop processing when call or gimple >>>>> with volatile operand habe been encountered, save scalar statement >>>>> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE >>>>> iterator, change vuse of all saved scalar statements to LAST_VUSE if >>>>> it makes sence. >>>>> >>>>> gcc/testsuite/ChangeLog: >>>>> * gcc.dg/torture/pr69652.c: New test. >>>> >>>> Your mailer breaks ChangeLog formatting, so it is hard to check the >>>> formatting of the ChangeLog entry. >>>> >>>> diff --git a/gcc/testsuite/gcc.dg/torture/pr69652.c b/gcc/testsuite/gcc.dg/torture/pr69652.c >>>> new file mode 100644 >>>> index 0000000..91f30cf >>>> --- /dev/null >>>> +++ b/gcc/testsuite/gcc.dg/torture/pr69652.c >>>> @@ -0,0 +1,14 @@ >>>> +/* { dg-do compile } */ >>>> +/* { dg-options "-O2 -ffast-math -ftree-vectorize " } */ >>>> +/* { dg-additional-options "-mavx" { target { i?86-*-* x86_64-*-* } } } */ >>>> + >>>> +void fn1(double **matrix, int column, int row, int n) >>>> +{ >>>> + int k; >>>> + for (k = 0; k < n; k++) >>>> + if (matrix[row][k] != matrix[column][k]) >>>> + { >>>> + matrix[column][k] = -matrix[column][k]; >>>> + matrix[row][k] = matrix[row][k] - matrix[column][k]; >>>> + } >>>> +} >>>> \ No newline at end of file >>>> >>>> Please make sure the last line of the test is a new-line. >>>> >>>> @@ -6971,6 +6972,8 @@ optimize_mask_stores (struct loop *loop) >>>> gsi_next (&gsi)) >>>> { >>>> stmt = gsi_stmt (gsi); >>>> + if (is_gimple_debug (stmt)) >>>> + continue; >>>> if (is_gimple_call (stmt) >>>> && gimple_call_internal_p (stmt) >>>> && gimple_call_internal_fn (stmt) == IFN_MASK_STORE) >>>> >>>> This is not needed, you do something only for is_gimple_call, >>>> which is never true if is_gimple_debug, so the code used to be fine as is. >>>> >>>> + /* Skip debug sstatements. */ >>>> >>>> s/ss/s/ >>>> >>>> + if (is_gimple_debug (gsi_stmt (gsi))) >>>> + continue; >>>> + stmt1 = gsi_stmt (gsi); >>>> + /* Do not consider writing to memory,volatile and call >>>> >>>> Missing space after , >>>> >>>> + /* Skip scalar statements. */ >>>> + if (!VECTOR_TYPE_P (TREE_TYPE (lhs))) >>>> + { >>>> + /* If scalar statement has vuse we need to modify it >>>> + when another masked store will be sunk. */ >>>> + if (gimple_vuse (stmt1)) >>>> + scalar_vuse.safe_push (stmt1); >>>> continue; >>>> + } >>>> >>>> I don't think it is safe to take for granted that the scalar stmts are all >>>> going to be DCEd, but I could be wrong. >>>> >>>> + /* Check that LHS does not have uses outside of STORE_BB. */ >>>> + res = true; >>>> + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) >>>> + { >>>> + gimple *use_stmt; >>>> + use_stmt = USE_STMT (use_p); >>>> + if (is_gimple_debug (use_stmt)) >>>> + continue; >>>> >>>> Ignoring debug stmts to make decision whether you move or not is >>>> of course the right thing to do. But IMHO you should remember if >>>> you saw any is_gimple_debug stmts in some bool var. >>>> >>>> + if (gimple_bb (use_stmt) != store_bb) >>>> + { >>>> + res = false; >>>> + break; >>>> + } >>>> + } >>>> + if (!res) >>>> + break; >>>> >>>> - if (gimple_vuse (stmt1) >>>> - && gimple_vuse (stmt1) != gimple_vuse (last_store)) >>>> - break; >>>> + if (gimple_vuse (stmt1) >>>> + && gimple_vuse (stmt1) != gimple_vuse (last_store)) >>>> + break; >>>> >>>> + /* Can move STMT1 to STORE_BB. */ >>>> + if (dump_enabled_p ()) >>>> + { >>>> + dump_printf_loc (MSG_NOTE, vect_location, >>>> + "Move stmt to created bb\n"); >>>> + dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt1, 0); >>>> + } >>>> >>>> And if yes, invalidate them here, because the move would otherwise >>>> generate invalid IL. >>>> >>>> + gsi_move_before (&gsi_from, &gsi_to); >>>> + /* Shift GSI_TO for further insertion. */ >>>> + gsi_prev (&gsi_to); >>>> + } >>>> + /* Put other masked stores with the same mask to STORE_BB. */ >>>> + if (worklist.is_empty () >>>> + || gimple_call_arg (worklist.last (), 2) != mask >>>> + || worklist.last () != stmt1) >>>> + break; >>>> + last = worklist.pop (); >>>> } >>>> add_phi_arg (phi, gimple_vuse (last_store), e, UNKNOWN_LOCATION); >>>> + /* Mask stores motion could crossing scalar statements with vuse >>>> + which should be corrected. */ >>>> >>>> s/crossing/cross/ >>>> That said, I'm not really sure if without some verification if such >>>> reads are really dead it is safe to skip them and update this way. >>>> Richard? >>>> >>>> + last_vuse = gimple_vuse (last_store); >>>> + while (!scalar_vuse.is_empty ()) >>>> + { >>>> + stmt = scalar_vuse.pop (); >>>> + if (gimple_vuse (stmt) != last_vuse) >>>> + { >>>> + gimple_set_vuse (stmt, last_vuse); >>>> + update_stmt (stmt); >>>> + } >>>> + } >>>> } >>>> } >>>> >>>> Jakub
On Wed, Feb 10, 2016 at 2:26 AM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote: > Thanks Richard for your comments. > I changes algorithm to remove dead scalar statements as you proposed. > > Bootstrap and regression testing did not show any new failures on x86-64. > Is it OK for trunk? > > Changelog: > 2016-02-10 Yuri Rumyantsev <ysrumyan@gmail.com> > > PR tree-optimization/69652 > * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 > to nested loop, did source re-formatting, skip debug statements, > add check on statement with volatile operand, remove dead scalar > statements. > > gcc/testsuite/ChangeLog: > * gcc.dg/torture/pr69652.c: New test. > > /* { dg-do compile } */ /* { dg-options "-O2 -ffast-math -ftree-vectorize " } */ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Any particular reason why it should be in gcc.dg/torture, not in gcc.dg/vect? -O2 here is overridden by -Ox. /* { dg-additional-options "-mavx" { target { i?86-*-* x86_64-*-* } } } */
This test simply checks that ICE is not occurred but not any vectorization issues. Best regards. Yuri. 2016-02-28 20:29 GMT+03:00 H.J. Lu <hjl.tools@gmail.com>: > On Wed, Feb 10, 2016 at 2:26 AM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote: >> Thanks Richard for your comments. >> I changes algorithm to remove dead scalar statements as you proposed. >> >> Bootstrap and regression testing did not show any new failures on x86-64. >> Is it OK for trunk? >> >> Changelog: >> 2016-02-10 Yuri Rumyantsev <ysrumyan@gmail.com> >> >> PR tree-optimization/69652 >> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 >> to nested loop, did source re-formatting, skip debug statements, >> add check on statement with volatile operand, remove dead scalar >> statements. >> >> gcc/testsuite/ChangeLog: >> * gcc.dg/torture/pr69652.c: New test. >> >> > > /* { dg-do compile } */ > /* { dg-options "-O2 -ffast-math -ftree-vectorize " } */ > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Any particular reason why it should be in gcc.dg/torture, not in > gcc.dg/vect? -O2 here is overridden by -Ox. > > /* { dg-additional-options "-mavx" { target { i?86-*-* x86_64-*-* } } } */ > > > > -- > H.J.
On Mon, Feb 29, 2016 at 3:53 AM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote: > This test simply checks that ICE is not occurred but not any > vectorization issues. Can we remove /* { dg-options "-O2 -ffast-math -ftree-vectorize " } */ then? H.J. > Best regards. > Yuri. > > 2016-02-28 20:29 GMT+03:00 H.J. Lu <hjl.tools@gmail.com>: >> On Wed, Feb 10, 2016 at 2:26 AM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote: >>> Thanks Richard for your comments. >>> I changes algorithm to remove dead scalar statements as you proposed. >>> >>> Bootstrap and regression testing did not show any new failures on x86-64. >>> Is it OK for trunk? >>> >>> Changelog: >>> 2016-02-10 Yuri Rumyantsev <ysrumyan@gmail.com> >>> >>> PR tree-optimization/69652 >>> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 >>> to nested loop, did source re-formatting, skip debug statements, >>> add check on statement with volatile operand, remove dead scalar >>> statements. >>> >>> gcc/testsuite/ChangeLog: >>> * gcc.dg/torture/pr69652.c: New test. >>> >>> >> >> /* { dg-do compile } */ >> /* { dg-options "-O2 -ffast-math -ftree-vectorize " } */ >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> >> Any particular reason why it should be in gcc.dg/torture, not in >> gcc.dg/vect? -O2 here is overridden by -Ox. >> >> /* { dg-additional-options "-mavx" { target { i?86-*-* x86_64-*-* } } } */ >> >> >> >> -- >> H.J.
On Mon, Feb 29, 2016 at 05:01:52AM -0800, H.J. Lu wrote: > On Mon, Feb 29, 2016 at 3:53 AM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote: > > This test simply checks that ICE is not occurred but not any > > vectorization issues. > > Can we remove > > /* { dg-options "-O2 -ffast-math -ftree-vectorize " } */ > > then? Well, I bet -ffast-math -ftree-vectorize are needed to reproduce the ICE with broken compiler. But, e.g. -O0 -ftree-vectorize doesn't make much sense to test. So, either put it into gcc.dg/pr69652.c with the above mentioned options, or into gcc.dg/vect/ with dg-additional-options "-ffast-math". Jakub
Jacub! Here is patch and ChangeLog to move pr69652.c to /vect directory. Is it OK for trunk. Thanks. Yuri. ChangeLog: 2016-02-29 Yuri Rumyantsev <ysrumyan@gmail.com> PR tree-optimization/69652 gcc/testsuite/ChangeLog: * gcc.dg/torture/pr69652.c: Delete test. * gcc.dg/vect/pr69652.c: New test. 2016-02-29 16:05 GMT+03:00 Jakub Jelinek <jakub@redhat.com>: > On Mon, Feb 29, 2016 at 05:01:52AM -0800, H.J. Lu wrote: >> On Mon, Feb 29, 2016 at 3:53 AM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote: >> > This test simply checks that ICE is not occurred but not any >> > vectorization issues. >> >> Can we remove >> >> /* { dg-options "-O2 -ffast-math -ftree-vectorize " } */ >> >> then? > > Well, I bet -ffast-math -ftree-vectorize are needed to reproduce the ICE > with broken compiler. But, e.g. -O0 -ftree-vectorize doesn't make much > sense to test. > So, either put it into gcc.dg/pr69652.c with the above mentioned options, > or into gcc.dg/vect/ with dg-additional-options "-ffast-math". > > Jakub
On Mon, Feb 29, 2016 at 05:03:38PM +0300, Yuri Rumyantsev wrote: > 2016-02-29 Yuri Rumyantsev <ysrumyan@gmail.com> > > PR tree-optimization/69652 > gcc/testsuite/ChangeLog: > * gcc.dg/torture/pr69652.c: Delete test. > * gcc.dg/vect/pr69652.c: New test. Ok, with: /* { dg-additional-options "-mavx -ffast-math" { target { i?86-*-* x86_64-*-* } } } */ changed to: /* { dg-additional-options "-ffast-math" } */ /* { dg-additional-options "-mavx" { target { i?86-*-* x86_64-*-* } } } */ (no reason not to use it in all targets), and if you verify the test fails if you revert the compiler fix and passes otherwise. Jakub
diff --git a/gcc/testsuite/gcc.dg/torture/pr69652.c b/gcc/testsuite/gcc.dg/torture/pr69652.c new file mode 100644 index 0000000..91f30cf --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr69652.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ffast-math -ftree-vectorize " } */ +/* { dg-additional-options "-mavx" { target { i?86-*-* x86_64-*-* } } } */ + +void fn1(double **matrix, int column, int row, int n) +{ + int k; + for (k = 0; k < n; k++) + if (matrix[row][k] != matrix[column][k]) + { + matrix[column][k] = -matrix[column][k]; + matrix[row][k] = matrix[row][k] - matrix[column][k]; + } +} \ No newline at end of file