Message ID | 20210902095739.192089-1-guojiufu@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | avoid transform at run until wrap comparesion | expand |
On Thu, 2 Sep 2021, Jiufu Guo wrote: > When transform > {iv0.base, iv0.step} LT/LE {iv1.base, iv1.step} > to > {iv0.base, iv0.step - iv1.step} LT/LE {iv1.base, 0} > > There would be error if 'iv0.step - iv1.step' in negative, > for which means run until wrap/overflow. > > For example: > {1, +, 1} <= {4, +, 3} => {1, +, -2} <= {4, +, 0} > > This patch avoid this kind transform. > > Bootstrap and regtest pass on ppc64le. > Is this ok for trunk? This looks like PR100740, see the discussion starting at https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571570.html We seem to be at a dead end figuring what's exactly required to make the transform valid and I have my doubts that arguing with overflow we're not running in circles. My last attempt was + if (code != NE_EXPR) + { + if (TREE_CODE (step) != INTEGER_CST) + return false; + if (!iv0->no_overflow || !iv1->no_overflow) + return false; + /* The new IV does not overflow if the step has the same + sign and is less in magnitude. */ + if (tree_int_cst_sign_bit (iv0->step) != tree_int_cst_sign_bit (step) + || wi::geu_p (wi::abs (wi::to_wide (step)), + wi::abs (wi::to_wide (iv0->step)))) + return false; + } where your patch at least misses { 1, +, -1 } < { -3, + , -3 } transforming to { 1, +, +2 } < -3, thus a positive step but we're still iterating unti wrap? That is, the sign of the combined step cannot really be the issue at hand. Bin argued my condition is too strict and I agreed, see https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574334.html which is the last mail in the thread sofar (still without a fix :/) Somewhere it was said that we eventually should simply put preconditions into ->assumptions rather than trying to check ->no_overflow and friends, not sure if that's really applicable here. Richard. > BR. > Jiufu > > gcc/ChangeLog: > > 2021-09-02 Jiufu Guo <guojiufu@linux.ibm.com> > > PR tree-optimization/102131 > * tree-ssa-loop-niter.c (number_of_iterations_cond): > Avoid transform until wrap condition > > gcc/testsuite/ChangeLog: > > 2021-09-02 Jiufu Guo <guojiufu@linux.ibm.com> > > PR tree-optimization/102131 > * gcc.dg/pr102131.c: New test. > > --- > gcc/tree-ssa-loop-niter.c | 18 +++++++++ > gcc/testsuite/gcc.dg/pr102131.c | 69 +++++++++++++++++++++++++++++++++ > 2 files changed, 87 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/pr102131.c > > diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c > index 7af92d1c893..52ce517af89 100644 > --- a/gcc/tree-ssa-loop-niter.c > +++ b/gcc/tree-ssa-loop-niter.c > @@ -1866,6 +1866,24 @@ number_of_iterations_cond (class loop *loop, > || !iv0->no_overflow || !iv1->no_overflow)) > return false; > > + /* GT/GE has been transformed to LT/LE already. > + cmp_code could be LT, LE or NE > + > + For LE/LT transform > + {iv0.base, iv0.step} LT/LE {iv1.base, iv1.step} > + to > + {iv0.base, iv0.step - iv1.step} LT/LE {iv1.base, 0} > + Negative iv0.step - iv1.step means decreasing until wrap, > + then the transform is not accurate. > + > + For example: > + {1, +, 1} <= {4, +, 3} > + is not same with > + {1, +, -2} <= {4, +, 0} > + */ > + if ((code == LE_EXPR || code == LT_EXPR) && tree_int_cst_sign_bit (step)) > + return false; > + > iv0->step = step; > if (!POINTER_TYPE_P (type)) > iv0->no_overflow = false; > diff --git a/gcc/testsuite/gcc.dg/pr102131.c b/gcc/testsuite/gcc.dg/pr102131.c > new file mode 100644 > index 00000000000..0fcfaa132da > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pr102131.c > @@ -0,0 +1,69 @@ > +/* { dg-do run } */ > + > +unsigned int a; > +int > +fun1 () > +{ > + unsigned b = 0; > + int c = 1; > + for (; b < 3; b++) > + { > + while (c < b) > + __builtin_abort (); > + for (a = 0; a < 3; a++) > + c++; > + } > + return 0; > +} > + > +unsigned b; > +int > +fun2 () > +{ > + unsigned c = 0; > + for (a = 0; a < 2; a++) > + for (b = 0; b < 2; b++) > + if (++c < a) > + __builtin_abort (); > + return 0; > +} > + > +int > +fun3 (void) > +{ > + unsigned a, b, c = 0; > + for (a = 0; a < 10; a++) > + { > + for (b = 0; b < 2; b++) > + { > + c++; > + if (c < a) > + { > + __builtin_abort (); > + } > + } > + } > + return 0; > +} > + > +int > +fun4 () > +{ > + unsigned i; > + for (i = 0; i < 3; ++i) > + { > + if (i > i * 2) > + __builtin_abort (); > + } > + return 0; > +} > + > +int > +main () > +{ > + fun1 (); > + fun2 (); > + fun3 (); > + fun4 (); > + return 0; > +} >
On Thu, Sep 2, 2021 at 6:18 PM Richard Biener <rguenther@suse.de> wrote: > > On Thu, 2 Sep 2021, Jiufu Guo wrote: > > > When transform > > {iv0.base, iv0.step} LT/LE {iv1.base, iv1.step} > > to > > {iv0.base, iv0.step - iv1.step} LT/LE {iv1.base, 0} > > > > There would be error if 'iv0.step - iv1.step' in negative, > > for which means run until wrap/overflow. > > > > For example: > > {1, +, 1} <= {4, +, 3} => {1, +, -2} <= {4, +, 0} > > > > This patch avoid this kind transform. > > > > Bootstrap and regtest pass on ppc64le. > > Is this ok for trunk? > > This looks like PR100740, see the discussion starting at > https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571570.html > > We seem to be at a dead end figuring what's exactly required > to make the transform valid and I have my doubts that arguing > with overflow we're not running in circles. > > My last attempt was > > + if (code != NE_EXPR) > + { > + if (TREE_CODE (step) != INTEGER_CST) > + return false; > + if (!iv0->no_overflow || !iv1->no_overflow) > + return false; > + /* The new IV does not overflow if the step has the same > + sign and is less in magnitude. */ > + if (tree_int_cst_sign_bit (iv0->step) != tree_int_cst_sign_bit > (step) > + || wi::geu_p (wi::abs (wi::to_wide (step)), > + wi::abs (wi::to_wide (iv0->step)))) > + return false; > + } > > where your patch at least misses { 1, +, -1 } < { -3, + , -3 } > transforming to { 1, +, +2 } < -3, thus a positive step but > we're still iterating unti wrap? That is, the sign of the > combined step cannot really be the issue at hand. > > Bin argued my condition is too strict and I agreed, see > https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574334.html > which is the last mail in the thread sofar (still without a fix :/) > > Somewhere it was said that we eventually should simply put > preconditions into ->assumptions rather than trying to > check ->no_overflow and friends, not sure if that's really I did some experiments which can fix the PRs, however it causes new failures in graphite possibly because of missing cases. However, I didn't have time looking into graphite tests back in time. Thanks, bin > applicable here. > > Richard. > > > BR. > > Jiufu > > > > gcc/ChangeLog: > > > > 2021-09-02 Jiufu Guo <guojiufu@linux.ibm.com> > > > > PR tree-optimization/102131 > > * tree-ssa-loop-niter.c (number_of_iterations_cond): > > Avoid transform until wrap condition > > > > gcc/testsuite/ChangeLog: > > > > 2021-09-02 Jiufu Guo <guojiufu@linux.ibm.com> > > > > PR tree-optimization/102131 > > * gcc.dg/pr102131.c: New test. > > > > --- > > gcc/tree-ssa-loop-niter.c | 18 +++++++++ > > gcc/testsuite/gcc.dg/pr102131.c | 69 +++++++++++++++++++++++++++++++++ > > 2 files changed, 87 insertions(+) > > create mode 100644 gcc/testsuite/gcc.dg/pr102131.c > > > > diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c > > index 7af92d1c893..52ce517af89 100644 > > --- a/gcc/tree-ssa-loop-niter.c > > +++ b/gcc/tree-ssa-loop-niter.c > > @@ -1866,6 +1866,24 @@ number_of_iterations_cond (class loop *loop, > > || !iv0->no_overflow || !iv1->no_overflow)) > > return false; > > > > + /* GT/GE has been transformed to LT/LE already. > > + cmp_code could be LT, LE or NE > > + > > + For LE/LT transform > > + {iv0.base, iv0.step} LT/LE {iv1.base, iv1.step} > > + to > > + {iv0.base, iv0.step - iv1.step} LT/LE {iv1.base, 0} > > + Negative iv0.step - iv1.step means decreasing until wrap, > > + then the transform is not accurate. > > + > > + For example: > > + {1, +, 1} <= {4, +, 3} > > + is not same with > > + {1, +, -2} <= {4, +, 0} > > + */ > > + if ((code == LE_EXPR || code == LT_EXPR) && tree_int_cst_sign_bit (step)) > > + return false; > > + > > iv0->step = step; > > if (!POINTER_TYPE_P (type)) > > iv0->no_overflow = false; > > diff --git a/gcc/testsuite/gcc.dg/pr102131.c b/gcc/testsuite/gcc.dg/pr102131.c > > new file mode 100644 > > index 00000000000..0fcfaa132da > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/pr102131.c > > @@ -0,0 +1,69 @@ > > +/* { dg-do run } */ > > + > > +unsigned int a; > > +int > > +fun1 () > > +{ > > + unsigned b = 0; > > + int c = 1; > > + for (; b < 3; b++) > > + { > > + while (c < b) > > + __builtin_abort (); > > + for (a = 0; a < 3; a++) > > + c++; > > + } > > + return 0; > > +} > > + > > +unsigned b; > > +int > > +fun2 () > > +{ > > + unsigned c = 0; > > + for (a = 0; a < 2; a++) > > + for (b = 0; b < 2; b++) > > + if (++c < a) > > + __builtin_abort (); > > + return 0; > > +} > > + > > +int > > +fun3 (void) > > +{ > > + unsigned a, b, c = 0; > > + for (a = 0; a < 10; a++) > > + { > > + for (b = 0; b < 2; b++) > > + { > > + c++; > > + if (c < a) > > + { > > + __builtin_abort (); > > + } > > + } > > + } > > + return 0; > > +} > > + > > +int > > +fun4 () > > +{ > > + unsigned i; > > + for (i = 0; i < 3; ++i) > > + { > > + if (i > i * 2) > > + __builtin_abort (); > > + } > > + return 0; > > +} > > + > > +int > > +main () > > +{ > > + fun1 (); > > + fun2 (); > > + fun3 (); > > + fun4 (); > > + return 0; > > +} > > > > -- > Richard Biener <rguenther@suse.de> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
Richard Biener <rguenther@suse.de> writes: > On Thu, 2 Sep 2021, Jiufu Guo wrote: > >> When transform >> {iv0.base, iv0.step} LT/LE {iv1.base, iv1.step} >> to >> {iv0.base, iv0.step - iv1.step} LT/LE {iv1.base, 0} >> >> There would be error if 'iv0.step - iv1.step' in negative, >> for which means run until wrap/overflow. >> >> For example: >> {1, +, 1} <= {4, +, 3} => {1, +, -2} <= {4, +, 0} >> >> This patch avoid this kind transform. >> >> Bootstrap and regtest pass on ppc64le. >> Is this ok for trunk? > > This looks like PR100740, see the discussion starting at > https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571570.html Oh, thanks! > > We seem to be at a dead end figuring what's exactly required > to make the transform valid and I have my doubts that arguing > with overflow we're not running in circles. > > My last attempt was > > + if (code != NE_EXPR) > + { > + if (TREE_CODE (step) != INTEGER_CST) > + return false; > + if (!iv0->no_overflow || !iv1->no_overflow) > + return false; > + /* The new IV does not overflow if the step has the same > + sign and is less in magnitude. */ > + if (tree_int_cst_sign_bit (iv0->step) != tree_int_cst_sign_bit > (step) > + || wi::geu_p (wi::abs (wi::to_wide (step)), > + wi::abs (wi::to_wide (iv0->step)))) > + return false; > + } > > where your patch at least misses { 1, +, -1 } < { -3, + , -3 } > transforming to { 1, +, +2 } < -3, thus a positive step but > we're still iterating unti wrap? That is, the sign of the > combined step cannot really be the issue at hand. hmm, this transform may be invalid for a few cases. If the "iv0->step - iv1->step" is possitive, there may be some cases are not valid on LE/LT, like the example you said (or {0, +, 5} < {MAX - 30, 3}, iv0 walks faster, but iv1 overflow early). Also it maybe circles or inifinite occur on NE cases. Similary with this patch to check if combined STEP is negative, I find in the links, there is also talking about 'if (tree_int_cst_lt (iv0->step, iv1->step))' For this kind of cases LE/LT, the transformation seems always invalid. right? > > Bin argued my condition is too strict and I agreed, see > https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574334.html > which is the last mail in the thread sofar (still without a fix :/) > > Somewhere it was said that we eventually should simply put > preconditions into ->assumptions rather than trying to > check ->no_overflow and friends, not sure if that's really > applicable here. Yeap, adding to ->assumptions could help some kind of cases, the assumption may include "checking on iv0.base/iv1.base". Like the example "{0, +, 5} < {MAX - 30, 3}", it become "{0, +, 2} < {MAX - 30, 0}", this calls number_of_iterations_lt and get a niter, which is not same with the real niter (the original real niter maybe 10: 30/3 or less if iv1 does not overflow). For, "{base0, +, 5} < {base1, 3}", we may get a niter from "{base0, +, 0} < {base1, 3}", and then use 'assumption' which checking if the niter is valid. And for "NE_EXPR", the assumption may be more complicate on possible circles/inifinite loops. BR. Jiufu > > Richard. > >> BR. >> Jiufu >> >> gcc/ChangeLog: >> >> 2021-09-02 Jiufu Guo <guojiufu@linux.ibm.com> >> >> PR tree-optimization/102131 >> * tree-ssa-loop-niter.c (number_of_iterations_cond): >> Avoid transform until wrap condition >> >> gcc/testsuite/ChangeLog: >> >> 2021-09-02 Jiufu Guo <guojiufu@linux.ibm.com> >> >> PR tree-optimization/102131 >> * gcc.dg/pr102131.c: New test. >> >> --- >> gcc/tree-ssa-loop-niter.c | 18 +++++++++ >> gcc/testsuite/gcc.dg/pr102131.c | 69 +++++++++++++++++++++++++++++++++ >> 2 files changed, 87 insertions(+) >> create mode 100644 gcc/testsuite/gcc.dg/pr102131.c >> >> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c >> index 7af92d1c893..52ce517af89 100644 >> --- a/gcc/tree-ssa-loop-niter.c >> +++ b/gcc/tree-ssa-loop-niter.c >> @@ -1866,6 +1866,24 @@ number_of_iterations_cond (class loop *loop, >> || !iv0->no_overflow || !iv1->no_overflow)) >> return false; >> >> + /* GT/GE has been transformed to LT/LE already. >> + cmp_code could be LT, LE or NE >> + >> + For LE/LT transform >> + {iv0.base, iv0.step} LT/LE {iv1.base, iv1.step} >> + to >> + {iv0.base, iv0.step - iv1.step} LT/LE {iv1.base, 0} >> + Negative iv0.step - iv1.step means decreasing until wrap, >> + then the transform is not accurate. >> + >> + For example: >> + {1, +, 1} <= {4, +, 3} >> + is not same with >> + {1, +, -2} <= {4, +, 0} >> + */ >> + if ((code == LE_EXPR || code == LT_EXPR) && tree_int_cst_sign_bit (step)) >> + return false; >> + >> iv0->step = step; >> if (!POINTER_TYPE_P (type)) >> iv0->no_overflow = false; >> diff --git a/gcc/testsuite/gcc.dg/pr102131.c b/gcc/testsuite/gcc.dg/pr102131.c >> new file mode 100644 >> index 00000000000..0fcfaa132da >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/pr102131.c >> @@ -0,0 +1,69 @@ >> +/* { dg-do run } */ >> + >> +unsigned int a; >> +int >> +fun1 () >> +{ >> + unsigned b = 0; >> + int c = 1; >> + for (; b < 3; b++) >> + { >> + while (c < b) >> + __builtin_abort (); >> + for (a = 0; a < 3; a++) >> + c++; >> + } >> + return 0; >> +} >> + >> +unsigned b; >> +int >> +fun2 () >> +{ >> + unsigned c = 0; >> + for (a = 0; a < 2; a++) >> + for (b = 0; b < 2; b++) >> + if (++c < a) >> + __builtin_abort (); >> + return 0; >> +} >> + >> +int >> +fun3 (void) >> +{ >> + unsigned a, b, c = 0; >> + for (a = 0; a < 10; a++) >> + { >> + for (b = 0; b < 2; b++) >> + { >> + c++; >> + if (c < a) >> + { >> + __builtin_abort (); >> + } >> + } >> + } >> + return 0; >> +} >> + >> +int >> +fun4 () >> +{ >> + unsigned i; >> + for (i = 0; i < 3; ++i) >> + { >> + if (i > i * 2) >> + __builtin_abort (); >> + } >> + return 0; >> +} >> + >> +int >> +main () >> +{ >> + fun1 (); >> + fun2 (); >> + fun3 (); >> + fun4 (); >> + return 0; >> +} >>
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c index 7af92d1c893..52ce517af89 100644 --- a/gcc/tree-ssa-loop-niter.c +++ b/gcc/tree-ssa-loop-niter.c @@ -1866,6 +1866,24 @@ number_of_iterations_cond (class loop *loop, || !iv0->no_overflow || !iv1->no_overflow)) return false; + /* GT/GE has been transformed to LT/LE already. + cmp_code could be LT, LE or NE + + For LE/LT transform + {iv0.base, iv0.step} LT/LE {iv1.base, iv1.step} + to + {iv0.base, iv0.step - iv1.step} LT/LE {iv1.base, 0} + Negative iv0.step - iv1.step means decreasing until wrap, + then the transform is not accurate. + + For example: + {1, +, 1} <= {4, +, 3} + is not same with + {1, +, -2} <= {4, +, 0} + */ + if ((code == LE_EXPR || code == LT_EXPR) && tree_int_cst_sign_bit (step)) + return false; + iv0->step = step; if (!POINTER_TYPE_P (type)) iv0->no_overflow = false; diff --git a/gcc/testsuite/gcc.dg/pr102131.c b/gcc/testsuite/gcc.dg/pr102131.c new file mode 100644 index 00000000000..0fcfaa132da --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr102131.c @@ -0,0 +1,69 @@ +/* { dg-do run } */ + +unsigned int a; +int +fun1 () +{ + unsigned b = 0; + int c = 1; + for (; b < 3; b++) + { + while (c < b) + __builtin_abort (); + for (a = 0; a < 3; a++) + c++; + } + return 0; +} + +unsigned b; +int +fun2 () +{ + unsigned c = 0; + for (a = 0; a < 2; a++) + for (b = 0; b < 2; b++) + if (++c < a) + __builtin_abort (); + return 0; +} + +int +fun3 (void) +{ + unsigned a, b, c = 0; + for (a = 0; a < 10; a++) + { + for (b = 0; b < 2; b++) + { + c++; + if (c < a) + { + __builtin_abort (); + } + } + } + return 0; +} + +int +fun4 () +{ + unsigned i; + for (i = 0; i < 3; ++i) + { + if (i > i * 2) + __builtin_abort (); + } + return 0; +} + +int +main () +{ + fun1 (); + fun2 (); + fun3 (); + fun4 (); + return 0; +}