Message ID | alpine.LNX.2.20.13.1707182003570.25203@monopod.intra.ispras.ru |
---|---|
State | New |
Headers | show |
On Tue, Jul 18, 2017 at 7:07 PM, Alexander Monakov <amonakov@ispras.ru> wrote: > On Mon, 17 Jul 2017, Alexander Monakov wrote: >> On Mon, 17 Jul 2017, Marc Glisse wrote: >> > > +/* Combine successive multiplications. Similar to above, but handling >> > > + overflow is different. */ >> > > +(simplify >> > > + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2) >> > > + (with { >> > > + bool overflow_p; >> > > + wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p); >> > > + } >> > > + (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type)) >> > >> > I wonder if there are cases where this would cause trouble for saturating >> > integers. The only case I can think of is when @2 is -1, but that's likely >> > simplified to NEGATE_EXPR first. >> >> Ah, yes, I think if @2 is -1 or 0 then we should not attempt this transform for >> either saturating or sanitized types, just like in the first patch. I think >> wrapping the 'with' with 'if (!integer_minus_onep (@2) && !integer_zerop (@2))' >> works, since as you say it should become a negate/zero anyway? > > Updated patch: > > * match.pd ((X * CST1) * CST2): Simplify to X * (CST1 * CST2). > testsuite: > * gcc.dg/tree-ssa/assoc-2.c: Enhance. > * gcc.dg/tree-ssa/slsr-4.c: Adjust. > > diff --git a/gcc/match.pd b/gcc/match.pd > index 36045f1..0bb5541 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > || mul != wi::min_value (TYPE_PRECISION (type), SIGNED)) > { build_zero_cst (type); }))))) > > +/* Combine successive multiplications. Similar to above, but handling > + overflow is different. */ > +(simplify > + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2) > + /* More specific rules can handle 0 and -1; skip them here to avoid > + wrong transformations for sanitized and saturating types. */ > + (if (!integer_zerop (@2) && !integer_minus_onep (@2)) I think integer_zerop (@2) should never happen here if you order the pattern after (simplify (mult @0 integer_zerop@1) @1) which I think you do. Why's @1 == -1 ok when sanitizing but not @2 == -1? If you rely on /* Transform x * -1 into -x. */ (simplify (mult @0 integer_minus_onep) (negate @0)) then you need to move that pattern above yours (there seem to be a bunch of related ones like 0 - @1 -> -@1 to move earlier as well). That would leave us with the case of saturating types (the above transforms doesn't care for those). So unless you can come up with a testcase that breaks I'd remove the integer_zerop/integer_minus_onep checks. > + (with { > + bool overflow_p; > + wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p); > + } > + (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type)) I think overflow in the constant multiplication is ok unless TYPE_OVERFLOW_SANITIZED (and can we have a ubsan testcase for that that would otherwise fail?). It's not that we introduce new overflow here. Ok with those changes. Thanks, Richard. > + (mult @0 { wide_int_to_tree (type, mul); }))))) > + > /* Optimize A / A to 1.0 if we don't care about > NaNs or Infinities. */ > (simplify > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c > index a92c882..cc0e9d4 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c > @@ -5,4 +5,15 @@ int f0(int a, int b){ > return a * 33 * b * 55; > } > > -/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */ > +int f1(int a){ > + a *= 33; > + return a * 55; > +} > + > +int f2(int a, int b){ > + a *= 33; > + return a * b * 55; > +} > + > +/* { dg-final { scan-tree-dump-times "mult_expr" 7 "gimple" } } */ > +/* { dg-final { scan-tree-dump-times "mult_expr" 5 "optimized" } } */ > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c > index 17d7b4c..1e943b7 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c > @@ -23,13 +23,9 @@ f (int i) > foo (y); > } > > -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "slsr" } } */ > -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "slsr" } } */ > -/* { dg-final { scan-tree-dump-times "\\+ 20;" 1 "slsr" } } */ > +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "slsr" } } */ > /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "slsr" } } */ > -/* { dg-final { scan-tree-dump-times "\\- 16;" 1 "slsr" } } */ > /* { dg-final { scan-tree-dump-times "\\- 160" 1 "slsr" } } */ > -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "optimized" } } */ > -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "optimized" } } */ > /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "optimized" } } */ > /* { dg-final { scan-tree-dump-times "\\+ 40" 1 "optimized" } } */
On Wed, Jul 19, 2017 at 1:28 PM, Richard Biener <richard.guenther@gmail.com> wrote: > On Tue, Jul 18, 2017 at 7:07 PM, Alexander Monakov <amonakov@ispras.ru> wrote: >> On Mon, 17 Jul 2017, Alexander Monakov wrote: >>> On Mon, 17 Jul 2017, Marc Glisse wrote: >>> > > +/* Combine successive multiplications. Similar to above, but handling >>> > > + overflow is different. */ >>> > > +(simplify >>> > > + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2) >>> > > + (with { >>> > > + bool overflow_p; >>> > > + wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p); >>> > > + } >>> > > + (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type)) >>> > >>> > I wonder if there are cases where this would cause trouble for saturating >>> > integers. The only case I can think of is when @2 is -1, but that's likely >>> > simplified to NEGATE_EXPR first. >>> >>> Ah, yes, I think if @2 is -1 or 0 then we should not attempt this transform for >>> either saturating or sanitized types, just like in the first patch. I think >>> wrapping the 'with' with 'if (!integer_minus_onep (@2) && !integer_zerop (@2))' >>> works, since as you say it should become a negate/zero anyway? >> >> Updated patch: >> >> * match.pd ((X * CST1) * CST2): Simplify to X * (CST1 * CST2). >> testsuite: >> * gcc.dg/tree-ssa/assoc-2.c: Enhance. >> * gcc.dg/tree-ssa/slsr-4.c: Adjust. >> >> diff --git a/gcc/match.pd b/gcc/match.pd >> index 36045f1..0bb5541 100644 >> --- a/gcc/match.pd >> +++ b/gcc/match.pd >> @@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) >> || mul != wi::min_value (TYPE_PRECISION (type), SIGNED)) >> { build_zero_cst (type); }))))) >> >> +/* Combine successive multiplications. Similar to above, but handling >> + overflow is different. */ >> +(simplify >> + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2) >> + /* More specific rules can handle 0 and -1; skip them here to avoid >> + wrong transformations for sanitized and saturating types. */ >> + (if (!integer_zerop (@2) && !integer_minus_onep (@2)) > > I think integer_zerop (@2) should never happen here if you order the > pattern after > > (simplify > (mult @0 integer_zerop@1) > @1) > > which I think you do. Why's @1 == -1 ok when sanitizing but not @2 == -1? > If you rely on > > /* Transform x * -1 into -x. */ > (simplify > (mult @0 integer_minus_onep) > (negate @0)) > > then you need to move that pattern above yours (there seem to be a bunch > of related ones like 0 - @1 -> -@1 to move earlier as well). > > That would leave us with the case of saturating types (the above transforms > doesn't care for those). > > So unless you can come up with a testcase that breaks I'd remove the > integer_zerop/integer_minus_onep checks. So for saturating types isn't the issue when @1 and @2 have opposite sign and the inner multiply would have saturated? [how do saturated types behave with sign-changing multiplication/negation anyway?] >> + (with { >> + bool overflow_p; >> + wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p); >> + } >> + (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type)) > > I think overflow in the constant multiplication is ok unless > TYPE_OVERFLOW_SANITIZED > (and can we have a ubsan testcase for that that would otherwise fail?). > It's not that we introduce new overflow here. > > Ok with those changes. > > Thanks, > Richard. > >> + (mult @0 { wide_int_to_tree (type, mul); }))))) >> + >> /* Optimize A / A to 1.0 if we don't care about >> NaNs or Infinities. */ >> (simplify >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c >> index a92c882..cc0e9d4 100644 >> --- a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c >> @@ -5,4 +5,15 @@ int f0(int a, int b){ >> return a * 33 * b * 55; >> } >> >> -/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */ >> +int f1(int a){ >> + a *= 33; >> + return a * 55; >> +} >> + >> +int f2(int a, int b){ >> + a *= 33; >> + return a * b * 55; >> +} >> + >> +/* { dg-final { scan-tree-dump-times "mult_expr" 7 "gimple" } } */ >> +/* { dg-final { scan-tree-dump-times "mult_expr" 5 "optimized" } } */ >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c >> index 17d7b4c..1e943b7 100644 >> --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c >> @@ -23,13 +23,9 @@ f (int i) >> foo (y); >> } >> >> -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "slsr" } } */ >> -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "slsr" } } */ >> -/* { dg-final { scan-tree-dump-times "\\+ 20;" 1 "slsr" } } */ >> +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "slsr" } } */ >> /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "slsr" } } */ >> -/* { dg-final { scan-tree-dump-times "\\- 16;" 1 "slsr" } } */ >> /* { dg-final { scan-tree-dump-times "\\- 160" 1 "slsr" } } */ >> -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "optimized" } } */ >> -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "optimized" } } */ >> +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "optimized" } } */ >> /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "optimized" } } */ >> /* { dg-final { scan-tree-dump-times "\\+ 40" 1 "optimized" } } */
On Wed, 19 Jul 2017, Richard Biener wrote: > >> --- a/gcc/match.pd > >> +++ b/gcc/match.pd > >> @@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > >> || mul != wi::min_value (TYPE_PRECISION (type), SIGNED)) > >> { build_zero_cst (type); }))))) > >> > >> +/* Combine successive multiplications. Similar to above, but handling > >> + overflow is different. */ > >> +(simplify > >> + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2) > >> + /* More specific rules can handle 0 and -1; skip them here to avoid > >> + wrong transformations for sanitized and saturating types. */ > >> + (if (!integer_zerop (@2) && !integer_minus_onep (@2)) > > > > I think integer_zerop (@2) should never happen here if you order the > > pattern after [...] > > which I think you do. Why's @1 == -1 ok when sanitizing but not @2 == -1? Because we ruled out @2 being 0 or -1. If @2 == 1 then @1 == -1 is fine, otherwise abs(@2) > 1, thus if X * @1 overflows, so does X * (@1 * @2) (assuming type range is from -2**L to 2**L - 1). @2 == -1 is not ok because with @1 == -1 it may lead to wrong result for saturating types if their range can be asymmetrical. It would also conceal intermediate overflow for sanitized ops. > > So unless you can come up with a testcase that breaks I'd remove the > > integer_zerop/integer_minus_onep checks. Well, initially I suggested using 'gcc_checking_assert' to ensure that such a pattern is not triggered under wrong circumstances (such as bisecting match.pd by #if 0'ing parts of it), but I believe Marc said he would prefer plain ifs. I agree with him that we shouldn't just implicitly depend on ordering, but keep the ifs or use asserts. Do you insist? Are there other instances already where GCC relies on ordering within match.pd for correctness? > So for saturating types isn't the issue when @1 and @2 have opposite > sign and the inner multiply would have saturated? No, I think the only special case is @1 == @2 == -1, otherwise either @2 is 0 or 1, or @1 * @2 is larger in magnitude than @1 (unless @1 == 0), and folding doesn't conceal an intermediate saturation. > [how do saturated > types behave with sign-changing multiplication/negation anyway?] Actually I'm not sure they need to be handled here, afaict only fixed-point types can be saturating, but then wouldn't constant operands be FIXED_CST? (but there are already some instances in match.pd that anticipate TYPE_SATURATING in patterns that match INTEGER_CST) > > I think overflow in the constant multiplication is ok unless > > TYPE_OVERFLOW_SANITIZED > > (and can we have a ubsan testcase for that that would otherwise fail?). > > It's not that we introduce new overflow here. This is to allow deducing that X must be zero. Otherwise, exploiting undefined overflow allows to fold X * CST1 * CST2 to just 0 if CST1 * CST2 overflows (this is what Marc originally suggested), but shouldn't we rather keep X in the IR to later deduce that X is zero? Alexander
On Wed, Jul 19, 2017 at 4:39 PM, Alexander Monakov <amonakov@ispras.ru> wrote: > On Wed, 19 Jul 2017, Richard Biener wrote: >> >> --- a/gcc/match.pd >> >> +++ b/gcc/match.pd >> >> @@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) >> >> || mul != wi::min_value (TYPE_PRECISION (type), SIGNED)) >> >> { build_zero_cst (type); }))))) >> >> >> >> +/* Combine successive multiplications. Similar to above, but handling >> >> + overflow is different. */ >> >> +(simplify >> >> + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2) >> >> + /* More specific rules can handle 0 and -1; skip them here to avoid >> >> + wrong transformations for sanitized and saturating types. */ >> >> + (if (!integer_zerop (@2) && !integer_minus_onep (@2)) >> > >> > I think integer_zerop (@2) should never happen here if you order the >> > pattern after [...] >> > which I think you do. Why's @1 == -1 ok when sanitizing but not @2 == -1? > > Because we ruled out @2 being 0 or -1. If @2 == 1 then @1 == -1 is fine, > otherwise abs(@2) > 1, thus if X * @1 overflows, so does X * (@1 * @2) > (assuming type range is from -2**L to 2**L - 1). > > @2 == -1 is not ok because with @1 == -1 it may lead to wrong result for > saturating types if their range can be asymmetrical. It would also > conceal intermediate overflow for sanitized ops. I'm not so much worried about false negatives for ubsan -- it's ubsan implementations fault that it runs so late after folding. >> > So unless you can come up with a testcase that breaks I'd remove the >> > integer_zerop/integer_minus_onep checks. > > Well, initially I suggested using 'gcc_checking_assert' to ensure that such a > pattern is not triggered under wrong circumstances (such as bisecting match.pd > by #if 0'ing parts of it), but I believe Marc said he would prefer plain ifs. > I agree with him that we shouldn't just implicitly depend on ordering, but > keep the ifs or use asserts. Do you insist? Are there other instances already > where GCC relies on ordering within match.pd for correctness? Not for correctness AFAIK but for optimization. IIRC I implemented strict ordering because of a bug. >> So for saturating types isn't the issue when @1 and @2 have opposite >> sign and the inner multiply would have saturated? > > No, I think the only special case is @1 == @2 == -1, otherwise either @2 is > 0 or 1, or @1 * @2 is larger in magnitude than @1 (unless @1 == 0), and > folding doesn't conceal an intermediate saturation. Ok, but you don't check for @1 == @2 == -1 because you rely on the subtree being folded (we rely on this for optimization, not sure we should rely on this for correctness -- only checking @1 == -1 is if course conservatively correct). Note that with folding X * -1 to -X a related pattern would optimize (mult (negate X) CST) and (negate (mult X CST)). >> [how do saturated >> types behave with sign-changing multiplication/negation anyway?] > > Actually I'm not sure they need to be handled here, afaict only fixed-point > types can be saturating, but then wouldn't constant operands be FIXED_CST? > (but there are already some instances in match.pd that anticipate > TYPE_SATURATING in patterns that match INTEGER_CST) At least /* In a non-aggregate type, indicates a saturating type. */ #define TYPE_SATURATING(NODE) \ (TREE_NOT_CHECK4 (NODE, RECORD_TYPE, UNION_TYPE, QUAL_UNION_TYPE, ARRAY_TYPE)->base.u.bits.saturating_flag) suggests that non-fixed point saturating types are possible. Maybe no such ones exist though. fixed-point math has other issues (but I think at least has symmetric negative/positive ranges). >> > I think overflow in the constant multiplication is ok unless >> > TYPE_OVERFLOW_SANITIZED >> > (and can we have a ubsan testcase for that that would otherwise fail?). >> > It's not that we introduce new overflow here. > > This is to allow deducing that X must be zero. Otherwise, exploiting > undefined overflow allows to fold X * CST1 * CST2 to just 0 if CST1 * CST2 > overflows (this is what Marc originally suggested), but shouldn't we > rather keep X in the IR to later deduce that X is zero? I don't think we should care here. After all we could immediately fold X * CST1 * CST2 to zero if CST1 * CST2 overflows and overflow invokes undefined behavior, right in this reassoc pattern. So -- I'd prefer to just give up for TYPE_SATURATING (as you say currently it can't happen anyway because the only saturating types we have are fixed-point), thus remove the zero/minus_one checks (not worry about false positive in ubsan). Richard. > > Alexander
On Thu, 20 Jul 2017, Richard Biener wrote: > >> So for saturating types isn't the issue when @1 and @2 have opposite > >> sign and the inner multiply would have saturated? > > > > No, I think the only special case is @1 == @2 == -1, otherwise either @2 is > > 0 or 1, or @1 * @2 is larger in magnitude than @1 (unless @1 == 0), and > > folding doesn't conceal an intermediate saturation. > > Ok, but you don't check for @1 == @2 == -1 because you rely on > the subtree being folded Hm, I don't see why you say that, the patch ensures that @2 != -1 (I agree with Marc that implicitly relying on other folding to happen is not quite ok) (for the avoidance of doubt, "the only special case" was meant in the context of saturating types here) > fixed-point math has other issues (but I think at least > has symmetric negative/positive ranges). I don't think so, n1169.pdf calls out that fixed-point _Fract types may have asymmetric range such that -1.0 is exactly representable but 1.0 is not. > >> > I think overflow in the constant multiplication is ok unless > >> > TYPE_OVERFLOW_SANITIZED > >> > (and can we have a ubsan testcase for that that would otherwise fail?). > >> > It's not that we introduce new overflow here. > > > > This is to allow deducing that X must be zero. Otherwise, exploiting > > undefined overflow allows to fold X * CST1 * CST2 to just 0 if CST1 * CST2 > > overflows (this is what Marc originally suggested), but shouldn't we > > rather keep X in the IR to later deduce that X is zero? > > I don't think we should care here. After all we could immediately fold > X * CST1 * CST2 to zero if CST1 * CST2 overflows and overflow invokes > undefined behavior, right in this reassoc pattern. Note we cannot simply check if CST1 * CST2 overflows, because in case their exact product is -INT_MIN (i.e. INT_MAX+1), the original expression doesn't overflow for X==-1 (except when CST1==INT_MIN && CST2==-1). This was previously mentioned by Marc; sorry for neglecting to mention this special case in my previous email. So the motivation to not to do the transform on overflow is twofold: - unclear how to handle CST1*CST2 == -INT_MIN, if at all; - otherwise, folding X*CST1*CST2 to literal zero would drop X from the IR; but if X is an SSA_NAME, we could instead deduce that X is zero and replace all dominated uses of X by zero (I don't know how real-world-relevant this is) > So -- I'd prefer to just give up for TYPE_SATURATING (as you say currently > it can't happen anyway because the only saturating types we have are > fixed-point), > thus remove the zero/minus_one checks (not worry about false positive in ubsan). OK, I think that can be done (I assume you mean 'false negative in ubsan'), but in light of the above clarification, what change do you want for overflow? Alexander
On Thu, Jul 20, 2017 at 11:39 AM, Alexander Monakov <amonakov@ispras.ru> wrote: > On Thu, 20 Jul 2017, Richard Biener wrote: >> >> So for saturating types isn't the issue when @1 and @2 have opposite >> >> sign and the inner multiply would have saturated? >> > >> > No, I think the only special case is @1 == @2 == -1, otherwise either @2 is >> > 0 or 1, or @1 * @2 is larger in magnitude than @1 (unless @1 == 0), and >> > folding doesn't conceal an intermediate saturation. >> >> Ok, but you don't check for @1 == @2 == -1 because you rely on >> the subtree being folded > > Hm, I don't see why you say that, the patch ensures that @2 != -1 (I agree with > Marc that implicitly relying on other folding to happen is not quite ok) But not @1 == -1, so not both. > (for the avoidance of doubt, "the only special case" was meant in the > context of saturating types here) > >> fixed-point math has other issues (but I think at least >> has symmetric negative/positive ranges). > > I don't think so, n1169.pdf calls out that fixed-point _Fract types may have > asymmetric range such that -1.0 is exactly representable but 1.0 is not. I stand corrected then. >> >> > I think overflow in the constant multiplication is ok unless >> >> > TYPE_OVERFLOW_SANITIZED >> >> > (and can we have a ubsan testcase for that that would otherwise fail?). >> >> > It's not that we introduce new overflow here. >> > >> > This is to allow deducing that X must be zero. Otherwise, exploiting >> > undefined overflow allows to fold X * CST1 * CST2 to just 0 if CST1 * CST2 >> > overflows (this is what Marc originally suggested), but shouldn't we >> > rather keep X in the IR to later deduce that X is zero? >> >> I don't think we should care here. After all we could immediately fold >> X * CST1 * CST2 to zero if CST1 * CST2 overflows and overflow invokes >> undefined behavior, right in this reassoc pattern. > > Note we cannot simply check if CST1 * CST2 overflows, because in case their > exact product is -INT_MIN (i.e. INT_MAX+1), the original expression doesn't > overflow for X==-1 (except when CST1==INT_MIN && CST2==-1). This was previously > mentioned by Marc; sorry for neglecting to mention this special case in my > previous email. > > So the motivation to not to do the transform on overflow is twofold: > > - unclear how to handle CST1*CST2 == -INT_MIN, if at all; > - otherwise, folding X*CST1*CST2 to literal zero would drop X from the > IR; but if X is an SSA_NAME, we could instead deduce that X is zero and > replace all dominated uses of X by zero (I don't know how real-world-relevant > this is) > >> So -- I'd prefer to just give up for TYPE_SATURATING (as you say currently >> it can't happen anyway because the only saturating types we have are >> fixed-point), >> thus remove the zero/minus_one checks (not worry about false positive in ubsan). > > OK, I think that can be done (I assume you mean 'false negative in ubsan'), > but in light of the above clarification, what change do you want for overflow? Ok, let's leave overflow handling as in your patch. It looks like fold_binary associate: case does the same. It does look like an unnecessary restriction but we follow existing practice there. Richard. > Alexander
diff --git a/gcc/match.pd b/gcc/match.pd index 36045f1..0bb5541 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) || mul != wi::min_value (TYPE_PRECISION (type), SIGNED)) { build_zero_cst (type); }))))) +/* Combine successive multiplications. Similar to above, but handling + overflow is different. */ +(simplify + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2) + /* More specific rules can handle 0 and -1; skip them here to avoid + wrong transformations for sanitized and saturating types. */ + (if (!integer_zerop (@2) && !integer_minus_onep (@2)) + (with { + bool overflow_p; + wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p); + } + (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type)) + (mult @0 { wide_int_to_tree (type, mul); }))))) + /* Optimize A / A to 1.0 if we don't care about NaNs or Infinities. */ (simplify diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c index a92c882..cc0e9d4 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c @@ -5,4 +5,15 @@ int f0(int a, int b){ return a * 33 * b * 55; } -/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */ +int f1(int a){ + a *= 33; + return a * 55; +} + +int f2(int a, int b){ + a *= 33; + return a * b * 55; +} + +/* { dg-final { scan-tree-dump-times "mult_expr" 7 "gimple" } } */ +/* { dg-final { scan-tree-dump-times "mult_expr" 5 "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c index 17d7b4c..1e943b7 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c @@ -23,13 +23,9 @@ f (int i) foo (y); } -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "slsr" } } */ -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "slsr" } } */ -/* { dg-final { scan-tree-dump-times "\\+ 20;" 1 "slsr" } } */ +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "slsr" } } */ /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "slsr" } } */ -/* { dg-final { scan-tree-dump-times "\\- 16;" 1 "slsr" } } */ /* { dg-final { scan-tree-dump-times "\\- 160" 1 "slsr" } } */ -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "\\+ 40" 1 "optimized" } } */