diff mbox

match.pd: reassociate multiplications with constants

Message ID alpine.LNX.2.20.13.1707182003570.25203@monopod.intra.ispras.ru
State New
Headers show

Commit Message

Alexander Monakov July 18, 2017, 5:07 p.m. UTC
On Mon, 17 Jul 2017, Alexander Monakov wrote:
> On Mon, 17 Jul 2017, Marc Glisse wrote:
> > > +/* Combine successive multiplications.  Similar to above, but handling
> > > +   overflow is different.  */
> > > +(simplify
> > > + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
> > > + (with {
> > > +   bool overflow_p;
> > > +   wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
> > > +  }
> > > +  (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))
> > 
> > I wonder if there are cases where this would cause trouble for saturating
> > integers. The only case I can think of is when @2 is -1, but that's likely
> > simplified to NEGATE_EXPR first.
> 
> Ah, yes, I think if @2 is -1 or 0 then we should not attempt this transform for
> either saturating or sanitized types, just like in the first patch. I think
> wrapping the 'with' with 'if (!integer_minus_onep (@2) && !integer_zerop (@2))'
> works, since as you say it should become a negate/zero anyway?

Updated patch:

	* match.pd ((X * CST1) * CST2): Simplify to X * (CST1 * CST2).
testsuite:
	* gcc.dg/tree-ssa/assoc-2.c: Enhance.
	* gcc.dg/tree-ssa/slsr-4.c: Adjust.

Comments

Richard Biener July 19, 2017, 11:28 a.m. UTC | #1
On Tue, Jul 18, 2017 at 7:07 PM, Alexander Monakov <amonakov@ispras.ru> wrote:
> On Mon, 17 Jul 2017, Alexander Monakov wrote:
>> On Mon, 17 Jul 2017, Marc Glisse wrote:
>> > > +/* Combine successive multiplications.  Similar to above, but handling
>> > > +   overflow is different.  */
>> > > +(simplify
>> > > + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
>> > > + (with {
>> > > +   bool overflow_p;
>> > > +   wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
>> > > +  }
>> > > +  (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))
>> >
>> > I wonder if there are cases where this would cause trouble for saturating
>> > integers. The only case I can think of is when @2 is -1, but that's likely
>> > simplified to NEGATE_EXPR first.
>>
>> Ah, yes, I think if @2 is -1 or 0 then we should not attempt this transform for
>> either saturating or sanitized types, just like in the first patch. I think
>> wrapping the 'with' with 'if (!integer_minus_onep (@2) && !integer_zerop (@2))'
>> works, since as you say it should become a negate/zero anyway?
>
> Updated patch:
>
>         * match.pd ((X * CST1) * CST2): Simplify to X * (CST1 * CST2).
> testsuite:
>         * gcc.dg/tree-ssa/assoc-2.c: Enhance.
>         * gcc.dg/tree-ssa/slsr-4.c: Adjust.
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 36045f1..0bb5541 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>          || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
>       { build_zero_cst (type); })))))
>
> +/* Combine successive multiplications.  Similar to above, but handling
> +   overflow is different.  */
> +(simplify
> + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
> + /* More specific rules can handle 0 and -1; skip them here to avoid
> +    wrong transformations for sanitized and saturating types.  */
> + (if (!integer_zerop (@2) && !integer_minus_onep (@2))

I think integer_zerop (@2) should never happen here if you order the
pattern after

(simplify
 (mult @0 integer_zerop@1)
 @1)

which I think you do.  Why's @1 == -1 ok when sanitizing but not @2 == -1?
If you rely on

/* Transform x * -1 into -x.  */
(simplify
 (mult @0 integer_minus_onep)
 (negate @0))

then you need to move that pattern above yours (there seem to be a bunch
of related ones like 0 - @1 -> -@1 to move earlier as well).

That would leave us with the case of saturating types (the above transforms
doesn't care for those).

So unless you can come up with a testcase that breaks I'd remove the
integer_zerop/integer_minus_onep checks.

> +  (with {
> +    bool overflow_p;
> +    wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
> +   }
> +   (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))

I think overflow in the constant multiplication is ok unless
TYPE_OVERFLOW_SANITIZED
(and can we have a ubsan testcase for that that would otherwise fail?).
It's not that we introduce new overflow here.

Ok with those changes.

Thanks,
Richard.

> +    (mult @0 { wide_int_to_tree (type, mul); })))))
> +
>  /* Optimize A / A to 1.0 if we don't care about
>     NaNs or Infinities.  */
>  (simplify
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
> index a92c882..cc0e9d4 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
> @@ -5,4 +5,15 @@ int f0(int a, int b){
>    return a * 33 * b * 55;
>  }
>
> -/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */
> +int f1(int a){
> +  a *= 33;
> +  return a * 55;
> +}
> +
> +int f2(int a, int b){
> +  a *= 33;
> +  return a * b * 55;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "mult_expr" 7 "gimple" } } */
> +/* { dg-final { scan-tree-dump-times "mult_expr" 5 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
> index 17d7b4c..1e943b7 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
> @@ -23,13 +23,9 @@ f (int i)
>    foo (y);
>  }
>
> -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "slsr" } } */
> -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "slsr" } } */
> -/* { dg-final { scan-tree-dump-times "\\+ 20;" 1 "slsr" } } */
> +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "slsr" } } */
>  /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "slsr" } } */
> -/* { dg-final { scan-tree-dump-times "\\- 16;" 1 "slsr" } } */
>  /* { dg-final { scan-tree-dump-times "\\- 160" 1 "slsr" } } */
> -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "optimized" } } */
> -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "optimized" } } */
>  /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "optimized" } } */
>  /* { dg-final { scan-tree-dump-times "\\+ 40" 1 "optimized" } } */
Richard Biener July 19, 2017, 11:32 a.m. UTC | #2
On Wed, Jul 19, 2017 at 1:28 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Jul 18, 2017 at 7:07 PM, Alexander Monakov <amonakov@ispras.ru> wrote:
>> On Mon, 17 Jul 2017, Alexander Monakov wrote:
>>> On Mon, 17 Jul 2017, Marc Glisse wrote:
>>> > > +/* Combine successive multiplications.  Similar to above, but handling
>>> > > +   overflow is different.  */
>>> > > +(simplify
>>> > > + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
>>> > > + (with {
>>> > > +   bool overflow_p;
>>> > > +   wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
>>> > > +  }
>>> > > +  (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))
>>> >
>>> > I wonder if there are cases where this would cause trouble for saturating
>>> > integers. The only case I can think of is when @2 is -1, but that's likely
>>> > simplified to NEGATE_EXPR first.
>>>
>>> Ah, yes, I think if @2 is -1 or 0 then we should not attempt this transform for
>>> either saturating or sanitized types, just like in the first patch. I think
>>> wrapping the 'with' with 'if (!integer_minus_onep (@2) && !integer_zerop (@2))'
>>> works, since as you say it should become a negate/zero anyway?
>>
>> Updated patch:
>>
>>         * match.pd ((X * CST1) * CST2): Simplify to X * (CST1 * CST2).
>> testsuite:
>>         * gcc.dg/tree-ssa/assoc-2.c: Enhance.
>>         * gcc.dg/tree-ssa/slsr-4.c: Adjust.
>>
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 36045f1..0bb5541 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>          || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
>>       { build_zero_cst (type); })))))
>>
>> +/* Combine successive multiplications.  Similar to above, but handling
>> +   overflow is different.  */
>> +(simplify
>> + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
>> + /* More specific rules can handle 0 and -1; skip them here to avoid
>> +    wrong transformations for sanitized and saturating types.  */
>> + (if (!integer_zerop (@2) && !integer_minus_onep (@2))
>
> I think integer_zerop (@2) should never happen here if you order the
> pattern after
>
> (simplify
>  (mult @0 integer_zerop@1)
>  @1)
>
> which I think you do.  Why's @1 == -1 ok when sanitizing but not @2 == -1?
> If you rely on
>
> /* Transform x * -1 into -x.  */
> (simplify
>  (mult @0 integer_minus_onep)
>  (negate @0))
>
> then you need to move that pattern above yours (there seem to be a bunch
> of related ones like 0 - @1 -> -@1 to move earlier as well).
>
> That would leave us with the case of saturating types (the above transforms
> doesn't care for those).
>
> So unless you can come up with a testcase that breaks I'd remove the
> integer_zerop/integer_minus_onep checks.

So for saturating types isn't the issue when @1 and @2 have opposite
sign and the inner multiply would have saturated?  [how do saturated
types behave with sign-changing multiplication/negation anyway?]

>> +  (with {
>> +    bool overflow_p;
>> +    wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
>> +   }
>> +   (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))
>
> I think overflow in the constant multiplication is ok unless
> TYPE_OVERFLOW_SANITIZED
> (and can we have a ubsan testcase for that that would otherwise fail?).
> It's not that we introduce new overflow here.
>
> Ok with those changes.
>
> Thanks,
> Richard.
>
>> +    (mult @0 { wide_int_to_tree (type, mul); })))))
>> +
>>  /* Optimize A / A to 1.0 if we don't care about
>>     NaNs or Infinities.  */
>>  (simplify
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
>> index a92c882..cc0e9d4 100644
>> --- a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
>> @@ -5,4 +5,15 @@ int f0(int a, int b){
>>    return a * 33 * b * 55;
>>  }
>>
>> -/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */
>> +int f1(int a){
>> +  a *= 33;
>> +  return a * 55;
>> +}
>> +
>> +int f2(int a, int b){
>> +  a *= 33;
>> +  return a * b * 55;
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times "mult_expr" 7 "gimple" } } */
>> +/* { dg-final { scan-tree-dump-times "mult_expr" 5 "optimized" } } */
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
>> index 17d7b4c..1e943b7 100644
>> --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
>> @@ -23,13 +23,9 @@ f (int i)
>>    foo (y);
>>  }
>>
>> -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "slsr" } } */
>> -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "slsr" } } */
>> -/* { dg-final { scan-tree-dump-times "\\+ 20;" 1 "slsr" } } */
>> +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "slsr" } } */
>>  /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "slsr" } } */
>> -/* { dg-final { scan-tree-dump-times "\\- 16;" 1 "slsr" } } */
>>  /* { dg-final { scan-tree-dump-times "\\- 160" 1 "slsr" } } */
>> -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "optimized" } } */
>> -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "optimized" } } */
>>  /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "optimized" } } */
>>  /* { dg-final { scan-tree-dump-times "\\+ 40" 1 "optimized" } } */
Alexander Monakov July 19, 2017, 2:39 p.m. UTC | #3
On Wed, 19 Jul 2017, Richard Biener wrote:
> >> --- a/gcc/match.pd
> >> +++ b/gcc/match.pd
> >> @@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >>          || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
> >>       { build_zero_cst (type); })))))
> >>
> >> +/* Combine successive multiplications.  Similar to above, but handling
> >> +   overflow is different.  */
> >> +(simplify
> >> + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
> >> + /* More specific rules can handle 0 and -1; skip them here to avoid
> >> +    wrong transformations for sanitized and saturating types.  */
> >> + (if (!integer_zerop (@2) && !integer_minus_onep (@2))
> >
> > I think integer_zerop (@2) should never happen here if you order the
> > pattern after [...]
> > which I think you do.  Why's @1 == -1 ok when sanitizing but not @2 == -1?

Because we ruled out @2 being 0 or -1. If @2 == 1 then @1 == -1 is fine,
otherwise abs(@2) > 1, thus if X * @1 overflows, so does X * (@1 * @2)
(assuming type range is from -2**L to 2**L - 1).

@2 == -1 is not ok because with @1 == -1 it may lead to wrong result for
saturating types if their range can be asymmetrical.  It would also
conceal intermediate overflow for sanitized ops.

> > So unless you can come up with a testcase that breaks I'd remove the
> > integer_zerop/integer_minus_onep checks.

Well, initially I suggested using 'gcc_checking_assert' to ensure that such a
pattern is not triggered under wrong circumstances (such as bisecting match.pd
by #if 0'ing parts of it), but I believe Marc said he would prefer plain ifs.
I agree with him that we shouldn't just implicitly depend on ordering, but
keep the ifs or use asserts.  Do you insist?  Are there other instances already
where GCC relies on ordering within match.pd for correctness?

> So for saturating types isn't the issue when @1 and @2 have opposite
> sign and the inner multiply would have saturated?

No, I think the only special case is @1 == @2 == -1, otherwise either @2 is
0 or 1, or @1 * @2 is larger in magnitude than @1 (unless @1 == 0), and
folding doesn't conceal an intermediate saturation.

> [how do saturated
> types behave with sign-changing multiplication/negation anyway?]

Actually I'm not sure they need to be handled here, afaict only fixed-point
types can be saturating, but then wouldn't constant operands be FIXED_CST?
(but there are already some instances in match.pd that anticipate
TYPE_SATURATING in patterns that match INTEGER_CST)

> > I think overflow in the constant multiplication is ok unless
> > TYPE_OVERFLOW_SANITIZED
> > (and can we have a ubsan testcase for that that would otherwise fail?).
> > It's not that we introduce new overflow here.

This is to allow deducing that X must be zero. Otherwise, exploiting
undefined overflow allows to fold X * CST1 * CST2 to just 0 if CST1 * CST2
overflows (this is what Marc originally suggested), but shouldn't we
rather keep X in the IR to later deduce that X is zero?

Alexander
Richard Biener July 20, 2017, 8:41 a.m. UTC | #4
On Wed, Jul 19, 2017 at 4:39 PM, Alexander Monakov <amonakov@ispras.ru> wrote:
> On Wed, 19 Jul 2017, Richard Biener wrote:
>> >> --- a/gcc/match.pd
>> >> +++ b/gcc/match.pd
>> >> @@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>> >>          || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
>> >>       { build_zero_cst (type); })))))
>> >>
>> >> +/* Combine successive multiplications.  Similar to above, but handling
>> >> +   overflow is different.  */
>> >> +(simplify
>> >> + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
>> >> + /* More specific rules can handle 0 and -1; skip them here to avoid
>> >> +    wrong transformations for sanitized and saturating types.  */
>> >> + (if (!integer_zerop (@2) && !integer_minus_onep (@2))
>> >
>> > I think integer_zerop (@2) should never happen here if you order the
>> > pattern after [...]
>> > which I think you do.  Why's @1 == -1 ok when sanitizing but not @2 == -1?
>
> Because we ruled out @2 being 0 or -1. If @2 == 1 then @1 == -1 is fine,
> otherwise abs(@2) > 1, thus if X * @1 overflows, so does X * (@1 * @2)
> (assuming type range is from -2**L to 2**L - 1).
>
> @2 == -1 is not ok because with @1 == -1 it may lead to wrong result for
> saturating types if their range can be asymmetrical.  It would also
> conceal intermediate overflow for sanitized ops.

I'm not so much worried about false negatives for ubsan -- it's ubsan
implementations
fault that it runs so late after folding.

>> > So unless you can come up with a testcase that breaks I'd remove the
>> > integer_zerop/integer_minus_onep checks.
>
> Well, initially I suggested using 'gcc_checking_assert' to ensure that such a
> pattern is not triggered under wrong circumstances (such as bisecting match.pd
> by #if 0'ing parts of it), but I believe Marc said he would prefer plain ifs.
> I agree with him that we shouldn't just implicitly depend on ordering, but
> keep the ifs or use asserts.  Do you insist?  Are there other instances already
> where GCC relies on ordering within match.pd for correctness?

Not for correctness AFAIK but for optimization.  IIRC I implemented strict
ordering because of a bug.

>> So for saturating types isn't the issue when @1 and @2 have opposite
>> sign and the inner multiply would have saturated?
>
> No, I think the only special case is @1 == @2 == -1, otherwise either @2 is
> 0 or 1, or @1 * @2 is larger in magnitude than @1 (unless @1 == 0), and
> folding doesn't conceal an intermediate saturation.

Ok, but you don't check for @1 == @2 == -1 because you rely on
the subtree being folded (we rely on this for optimization, not sure we should
rely on this for correctness -- only checking @1 == -1 is if course
conservatively
correct).

Note that with folding X * -1 to -X a related pattern would optimize
(mult (negate X) CST)
and (negate (mult X CST)).

>> [how do saturated
>> types behave with sign-changing multiplication/negation anyway?]
>
> Actually I'm not sure they need to be handled here, afaict only fixed-point
> types can be saturating, but then wouldn't constant operands be FIXED_CST?
> (but there are already some instances in match.pd that anticipate
> TYPE_SATURATING in patterns that match INTEGER_CST)

At least

/* In a non-aggregate type, indicates a saturating type.  */
#define TYPE_SATURATING(NODE) \
  (TREE_NOT_CHECK4 (NODE, RECORD_TYPE, UNION_TYPE, QUAL_UNION_TYPE,
ARRAY_TYPE)->base.u.bits.saturating_flag)

suggests that non-fixed point saturating types are possible.  Maybe no such
ones exist though.  fixed-point math has other issues (but I think at least
has symmetric negative/positive ranges).

>> > I think overflow in the constant multiplication is ok unless
>> > TYPE_OVERFLOW_SANITIZED
>> > (and can we have a ubsan testcase for that that would otherwise fail?).
>> > It's not that we introduce new overflow here.
>
> This is to allow deducing that X must be zero. Otherwise, exploiting
> undefined overflow allows to fold X * CST1 * CST2 to just 0 if CST1 * CST2
> overflows (this is what Marc originally suggested), but shouldn't we
> rather keep X in the IR to later deduce that X is zero?

I don't think we should care here.  After all we could immediately fold
X * CST1 * CST2 to zero if CST1 * CST2 overflows and overflow invokes
undefined behavior, right in this reassoc pattern.

So -- I'd prefer to just give up for TYPE_SATURATING (as you say currently
it can't happen anyway because the only saturating types we have are
fixed-point),
thus remove the zero/minus_one checks (not worry about false positive in ubsan).

Richard.

>
> Alexander
Alexander Monakov July 20, 2017, 9:39 a.m. UTC | #5
On Thu, 20 Jul 2017, Richard Biener wrote:
> >> So for saturating types isn't the issue when @1 and @2 have opposite
> >> sign and the inner multiply would have saturated?
> >
> > No, I think the only special case is @1 == @2 == -1, otherwise either @2 is
> > 0 or 1, or @1 * @2 is larger in magnitude than @1 (unless @1 == 0), and
> > folding doesn't conceal an intermediate saturation.
> 
> Ok, but you don't check for @1 == @2 == -1 because you rely on
> the subtree being folded

Hm, I don't see why you say that, the patch ensures that @2 != -1 (I agree with
Marc that implicitly relying on other folding to happen is not quite ok)

(for the avoidance of doubt, "the only special case" was meant in the
context of saturating types here)

> fixed-point math has other issues (but I think at least
> has symmetric negative/positive ranges).

I don't think so, n1169.pdf calls out that fixed-point _Fract types may have
asymmetric range such that -1.0 is exactly representable but 1.0 is not.

> >> > I think overflow in the constant multiplication is ok unless
> >> > TYPE_OVERFLOW_SANITIZED
> >> > (and can we have a ubsan testcase for that that would otherwise fail?).
> >> > It's not that we introduce new overflow here.
> >
> > This is to allow deducing that X must be zero. Otherwise, exploiting
> > undefined overflow allows to fold X * CST1 * CST2 to just 0 if CST1 * CST2
> > overflows (this is what Marc originally suggested), but shouldn't we
> > rather keep X in the IR to later deduce that X is zero?
> 
> I don't think we should care here.  After all we could immediately fold
> X * CST1 * CST2 to zero if CST1 * CST2 overflows and overflow invokes
> undefined behavior, right in this reassoc pattern.

Note we cannot simply check if CST1 * CST2 overflows, because in case their
exact product is -INT_MIN (i.e. INT_MAX+1), the original expression doesn't
overflow for X==-1 (except when CST1==INT_MIN && CST2==-1).  This was previously
mentioned by Marc; sorry for neglecting to mention this special case in my
previous email.

So the motivation to not to do the transform on overflow is twofold:

- unclear how to handle CST1*CST2 == -INT_MIN, if at all;
- otherwise, folding X*CST1*CST2 to literal zero would drop X from the
  IR; but if X is an SSA_NAME, we could instead deduce that X is zero and
  replace all dominated uses of X by zero (I don't know how real-world-relevant
  this is)

> So -- I'd prefer to just give up for TYPE_SATURATING (as you say currently
> it can't happen anyway because the only saturating types we have are
> fixed-point),
> thus remove the zero/minus_one checks (not worry about false positive in ubsan).

OK, I think that can be done (I assume you mean 'false negative in ubsan'),
but in light of the above clarification, what change do you want for overflow?

Alexander
Richard Biener July 20, 2017, 10:50 a.m. UTC | #6
On Thu, Jul 20, 2017 at 11:39 AM, Alexander Monakov <amonakov@ispras.ru> wrote:
> On Thu, 20 Jul 2017, Richard Biener wrote:
>> >> So for saturating types isn't the issue when @1 and @2 have opposite
>> >> sign and the inner multiply would have saturated?
>> >
>> > No, I think the only special case is @1 == @2 == -1, otherwise either @2 is
>> > 0 or 1, or @1 * @2 is larger in magnitude than @1 (unless @1 == 0), and
>> > folding doesn't conceal an intermediate saturation.
>>
>> Ok, but you don't check for @1 == @2 == -1 because you rely on
>> the subtree being folded
>
> Hm, I don't see why you say that, the patch ensures that @2 != -1 (I agree with
> Marc that implicitly relying on other folding to happen is not quite ok)

But not @1 == -1, so not both.

> (for the avoidance of doubt, "the only special case" was meant in the
> context of saturating types here)
>
>> fixed-point math has other issues (but I think at least
>> has symmetric negative/positive ranges).
>
> I don't think so, n1169.pdf calls out that fixed-point _Fract types may have
> asymmetric range such that -1.0 is exactly representable but 1.0 is not.

I stand corrected then.

>> >> > I think overflow in the constant multiplication is ok unless
>> >> > TYPE_OVERFLOW_SANITIZED
>> >> > (and can we have a ubsan testcase for that that would otherwise fail?).
>> >> > It's not that we introduce new overflow here.
>> >
>> > This is to allow deducing that X must be zero. Otherwise, exploiting
>> > undefined overflow allows to fold X * CST1 * CST2 to just 0 if CST1 * CST2
>> > overflows (this is what Marc originally suggested), but shouldn't we
>> > rather keep X in the IR to later deduce that X is zero?
>>
>> I don't think we should care here.  After all we could immediately fold
>> X * CST1 * CST2 to zero if CST1 * CST2 overflows and overflow invokes
>> undefined behavior, right in this reassoc pattern.
>
> Note we cannot simply check if CST1 * CST2 overflows, because in case their
> exact product is -INT_MIN (i.e. INT_MAX+1), the original expression doesn't
> overflow for X==-1 (except when CST1==INT_MIN && CST2==-1).  This was previously
> mentioned by Marc; sorry for neglecting to mention this special case in my
> previous email.
>
> So the motivation to not to do the transform on overflow is twofold:
>
> - unclear how to handle CST1*CST2 == -INT_MIN, if at all;
> - otherwise, folding X*CST1*CST2 to literal zero would drop X from the
>   IR; but if X is an SSA_NAME, we could instead deduce that X is zero and
>   replace all dominated uses of X by zero (I don't know how real-world-relevant
>   this is)
>
>> So -- I'd prefer to just give up for TYPE_SATURATING (as you say currently
>> it can't happen anyway because the only saturating types we have are
>> fixed-point),
>> thus remove the zero/minus_one checks (not worry about false positive in ubsan).
>
> OK, I think that can be done (I assume you mean 'false negative in ubsan'),
> but in light of the above clarification, what change do you want for overflow?

Ok, let's leave overflow handling as in your patch.  It looks like fold_binary
associate: case does the same.  It does look like an unnecessary restriction
but we follow existing practice there.

Richard.

> Alexander
diff mbox

Patch

diff --git a/gcc/match.pd b/gcc/match.pd
index 36045f1..0bb5541 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -283,6 +283,20 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
         || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
      { build_zero_cst (type); })))))

+/* Combine successive multiplications.  Similar to above, but handling
+   overflow is different.  */
+(simplify
+ (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
+ /* More specific rules can handle 0 and -1; skip them here to avoid
+    wrong transformations for sanitized and saturating types.  */
+ (if (!integer_zerop (@2) && !integer_minus_onep (@2))
+  (with {
+    bool overflow_p;
+    wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
+   }
+   (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))
+    (mult @0 { wide_int_to_tree (type, mul); })))))
+
 /* Optimize A / A to 1.0 if we don't care about
    NaNs or Infinities.  */
 (simplify
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
index a92c882..cc0e9d4 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
@@ -5,4 +5,15 @@  int f0(int a, int b){
   return a * 33 * b * 55;
 }

-/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */
+int f1(int a){
+  a *= 33;
+  return a * 55;
+}
+
+int f2(int a, int b){
+  a *= 33;
+  return a * b * 55;
+}
+
+/* { dg-final { scan-tree-dump-times "mult_expr" 7 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "mult_expr" 5 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
index 17d7b4c..1e943b7 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
@@ -23,13 +23,9 @@  f (int i)
   foo (y);
 }
 
-/* { dg-final { scan-tree-dump-times "\\* 4" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\* 10" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\+ 20;" 1 "slsr" } } */
+/* { dg-final { scan-tree-dump-times "\\* 40" 1 "slsr" } } */
 /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\- 16;" 1 "slsr" } } */
 /* { dg-final { scan-tree-dump-times "\\- 160" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\* 4" 1 "optimized" } } */
-/* { dg-final { scan-tree-dump-times "\\* 10" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\* 40" 1 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "\\+ 40" 1 "optimized" } } */