diff mbox

Prevent extract_muldiv from introducing an overflow (PR sanitizer/80800)

Message ID 20170519103754.GA3335@redhat.com
State New
Headers show

Commit Message

Marek Polacek May 19, 2017, 10:37 a.m. UTC
On Fri, May 19, 2017 at 09:58:45AM +0200, Richard Biener wrote:
> On Fri, 19 May 2017, Marek Polacek wrote:
> 
> > extract_muldiv folds 
> > 
> >   (n * 10000 * z) * 50
> > 
> > to
> > 
> >   (n * 500000) * z
> > 
> > which is a wrong transformation to do, because it may introduce an overflow.
> > This resulted in a ubsan false positive.  So we should just disable this
> > folding altogether.  Does the approach I took make sense?
> > 
> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> Didn't dig very far to identify extract_muldiv, but I guess it's either
> of the following recursions that trigger?
> 
>       /* If we can extract our operation from the LHS, do so and return a
>          new operation.  Likewise for the RHS from a MULT_EXPR.  
> Otherwise,
>          do something only if the second operand is a constant.  */
>       if (same_p
>           && (t1 = extract_muldiv (op0, c, code, wide_type,
>                                    strict_overflow_p)) != 0)
>         return fold_build2 (tcode, ctype, fold_convert (ctype, t1),
>                             fold_convert (ctype, op1));
>       else if (tcode == MULT_EXPR && code == MULT_EXPR
>                && (t1 = extract_muldiv (op1, c, code, wide_type,
>                                         strict_overflow_p)) != 0)
>         return fold_build2 (tcode, ctype, fold_convert (ctype, op0),
>                             fold_convert (ctype, t1));

Exactly.  extract_muldiv first gets (n * 10000 * z) * 50 so it tries
to fold 50 with (subexpressions) of (n * 10000 * z).  So it then tries
(n * 10000) * 50, and then n * 50 and then 10000 * 50 which finally
works out, so it uses 50000 and removes the outermost multiplication.

> thus I'd simply guard them with TYPE_OVERFLOW_WRAPS ().

That works, too.  I was afraid it'd disable too much folding.

> In the end I think the whole extract_muldiv mess should be truncated
> down to what its name suggest - identifying and removing mul-div
> cancellations.

Would be nice.  I had trouble wrapping my head around it.

> It's for example not clear whether the recursion above assumes
> TYPE_OVERFLOW_UNDEFINED (it passes a wide_type .. widening is only
> ok if there's no overflow).

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-05-19  Marek Polacek  <polacek@redhat.com>

	PR sanitizer/80800
	* fold-const.c (extract_muldiv_1) <case TRUNC_DIV_EXPR>: Add
	TYPE_OVERFLOW_WRAPS checks.

	* c-c++-common/ubsan/pr80800.c: New test.
	* c-c++-common/Wduplicated-branches-1.c: Adjust an expression.



	Marek

Comments

Richard Biener May 19, 2017, 10:44 a.m. UTC | #1
On Fri, 19 May 2017, Marek Polacek wrote:

> On Fri, May 19, 2017 at 09:58:45AM +0200, Richard Biener wrote:
> > On Fri, 19 May 2017, Marek Polacek wrote:
> > 
> > > extract_muldiv folds 
> > > 
> > >   (n * 10000 * z) * 50
> > > 
> > > to
> > > 
> > >   (n * 500000) * z
> > > 
> > > which is a wrong transformation to do, because it may introduce an overflow.
> > > This resulted in a ubsan false positive.  So we should just disable this
> > > folding altogether.  Does the approach I took make sense?
> > > 
> > > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> > 
> > Didn't dig very far to identify extract_muldiv, but I guess it's either
> > of the following recursions that trigger?
> > 
> >       /* If we can extract our operation from the LHS, do so and return a
> >          new operation.  Likewise for the RHS from a MULT_EXPR.  
> > Otherwise,
> >          do something only if the second operand is a constant.  */
> >       if (same_p
> >           && (t1 = extract_muldiv (op0, c, code, wide_type,
> >                                    strict_overflow_p)) != 0)
> >         return fold_build2 (tcode, ctype, fold_convert (ctype, t1),
> >                             fold_convert (ctype, op1));
> >       else if (tcode == MULT_EXPR && code == MULT_EXPR
> >                && (t1 = extract_muldiv (op1, c, code, wide_type,
> >                                         strict_overflow_p)) != 0)
> >         return fold_build2 (tcode, ctype, fold_convert (ctype, op0),
> >                             fold_convert (ctype, t1));
> 
> Exactly.  extract_muldiv first gets (n * 10000 * z) * 50 so it tries
> to fold 50 with (subexpressions) of (n * 10000 * z).  So it then tries
> (n * 10000) * 50, and then n * 50 and then 10000 * 50 which finally
> works out, so it uses 50000 and removes the outermost multiplication.
> 
> > thus I'd simply guard them with TYPE_OVERFLOW_WRAPS ().
> 
> That works, too.  I was afraid it'd disable too much folding.
> 
> > In the end I think the whole extract_muldiv mess should be truncated
> > down to what its name suggest - identifying and removing mul-div
> > cancellations.
> 
> Would be nice.  I had trouble wrapping my head around it.

Yeah, happens everytime I need to chase a bug in it...

> > It's for example not clear whether the recursion above assumes
> > TYPE_OVERFLOW_UNDEFINED (it passes a wide_type .. widening is only
> > ok if there's no overflow).
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2017-05-19  Marek Polacek  <polacek@redhat.com>
> 
> 	PR sanitizer/80800
> 	* fold-const.c (extract_muldiv_1) <case TRUNC_DIV_EXPR>: Add
> 	TYPE_OVERFLOW_WRAPS checks.
> 
> 	* c-c++-common/ubsan/pr80800.c: New test.
> 	* c-c++-common/Wduplicated-branches-1.c: Adjust an expression.
> 
> 
> diff --git gcc/fold-const.c gcc/fold-const.c
> index 19aa722..736552c 100644
> --- gcc/fold-const.c
> +++ gcc/fold-const.c
> @@ -6281,11 +6281,13 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, tree wide_type,
>  	 new operation.  Likewise for the RHS from a MULT_EXPR.  Otherwise,
>  	 do something only if the second operand is a constant.  */
>        if (same_p
> +	  && TYPE_OVERFLOW_WRAPS (ctype)
>  	  && (t1 = extract_muldiv (op0, c, code, wide_type,
>  				   strict_overflow_p)) != 0)
>  	return fold_build2 (tcode, ctype, fold_convert (ctype, t1),
>  			    fold_convert (ctype, op1));
>        else if (tcode == MULT_EXPR && code == MULT_EXPR
> +	       && TYPE_OVERFLOW_WRAPS (ctype)
>  	       && (t1 = extract_muldiv (op1, c, code, wide_type,
>  					strict_overflow_p)) != 0)
>  	return fold_build2 (tcode, ctype, fold_convert (ctype, op0),
> diff --git gcc/testsuite/c-c++-common/Wduplicated-branches-1.c gcc/testsuite/c-c++-common/Wduplicated-branches-1.c
> index c0b93fc..7c5062d 100644
> --- gcc/testsuite/c-c++-common/Wduplicated-branches-1.c
> +++ gcc/testsuite/c-c++-common/Wduplicated-branches-1.c
> @@ -89,7 +89,7 @@ f (int i, int *p)
>    if (i == 8) /* { dg-warning "this condition has identical branches" } */
>      return i * 8 * i * 8;
>    else
> -    return 8 * i * 8 * i;
> +    return i * 8 * i * 8;
>  
>  
>    if (i == 9) /* { dg-warning "this condition has identical branches" } */
> diff --git gcc/testsuite/c-c++-common/ubsan/pr80800.c gcc/testsuite/c-c++-common/ubsan/pr80800.c
> index e69de29..992c136 100644
> --- gcc/testsuite/c-c++-common/ubsan/pr80800.c
> +++ gcc/testsuite/c-c++-common/ubsan/pr80800.c
> @@ -0,0 +1,25 @@
> +/* PR sanitizer/80800 */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=undefined -fsanitize-undefined-trap-on-error" } */
> +
> +int n = 20000;
> +int z = 0;
> +
> +int
> +fn1 (void)
> +{
> +  return (n * 10000 * z) * 50;
> +}
> +
> +int
> +fn2 (void)
> +{
> +  return (10000 * n * z) * 50;
> +}
> +
> +int
> +main ()
> +{
> +  fn1 ();
> +  fn2 ();
> +}
> 
> 	Marek
> 
>
Alexander Monakov May 19, 2017, 10:57 a.m. UTC | #2
On Fri, 19 May 2017, Richard Biener wrote:

> On Fri, 19 May 2017, Marek Polacek wrote:
> 
> > On Fri, May 19, 2017 at 09:58:45AM +0200, Richard Biener wrote:
> > > On Fri, 19 May 2017, Marek Polacek wrote:
> > > 
> > > > extract_muldiv folds 
> > > > 
> > > >   (n * 10000 * z) * 50
> > > > 
> > > > to
> > > > 
> > > >   (n * 500000) * z
> > > > 
> > > > which is a wrong transformation to do, because it may introduce an overflow.
> > > > This resulted in a ubsan false positive.  So we should just disable this
> > > > folding altogether.  Does the approach I took make sense?

I think it's possible to keep this folding, note that it's valid to transform to

    (n * 1 * z) * 500000

(i.e. accumulate multiplications on the outermost factor)

> > > > 
> > > > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> > > 
> > > Didn't dig very far to identify extract_muldiv, but I guess it's either
> > > of the following recursions that trigger?
> > > 
> > >       /* If we can extract our operation from the LHS, do so and return a
> > >          new operation.  Likewise for the RHS from a MULT_EXPR.  
> > > Otherwise,
> > >          do something only if the second operand is a constant.  */
> > >       if (same_p
> > >           && (t1 = extract_muldiv (op0, c, code, wide_type,
> > >                                    strict_overflow_p)) != 0)
> > >         return fold_build2 (tcode, ctype, fold_convert (ctype, t1),
> > >                             fold_convert (ctype, op1));
> > >       else if (tcode == MULT_EXPR && code == MULT_EXPR
> > >                && (t1 = extract_muldiv (op1, c, code, wide_type,
> > >                                         strict_overflow_p)) != 0)
> > >         return fold_build2 (tcode, ctype, fold_convert (ctype, op0),
> > >                             fold_convert (ctype, t1));
> > 
> > Exactly.  extract_muldiv first gets (n * 10000 * z) * 50 so it tries
> > to fold 50 with (subexpressions) of (n * 10000 * z).  So it then tries
> > (n * 10000) * 50, and then n * 50 and then 10000 * 50 which finally
> > works out, so it uses 50000 and removes the outermost multiplication.

so would it be possible to adjust things here to remove the innermost
multiplication instead?

Alexander
Marek Polacek May 19, 2017, 3:27 p.m. UTC | #3
On Fri, May 19, 2017 at 01:57:24PM +0300, Alexander Monakov wrote:
> On Fri, 19 May 2017, Richard Biener wrote:
> 
> > On Fri, 19 May 2017, Marek Polacek wrote:
> > 
> > > On Fri, May 19, 2017 at 09:58:45AM +0200, Richard Biener wrote:
> > > > On Fri, 19 May 2017, Marek Polacek wrote:
> > > > 
> > > > > extract_muldiv folds 
> > > > > 
> > > > >   (n * 10000 * z) * 50
> > > > > 
> > > > > to
> > > > > 
> > > > >   (n * 500000) * z
> > > > > 
> > > > > which is a wrong transformation to do, because it may introduce an overflow.
> > > > > This resulted in a ubsan false positive.  So we should just disable this
> > > > > folding altogether.  Does the approach I took make sense?
> 
> I think it's possible to keep this folding, note that it's valid to transform to
> 
>     (n * 1 * z) * 500000
> 
> (i.e. accumulate multiplications on the outermost factor)
> 
> > > > > 
> > > > > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> > > > 
> > > > Didn't dig very far to identify extract_muldiv, but I guess it's either
> > > > of the following recursions that trigger?
> > > > 
> > > >       /* If we can extract our operation from the LHS, do so and return a
> > > >          new operation.  Likewise for the RHS from a MULT_EXPR.  
> > > > Otherwise,
> > > >          do something only if the second operand is a constant.  */
> > > >       if (same_p
> > > >           && (t1 = extract_muldiv (op0, c, code, wide_type,
> > > >                                    strict_overflow_p)) != 0)
> > > >         return fold_build2 (tcode, ctype, fold_convert (ctype, t1),
> > > >                             fold_convert (ctype, op1));
> > > >       else if (tcode == MULT_EXPR && code == MULT_EXPR
> > > >                && (t1 = extract_muldiv (op1, c, code, wide_type,
> > > >                                         strict_overflow_p)) != 0)
> > > >         return fold_build2 (tcode, ctype, fold_convert (ctype, op0),
> > > >                             fold_convert (ctype, t1));
> > > 
> > > Exactly.  extract_muldiv first gets (n * 10000 * z) * 50 so it tries
> > > to fold 50 with (subexpressions) of (n * 10000 * z).  So it then tries
> > > (n * 10000) * 50, and then n * 50 and then 10000 * 50 which finally
> > > works out, so it uses 50000 and removes the outermost multiplication.
> 
> so would it be possible to adjust things here to remove the innermost
> multiplication instead?

I think I'd rather not expand this function any more, sorry.

	Marek
Alexander Monakov May 19, 2017, 3:47 p.m. UTC | #4
On Fri, 19 May 2017, Marek Polacek wrote:
> > I think it's possible to keep this folding, note that it's valid to transform to
> > 
> >     (n * 1 * z) * 500000
> > 
> > (i.e. accumulate multiplications on the outermost factor)

(to be precise, if the multiplication is done in a signed type and the middle
constant factor was a negated power of two, the sign change needs to remain:

    a * -4 * b * 2

needs to be transformed to

    a * -1 * b * 8 )

> > so would it be possible to adjust things here to remove the innermost
> > multiplication instead?
> 
> I think I'd rather not expand this function any more, sorry.

I'd be happy to look into that myself, if the idea sounds feasible and desirable.
I've never looked at this code before, so I'd appreciate a quick yay-or-nay
before diving in.  Richard, can you share your opinion on this point?

Thanks.
Alexander
Richard Biener May 19, 2017, 4:15 p.m. UTC | #5
On May 19, 2017 5:47:10 PM GMT+02:00, Alexander Monakov <amonakov@ispras.ru> wrote:
>On Fri, 19 May 2017, Marek Polacek wrote:
>> > I think it's possible to keep this folding, note that it's valid to
>transform to
>> > 
>> >     (n * 1 * z) * 500000
>> > 
>> > (i.e. accumulate multiplications on the outermost factor)
>
>(to be precise, if the multiplication is done in a signed type and the
>middle
>constant factor was a negated power of two, the sign change needs to
>remain:
>
>    a * -4 * b * 2
>
>needs to be transformed to
>
>    a * -1 * b * 8 )
>
>> > so would it be possible to adjust things here to remove the
>innermost
>> > multiplication instead?
>> 
>> I think I'd rather not expand this function any more, sorry.
>
>I'd be happy to look into that myself, if the idea sounds feasible and
>desirable.
>I've never looked at this code before, so I'd appreciate a quick
>yay-or-nay
>before diving in.  Richard, can you share your opinion on this point?

I'd rather extend the fold-binary associate: case to handle this, or do this via match.pd pattern(s).

Of course in the end we miss to associate signed integer ops in the reassoc pass...

Richard.

>Thanks.
>Alexander
Joseph Myers May 19, 2017, 5:26 p.m. UTC | #6
On Fri, 19 May 2017, Alexander Monakov wrote:

> On Fri, 19 May 2017, Marek Polacek wrote:
> > > I think it's possible to keep this folding, note that it's valid to transform to
> > > 
> > >     (n * 1 * z) * 500000
> > > 
> > > (i.e. accumulate multiplications on the outermost factor)
> 
> (to be precise, if the multiplication is done in a signed type and the middle
> constant factor was a negated power of two, the sign change needs to remain:
> 
>     a * -4 * b * 2
> 
> needs to be transformed to
> 
>     a * -1 * b * 8 )

Shouldn't that only be the case if the middle constant was -1 and the 
outer constant was 1?  If a * -4 * b is INT_MIN, a * b won't overflow and 
so a * b * -8 should be a safe transformation.

(You also need to avoid overflows in accumulating things on the outermost 
factor.)
Alexander Monakov May 19, 2017, 7:42 p.m. UTC | #7
On Fri, 19 May 2017, Joseph Myers wrote:
> On Fri, 19 May 2017, Alexander Monakov wrote:
> > (to be precise, if the multiplication is done in a signed type and the middle
> > constant factor was a negated power of two, the sign change needs to remain:
> > 
> >     a * -4 * b * 2
> > 
> > needs to be transformed to
> > 
> >     a * -1 * b * 8 )
> 
> Shouldn't that only be the case if the middle constant was -1 and the 
> outer constant was 1?  If a * -4 * b is INT_MIN, a * b won't overflow and 
> so a * b * -8 should be a safe transformation.

Indeed, I should have considered the situation more carefully.  Thank you for
the correction.

Alexander
Richard Biener May 24, 2017, 7:56 a.m. UTC | #8
On Fri, 19 May 2017, Alexander Monakov wrote:

> On Fri, 19 May 2017, Joseph Myers wrote:
> > On Fri, 19 May 2017, Alexander Monakov wrote:
> > > (to be precise, if the multiplication is done in a signed type and the middle
> > > constant factor was a negated power of two, the sign change needs to remain:
> > > 
> > >     a * -4 * b * 2
> > > 
> > > needs to be transformed to
> > > 
> > >     a * -1 * b * 8 )
> > 
> > Shouldn't that only be the case if the middle constant was -1 and the 
> > outer constant was 1?  If a * -4 * b is INT_MIN, a * b won't overflow and 
> > so a * b * -8 should be a safe transformation.
> 
> Indeed, I should have considered the situation more carefully.  Thank you for
> the correction.

Feel free to sumbit a patch improving the situation (as said, preferably
not in extract_muldiv).  Or open an enhancement bugreport refering to the
above constraints.

Richard.
diff mbox

Patch

diff --git gcc/fold-const.c gcc/fold-const.c
index 19aa722..736552c 100644
--- gcc/fold-const.c
+++ gcc/fold-const.c
@@ -6281,11 +6281,13 @@  extract_muldiv_1 (tree t, tree c, enum tree_code code, tree wide_type,
 	 new operation.  Likewise for the RHS from a MULT_EXPR.  Otherwise,
 	 do something only if the second operand is a constant.  */
       if (same_p
+	  && TYPE_OVERFLOW_WRAPS (ctype)
 	  && (t1 = extract_muldiv (op0, c, code, wide_type,
 				   strict_overflow_p)) != 0)
 	return fold_build2 (tcode, ctype, fold_convert (ctype, t1),
 			    fold_convert (ctype, op1));
       else if (tcode == MULT_EXPR && code == MULT_EXPR
+	       && TYPE_OVERFLOW_WRAPS (ctype)
 	       && (t1 = extract_muldiv (op1, c, code, wide_type,
 					strict_overflow_p)) != 0)
 	return fold_build2 (tcode, ctype, fold_convert (ctype, op0),
diff --git gcc/testsuite/c-c++-common/Wduplicated-branches-1.c gcc/testsuite/c-c++-common/Wduplicated-branches-1.c
index c0b93fc..7c5062d 100644
--- gcc/testsuite/c-c++-common/Wduplicated-branches-1.c
+++ gcc/testsuite/c-c++-common/Wduplicated-branches-1.c
@@ -89,7 +89,7 @@  f (int i, int *p)
   if (i == 8) /* { dg-warning "this condition has identical branches" } */
     return i * 8 * i * 8;
   else
-    return 8 * i * 8 * i;
+    return i * 8 * i * 8;
 
 
   if (i == 9) /* { dg-warning "this condition has identical branches" } */
diff --git gcc/testsuite/c-c++-common/ubsan/pr80800.c gcc/testsuite/c-c++-common/ubsan/pr80800.c
index e69de29..992c136 100644
--- gcc/testsuite/c-c++-common/ubsan/pr80800.c
+++ gcc/testsuite/c-c++-common/ubsan/pr80800.c
@@ -0,0 +1,25 @@ 
+/* PR sanitizer/80800 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=undefined -fsanitize-undefined-trap-on-error" } */
+
+int n = 20000;
+int z = 0;
+
+int
+fn1 (void)
+{
+  return (n * 10000 * z) * 50;
+}
+
+int
+fn2 (void)
+{
+  return (10000 * n * z) * 50;
+}
+
+int
+main ()
+{
+  fn1 ();
+  fn2 ();
+}