diff mbox

Convert manual unsigned +/- overflow checking into {ADD,SUB}_OVERFLOW (PR target/67089)

Message ID 20151125083627.GU5675@tucnak.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek Nov. 25, 2015, 8:36 a.m. UTC
On Wed, Nov 25, 2015 at 08:56:45AM +0100, Marc Glisse wrote:
> >This is the GIMPLE side of Richard's i?86 uadd/usub overflow
> >testing improvements.  If unsigned addition or subtraction
> >result is used both normally and in a GIMPLE_COND/COND_EXPR/tcc_comparison
> >that tests if unsigned overflow happened, the patch replaces it shortly
> >before expansion with {ADD,SUB}_OVERFLOW, so that RTL expansion can generate
> >better code on it.
> 
> If I test a+b<a and don't use a+b anywhere else, don't we also want to use
> the OVERFLOW things so we can expand to test the carry flag? That is, I am
> not convinced we want to punt on has_single_use for add_overflow. For
> sub_overflow with a single use of y-z, I guess y-z>y should become z>y, and
> going through a rewrite with sub_overflow neither helps nor hinders that.
> Actually, writing z>y is something the user is not unlikely to have done
> himself, and walking through the uses of y or z should not be hard, so I
> guess it could make sense to rewrite y-z>y to z>y always in match.pd and
> only look for the second form in math-opts.

Incremental diff for also handling the single use case if it is overflow
check is below.  But we already generate good code without it for the
x+y<x or x+y<y cases (and they aren't really problematic, as they are single
use), and while it is true that for x-y>x case the incremental patch below
improves the generated code right now, as you said it is better to rewrite
those as y>x and as it is a single use, it is easier to do it in match.pd.
So, I'd prefer to add that transformation and not use {ADD,SUB}_OVERFLOW
for those cases, because we get good enough code without increasing the IL
size, eating more memory etc.

> I was thinking more match.pd to transform a+b<a and sccvn to somehow CSE a+b
> with add_overflow(a,b), but your patch seems to work well with simpler code,
> that's cool :-)
> 
> And it shouldn't be too hard to add a few more later, to detect widening
> operations that are only used for overflow testing, although the form of
> such tests is much less universal among users.



	Jakub

Comments

Richard Biener Nov. 25, 2015, 8:45 a.m. UTC | #1
On Wed, 25 Nov 2015, Jakub Jelinek wrote:

> On Wed, Nov 25, 2015 at 08:56:45AM +0100, Marc Glisse wrote:
> > >This is the GIMPLE side of Richard's i?86 uadd/usub overflow
> > >testing improvements.  If unsigned addition or subtraction
> > >result is used both normally and in a GIMPLE_COND/COND_EXPR/tcc_comparison
> > >that tests if unsigned overflow happened, the patch replaces it shortly
> > >before expansion with {ADD,SUB}_OVERFLOW, so that RTL expansion can generate
> > >better code on it.
> > 
> > If I test a+b<a and don't use a+b anywhere else, don't we also want to use
> > the OVERFLOW things so we can expand to test the carry flag? That is, I am
> > not convinced we want to punt on has_single_use for add_overflow. For
> > sub_overflow with a single use of y-z, I guess y-z>y should become z>y, and
> > going through a rewrite with sub_overflow neither helps nor hinders that.
> > Actually, writing z>y is something the user is not unlikely to have done
> > himself, and walking through the uses of y or z should not be hard, so I
> > guess it could make sense to rewrite y-z>y to z>y always in match.pd and
> > only look for the second form in math-opts.
> 
> Incremental diff for also handling the single use case if it is overflow
> check is below.  But we already generate good code without it for the
> x+y<x or x+y<y cases (and they aren't really problematic, as they are single
> use), and while it is true that for x-y>x case the incremental patch below
> improves the generated code right now, as you said it is better to rewrite
> those as y>x and as it is a single use, it is easier to do it in match.pd.
> So, I'd prefer to add that transformation and not use {ADD,SUB}_OVERFLOW
> for those cases, because we get good enough code without increasing the IL
> size, eating more memory etc.
> 
> > I was thinking more match.pd to transform a+b<a and sccvn to somehow CSE a+b
> > with add_overflow(a,b), but your patch seems to work well with simpler code,
> > that's cool :-)

Yeah, I think the current patch aids RTL expansion best.

> > And it shouldn't be too hard to add a few more later, to detect widening
> > operations that are only used for overflow testing, although the form of
> > such tests is much less universal among users.

> --- gcc/tree-ssa-math-opts.c.jj	2015-11-24 17:00:10.000000000 +0100
> +++ gcc/tree-ssa-math-opts.c	2015-11-25 09:25:31.781087597 +0100
> @@ -3586,7 +3586,6 @@ match_uaddsub_overflow (gimple_stmt_iter
>    tree type = TREE_TYPE (lhs);
>    use_operand_p use_p;
>    imm_use_iterator iter;
> -  bool use_seen = false;
>    bool ovf_use_seen = false;
>    gimple *use_stmt;
>  
> @@ -3594,7 +3593,6 @@ match_uaddsub_overflow (gimple_stmt_iter
>    if (!INTEGRAL_TYPE_P (type)
>        || !TYPE_UNSIGNED (type)
>        || has_zero_uses (lhs)
> -      || has_single_use (lhs)
>        || optab_handler (code == PLUS_EXPR ? uaddv4_optab : usubv4_optab,
>  			TYPE_MODE (type)) == CODE_FOR_nothing)
>      return false;
> @@ -3606,14 +3604,13 @@ match_uaddsub_overflow (gimple_stmt_iter
>  	continue;
>  
>        if (uaddsub_overflow_check_p (stmt, use_stmt))
> -	ovf_use_seen = true;
> -      else
> -	use_seen = true;
> -      if (ovf_use_seen && use_seen)
> -	break;
> +	{
> +	  ovf_use_seen = true;
> +	  break;
> +	}
>      }
>  
> -  if (!ovf_use_seen || !use_seen)
> +  if (!ovf_use_seen)
>      return false;
>  
>    tree ctype = build_complex_type (type);

Ok with me as well.

Thanks,
Richard.

> 
> 	Jakub
> 
>
Marc Glisse Nov. 25, 2015, 8:58 a.m. UTC | #2
On Wed, 25 Nov 2015, Jakub Jelinek wrote:

> On Wed, Nov 25, 2015 at 08:56:45AM +0100, Marc Glisse wrote:
>>> This is the GIMPLE side of Richard's i?86 uadd/usub overflow
>>> testing improvements.  If unsigned addition or subtraction
>>> result is used both normally and in a GIMPLE_COND/COND_EXPR/tcc_comparison
>>> that tests if unsigned overflow happened, the patch replaces it shortly
>>> before expansion with {ADD,SUB}_OVERFLOW, so that RTL expansion can generate
>>> better code on it.
>>
>> If I test a+b<a and don't use a+b anywhere else, don't we also want to use
>> the OVERFLOW things so we can expand to test the carry flag? That is, I am
>> not convinced we want to punt on has_single_use for add_overflow. For
>> sub_overflow with a single use of y-z, I guess y-z>y should become z>y, and
>> going through a rewrite with sub_overflow neither helps nor hinders that.
>> Actually, writing z>y is something the user is not unlikely to have done
>> himself, and walking through the uses of y or z should not be hard, so I
>> guess it could make sense to rewrite y-z>y to z>y always in match.pd and
>> only look for the second form in math-opts.
>
> Incremental diff for also handling the single use case if it is overflow
> check is below.  But we already generate good code without it for the
> x+y<x or x+y<y cases (and they aren't really problematic, as they are single
> use), and while it is true that for x-y>x case the incremental patch below
> improves the generated code right now, as you said it is better to rewrite
> those as y>x and as it is a single use, it is easier to do it in match.pd.
> So, I'd prefer to add that transformation and not use {ADD,SUB}_OVERFLOW
> for those cases, because we get good enough code without increasing the IL
> size, eating more memory etc.

I guess it got lost in my text, but if a user writes:

unsigned diff = a - b;
if (b > a) { /* overflow */ ... }
else { ... }

Your patch will not detect it. It seems that replacing x-y>x with y>x 
could be done whether it is single use or not, and we could look for the 
pattern above instead of the one you currently have for sub_overflow. The 
main change would be that now it isn't obvious where to insert the 
sub_overflow, since one operation doesn't obviously dominate the other :-(
Jakub Jelinek Nov. 25, 2015, 9:04 a.m. UTC | #3
On Wed, Nov 25, 2015 at 09:58:15AM +0100, Marc Glisse wrote:
> I guess it got lost in my text, but if a user writes:
> 
> unsigned diff = a - b;
> if (b > a) { /* overflow */ ... }
> else { ... }
> 
> Your patch will not detect it. It seems that replacing x-y>x with y>x could

Sorry, already committed the patch (without incremental that hasn't been
tested anyway).
It is true that the patch does not detect this, but it is harder that way.
What if it is
if (b > a) ...
// Huge amount of code
r = a - b;
?  Trying to emit the subtraction above the comparison would then very
likely increase register preassure.  So, I'd really prefer doing x-y>x to y>x
only for single use.

	Jakub
Richard Biener Nov. 25, 2015, 9:08 a.m. UTC | #4
On Wed, 25 Nov 2015, Jakub Jelinek wrote:

> On Wed, Nov 25, 2015 at 09:58:15AM +0100, Marc Glisse wrote:
> > I guess it got lost in my text, but if a user writes:
> > 
> > unsigned diff = a - b;
> > if (b > a) { /* overflow */ ... }
> > else { ... }
> > 
> > Your patch will not detect it. It seems that replacing x-y>x with y>x could
> 
> Sorry, already committed the patch (without incremental that hasn't been
> tested anyway).
> It is true that the patch does not detect this, but it is harder that way.
> What if it is
> if (b > a) ...
> // Huge amount of code
> r = a - b;
> ?  Trying to emit the subtraction above the comparison would then very
> likely increase register preassure.  So, I'd really prefer doing x-y>x to y>x
> only for single use.

I think that's ok for now.  For the above case you'd need to do sth
similar to what cse sincos / divmod do, with an added "cost" check
(same basic-block?).

Richard.
Marc Glisse Nov. 25, 2015, 11:19 a.m. UTC | #5
On Wed, 25 Nov 2015, Jakub Jelinek wrote:

> On Wed, Nov 25, 2015 at 09:58:15AM +0100, Marc Glisse wrote:
>> I guess it got lost in my text, but if a user writes:
>>
>> unsigned diff = a - b;
>> if (b > a) { /* overflow */ ... }
>> else { ... }
>>
>> Your patch will not detect it. It seems that replacing x-y>x with y>x could
>
> Sorry, already committed the patch (without incremental that hasn't been
> tested anyway).

Sorry, I never meant to imply that your patch was wrong in any way or 
should be blocked, I like it. I was only discussing possible future 
improvements...

> It is true that the patch does not detect this, but it is harder that way.
> What if it is
> if (b > a) ...
> // Huge amount of code
> r = a - b;
> ?  Trying to emit the subtraction above the comparison would then very
> likely increase register preassure.

The same is true whether we write it b > a or (a - b) > a (I don't think 
PRE + SCCVN avoid increasing register pressure).

> So, I'd really prefer doing x-y>x to y>x only for single use.

Ok (for now).
Jakub Jelinek Nov. 25, 2015, 11:23 a.m. UTC | #6
On Wed, Nov 25, 2015 at 12:19:04PM +0100, Marc Glisse wrote:
> On Wed, 25 Nov 2015, Jakub Jelinek wrote:
> 
> >On Wed, Nov 25, 2015 at 09:58:15AM +0100, Marc Glisse wrote:
> >>I guess it got lost in my text, but if a user writes:
> >>
> >>unsigned diff = a - b;
> >>if (b > a) { /* overflow */ ... }
> >>else { ... }
> >>
> >>Your patch will not detect it. It seems that replacing x-y>x with y>x could
> >
> >Sorry, already committed the patch (without incremental that hasn't been
> >tested anyway).
> 
> Sorry, I never meant to imply that your patch was wrong in any way or should
> be blocked, I like it. I was only discussing possible future improvements...

BTW, the primary reason for the patch has been a code quality regression,
and I bet that is only for the case of if (diff > a), otherwise combiner
with the problematic subtraction with overflow patterns wouldn't be able to
find anything.  rth fixed it only for the case where users explicitly use
the new __builtin_*_overflow builtins, and the patch has been trying to fix
the regression even when not written that way.

> The same is true whether we write it b > a or (a - b) > a (I don't think PRE
> + SCCVN avoid increasing register pressure).
> 
> >So, I'd really prefer doing x-y>x to y>x only for single use.
> 
> Ok (for now).

Do you plan to work on that (my match.pd experience is smaller than yours),
or should I add to my todo list?

	Jakub
Marc Glisse Nov. 25, 2015, 9:08 p.m. UTC | #7
On Wed, 25 Nov 2015, Jakub Jelinek wrote:

>> The same is true whether we write it b > a or (a - b) > a (I don't think PRE
>> + SCCVN avoid increasing register pressure).
>>
>>> So, I'd really prefer doing x-y>x to y>x only for single use.
>>
>> Ok (for now).
>
> Do you plan to work on that (my match.pd experience is smaller than yours),
> or should I add to my todo list?

Are we talking stage 3 or next stage 1? If you want something for stage 3, 
I think you'll have to do it, it shouldn't be much longer than

(for cmp (gt le)
  (simplify
   (cmp (minus:s @0 @1) @0)
   (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
    (cmp @1 @0))))

and a similar one for x<y-x. (I don't think TYPE_UNSIGNED needs protection 
against floats or whatever, but I could be wrong)
Richard Biener Nov. 26, 2015, 8:50 a.m. UTC | #8
On Wed, 25 Nov 2015, Marc Glisse wrote:

> On Wed, 25 Nov 2015, Jakub Jelinek wrote:
> 
> > > The same is true whether we write it b > a or (a - b) > a (I don't think
> > > PRE
> > > + SCCVN avoid increasing register pressure).
> > > 
> > > > So, I'd really prefer doing x-y>x to y>x only for single use.
> > > 
> > > Ok (for now).
> > 
> > Do you plan to work on that (my match.pd experience is smaller than yours),
> > or should I add to my todo list?
> 
> Are we talking stage 3 or next stage 1? If you want something for stage 3, I
> think you'll have to do it, it shouldn't be much longer than
> 
> (for cmp (gt le)
>  (simplify
>   (cmp (minus:s @0 @1) @0)
>   (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
>    (cmp @1 @0))))
> 
> and a similar one for x<y-x. (I don't think TYPE_UNSIGNED needs protection
> against floats or whatever, but I could be wrong)

floats should be fine here.  But eventually saturating types need to
be excluded?

Note that the :s on the minus has no effect as the result is a
single operation.  If you want to restrict this nevertheless
you need a && single_use (@2) and capture the minus with
(minus@2 @0 @1) instead.

Richard.
diff mbox

Patch

--- gcc/tree-ssa-math-opts.c.jj	2015-11-24 17:00:10.000000000 +0100
+++ gcc/tree-ssa-math-opts.c	2015-11-25 09:25:31.781087597 +0100
@@ -3586,7 +3586,6 @@  match_uaddsub_overflow (gimple_stmt_iter
   tree type = TREE_TYPE (lhs);
   use_operand_p use_p;
   imm_use_iterator iter;
-  bool use_seen = false;
   bool ovf_use_seen = false;
   gimple *use_stmt;
 
@@ -3594,7 +3593,6 @@  match_uaddsub_overflow (gimple_stmt_iter
   if (!INTEGRAL_TYPE_P (type)
       || !TYPE_UNSIGNED (type)
       || has_zero_uses (lhs)
-      || has_single_use (lhs)
       || optab_handler (code == PLUS_EXPR ? uaddv4_optab : usubv4_optab,
 			TYPE_MODE (type)) == CODE_FOR_nothing)
     return false;
@@ -3606,14 +3604,13 @@  match_uaddsub_overflow (gimple_stmt_iter
 	continue;
 
       if (uaddsub_overflow_check_p (stmt, use_stmt))
-	ovf_use_seen = true;
-      else
-	use_seen = true;
-      if (ovf_use_seen && use_seen)
-	break;
+	{
+	  ovf_use_seen = true;
+	  break;
+	}
     }
 
-  if (!ovf_use_seen || !use_seen)
+  if (!ovf_use_seen)
     return false;
 
   tree ctype = build_complex_type (type);