diff mbox series

[middle-end/88784] Middle end is missing some optimizations about unsigned

Message ID 1561615913-22109-1-git-send-email-helijia@linux.ibm.com
State New
Headers show
Series [middle-end/88784] Middle end is missing some optimizations about unsigned | expand

Commit Message

Li Jia He June 27, 2019, 6:11 a.m. UTC
Hi,

According to the optimizable case described by Qi Feng on
issue 88784, we can combine the cases into the following:

1. x >  y  &&  x != XXX_MIN  -->   x > y
2. x >  y  &&  x == XXX_MIN  -->   false
3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN

4. x <  y  &&  x != XXX_MAX  -->   x < y
5. x <  y  &&  x == XXX_MAX  -->   false
6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX

7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
8. x <= y  ||  x != XXX_MIN  -->   true
9. x <= y  ||  x == XXX_MIN  -->   x <= y

10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
11. x >= y  ||  x != XXX_MAX  -->   true
12. x >= y  ||  x == XXX_MAX  -->   x >= y

Note: XXX_MIN represents the minimum value of type x.
      XXX_MAX represents the maximum value of type x.

Here we don't need to care about whether the operation is
signed or unsigned.  For example, in the below equation:

'x >  y  &&  x != XXX_MIN  -->   x > y'

If the x type is signed int and XXX_MIN is INT_MIN, we can
optimize it to 'x > y'.  However, if the type of x is unsigned
int and XXX_MIN is 0, we can still optimize it to 'x > y'.

The regression testing for the patch was done on GCC mainline on

    powerpc64le-unknown-linux-gnu (Power 9 LE)

with no regressions.  Is it OK for trunk ?

Thanks,
Lijia He

gcc/ChangeLog

2019-06-27  Li Jia He  <helijia@linux.ibm.com>
	    Qi Feng  <ffengqi@linux.ibm.com>

	PR middle-end/88784
	* gimple-fold.c (and_comparisons_contain_equal_operands): New function.
	(and_comparisons_1): Use and_comparisons_contain_equal_operands.
	(or_comparisons_contain_equal_operands): New function.
	(or_comparisons_1): Use or_comparisons_contain_equal_operands.

gcc/testsuite/ChangeLog

2019-06-27  Li Jia He  <helijia@linux.ibm.com>
	    Qi Feng  <ffengqi@linux.ibm.com>

	PR middle-end/88784
	* gcc.dg/pr88784-1.c: New testcase.
	* gcc.dg/pr88784-2.c: New testcase.
	* gcc.dg/pr88784-3.c: New testcase.
	* gcc.dg/pr88784-4.c: New testcase.
	* gcc.dg/pr88784-5.c: New testcase.
	* gcc.dg/pr88784-6.c: New testcase.
	* gcc.dg/pr88784-7.c: New testcase.
	* gcc.dg/pr88784-8.c: New testcase.
	* gcc.dg/pr88784-9.c: New testcase.
	* gcc.dg/pr88784-10.c: New testcase.
	* gcc.dg/pr88784-11.c: New testcase.
	* gcc.dg/pr88784-12.c: New testcase.
---
 gcc/gimple-fold.c                 | 124 ++++++++++++++++++++++++++++++
 gcc/testsuite/gcc.dg/pr88784-1.c  |  30 ++++++++
 gcc/testsuite/gcc.dg/pr88784-10.c |  32 ++++++++
 gcc/testsuite/gcc.dg/pr88784-11.c |  30 ++++++++
 gcc/testsuite/gcc.dg/pr88784-12.c |  30 ++++++++
 gcc/testsuite/gcc.dg/pr88784-2.c  |  30 ++++++++
 gcc/testsuite/gcc.dg/pr88784-3.c  |  32 ++++++++
 gcc/testsuite/gcc.dg/pr88784-4.c  |  32 ++++++++
 gcc/testsuite/gcc.dg/pr88784-5.c  |  31 ++++++++
 gcc/testsuite/gcc.dg/pr88784-6.c  |  31 ++++++++
 gcc/testsuite/gcc.dg/pr88784-7.c  |  31 ++++++++
 gcc/testsuite/gcc.dg/pr88784-8.c  |  31 ++++++++
 gcc/testsuite/gcc.dg/pr88784-9.c  |  32 ++++++++
 13 files changed, 496 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-10.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-11.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-12.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-2.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-3.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-4.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-5.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-6.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-7.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-8.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-9.c

Comments

Jeff Law June 27, 2019, 3:48 p.m. UTC | #1
On 6/27/19 12:11 AM, Li Jia He wrote:
> Hi,
> 
> According to the optimizable case described by Qi Feng on
> issue 88784, we can combine the cases into the following:
> 
> 1. x >  y  &&  x != XXX_MIN  -->   x > y
> 2. x >  y  &&  x == XXX_MIN  -->   false
> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
> 
> 4. x <  y  &&  x != XXX_MAX  -->   x < y
> 5. x <  y  &&  x == XXX_MAX  -->   false
> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
> 
> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> 8. x <= y  ||  x != XXX_MIN  -->   true
> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
> 
> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> 11. x >= y  ||  x != XXX_MAX  -->   true
> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
> 
> Note: XXX_MIN represents the minimum value of type x.
>       XXX_MAX represents the maximum value of type x.
> 
> Here we don't need to care about whether the operation is
> signed or unsigned.  For example, in the below equation:
> 
> 'x >  y  &&  x != XXX_MIN  -->   x > y'
> 
> If the x type is signed int and XXX_MIN is INT_MIN, we can
> optimize it to 'x > y'.  However, if the type of x is unsigned
> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
> 
> The regression testing for the patch was done on GCC mainline on
> 
>     powerpc64le-unknown-linux-gnu (Power 9 LE)
> 
> with no regressions.  Is it OK for trunk ?
> 
> Thanks,
> Lijia He
> 
> gcc/ChangeLog
> 
> 2019-06-27  Li Jia He  <helijia@linux.ibm.com>
> 	    Qi Feng  <ffengqi@linux.ibm.com>
> 
> 	PR middle-end/88784
> 	* gimple-fold.c (and_comparisons_contain_equal_operands): New function.
> 	(and_comparisons_1): Use and_comparisons_contain_equal_operands.
> 	(or_comparisons_contain_equal_operands): New function.
> 	(or_comparisons_1): Use or_comparisons_contain_equal_operands.
Would this be better done via match.pd?  ISTM this transformation would
be well suited for that framework.

jeff
Li Jia He June 28, 2019, 4:55 a.m. UTC | #2
On 2019/6/27 11:48 PM, Jeff Law wrote:
> On 6/27/19 12:11 AM, Li Jia He wrote:
>> Hi,
>>
>> According to the optimizable case described by Qi Feng on
>> issue 88784, we can combine the cases into the following:
>>
>> 1. x >  y  &&  x != XXX_MIN  -->   x > y
>> 2. x >  y  &&  x == XXX_MIN  -->   false
>> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
>>
>> 4. x <  y  &&  x != XXX_MAX  -->   x < y
>> 5. x <  y  &&  x == XXX_MAX  -->   false
>> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
>>
>> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
>> 8. x <= y  ||  x != XXX_MIN  -->   true
>> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
>>
>> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
>> 11. x >= y  ||  x != XXX_MAX  -->   true
>> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
>>
>> Note: XXX_MIN represents the minimum value of type x.
>>        XXX_MAX represents the maximum value of type x.
>>
>> Here we don't need to care about whether the operation is
>> signed or unsigned.  For example, in the below equation:
>>
>> 'x >  y  &&  x != XXX_MIN  -->   x > y'
>>
>> If the x type is signed int and XXX_MIN is INT_MIN, we can
>> optimize it to 'x > y'.  However, if the type of x is unsigned
>> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
>>
>> The regression testing for the patch was done on GCC mainline on
>>
>>      powerpc64le-unknown-linux-gnu (Power 9 LE)
>>
>> with no regressions.  Is it OK for trunk ?
>>
>> Thanks,
>> Lijia He
>>
>> gcc/ChangeLog
>>
>> 2019-06-27  Li Jia He  <helijia@linux.ibm.com>
>> 	    Qi Feng  <ffengqi@linux.ibm.com>
>>
>> 	PR middle-end/88784
>> 	* gimple-fold.c (and_comparisons_contain_equal_operands): New function.
>> 	(and_comparisons_1): Use and_comparisons_contain_equal_operands.
>> 	(or_comparisons_contain_equal_operands): New function.
>> 	(or_comparisons_1): Use or_comparisons_contain_equal_operands.
> Would this be better done via match.pd?  ISTM this transformation would
> be well suited for that framework.

Hi, Jeff

I did this because of the following test case:
`
_Bool comp(unsigned x, unsigned y)
{
   return x > y && x != 0;
}
`
The gimple file dumped on the power platform is:
`
comp (unsigned int x, unsigned int y)
{
   _Bool D.2837;
   int iftmp.0;

   if (x > y) goto <D.2841>; else goto <D.2839>;
   <D.2841>:
   if (x != 0) goto <D.2842>; else goto <D.2839>;
   <D.2842>:
   iftmp.0 = 1;
   goto <D.2840>;
   <D.2839>:
   iftmp.0 = 0;
   <D.2840>:
   D.2837 = (_Bool) iftmp.0;
   return D.2837;
}
`
However, the gimple file dumped on x86 is
`
comp (unsigned int x, unsigned int y)
{
   _Bool D.2837;

   _1 = x > y;
   _2 = x != 0;
   _3 = _1 & _2;
   _4 = (int) _3;
   D.2837 = (_Bool) _4;
   return D.2837;
}
`

The reason for the inconsistency between these two behaviors is param
logical-op-non-short-circuit.  If we add the pattern to the match.pd
file, we can only optimize the situation in which the statement is in
the same basic block (logical-op-non-short-circuit=1, x86).  But for
a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
can't handle this situation.

Another reason is that I found out maybe_fold_and_comparisons and
maybe_fold_or_comparisons are not only called by ifcombine pass but
also by reassoc pass. Using this method can basically unify param
logical-op-non-short-circuit=0 or 1.

Thanks,
Lijia He

> 
> jeff
>
Andrew Pinski June 28, 2019, 5:02 p.m. UTC | #3
On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote:
>
>
>
> On 2019/6/27 11:48 PM, Jeff Law wrote:
> > On 6/27/19 12:11 AM, Li Jia He wrote:
> >> Hi,
> >>
> >> According to the optimizable case described by Qi Feng on
> >> issue 88784, we can combine the cases into the following:
> >>
> >> 1. x >  y  &&  x != XXX_MIN  -->   x > y
> >> 2. x >  y  &&  x == XXX_MIN  -->   false
> >> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
> >>
> >> 4. x <  y  &&  x != XXX_MAX  -->   x < y
> >> 5. x <  y  &&  x == XXX_MAX  -->   false
> >> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
> >>
> >> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> >> 8. x <= y  ||  x != XXX_MIN  -->   true
> >> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
> >>
> >> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> >> 11. x >= y  ||  x != XXX_MAX  -->   true
> >> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
> >>
> >> Note: XXX_MIN represents the minimum value of type x.
> >>        XXX_MAX represents the maximum value of type x.
> >>
> >> Here we don't need to care about whether the operation is
> >> signed or unsigned.  For example, in the below equation:
> >>
> >> 'x >  y  &&  x != XXX_MIN  -->   x > y'
> >>
> >> If the x type is signed int and XXX_MIN is INT_MIN, we can
> >> optimize it to 'x > y'.  However, if the type of x is unsigned
> >> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
> >>
> >> The regression testing for the patch was done on GCC mainline on
> >>
> >>      powerpc64le-unknown-linux-gnu (Power 9 LE)
> >>
> >> with no regressions.  Is it OK for trunk ?
> >>
> >> Thanks,
> >> Lijia He
> >>
> >> gcc/ChangeLog
> >>
> >> 2019-06-27  Li Jia He  <helijia@linux.ibm.com>
> >>          Qi Feng  <ffengqi@linux.ibm.com>
> >>
> >>      PR middle-end/88784
> >>      * gimple-fold.c (and_comparisons_contain_equal_operands): New function.
> >>      (and_comparisons_1): Use and_comparisons_contain_equal_operands.
> >>      (or_comparisons_contain_equal_operands): New function.
> >>      (or_comparisons_1): Use or_comparisons_contain_equal_operands.
> > Would this be better done via match.pd?  ISTM this transformation would
> > be well suited for that framework.
>
> Hi, Jeff
>
> I did this because of the following test case:
> `
> _Bool comp(unsigned x, unsigned y)
> {
>    return x > y && x != 0;
> }
> `
> The gimple file dumped on the power platform is:
> `
> comp (unsigned int x, unsigned int y)
> {
>    _Bool D.2837;
>    int iftmp.0;
>
>    if (x > y) goto <D.2841>; else goto <D.2839>;
>    <D.2841>:
>    if (x != 0) goto <D.2842>; else goto <D.2839>;
>    <D.2842>:
>    iftmp.0 = 1;
>    goto <D.2840>;
>    <D.2839>:
>    iftmp.0 = 0;
>    <D.2840>:
>    D.2837 = (_Bool) iftmp.0;
>    return D.2837;
> }
> `
> However, the gimple file dumped on x86 is
> `
> comp (unsigned int x, unsigned int y)
> {
>    _Bool D.2837;
>
>    _1 = x > y;
>    _2 = x != 0;
>    _3 = _1 & _2;
>    _4 = (int) _3;
>    D.2837 = (_Bool) _4;
>    return D.2837;
> }
> `
>
> The reason for the inconsistency between these two behaviors is param
> logical-op-non-short-circuit.  If we add the pattern to the match.pd
> file, we can only optimize the situation in which the statement is in
> the same basic block (logical-op-non-short-circuit=1, x86).  But for
> a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
> can't handle this situation.
>
> Another reason is that I found out maybe_fold_and_comparisons and
> maybe_fold_or_comparisons are not only called by ifcombine pass but
> also by reassoc pass. Using this method can basically unify param
> logical-op-non-short-circuit=0 or 1.


As mentioned before ifcombine pass should be using gimple-match
instead of fold_build.  Try converting ifcombine over to gimple-match
infrastructure and add these to match.pd.
NOTE tree-ssa-phiopt should also be moved over to gimple-match but
that is a different issue.


Thanks,
Andrew Pinski

>
> Thanks,
> Lijia He
>
> >
> > jeff
> >
>
Richard Biener July 1, 2019, 7:30 a.m. UTC | #4
On Fri, 28 Jun 2019, Andrew Pinski wrote:

> On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote:
> >
> >
> >
> > On 2019/6/27 11:48 PM, Jeff Law wrote:
> > > On 6/27/19 12:11 AM, Li Jia He wrote:
> > >> Hi,
> > >>
> > >> According to the optimizable case described by Qi Feng on
> > >> issue 88784, we can combine the cases into the following:
> > >>
> > >> 1. x >  y  &&  x != XXX_MIN  -->   x > y
> > >> 2. x >  y  &&  x == XXX_MIN  -->   false
> > >> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
> > >>
> > >> 4. x <  y  &&  x != XXX_MAX  -->   x < y
> > >> 5. x <  y  &&  x == XXX_MAX  -->   false
> > >> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
> > >>
> > >> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> > >> 8. x <= y  ||  x != XXX_MIN  -->   true
> > >> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
> > >>
> > >> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> > >> 11. x >= y  ||  x != XXX_MAX  -->   true
> > >> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
> > >>
> > >> Note: XXX_MIN represents the minimum value of type x.
> > >>        XXX_MAX represents the maximum value of type x.
> > >>
> > >> Here we don't need to care about whether the operation is
> > >> signed or unsigned.  For example, in the below equation:
> > >>
> > >> 'x >  y  &&  x != XXX_MIN  -->   x > y'
> > >>
> > >> If the x type is signed int and XXX_MIN is INT_MIN, we can
> > >> optimize it to 'x > y'.  However, if the type of x is unsigned
> > >> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
> > >>
> > >> The regression testing for the patch was done on GCC mainline on
> > >>
> > >>      powerpc64le-unknown-linux-gnu (Power 9 LE)
> > >>
> > >> with no regressions.  Is it OK for trunk ?
> > >>
> > >> Thanks,
> > >> Lijia He
> > >>
> > >> gcc/ChangeLog
> > >>
> > >> 2019-06-27  Li Jia He  <helijia@linux.ibm.com>
> > >>          Qi Feng  <ffengqi@linux.ibm.com>
> > >>
> > >>      PR middle-end/88784
> > >>      * gimple-fold.c (and_comparisons_contain_equal_operands): New function.
> > >>      (and_comparisons_1): Use and_comparisons_contain_equal_operands.
> > >>      (or_comparisons_contain_equal_operands): New function.
> > >>      (or_comparisons_1): Use or_comparisons_contain_equal_operands.
> > > Would this be better done via match.pd?  ISTM this transformation would
> > > be well suited for that framework.
> >
> > Hi, Jeff
> >
> > I did this because of the following test case:
> > `
> > _Bool comp(unsigned x, unsigned y)
> > {
> >    return x > y && x != 0;
> > }
> > `
> > The gimple file dumped on the power platform is:
> > `
> > comp (unsigned int x, unsigned int y)
> > {
> >    _Bool D.2837;
> >    int iftmp.0;
> >
> >    if (x > y) goto <D.2841>; else goto <D.2839>;
> >    <D.2841>:
> >    if (x != 0) goto <D.2842>; else goto <D.2839>;
> >    <D.2842>:
> >    iftmp.0 = 1;
> >    goto <D.2840>;
> >    <D.2839>:
> >    iftmp.0 = 0;
> >    <D.2840>:
> >    D.2837 = (_Bool) iftmp.0;
> >    return D.2837;
> > }
> > `
> > However, the gimple file dumped on x86 is
> > `
> > comp (unsigned int x, unsigned int y)
> > {
> >    _Bool D.2837;
> >
> >    _1 = x > y;
> >    _2 = x != 0;
> >    _3 = _1 & _2;
> >    _4 = (int) _3;
> >    D.2837 = (_Bool) _4;
> >    return D.2837;
> > }
> > `
> >
> > The reason for the inconsistency between these two behaviors is param
> > logical-op-non-short-circuit.  If we add the pattern to the match.pd
> > file, we can only optimize the situation in which the statement is in
> > the same basic block (logical-op-non-short-circuit=1, x86).  But for
> > a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
> > can't handle this situation.
> >
> > Another reason is that I found out maybe_fold_and_comparisons and
> > maybe_fold_or_comparisons are not only called by ifcombine pass but
> > also by reassoc pass. Using this method can basically unify param
> > logical-op-non-short-circuit=0 or 1.
> 
> 
> As mentioned before ifcombine pass should be using gimple-match
> instead of fold_build.  Try converting ifcombine over to gimple-match
> infrastructure and add these to match.pd.

Yes, I mentioned that in the PR.  The issue is that at the moment
to combine x > y with x <= y you'd have to build GENERIC trees
for both or temporary GIMPLE assign with a SSA def (and then feed
that into the GENERIC or GIMPLE match.pd path).

maybe_fold_and/or_comparisons handle two exploded binary expressions
while the current match.pd entries handle at most one exploded one
(the outermost then, either AND or OR).  But it would be definitely
doable to auto-generate maybe_fold_and/or_comparisons from match.pd
patterns which is what I'd ultimatively suggest to do (in some more
generalized form maybe).  Either with a separate genmatch invocation
or as part of the --gimple processing (not sure what is more feasible
here).

I told Li Jia He that I don't expect him to do this work.

Note I didn't review the actual patch yet.

Thanks,
Richard.
Li Jia He July 2, 2019, 7:41 a.m. UTC | #5
On 2019/7/1 3:30 PM, Richard Biener wrote:
> On Fri, 28 Jun 2019, Andrew Pinski wrote:
> 
>> On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote:
>>>
>>>
>>>
>>> On 2019/6/27 11:48 PM, Jeff Law wrote:
>>>> On 6/27/19 12:11 AM, Li Jia He wrote:
>>>>> Hi,
>>>>>
>>>>> According to the optimizable case described by Qi Feng on
>>>>> issue 88784, we can combine the cases into the following:
>>>>>
>>>>> 1. x >  y  &&  x != XXX_MIN  -->   x > y
>>>>> 2. x >  y  &&  x == XXX_MIN  -->   false
>>>>> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
>>>>>
>>>>> 4. x <  y  &&  x != XXX_MAX  -->   x < y
>>>>> 5. x <  y  &&  x == XXX_MAX  -->   false
>>>>> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
>>>>>
>>>>> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
>>>>> 8. x <= y  ||  x != XXX_MIN  -->   true
>>>>> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
>>>>>
>>>>> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
>>>>> 11. x >= y  ||  x != XXX_MAX  -->   true
>>>>> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
>>>>>
>>>>> Note: XXX_MIN represents the minimum value of type x.
>>>>>         XXX_MAX represents the maximum value of type x.
>>>>>
>>>>> Here we don't need to care about whether the operation is
>>>>> signed or unsigned.  For example, in the below equation:
>>>>>
>>>>> 'x >  y  &&  x != XXX_MIN  -->   x > y'
>>>>>
>>>>> If the x type is signed int and XXX_MIN is INT_MIN, we can
>>>>> optimize it to 'x > y'.  However, if the type of x is unsigned
>>>>> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
>>>>>
>>>>> The regression testing for the patch was done on GCC mainline on
>>>>>
>>>>>       powerpc64le-unknown-linux-gnu (Power 9 LE)
>>>>>
>>>>> with no regressions.  Is it OK for trunk ?
>>>>>
>>>>> Thanks,
>>>>> Lijia He
>>>>>
>>>>> gcc/ChangeLog
>>>>>
>>>>> 2019-06-27  Li Jia He  <helijia@linux.ibm.com>
>>>>>           Qi Feng  <ffengqi@linux.ibm.com>
>>>>>
>>>>>       PR middle-end/88784
>>>>>       * gimple-fold.c (and_comparisons_contain_equal_operands): New function.
>>>>>       (and_comparisons_1): Use and_comparisons_contain_equal_operands.
>>>>>       (or_comparisons_contain_equal_operands): New function.
>>>>>       (or_comparisons_1): Use or_comparisons_contain_equal_operands.
>>>> Would this be better done via match.pd?  ISTM this transformation would
>>>> be well suited for that framework.
>>>
>>> Hi, Jeff
>>>
>>> I did this because of the following test case:
>>> `
>>> _Bool comp(unsigned x, unsigned y)
>>> {
>>>     return x > y && x != 0;
>>> }
>>> `
>>> The gimple file dumped on the power platform is:
>>> `
>>> comp (unsigned int x, unsigned int y)
>>> {
>>>     _Bool D.2837;
>>>     int iftmp.0;
>>>
>>>     if (x > y) goto <D.2841>; else goto <D.2839>;
>>>     <D.2841>:
>>>     if (x != 0) goto <D.2842>; else goto <D.2839>;
>>>     <D.2842>:
>>>     iftmp.0 = 1;
>>>     goto <D.2840>;
>>>     <D.2839>:
>>>     iftmp.0 = 0;
>>>     <D.2840>:
>>>     D.2837 = (_Bool) iftmp.0;
>>>     return D.2837;
>>> }
>>> `
>>> However, the gimple file dumped on x86 is
>>> `
>>> comp (unsigned int x, unsigned int y)
>>> {
>>>     _Bool D.2837;
>>>
>>>     _1 = x > y;
>>>     _2 = x != 0;
>>>     _3 = _1 & _2;
>>>     _4 = (int) _3;
>>>     D.2837 = (_Bool) _4;
>>>     return D.2837;
>>> }
>>> `
>>>
>>> The reason for the inconsistency between these two behaviors is param
>>> logical-op-non-short-circuit.  If we add the pattern to the match.pd
>>> file, we can only optimize the situation in which the statement is in
>>> the same basic block (logical-op-non-short-circuit=1, x86).  But for
>>> a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
>>> can't handle this situation.
>>>
>>> Another reason is that I found out maybe_fold_and_comparisons and
>>> maybe_fold_or_comparisons are not only called by ifcombine pass but
>>> also by reassoc pass. Using this method can basically unify param
>>> logical-op-non-short-circuit=0 or 1.
>>
>>
>> As mentioned before ifcombine pass should be using gimple-match
>> instead of fold_build.  Try converting ifcombine over to gimple-match
>> infrastructure and add these to match.pd.
> 
> Yes, I mentioned that in the PR.  The issue is that at the moment
> to combine x > y with x <= y you'd have to build GENERIC trees
> for both or temporary GIMPLE assign with a SSA def (and then feed
> that into the GENERIC or GIMPLE match.pd path).

Hi,

I did some experimentation using ‘temporary GIMPLE with a SSA def (and 
then feed that into the GIMPLE match.pd path’.  Could we consider the 
code in the attachment(I did a test and the code took effect)?

Thanks,
Lijia He

> 
> maybe_fold_and/or_comparisons handle two exploded binary expressions
> while the current match.pd entries handle at most one exploded one
> (the outermost then, either AND or OR).  But it would be definitely
> doable to auto-generate maybe_fold_and/or_comparisons from match.pd
> patterns which is what I'd ultimatively suggest to do (in some more
> generalized form maybe).  Either with a separate genmatch invocation
> or as part of the --gimple processing (not sure what is more feasible
> here).
> 
> I told Li Jia He that I don't expect him to do this work.
> 
> Note I didn't review the actual patch yet.
> 
> Thanks,
> Richard.
>
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index dfb31a02078..9974b491626 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -5789,6 +5789,12 @@ and_comparisons_1 (enum tree_code code1, tree op1a, tree op1b,
   return NULL_TREE;
 }
 
+static tree
+do_valueize (tree t)
+{
+  return t;
+}
+
 /* Try to simplify the AND of two comparisons, specified by
    (OP1A CODE1 OP1B) and (OP2B CODE2 OP2B), respectively.
    If this can be simplified to a single expression (without requiring
@@ -5800,11 +5806,39 @@ tree
 maybe_fold_and_comparisons (enum tree_code code1, tree op1a, tree op1b,
 			    enum tree_code code2, tree op2a, tree op2b)
 {
-  tree t = and_comparisons_1 (code1, op1a, op1b, code2, op2a, op2b);
+  tree t1 = fold_build2 (code1, boolean_type_node, op1a, op1b);
+  tree t2 = fold_build2 (code2, boolean_type_node, op2a, op2b);
+
+  tree new_lhs1 = make_ssa_name (TREE_TYPE (t1));
+  tree new_lhs2 = make_ssa_name (TREE_TYPE (t2));
+
+  gassign *gassign1 = gimple_build_assign (new_lhs1, t1);
+  gassign *gassign2 = gimple_build_assign (new_lhs2, t2);
+
+  tree t = gimple_simplify (TRUTH_AND_EXPR, boolean_type_node,
+			    gimple_assign_lhs (gassign1),
+			    gimple_assign_lhs (gassign2), NULL, do_valueize);
+
+  if (!t)
+    {
+      t = gimple_simplify (TRUTH_AND_EXPR, boolean_type_node,
+			   gimple_assign_lhs (gassign2),
+			   gimple_assign_lhs (gassign1), NULL, do_valueize);
+    }
+
   if (t)
-    return t;
-  else
-    return and_comparisons_1 (code2, op2a, op2b, code1, op1a, op1b);
+    {
+      gimple *def = SSA_NAME_DEF_STMT (t);
+
+      if (!is_gimple_assign (def)
+	  || TREE_CODE_CLASS (gimple_assign_rhs_code (def)) != tcc_comparison)
+	return NULL_TREE;
+
+      return fold_build2 (gimple_assign_rhs_code (def), boolean_type_node,
+			  gimple_assign_rhs1 (def), gimple_assign_rhs2 (def));
+    }
+
+  return NULL_TREE;
 }
 
 /* Helper function for or_comparisons_1:  try to simplify the OR of the
diff --git a/gcc/match.pd b/gcc/match.pd
index f8e35e96d22..21a147d0ff1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1859,6 +1859,101 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
     { wide_int_to_tree (type, (wi::to_wide (@1)
 			       & (bitpos / BITS_PER_UNIT))); }))))
 
+/* x >  y  &&  x != XXX_MIN  -->  x > y  */
+(for and (truth_and bit_and)
+ (simplify
+  (and:c (gt:c@3 @0 @1) (ne @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))
+       && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)))))
+    @3)))
+
+/* x >  y  &&  x == XXX_MIN  -->  false  */
+(for and (truth_and bit_and)
+ (simplify
+  (and:c (gt:c @0 @1) (eq @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))
+       && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)))))
+    { boolean_false_node; })))
+
+/* x <=  y  &&  x == XXX_MIN  -->  x == XXX_MIN  */
+(for and (truth_and bit_and)
+ (simplify
+  (and:c (le:c @0 @1) (eq@3 @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))
+       && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)))))
+    @3)))
+
+/* x <  y  &&  x != XXX_MAX  -->  x < y  */
+(for and (truth_and bit_and)
+ (simplify
+  (and:c (lt:c@3 @0 @1) (ne @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))
+       && (wi::eq_p (wi::to_wide (@2), wi::max_value (TREE_TYPE (@2)))))
+    @3)))
+
+/* x <  y  &&  x == XXX_MAX  -->  false  */
+(for and (truth_and bit_and)
+ (simplify
+  (and:c (lt:c @0 @1) (eq @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))
+       && (wi::eq_p (wi::to_wide (@2), wi::max_value (TREE_TYPE (@2)))))
+    { boolean_false_node; })))
+
+/* x >=  y  &&  x == XXX_MAX  -->  x == XXX_MAX  */
+(for and (truth_and bit_and)
+ (simplify
+  (and:c (ge:c @0 @1) (eq@3 @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))
+       && (wi::eq_p (wi::to_wide (@2), wi::max_value (TREE_TYPE (@2)))))
+    @3)))
+
+/* x >  y  ||  x != XXX_MIN   -->  x != XXX_MIN  */
+(for or (truth_or bit_ior)
+ (simplify
+  (or:c (gt:c @0 @1) (ne@3 @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))
+       && wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2))))
+    @3)))
+
+/* x <=  y  ||  x != XXX_MIN   -->  true  */
+(for or (truth_or bit_ior)
+ (simplify
+  (or:c (le:c @0 @1) (ne @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))
+       && wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2))))
+    { boolean_true_node; })))
+
+/* x <=  y  ||  x == XXX_MIN   -->  x <= y  */
+(for or (truth_or bit_ior)
+ (simplify
+  (or:c (le:c@3 @0 @1) (eq @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))
+       && wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2))))
+    @3)))
+
+/* x <  y  ||  x != XXX_MAX   -->  x != XXX_MAX  */
+(for or (truth_or bit_ior)
+ (simplify
+  (or:c (lt:c @0 @1) (ne@3 @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))
+       && wi::eq_p (wi::to_wide (@2), wi::max_value (TREE_TYPE (@2))))
+    @3)))
+
+/* x >=  y  ||  x != XXX_MAX   -->  true  */
+(for or (truth_or bit_ior)
+ (simplify
+  (or:c (ge:c @0 @1) (ne @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))
+       && wi::eq_p (wi::to_wide (@2), wi::max_value (TREE_TYPE (@2))))
+    { boolean_true_node; })))
+
+/* x >=  y  ||  x == XXX_MAX   -->  x >= y  */
+(for or (truth_or bit_ior)
+ (simplify
+  (or:c (ge:c@3 @0 @1) (eq @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))
+       && wi::eq_p (wi::to_wide (@2), wi::max_value (TREE_TYPE (@2))))
+    @3)))
 
 /* We can't reassociate at all for saturating types.  */
 (if (!TYPE_SATURATING (type))
Richard Biener July 2, 2019, 8:09 a.m. UTC | #6
On Tue, 2 Jul 2019, Li Jia He wrote:

> 
> 
> On 2019/7/1 3:30 PM, Richard Biener wrote:
> > On Fri, 28 Jun 2019, Andrew Pinski wrote:
> > 
> > > On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote:
> > > > 
> > > > 
> > > > 
> > > > On 2019/6/27 11:48 PM, Jeff Law wrote:
> > > > > On 6/27/19 12:11 AM, Li Jia He wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > According to the optimizable case described by Qi Feng on
> > > > > > issue 88784, we can combine the cases into the following:
> > > > > > 
> > > > > > 1. x >  y  &&  x != XXX_MIN  -->   x > y
> > > > > > 2. x >  y  &&  x == XXX_MIN  -->   false
> > > > > > 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
> > > > > > 
> > > > > > 4. x <  y  &&  x != XXX_MAX  -->   x < y
> > > > > > 5. x <  y  &&  x == XXX_MAX  -->   false
> > > > > > 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
> > > > > > 
> > > > > > 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> > > > > > 8. x <= y  ||  x != XXX_MIN  -->   true
> > > > > > 9. x <= y  ||  x == XXX_MIN  -->   x <= y
> > > > > > 
> > > > > > 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> > > > > > 11. x >= y  ||  x != XXX_MAX  -->   true
> > > > > > 12. x >= y  ||  x == XXX_MAX  -->   x >= y
> > > > > > 
> > > > > > Note: XXX_MIN represents the minimum value of type x.
> > > > > >         XXX_MAX represents the maximum value of type x.
> > > > > > 
> > > > > > Here we don't need to care about whether the operation is
> > > > > > signed or unsigned.  For example, in the below equation:
> > > > > > 
> > > > > > 'x >  y  &&  x != XXX_MIN  -->   x > y'
> > > > > > 
> > > > > > If the x type is signed int and XXX_MIN is INT_MIN, we can
> > > > > > optimize it to 'x > y'.  However, if the type of x is unsigned
> > > > > > int and XXX_MIN is 0, we can still optimize it to 'x > y'.
> > > > > > 
> > > > > > The regression testing for the patch was done on GCC mainline on
> > > > > > 
> > > > > >       powerpc64le-unknown-linux-gnu (Power 9 LE)
> > > > > > 
> > > > > > with no regressions.  Is it OK for trunk ?
> > > > > > 
> > > > > > Thanks,
> > > > > > Lijia He
> > > > > > 
> > > > > > gcc/ChangeLog
> > > > > > 
> > > > > > 2019-06-27  Li Jia He  <helijia@linux.ibm.com>
> > > > > >           Qi Feng  <ffengqi@linux.ibm.com>
> > > > > > 
> > > > > >       PR middle-end/88784
> > > > > >       * gimple-fold.c (and_comparisons_contain_equal_operands): New
> > > > > > function.
> > > > > >       (and_comparisons_1): Use
> > > > > > and_comparisons_contain_equal_operands.
> > > > > >       (or_comparisons_contain_equal_operands): New function.
> > > > > >       (or_comparisons_1): Use or_comparisons_contain_equal_operands.
> > > > > Would this be better done via match.pd?  ISTM this transformation
> > > > > would
> > > > > be well suited for that framework.
> > > > 
> > > > Hi, Jeff
> > > > 
> > > > I did this because of the following test case:
> > > > `
> > > > _Bool comp(unsigned x, unsigned y)
> > > > {
> > > >     return x > y && x != 0;
> > > > }
> > > > `
> > > > The gimple file dumped on the power platform is:
> > > > `
> > > > comp (unsigned int x, unsigned int y)
> > > > {
> > > >     _Bool D.2837;
> > > >     int iftmp.0;
> > > > 
> > > >     if (x > y) goto <D.2841>; else goto <D.2839>;
> > > >     <D.2841>:
> > > >     if (x != 0) goto <D.2842>; else goto <D.2839>;
> > > >     <D.2842>:
> > > >     iftmp.0 = 1;
> > > >     goto <D.2840>;
> > > >     <D.2839>:
> > > >     iftmp.0 = 0;
> > > >     <D.2840>:
> > > >     D.2837 = (_Bool) iftmp.0;
> > > >     return D.2837;
> > > > }
> > > > `
> > > > However, the gimple file dumped on x86 is
> > > > `
> > > > comp (unsigned int x, unsigned int y)
> > > > {
> > > >     _Bool D.2837;
> > > > 
> > > >     _1 = x > y;
> > > >     _2 = x != 0;
> > > >     _3 = _1 & _2;
> > > >     _4 = (int) _3;
> > > >     D.2837 = (_Bool) _4;
> > > >     return D.2837;
> > > > }
> > > > `
> > > > 
> > > > The reason for the inconsistency between these two behaviors is param
> > > > logical-op-non-short-circuit.  If we add the pattern to the match.pd
> > > > file, we can only optimize the situation in which the statement is in
> > > > the same basic block (logical-op-non-short-circuit=1, x86).  But for
> > > > a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
> > > > can't handle this situation.
> > > > 
> > > > Another reason is that I found out maybe_fold_and_comparisons and
> > > > maybe_fold_or_comparisons are not only called by ifcombine pass but
> > > > also by reassoc pass. Using this method can basically unify param
> > > > logical-op-non-short-circuit=0 or 1.
> > > 
> > > 
> > > As mentioned before ifcombine pass should be using gimple-match
> > > instead of fold_build.  Try converting ifcombine over to gimple-match
> > > infrastructure and add these to match.pd.
> > 
> > Yes, I mentioned that in the PR.  The issue is that at the moment
> > to combine x > y with x <= y you'd have to build GENERIC trees
> > for both or temporary GIMPLE assign with a SSA def (and then feed
> > that into the GENERIC or GIMPLE match.pd path).
> 
> Hi,
> 
> I did some experimentation using ‘temporary GIMPLE with a SSA def (and then
> feed that into the GIMPLE match.pd path’.  Could we consider the code in the
> attachment(I did a test and the code took effect)?

No, that's excessive overhead IMHO - the whole point of these functions
originally was to avoid build2 (...) on both conditionals.  If we
now create two SSA names and two GIMPLE statements that's too much.

As said it shouldn't be rocket science to teach genmatch to
auto-generate those functions from match.pd patterns but it certainly
is some work (less so for somebody with genmatch knowledge).
I'll give it a poke...

Richard.

> Thanks,
> Lijia He
> 
> > 
> > maybe_fold_and/or_comparisons handle two exploded binary expressions
> > while the current match.pd entries handle at most one exploded one
> > (the outermost then, either AND or OR).  But it would be definitely
> > doable to auto-generate maybe_fold_and/or_comparisons from match.pd
> > patterns which is what I'd ultimatively suggest to do (in some more
> > generalized form maybe).  Either with a separate genmatch invocation
> > or as part of the --gimple processing (not sure what is more feasible
> > here).
> > 
> > I told Li Jia He that I don't expect him to do this work.
> > 
> > Note I didn't review the actual patch yet.
> > 
> > Thanks,
> > Richard.
> > 
>
Richard Biener July 2, 2019, 8:51 a.m. UTC | #7
On Tue, 2 Jul 2019, Richard Biener wrote:

> On Tue, 2 Jul 2019, Li Jia He wrote:
> 
> > 
> > 
> > On 2019/7/1 3:30 PM, Richard Biener wrote:
> > > On Fri, 28 Jun 2019, Andrew Pinski wrote:
> > > 
> > > > On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > On 2019/6/27 11:48 PM, Jeff Law wrote:
> > > > > > On 6/27/19 12:11 AM, Li Jia He wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > According to the optimizable case described by Qi Feng on
> > > > > > > issue 88784, we can combine the cases into the following:
> > > > > > > 
> > > > > > > 1. x >  y  &&  x != XXX_MIN  -->   x > y
> > > > > > > 2. x >  y  &&  x == XXX_MIN  -->   false
> > > > > > > 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
> > > > > > > 
> > > > > > > 4. x <  y  &&  x != XXX_MAX  -->   x < y
> > > > > > > 5. x <  y  &&  x == XXX_MAX  -->   false
> > > > > > > 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
> > > > > > > 
> > > > > > > 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> > > > > > > 8. x <= y  ||  x != XXX_MIN  -->   true
> > > > > > > 9. x <= y  ||  x == XXX_MIN  -->   x <= y
> > > > > > > 
> > > > > > > 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> > > > > > > 11. x >= y  ||  x != XXX_MAX  -->   true
> > > > > > > 12. x >= y  ||  x == XXX_MAX  -->   x >= y
> > > > > > > 
> > > > > > > Note: XXX_MIN represents the minimum value of type x.
> > > > > > >         XXX_MAX represents the maximum value of type x.
> > > > > > > 
> > > > > > > Here we don't need to care about whether the operation is
> > > > > > > signed or unsigned.  For example, in the below equation:
> > > > > > > 
> > > > > > > 'x >  y  &&  x != XXX_MIN  -->   x > y'
> > > > > > > 
> > > > > > > If the x type is signed int and XXX_MIN is INT_MIN, we can
> > > > > > > optimize it to 'x > y'.  However, if the type of x is unsigned
> > > > > > > int and XXX_MIN is 0, we can still optimize it to 'x > y'.
> > > > > > > 
> > > > > > > The regression testing for the patch was done on GCC mainline on
> > > > > > > 
> > > > > > >       powerpc64le-unknown-linux-gnu (Power 9 LE)
> > > > > > > 
> > > > > > > with no regressions.  Is it OK for trunk ?
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Lijia He
> > > > > > > 
> > > > > > > gcc/ChangeLog
> > > > > > > 
> > > > > > > 2019-06-27  Li Jia He  <helijia@linux.ibm.com>
> > > > > > >           Qi Feng  <ffengqi@linux.ibm.com>
> > > > > > > 
> > > > > > >       PR middle-end/88784
> > > > > > >       * gimple-fold.c (and_comparisons_contain_equal_operands): New
> > > > > > > function.
> > > > > > >       (and_comparisons_1): Use
> > > > > > > and_comparisons_contain_equal_operands.
> > > > > > >       (or_comparisons_contain_equal_operands): New function.
> > > > > > >       (or_comparisons_1): Use or_comparisons_contain_equal_operands.
> > > > > > Would this be better done via match.pd?  ISTM this transformation
> > > > > > would
> > > > > > be well suited for that framework.
> > > > > 
> > > > > Hi, Jeff
> > > > > 
> > > > > I did this because of the following test case:
> > > > > `
> > > > > _Bool comp(unsigned x, unsigned y)
> > > > > {
> > > > >     return x > y && x != 0;
> > > > > }
> > > > > `
> > > > > The gimple file dumped on the power platform is:
> > > > > `
> > > > > comp (unsigned int x, unsigned int y)
> > > > > {
> > > > >     _Bool D.2837;
> > > > >     int iftmp.0;
> > > > > 
> > > > >     if (x > y) goto <D.2841>; else goto <D.2839>;
> > > > >     <D.2841>:
> > > > >     if (x != 0) goto <D.2842>; else goto <D.2839>;
> > > > >     <D.2842>:
> > > > >     iftmp.0 = 1;
> > > > >     goto <D.2840>;
> > > > >     <D.2839>:
> > > > >     iftmp.0 = 0;
> > > > >     <D.2840>:
> > > > >     D.2837 = (_Bool) iftmp.0;
> > > > >     return D.2837;
> > > > > }
> > > > > `
> > > > > However, the gimple file dumped on x86 is
> > > > > `
> > > > > comp (unsigned int x, unsigned int y)
> > > > > {
> > > > >     _Bool D.2837;
> > > > > 
> > > > >     _1 = x > y;
> > > > >     _2 = x != 0;
> > > > >     _3 = _1 & _2;
> > > > >     _4 = (int) _3;
> > > > >     D.2837 = (_Bool) _4;
> > > > >     return D.2837;
> > > > > }
> > > > > `
> > > > > 
> > > > > The reason for the inconsistency between these two behaviors is param
> > > > > logical-op-non-short-circuit.  If we add the pattern to the match.pd
> > > > > file, we can only optimize the situation in which the statement is in
> > > > > the same basic block (logical-op-non-short-circuit=1, x86).  But for
> > > > > a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
> > > > > can't handle this situation.
> > > > > 
> > > > > Another reason is that I found out maybe_fold_and_comparisons and
> > > > > maybe_fold_or_comparisons are not only called by ifcombine pass but
> > > > > also by reassoc pass. Using this method can basically unify param
> > > > > logical-op-non-short-circuit=0 or 1.
> > > > 
> > > > 
> > > > As mentioned before ifcombine pass should be using gimple-match
> > > > instead of fold_build.  Try converting ifcombine over to gimple-match
> > > > infrastructure and add these to match.pd.
> > > 
> > > Yes, I mentioned that in the PR.  The issue is that at the moment
> > > to combine x > y with x <= y you'd have to build GENERIC trees
> > > for both or temporary GIMPLE assign with a SSA def (and then feed
> > > that into the GENERIC or GIMPLE match.pd path).
> > 
> > Hi,
> > 
> > I did some experimentation using ‘temporary GIMPLE with a SSA def (and then
> > feed that into the GIMPLE match.pd path’.  Could we consider the code in the
> > attachment(I did a test and the code took effect)?
> 
> No, that's excessive overhead IMHO - the whole point of these functions
> originally was to avoid build2 (...) on both conditionals.  If we
> now create two SSA names and two GIMPLE statements that's too much.
> 
> As said it shouldn't be rocket science to teach genmatch to
> auto-generate those functions from match.pd patterns but it certainly
> is some work (less so for somebody with genmatch knowledge).
> I'll give it a poke...

OK, it's not so easy.  I guess we could lower the cost of building
SSA names / gimple stmts significantly if we allocated them on the
stack like via

gimple *stmt1 = XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN) + 2 * 
sizeof (tree));
memset (stmt1, 0, ...);
gimple_set_code (stmt1, GIMPLE_ASSIGN);
gimple_set_num_ops (stmt1, 3);
gimple_init_singleton (stmt1);

gimple_stmt_iterator gsi = gsi_for_stmt (stmt1);
gimple_assign_set_rhs_with_ops (&gsi, actual operation...);

tree lhs1 = XALLOCA (tree_ssa_name);
memset (lhs1, 0, sizeof (tree_ssa_name));
TREE_SET_CODE (lhs1, SSA_NAME);
TREE_TYPE (lhs1) = ...;

gimple_assing_set_lhs (stmt1, lhs1);

it's all a bit ugly and some factoring in the generic gimple machinery
might easen this kind of hacks, but well...

With the above you could then use

  gimple_match_op op (gimple_match_cond::UNCOND, BIT_AND_EXPR /* or OR */,
		      boolean_type_node, lhs1, lhs2);
  if (gimple_resimplify2 (NULL, &op, follow_all_ssa_edges))
    ... successfully simplified into 'op' ...

with the simplification path then extracting the simplified result
from 'op'.  Note this deliberately leaves out passing a gimple_seq
as storage for more complex simplification results to match
existing behavior.

Your patch has the GC allocation and SSA name (and namespace!)
allocation overhead and most definitely misses releasing the
SSA names you allocated.

Note with the above hack the simplified result has to be checked
for mentions of lhs1 or lhs2 - a case we have to reject because
their definitions are transitional.

Richard.

> Richard.
> 
> > Thanks,
> > Lijia He
> > 
> > > 
> > > maybe_fold_and/or_comparisons handle two exploded binary expressions
> > > while the current match.pd entries handle at most one exploded one
> > > (the outermost then, either AND or OR).  But it would be definitely
> > > doable to auto-generate maybe_fold_and/or_comparisons from match.pd
> > > patterns which is what I'd ultimatively suggest to do (in some more
> > > generalized form maybe).  Either with a separate genmatch invocation
> > > or as part of the --gimple processing (not sure what is more feasible
> > > here).
> > > 
> > > I told Li Jia He that I don't expect him to do this work.
> > > 
> > > Note I didn't review the actual patch yet.
> > > 
> > > Thanks,
> > > Richard.
> > > 
> > 
> 
>
Li Jia He July 16, 2019, 6:35 a.m. UTC | #8
On 2019/7/2 4:51 PM, Richard Biener wrote:
> On Tue, 2 Jul 2019, Richard Biener wrote:
> 
>> On Tue, 2 Jul 2019, Li Jia He wrote:
>>
>>>
>>>
>>> On 2019/7/1 3:30 PM, Richard Biener wrote:
>>>> On Fri, 28 Jun 2019, Andrew Pinski wrote:
>>>>
>>>>> On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2019/6/27 11:48 PM, Jeff Law wrote:
>>>>>>> On 6/27/19 12:11 AM, Li Jia He wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> According to the optimizable case described by Qi Feng on
>>>>>>>> issue 88784, we can combine the cases into the following:
>>>>>>>>
>>>>>>>> 1. x >  y  &&  x != XXX_MIN  -->   x > y
>>>>>>>> 2. x >  y  &&  x == XXX_MIN  -->   false
>>>>>>>> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
>>>>>>>>
>>>>>>>> 4. x <  y  &&  x != XXX_MAX  -->   x < y
>>>>>>>> 5. x <  y  &&  x == XXX_MAX  -->   false
>>>>>>>> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
>>>>>>>>
>>>>>>>> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
>>>>>>>> 8. x <= y  ||  x != XXX_MIN  -->   true
>>>>>>>> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
>>>>>>>>
>>>>>>>> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
>>>>>>>> 11. x >= y  ||  x != XXX_MAX  -->   true
>>>>>>>> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
>>>>>>>>
>>>>>>>> Note: XXX_MIN represents the minimum value of type x.
>>>>>>>>          XXX_MAX represents the maximum value of type x.
>>>>>>>>
>>>>>>>> Here we don't need to care about whether the operation is
>>>>>>>> signed or unsigned.  For example, in the below equation:
>>>>>>>>
>>>>>>>> 'x >  y  &&  x != XXX_MIN  -->   x > y'
>>>>>>>>
>>>>>>>> If the x type is signed int and XXX_MIN is INT_MIN, we can
>>>>>>>> optimize it to 'x > y'.  However, if the type of x is unsigned
>>>>>>>> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
>>>>>>>>
>>>>>>>> The regression testing for the patch was done on GCC mainline on
>>>>>>>>
>>>>>>>>        powerpc64le-unknown-linux-gnu (Power 9 LE)
>>>>>>>>
>>>>>>>> with no regressions.  Is it OK for trunk ?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Lijia He
>>>>>>>>
>>>>>>>> gcc/ChangeLog
>>>>>>>>
>>>>>>>> 2019-06-27  Li Jia He  <helijia@linux.ibm.com>
>>>>>>>>            Qi Feng  <ffengqi@linux.ibm.com>
>>>>>>>>
>>>>>>>>        PR middle-end/88784
>>>>>>>>        * gimple-fold.c (and_comparisons_contain_equal_operands): New
>>>>>>>> function.
>>>>>>>>        (and_comparisons_1): Use
>>>>>>>> and_comparisons_contain_equal_operands.
>>>>>>>>        (or_comparisons_contain_equal_operands): New function.
>>>>>>>>        (or_comparisons_1): Use or_comparisons_contain_equal_operands.
>>>>>>> Would this be better done via match.pd?  ISTM this transformation
>>>>>>> would
>>>>>>> be well suited for that framework.
>>>>>>
>>>>>> Hi, Jeff
>>>>>>
>>>>>> I did this because of the following test case:
>>>>>> `
>>>>>> _Bool comp(unsigned x, unsigned y)
>>>>>> {
>>>>>>      return x > y && x != 0;
>>>>>> }
>>>>>> `
>>>>>> The gimple file dumped on the power platform is:
>>>>>> `
>>>>>> comp (unsigned int x, unsigned int y)
>>>>>> {
>>>>>>      _Bool D.2837;
>>>>>>      int iftmp.0;
>>>>>>
>>>>>>      if (x > y) goto <D.2841>; else goto <D.2839>;
>>>>>>      <D.2841>:
>>>>>>      if (x != 0) goto <D.2842>; else goto <D.2839>;
>>>>>>      <D.2842>:
>>>>>>      iftmp.0 = 1;
>>>>>>      goto <D.2840>;
>>>>>>      <D.2839>:
>>>>>>      iftmp.0 = 0;
>>>>>>      <D.2840>:
>>>>>>      D.2837 = (_Bool) iftmp.0;
>>>>>>      return D.2837;
>>>>>> }
>>>>>> `
>>>>>> However, the gimple file dumped on x86 is
>>>>>> `
>>>>>> comp (unsigned int x, unsigned int y)
>>>>>> {
>>>>>>      _Bool D.2837;
>>>>>>
>>>>>>      _1 = x > y;
>>>>>>      _2 = x != 0;
>>>>>>      _3 = _1 & _2;
>>>>>>      _4 = (int) _3;
>>>>>>      D.2837 = (_Bool) _4;
>>>>>>      return D.2837;
>>>>>> }
>>>>>> `
>>>>>>
>>>>>> The reason for the inconsistency between these two behaviors is param
>>>>>> logical-op-non-short-circuit.  If we add the pattern to the match.pd
>>>>>> file, we can only optimize the situation in which the statement is in
>>>>>> the same basic block (logical-op-non-short-circuit=1, x86).  But for
>>>>>> a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
>>>>>> can't handle this situation.
>>>>>>
>>>>>> Another reason is that I found out maybe_fold_and_comparisons and
>>>>>> maybe_fold_or_comparisons are not only called by ifcombine pass but
>>>>>> also by reassoc pass. Using this method can basically unify param
>>>>>> logical-op-non-short-circuit=0 or 1.
>>>>>
>>>>>
>>>>> As mentioned before ifcombine pass should be using gimple-match
>>>>> instead of fold_build.  Try converting ifcombine over to gimple-match
>>>>> infrastructure and add these to match.pd.
>>>>
>>>> Yes, I mentioned that in the PR.  The issue is that at the moment
>>>> to combine x > y with x <= y you'd have to build GENERIC trees
>>>> for both or temporary GIMPLE assign with a SSA def (and then feed
>>>> that into the GENERIC or GIMPLE match.pd path).
>>>
>>> Hi,
>>>
>>> I did some experimentation using ‘temporary GIMPLE with a SSA def (and then
>>> feed that into the GIMPLE match.pd path’.  Could we consider the code in the
>>> attachment(I did a test and the code took effect)?
>>
>> No, that's excessive overhead IMHO - the whole point of these functions
>> originally was to avoid build2 (...) on both conditionals.  If we
>> now create two SSA names and two GIMPLE statements that's too much.
>>
>> As said it shouldn't be rocket science to teach genmatch to
>> auto-generate those functions from match.pd patterns but it certainly
>> is some work (less so for somebody with genmatch knowledge).
>> I'll give it a poke...
> 
> OK, it's not so easy.  I guess we could lower the cost of building
> SSA names / gimple stmts significantly if we allocated them on the
> stack like via
> 
> gimple *stmt1 = XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN) + 2 *
> sizeof (tree));
> memset (stmt1, 0, ...);
> gimple_set_code (stmt1, GIMPLE_ASSIGN);
> gimple_set_num_ops (stmt1, 3);
> gimple_init_singleton (stmt1);
> 
> gimple_stmt_iterator gsi = gsi_for_stmt (stmt1);
> gimple_assign_set_rhs_with_ops (&gsi, actual operation...);
> 
> tree lhs1 = XALLOCA (tree_ssa_name);
> memset (lhs1, 0, sizeof (tree_ssa_name));
> TREE_SET_CODE (lhs1, SSA_NAME);
> TREE_TYPE (lhs1) = ...;
> 
> gimple_assing_set_lhs (stmt1, lhs1);
> 
> it's all a bit ugly and some factoring in the generic gimple machinery
> might easen this kind of hacks, but well...
> 
> With the above you could then use
> 
>    gimple_match_op op (gimple_match_cond::UNCOND, BIT_AND_EXPR /* or OR */,
> 		      boolean_type_node, lhs1, lhs2);
>    if (gimple_resimplify2 (NULL, &op, follow_all_ssa_edges))
>      ... successfully simplified into 'op' ...
> 
> with the simplification path then extracting the simplified result
> from 'op'.  Note this deliberately leaves out passing a gimple_seq
> as storage for more complex simplification results to match
> existing behavior.
> 
> Your patch has the GC allocation and SSA name (and namespace!)
> allocation overhead and most definitely misses releasing the
> SSA names you allocated.
> 
> Note with the above hack the simplified result has to be checked
> for mentions of lhs1 or lhs2 - a case we have to reject because
> their definitions are transitional.
> 

Hi,

   I made some changes based on the recommendations. Would you like to
   help me to see it again ? Sorry, it took so long time to provide the
   patch.

   Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1.
	The reason is that I did not found a good way to handle the
	optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd.
	Maybe I missing some important information about match.pd.
	2. The gimple_resimplify2 function is not used.  Since stmt1,
	stmt2, lhs1 and lhs2 are allocated on the stack, Is there a
	question with the value on the stack as the return value ?
	I may have misunderstood Richard's intention.

Thanks,
Lijia He

> Richard.
> 
>> Richard.
>>
>>> Thanks,
>>> Lijia He
>>>
>>>>
>>>> maybe_fold_and/or_comparisons handle two exploded binary expressions
>>>> while the current match.pd entries handle at most one exploded one
>>>> (the outermost then, either AND or OR).  But it would be definitely
>>>> doable to auto-generate maybe_fold_and/or_comparisons from match.pd
>>>> patterns which is what I'd ultimatively suggest to do (in some more
>>>> generalized form maybe).  Either with a separate genmatch invocation
>>>> or as part of the --gimple processing (not sure what is more feasible
>>>> here).
>>>>
>>>> I told Li Jia He that I don't expect him to do this work.
>>>>
>>>> Note I didn't review the actual patch yet.
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>
>>
>>
>
From 2699516000abdef60b06b203d703f80b8905cf09 Mon Sep 17 00:00:00 2001
From: Li Jia He <helijia@linux.ibm.com>
Date: Mon, 15 Jul 2019 00:30:25 -0500
Subject: [PATCH 1/2] Auto-generate maybe_fold_and/or_comparisons from match.pd

Hi,

This patch try to auto-generate maybe_fold_and/or_comparisons from match.pd.
We try to build tempory assign with SSA def and then feed that into the GIMPLE
match.pd path.

In order to lower the cost of building SSA names / gimple stmts significantly,
we need to allocated them on the stack.  This will cause the code a bit ugly.

The regression testing for the patch was done on GCC mainline on

	powerpc64le-unknown-linux-gnu (Power 9 LE)

with no regressions.  Is it OK for trunk ?

Thanks,
Lijia He

gcc/ChangeLog

2019-07-16  Li Jia He  <helijia@linux.ibm.com>

	* gimple.h (gimple_set_code): Declare.
	(gimple_size): Likewise.
	* gimple.c (gimple_set_code): Remove static and inline restrictions.
	(gimple_size): Likewise.
	* gimple-fold.c (maybe_fold_comparisons_from_match_pd): New function.
	(maybe_fold_and_comparisons): Modify and_comparisons_1 invocation and
	call maybe_fold_comparisons_from_match_pd.
	(maybe_fold_or_comparisons): Modify or_comparisons_1 invocation and
	call maybe_fold_comparisons_from_match_pd.
---
 gcc/gimple-fold.c | 122 ++++++++++++++++++++++++++++++++++++++++++----
 gcc/gimple.c      |   4 +-
 gcc/gimple.h      |   2 +
 3 files changed, 117 insertions(+), 11 deletions(-)

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index dfb31a02078..a5c2790361f 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -5789,6 +5789,91 @@ and_comparisons_1 (enum tree_code code1, tree op1a, tree op1b,
   return NULL_TREE;
 }
 
+/* Helper function for maybe_fold_and_comparisons and maybe_fold_or_comparisons
+   : try to simplify the AND/OR of the ssa variable VAR with the comparison
+   specified by (OP2A CODE2 OP2B) from match.pd.  Return NULL_EXPR if we can't
+   simplify this to a single expression.  As we are going to lower the cost
+   of building SSA names / gimple stmts significantly, we need to allocate
+   them ont the stack.  This will cause the code to be a bit ugly.  */
+
+static tree
+maybe_fold_comparisons_from_match_pd (enum tree_code code, enum tree_code code1,
+				      tree op1a, tree op1b,
+				      enum tree_code code2, tree op2a,
+				      tree op2b)
+{
+  /* Allocate gimple stmt1 on the stack.  */
+  gimple *stmt1 = (gimple *) (XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN)
+						  + 2 * sizeof (tree)));
+  memset (stmt1, 0, gimple_size (GIMPLE_ASSIGN) + 2 * sizeof (tree));
+  gimple_set_code (stmt1, GIMPLE_ASSIGN);
+  gimple_set_num_ops (stmt1, 3);
+  gimple_init_singleton (stmt1);
+
+  /* Allocate gimple stmt2 on the stack.  */
+  gimple *stmt2 = (gimple *) (XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN)
+						  + 2 * sizeof (tree)));
+  memset (stmt2, 0, gimple_size (GIMPLE_ASSIGN) + 2 * sizeof (tree));
+  gimple_set_code (stmt2, GIMPLE_ASSIGN);
+  gimple_set_num_ops (stmt2, 3);
+  gimple_init_singleton (stmt2);
+
+  gimple_assign_set_rhs_code (stmt1, code1);
+  gimple_assign_set_rhs1 (stmt1, op1a);
+  gimple_assign_set_rhs2 (stmt1, op1b);
+
+  gimple_assign_set_rhs_code (stmt2, code2);
+  gimple_assign_set_rhs1 (stmt2, op2a);
+  gimple_assign_set_rhs2 (stmt2, op2b);
+
+  /* Allocate SSA names(lhs1) on the stack.  */
+  tree lhs1 = XALLOCA (tree_node);
+  memset (lhs1, 0, sizeof (tree_node));
+  TREE_SET_CODE (lhs1, SSA_NAME);
+  TREE_TYPE (lhs1) = boolean_type_node;
+
+  /* Allocate SSA names(lhs2) on the stack.  */
+  tree lhs2 = XALLOCA (tree_node);
+  memset (lhs2, 0, sizeof (tree_node));
+  TREE_SET_CODE (lhs2, SSA_NAME);
+  TREE_TYPE (lhs2) = boolean_type_node;
+
+  gimple_assign_set_lhs (stmt1, lhs1);
+  gimple_assign_set_lhs (stmt2, lhs2);
+
+  /* Call the interface function of match.pd to simplify the expression.  */
+  tree t = gimple_simplify (code, boolean_type_node, gimple_assign_lhs (stmt1),
+			    gimple_assign_lhs (stmt2), NULL,
+			    follow_all_ssa_edges);
+
+  if (t)
+    {
+      switch (TREE_CODE (t))
+	{
+	  case SSA_NAME:
+	    {
+	      gimple *def = SSA_NAME_DEF_STMT (t);
+
+	      if (!is_gimple_assign (def)
+		  || TREE_CODE_CLASS (gimple_assign_rhs_code (def))
+		    != tcc_comparison)
+		return NULL_TREE;
+
+	      return fold_build2 (gimple_assign_rhs_code (def),
+				  boolean_type_node, gimple_assign_rhs1 (def),
+				  gimple_assign_rhs2 (def));
+	  }
+	case INTEGER_CST:
+	  /* Fold expression to boolean_true_node or boolean_false_node.  */
+	  return t;
+	default:
+	  return NULL_TREE;
+	}
+    }
+
+  return NULL_TREE;
+}
+
 /* Try to simplify the AND of two comparisons, specified by
    (OP1A CODE1 OP1B) and (OP2B CODE2 OP2B), respectively.
    If this can be simplified to a single expression (without requiring
@@ -5800,11 +5885,21 @@ tree
 maybe_fold_and_comparisons (enum tree_code code1, tree op1a, tree op1b,
 			    enum tree_code code2, tree op2a, tree op2b)
 {
-  tree t = and_comparisons_1 (code1, op1a, op1b, code2, op2a, op2b);
-  if (t)
+  if (tree t = and_comparisons_1 (code1, op1a, op1b, code2, op2a, op2b))
     return t;
-  else
-    return and_comparisons_1 (code2, op2a, op2b, code1, op1a, op1b);
+
+  if (tree t = and_comparisons_1 (code2, op2a, op2b, code1, op1a, op1b))
+    return t;
+
+  if (tree t = maybe_fold_comparisons_from_match_pd (BIT_AND_EXPR, code1, op1a,
+						     op1b, code2, op2a, op2b))
+    return t;
+
+  if (tree t = maybe_fold_comparisons_from_match_pd (BIT_AND_EXPR, code2, op2a,
+						     op2b, code1, op1a, op1b))
+    return t;
+
+  return NULL_TREE;
 }
 
 /* Helper function for or_comparisons_1:  try to simplify the OR of the
@@ -6264,13 +6359,22 @@ tree
 maybe_fold_or_comparisons (enum tree_code code1, tree op1a, tree op1b,
 			   enum tree_code code2, tree op2a, tree op2b)
 {
-  tree t = or_comparisons_1 (code1, op1a, op1b, code2, op2a, op2b);
-  if (t)
+  if (tree t = or_comparisons_1 (code1, op1a, op1b, code2, op2a, op2b))
+    return t;
+
+  if (tree t = or_comparisons_1 (code2, op2a, op2b, code1, op1a, op1b))
+    return t;
+
+  if (tree t = maybe_fold_comparisons_from_match_pd (BIT_IOR_EXPR, code1, op1a,
+						     op1b, code2, op2a, op2b))
+    return t;
+
+  if (tree t = maybe_fold_comparisons_from_match_pd (BIT_IOR_EXPR, code2, op2a,
+						     op2b, code1, op1a, op1b))
     return t;
-  else
-    return or_comparisons_1 (code2, op2a, op2b, code1, op1a, op1b);
-}
 
+  return NULL_TREE;
+}
 
 /* Fold STMT to a constant using VALUEIZE to valueize SSA names.
 
diff --git a/gcc/gimple.c b/gcc/gimple.c
index 63c8d5e85ae..6769bd9edec 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -101,7 +101,7 @@ const enum gimple_code gcond::code_;
 
 /* Set the code for statement G to CODE.  */
 
-static inline void
+void
 gimple_set_code (gimple *g, enum gimple_code code)
 {
   g->code = code;
@@ -110,7 +110,7 @@ gimple_set_code (gimple *g, enum gimple_code code)
 /* Return the number of bytes needed to hold a GIMPLE statement with
    code CODE.  */
 
-static inline size_t
+size_t
 gimple_size (enum gimple_code code)
 {
   return gsstruct_code_size[gss_for_code (code)];
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 47070e7f409..a32708ad349 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1447,6 +1447,8 @@ extern enum gimple_statement_structure_enum const gss_for_code_[];
    of comminucating the profile info to the builtin expanders.  */
 extern gimple *currently_expanding_gimple_stmt;
 
+void gimple_set_code (gimple *g, enum gimple_code code);
+size_t gimple_size (enum gimple_code code);
 gimple *gimple_alloc (enum gimple_code, unsigned CXX_MEM_STAT_INFO);
 greturn *gimple_build_return (tree);
 void gimple_call_reset_alias_info (gcall *);
Martin Liška Aug. 30, 2019, 10:18 a.m. UTC | #9
@Richi: PING^1

On 7/16/19 8:35 AM, Li Jia He wrote:
> 
> 
> On 2019/7/2 4:51 PM, Richard Biener wrote:
>> On Tue, 2 Jul 2019, Richard Biener wrote:
>>
>>> On Tue, 2 Jul 2019, Li Jia He wrote:
>>>
>>>>
>>>>
>>>> On 2019/7/1 3:30 PM, Richard Biener wrote:
>>>>> On Fri, 28 Jun 2019, Andrew Pinski wrote:
>>>>>
>>>>>> On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2019/6/27 11:48 PM, Jeff Law wrote:
>>>>>>>> On 6/27/19 12:11 AM, Li Jia He wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> According to the optimizable case described by Qi Feng on
>>>>>>>>> issue 88784, we can combine the cases into the following:
>>>>>>>>>
>>>>>>>>> 1. x >  y  &&  x != XXX_MIN  -->   x > y
>>>>>>>>> 2. x >  y  &&  x == XXX_MIN  -->   false
>>>>>>>>> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
>>>>>>>>>
>>>>>>>>> 4. x <  y  &&  x != XXX_MAX  -->   x < y
>>>>>>>>> 5. x <  y  &&  x == XXX_MAX  -->   false
>>>>>>>>> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
>>>>>>>>>
>>>>>>>>> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
>>>>>>>>> 8. x <= y  ||  x != XXX_MIN  -->   true
>>>>>>>>> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
>>>>>>>>>
>>>>>>>>> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
>>>>>>>>> 11. x >= y  ||  x != XXX_MAX  -->   true
>>>>>>>>> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
>>>>>>>>>
>>>>>>>>> Note: XXX_MIN represents the minimum value of type x.
>>>>>>>>>          XXX_MAX represents the maximum value of type x.
>>>>>>>>>
>>>>>>>>> Here we don't need to care about whether the operation is
>>>>>>>>> signed or unsigned.  For example, in the below equation:
>>>>>>>>>
>>>>>>>>> 'x >  y  &&  x != XXX_MIN  -->   x > y'
>>>>>>>>>
>>>>>>>>> If the x type is signed int and XXX_MIN is INT_MIN, we can
>>>>>>>>> optimize it to 'x > y'.  However, if the type of x is unsigned
>>>>>>>>> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
>>>>>>>>>
>>>>>>>>> The regression testing for the patch was done on GCC mainline on
>>>>>>>>>
>>>>>>>>>        powerpc64le-unknown-linux-gnu (Power 9 LE)
>>>>>>>>>
>>>>>>>>> with no regressions.  Is it OK for trunk ?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Lijia He
>>>>>>>>>
>>>>>>>>> gcc/ChangeLog
>>>>>>>>>
>>>>>>>>> 2019-06-27  Li Jia He  <helijia@linux.ibm.com>
>>>>>>>>>            Qi Feng  <ffengqi@linux.ibm.com>
>>>>>>>>>
>>>>>>>>>        PR middle-end/88784
>>>>>>>>>        * gimple-fold.c (and_comparisons_contain_equal_operands): New
>>>>>>>>> function.
>>>>>>>>>        (and_comparisons_1): Use
>>>>>>>>> and_comparisons_contain_equal_operands.
>>>>>>>>>        (or_comparisons_contain_equal_operands): New function.
>>>>>>>>>        (or_comparisons_1): Use or_comparisons_contain_equal_operands.
>>>>>>>> Would this be better done via match.pd?  ISTM this transformation
>>>>>>>> would
>>>>>>>> be well suited for that framework.
>>>>>>>
>>>>>>> Hi, Jeff
>>>>>>>
>>>>>>> I did this because of the following test case:
>>>>>>> `
>>>>>>> _Bool comp(unsigned x, unsigned y)
>>>>>>> {
>>>>>>>      return x > y && x != 0;
>>>>>>> }
>>>>>>> `
>>>>>>> The gimple file dumped on the power platform is:
>>>>>>> `
>>>>>>> comp (unsigned int x, unsigned int y)
>>>>>>> {
>>>>>>>      _Bool D.2837;
>>>>>>>      int iftmp.0;
>>>>>>>
>>>>>>>      if (x > y) goto <D.2841>; else goto <D.2839>;
>>>>>>>      <D.2841>:
>>>>>>>      if (x != 0) goto <D.2842>; else goto <D.2839>;
>>>>>>>      <D.2842>:
>>>>>>>      iftmp.0 = 1;
>>>>>>>      goto <D.2840>;
>>>>>>>      <D.2839>:
>>>>>>>      iftmp.0 = 0;
>>>>>>>      <D.2840>:
>>>>>>>      D.2837 = (_Bool) iftmp.0;
>>>>>>>      return D.2837;
>>>>>>> }
>>>>>>> `
>>>>>>> However, the gimple file dumped on x86 is
>>>>>>> `
>>>>>>> comp (unsigned int x, unsigned int y)
>>>>>>> {
>>>>>>>      _Bool D.2837;
>>>>>>>
>>>>>>>      _1 = x > y;
>>>>>>>      _2 = x != 0;
>>>>>>>      _3 = _1 & _2;
>>>>>>>      _4 = (int) _3;
>>>>>>>      D.2837 = (_Bool) _4;
>>>>>>>      return D.2837;
>>>>>>> }
>>>>>>> `
>>>>>>>
>>>>>>> The reason for the inconsistency between these two behaviors is param
>>>>>>> logical-op-non-short-circuit.  If we add the pattern to the match.pd
>>>>>>> file, we can only optimize the situation in which the statement is in
>>>>>>> the same basic block (logical-op-non-short-circuit=1, x86).  But for
>>>>>>> a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
>>>>>>> can't handle this situation.
>>>>>>>
>>>>>>> Another reason is that I found out maybe_fold_and_comparisons and
>>>>>>> maybe_fold_or_comparisons are not only called by ifcombine pass but
>>>>>>> also by reassoc pass. Using this method can basically unify param
>>>>>>> logical-op-non-short-circuit=0 or 1.
>>>>>>
>>>>>>
>>>>>> As mentioned before ifcombine pass should be using gimple-match
>>>>>> instead of fold_build.  Try converting ifcombine over to gimple-match
>>>>>> infrastructure and add these to match.pd.
>>>>>
>>>>> Yes, I mentioned that in the PR.  The issue is that at the moment
>>>>> to combine x > y with x <= y you'd have to build GENERIC trees
>>>>> for both or temporary GIMPLE assign with a SSA def (and then feed
>>>>> that into the GENERIC or GIMPLE match.pd path).
>>>>
>>>> Hi,
>>>>
>>>> I did some experimentation using ‘temporary GIMPLE with a SSA def (and then
>>>> feed that into the GIMPLE match.pd path’.  Could we consider the code in the
>>>> attachment(I did a test and the code took effect)?
>>>
>>> No, that's excessive overhead IMHO - the whole point of these functions
>>> originally was to avoid build2 (...) on both conditionals.  If we
>>> now create two SSA names and two GIMPLE statements that's too much.
>>>
>>> As said it shouldn't be rocket science to teach genmatch to
>>> auto-generate those functions from match.pd patterns but it certainly
>>> is some work (less so for somebody with genmatch knowledge).
>>> I'll give it a poke...
>>
>> OK, it's not so easy.  I guess we could lower the cost of building
>> SSA names / gimple stmts significantly if we allocated them on the
>> stack like via
>>
>> gimple *stmt1 = XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN) + 2 *
>> sizeof (tree));
>> memset (stmt1, 0, ...);
>> gimple_set_code (stmt1, GIMPLE_ASSIGN);
>> gimple_set_num_ops (stmt1, 3);
>> gimple_init_singleton (stmt1);
>>
>> gimple_stmt_iterator gsi = gsi_for_stmt (stmt1);
>> gimple_assign_set_rhs_with_ops (&gsi, actual operation...);
>>
>> tree lhs1 = XALLOCA (tree_ssa_name);
>> memset (lhs1, 0, sizeof (tree_ssa_name));
>> TREE_SET_CODE (lhs1, SSA_NAME);
>> TREE_TYPE (lhs1) = ...;
>>
>> gimple_assing_set_lhs (stmt1, lhs1);
>>
>> it's all a bit ugly and some factoring in the generic gimple machinery
>> might easen this kind of hacks, but well...
>>
>> With the above you could then use
>>
>>    gimple_match_op op (gimple_match_cond::UNCOND, BIT_AND_EXPR /* or OR */,
>>               boolean_type_node, lhs1, lhs2);
>>    if (gimple_resimplify2 (NULL, &op, follow_all_ssa_edges))
>>      ... successfully simplified into 'op' ...
>>
>> with the simplification path then extracting the simplified result
>> from 'op'.  Note this deliberately leaves out passing a gimple_seq
>> as storage for more complex simplification results to match
>> existing behavior.
>>
>> Your patch has the GC allocation and SSA name (and namespace!)
>> allocation overhead and most definitely misses releasing the
>> SSA names you allocated.
>>
>> Note with the above hack the simplified result has to be checked
>> for mentions of lhs1 or lhs2 - a case we have to reject because
>> their definitions are transitional.
>>
> 
> Hi,
> 
>   I made some changes based on the recommendations. Would you like to
>   help me to see it again ? Sorry, it took so long time to provide the
>   patch.
> 
>   Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1.
>     The reason is that I did not found a good way to handle the
>     optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd.
>     Maybe I missing some important information about match.pd.
>     2. The gimple_resimplify2 function is not used.  Since stmt1,
>     stmt2, lhs1 and lhs2 are allocated on the stack, Is there a
>     question with the value on the stack as the return value ?
>     I may have misunderstood Richard's intention.
> 
> Thanks,
> Lijia He
> 
>> Richard.
>>
>>> Richard.
>>>
>>>> Thanks,
>>>> Lijia He
>>>>
>>>>>
>>>>> maybe_fold_and/or_comparisons handle two exploded binary expressions
>>>>> while the current match.pd entries handle at most one exploded one
>>>>> (the outermost then, either AND or OR).  But it would be definitely
>>>>> doable to auto-generate maybe_fold_and/or_comparisons from match.pd
>>>>> patterns which is what I'd ultimatively suggest to do (in some more
>>>>> generalized form maybe).  Either with a separate genmatch invocation
>>>>> or as part of the --gimple processing (not sure what is more feasible
>>>>> here).
>>>>>
>>>>> I told Li Jia He that I don't expect him to do this work.
>>>>>
>>>>> Note I didn't review the actual patch yet.
>>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>
>>>
>>>
>>
Richard Biener Sept. 5, 2019, 1:01 p.m. UTC | #10
On Tue, 16 Jul 2019, Li Jia He wrote:

> Hi,
> 
>   I made some changes based on the recommendations. Would you like to
>   help me to see it again ? Sorry, it took so long time to provide the
>   patch.
> 
>   Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1.
> 	The reason is that I did not found a good way to handle the
> 	optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd.
> 	Maybe I missing some important information about match.pd.
> 	2. The gimple_resimplify2 function is not used.  Since stmt1,
> 	stmt2, lhs1 and lhs2 are allocated on the stack, Is there a
> 	question with the value on the stack as the return value ?
> 	I may have misunderstood Richard's intention.

Sorry for the delay in reviewing.

Rather than exporting gimple_set_code and gimple_size I'd split
out a

void
gimple_init (gimple *, enum gimple_code code, unsigned nops);

from gimple_alloc (changing that to GC allocate not cleared
memory) doing all of the actual initialization.  Then the
allocation would go like

gimple *stmt1 = (gimple *)XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN, 
3));
gimple_init (stmt1, GIMPLE_ASSIGN, 3);

with an overload for gimple_size to account for # of ops.

+  /* Allocate SSA names(lhs1) on the stack.  */
+  tree lhs1 = XALLOCA (tree_node);

you can use (tree) XALLOCA (tree_ssa_name) here

+  /* Call the interface function of match.pd to simplify the expression.  
*/
+  tree t = gimple_simplify (code, boolean_type_node, gimple_assign_lhs 
(stmt1),
+                           gimple_assign_lhs (stmt2), NULL,
+                           follow_all_ssa_edges);
+
+  if (t)

As I told Martin offline the appropriate function to use is

 You'd do

  gimple_match_op op (gimple_match_cond::UNCOND, code,
         boolean_type_node, gimple_assign_lhs (stmt1),
gimple_assign_lhs (stmt2));
  if (op->resimplify (NULL, follow_all_ssa_edges))
   {
      if (gimple_simplified_result_is_gimple_val (res_op))
        .. got a constant or SSA name ..
      else if (res_op->code.is_tree_code ()
                 && TREE_CODE_CLASS ((tree_code)res_op->code)) ==
tcc_comparison)
        ... got a comparison res_op->op[0] res_op->code res_op->op[1] ...

so you get the outermost expression back decomposed.

Otherwise with you passing NULL as the gimple_seq you'll miss quite
some simplification cases.

You also have to watch out for the result containing the LHS of one
of your temporary stmts.

+  if (tree t = maybe_fold_comparisons_from_match_pd (BIT_AND_EXPR, code1, 
op1a,
+                                                    op1b, code2, op2a, 
op2b))
+    return t;
+
+  if (tree t = maybe_fold_comparisons_from_match_pd (BIT_AND_EXPR, code2, 
op2a,
+                                                    op2b, code1, op1a, 
op1b))
+    return t;

with match.pd rules you shouldn't need to call this twice, once with
swapped operands.

Otherwise this first patch looks like what I'd have done and we
can build upon it.

Not sure if you or Martin wants to improve it according to my
comments.

Thanks,
Richard.
Richard Biener Sept. 5, 2019, 1:17 p.m. UTC | #11
On Tue, 16 Jul 2019, Li Jia He wrote:

> Hi,
> 
>   I made some changes based on the recommendations. Would you like to
>   help me to see it again ? Sorry, it took so long time to provide the
>   patch.
> 
>   Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1.
> 	The reason is that I did not found a good way to handle the
> 	optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd.
> 	Maybe I missing some important information about match.pd.
> 	2. The gimple_resimplify2 function is not used.  Since stmt1,
> 	stmt2, lhs1 and lhs2 are allocated on the stack, Is there a
> 	question with the value on the stack as the return value ?
> 	I may have misunderstood Richard's intention.

And now for the match.pd patch.

+/* x >  y  &&  x != XXX_MIN  -->  x > y  */
+(for and (truth_and bit_and)
+ (simplify
+  (and:c (gt:c@3 @0 @1) (ne @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P 
(TREE_TYPE(@1))
+       && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)))))
+    @3)))
+
+/* x >  y  &&  x == XXX_MIN  -->  false  */
+(for and (truth_and bit_and)
+ (simplify
+  (and:c (gt:c @0 @1) (eq @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P 
(TREE_TYPE(@1))
+       && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)))))
+    { boolean_false_node; })))

you could merge those two via

 (for eqne (eq ne)
  (for and (....
   (simplify
    (and:c (gt:c @0 @1) (eqne @0 INTEGER_CST@2))
    (if (...)
     (switch
      (if (eqne == NE_EXPR)
       @3)
      (if (eqne == EQ_EXPR)
       { constant_boolean_node (false, type); }))))

notice using constant_boolean_node (false, type); instead of
boolean_false_node.  I suspect more unification is possible.

Also you could do

(match min_value
 INTEGER_CST
 (if (INTEGRAL_TYPE_P (type)
      && wi::eq_p (wi::to_wide (t), wi::min_value (type)))))

and then write

 (simplify
  (and:c (gt:c @0 @1) (eq @0 min_value))
  (...

Your

(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P
(TREE_TYPE(@1))

is redundant, it's enough to check either @0 or @1 given they
have to be compatible for the gt operation.  Note you probably
want to use

  (and:c (gt:c @0 @1) (eq @@0 min_value))

and verify that types_match (@1, @0) because when @0 are a constant
(and (eq @0 min_value) is not folded which can happen) then they
might have different types and thus you could have
(SHORT_MAX > intvar) && (SHORT_MAX == SHORT_MAX)

That said, the patterns can be quite a bit simplified I think.

Richard.
Martin Liška Sept. 5, 2019, 6:47 p.m. UTC | #12
On 9/5/19 3:01 PM, Richard Biener wrote:
> Not sure if you or Martin wants to improve it according to my
> comments.

Yes please. I'm working on that based on the review you provided.

Thanks,
Martin
Martin Liška Sept. 5, 2019, 6:47 p.m. UTC | #13
On 9/5/19 3:17 PM, Richard Biener wrote:
> That said, the patterns can be quite a bit simplified I think.

I will take care of it as well.

Martin
Richard Biener Sept. 6, 2019, 8:01 a.m. UTC | #14
On Thu, 5 Sep 2019, Martin Liška wrote:

> On 9/5/19 3:01 PM, Richard Biener wrote:
> > Not sure if you or Martin wants to improve it according to my
> > comments.
> 
> Yes please. I'm working on that based on the review you provided.

Oh, and it just occured to me since we're doing single_use checks
on SSA names in match.pd that you need to initialize the
SSA_NAME_IMM_USE_NODE () as well - see make_ssa_name_fn, probably
makes sense to split out a short helper for that.

Richard.
Martin Liška Sept. 6, 2019, 8:04 a.m. UTC | #15
On 9/6/19 10:01 AM, Richard Biener wrote:
> On Thu, 5 Sep 2019, Martin Liška wrote:
> 
>> On 9/5/19 3:01 PM, Richard Biener wrote:
>>> Not sure if you or Martin wants to improve it according to my
>>> comments.
>>
>> Yes please. I'm working on that based on the review you provided.
> 
> Oh, and it just occured to me since we're doing single_use checks
> on SSA names in match.pd that you need to initialize the
> SSA_NAME_IMM_USE_NODE () as well - see make_ssa_name_fn, probably
> makes sense to split out a short helper for that.

Yes, I already hit the ICE :)

Martin

> 
> Richard.
>
diff mbox series

Patch

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index dfb31a02078..6d2472d2fcb 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -5535,6 +5535,62 @@  and_var_with_comparison_1 (gimple *stmt,
   return NULL_TREE;
 }
 
+/* Try to simplify the AND of two comparisons defined by
+   (OP1A CODE1 OP1B) and (OP2A CODE2 OP2B), respectively.
+   This optimization needs to satisfy op1a equal to op2a
+   or op1b equal to op2a.  If this can be done without
+   constructing an intermediate value, return the resulting
+   tree; otherwise NULL_TREE is returned.  */
+
+static tree
+and_comparisons_contain_equal_operands (enum tree_code code1, tree op1a,
+					tree op1b, enum tree_code code2,
+					tree op2a, tree op2b)
+{
+  /* Transform ((Y1 CODE1 X) AND (X CODE2 Y2)) to
+     ((X CODE1' Y1) AND (X CODE2 Y2)).  */
+  if (!operand_equal_p (op1a, op1b, 0) && operand_equal_p (op1b, op2a, 0)
+      && !operand_equal_p (op2a, op2b, 0))
+    return and_comparisons_contain_equal_operands (swap_tree_comparison (code1),
+						   op1b, op1a, code2, op2a,
+						   op2b);
+
+  tree op1a_type = TREE_TYPE (op1a);
+  tree op1b_type = TREE_TYPE (op1b);
+  tree op2b_type = TREE_TYPE (op2b);
+
+  if (INTEGRAL_TYPE_P (op1a_type) && INTEGRAL_TYPE_P (op1b_type)
+      && operand_equal_p (op1a, op2a, 0) && TREE_CODE (op2b) == INTEGER_CST)
+    {
+      if (wi::eq_p (wi::to_wide (op2b), wi::min_value (op2b_type)))
+	{
+	  /* x > y && x != XXX_MIN --> x > y  */
+	  if (code1 == GT_EXPR && code2 == NE_EXPR)
+	    return fold_build2 (code1, boolean_type_node, op1a, op1b);
+	  /* x > y && x == XXX_MIN --> false  */
+	  if (code1 == GT_EXPR && code2 == EQ_EXPR)
+	    return boolean_false_node;
+	  /* x <= y && x == XXX_MIN --> x == XXX_MIN  */
+	  if (code1 == LE_EXPR && code2 == EQ_EXPR)
+	    return fold_build2 (code2, boolean_type_node, op2a, op2b);
+	}
+      else if (wi::eq_p (wi::to_wide (op2b), wi::max_value (op2b_type)))
+	{
+	  /* x < y && x != XXX_MAX --> x < y  */
+	  if (code1 == LT_EXPR && code2 == NE_EXPR)
+	    return fold_build2 (code1, boolean_type_node, op1a, op1b);
+	  /* x < y && x == XXX_MAX --> false  */
+	  if (code1 == LT_EXPR && code2 == EQ_EXPR)
+	    return boolean_false_node;
+	  /* x >= y && x == XXX_MAX --> x == XXX_MAX  */
+	  if (code1 == GE_EXPR && code2 == EQ_EXPR)
+	    return fold_build2 (code2, boolean_type_node, op2a, op2b);
+	}
+    }
+
+  return NULL_TREE;
+}
+
 /* Try to simplify the AND of two comparisons defined by
    (OP1A CODE1 OP1B) and (OP2A CODE2 OP2B), respectively.
    If this can be done without constructing an intermediate value,
@@ -5546,6 +5602,12 @@  static tree
 and_comparisons_1 (enum tree_code code1, tree op1a, tree op1b,
 		   enum tree_code code2, tree op2a, tree op2b)
 {
+  /* Try to optimize ((x CODE1 y1) AND (x CODE2 y2))
+     and ((y1 CODE1 x) AND (x CODE2 y2)).  */
+  if (tree t = and_comparisons_contain_equal_operands (code1, op1a, op1b, code2,
+						       op2a, op2b))
+    return t;
+
   tree truth_type = truth_type_for (TREE_TYPE (op1a));
 
   /* First check for ((x CODE1 y) AND (x CODE2 y)).  */
@@ -5999,6 +6061,62 @@  or_var_with_comparison_1 (gimple *stmt,
   return NULL_TREE;
 }
 
+/* Try to simplify the OR of two comparisons defined by
+   (OP1A CODE1 OP1B) or (OP2A CODE2 OP2B), respectively.
+   This optimization needs to satisfy op1a equal to op2a
+   or op1b equal to op2a.  If this can be done without
+   constructing an intermediate value, return the resulting
+   tree; otherwise NULL_TREE is returned.  */
+
+static tree
+or_comparisons_contain_equal_operands (enum tree_code code1, tree op1a,
+				       tree op1b, enum tree_code code2,
+				       tree op2a, tree op2b)
+{
+  /* Transform ((Y1 CODE1 X) OR (X CODE2 Y2)) to
+     ((X CODE1' Y1) OR (X CODE2 Y2)).  */
+  if (!operand_equal_p (op1a, op1b, 0) && operand_equal_p (op1b, op2a, 0)
+      && !operand_equal_p (op2a, op2b, 0))
+    return or_comparisons_contain_equal_operands (swap_tree_comparison (code1),
+						  op1b, op1a, code2, op2a,
+						  op2b);
+
+  tree op1a_type = TREE_TYPE (op1a);
+  tree op1b_type = TREE_TYPE (op1b);
+  tree op2b_type = TREE_TYPE (op2b);
+
+  if (INTEGRAL_TYPE_P (op1a_type) && INTEGRAL_TYPE_P (op1b_type)
+      && operand_equal_p (op1a, op2a, 0) && TREE_CODE (op2b) == INTEGER_CST)
+    {
+      if (wi::eq_p (wi::to_wide (op2b), wi::min_value (op2b_type)))
+	{
+	  /* x > y || x != XXX_MIN --> x != XXX_MIN  */
+	  if (code1 == GT_EXPR && code2 == NE_EXPR)
+	    return fold_build2 (code2, boolean_type_node, op2a, op2b);
+	  /* x <= y || x != XXX_MIN --> true  */
+	  if (code1 == LE_EXPR && code2 == NE_EXPR)
+	    return boolean_true_node;
+	  /* x <= y || x == XXX_MIN --> x <= y  */
+	  if (code1 == LE_EXPR && code2 == EQ_EXPR)
+	    return fold_build2 (code1, boolean_type_node, op1a, op1b);
+	}
+      else if (wi::eq_p (wi::to_wide (op2b), wi::max_value (op2b_type)))
+	{
+	  /* x < y || x != XXX_MAX --> x != XXX_MAX  */
+	  if (code1 == LT_EXPR && code2 == NE_EXPR)
+	    return fold_build2 (code2, boolean_type_node, op2a, op2b);
+	  /* x >= y || x != XXX_MAX --> true  */
+	  if (code1 == GE_EXPR && code2 == NE_EXPR)
+	    return boolean_true_node;
+	  /* x >= y || x == XXX_MAX --> x >= y  */
+	  if (code1 == GE_EXPR && code2 == EQ_EXPR)
+	    return fold_build2 (code1, boolean_type_node, op1a, op1b);
+	}
+    }
+
+  return NULL_TREE;
+}
+
 /* Try to simplify the OR of two comparisons defined by
    (OP1A CODE1 OP1B) and (OP2A CODE2 OP2B), respectively.
    If this can be done without constructing an intermediate value,
@@ -6010,6 +6128,12 @@  static tree
 or_comparisons_1 (enum tree_code code1, tree op1a, tree op1b,
 		  enum tree_code code2, tree op2a, tree op2b)
 {
+  /* Try to optimize ((x CODE1 y1) OR (x CODE2 y2))
+     and ((y1 CODE1 x) OR (x CODE2 y2)).  */
+  if (tree t = or_comparisons_contain_equal_operands (code1, op1a, op1b, code2,
+						      op2a, op2b))
+    return t;
+
   tree truth_type = truth_type_for (TREE_TYPE (op1a));
 
   /* First check for ((x CODE1 y) OR (x CODE2 y)).  */
diff --git a/gcc/testsuite/gcc.dg/pr88784-1.c b/gcc/testsuite/gcc.dg/pr88784-1.c
new file mode 100644
index 00000000000..067d40d0739
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-1.c
@@ -0,0 +1,30 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */
+
+#include <limits.h>
+
+_Bool and1(unsigned x, unsigned y)
+{
+  /* x > y && x != 0 --> x > y */
+  return x > y && x != 0;
+}
+
+_Bool and2(unsigned x, unsigned y)
+{
+  /* x < y && x != UINT_MAX --> x < y */
+  return x < y && x != UINT_MAX;
+}
+
+_Bool and3(signed x, signed y)
+{
+  /* x > y && x != INT_MIN --> x > y */
+  return x > y && x != INT_MIN;
+}
+
+_Bool and4(signed x, signed y)
+{
+  /* x < y && x != INT_MAX --> x < y */
+  return x < y && x != INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " != " "ifcombine" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-10.c b/gcc/testsuite/gcc.dg/pr88784-10.c
new file mode 100644
index 00000000000..fa185d680c2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-10.c
@@ -0,0 +1,32 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */
+
+#include <limits.h>
+
+_Bool or1(unsigned x, unsigned y)
+{
+  /* x <= y || x != 0 --> true */
+  return x <= y || x != 0;
+}
+
+_Bool or2(unsigned x, unsigned y)
+{
+  /* x >= y || x != UINT_MAX --> true */
+  return x >= y || x != UINT_MAX;
+}
+
+_Bool or3(signed x, signed y)
+{
+  /* x <= y || x != INT_MIN --> true */
+  return x <= y || x != INT_MIN;
+}
+
+_Bool or4(signed x, signed y)
+{
+  /* x >= y || x != INT_MAX --> true */
+  return x >= y || x != INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " != " "dce3" } } */
+/* { dg-final { scan-tree-dump-not " <= " "dce3" } } */
+/* { dg-final { scan-tree-dump-not " >= " "dce3" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-11.c b/gcc/testsuite/gcc.dg/pr88784-11.c
new file mode 100644
index 00000000000..4465910efbb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-11.c
@@ -0,0 +1,30 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */
+
+#include <limits.h>
+
+_Bool or1(unsigned x, unsigned y)
+{
+  /* x <= y || x == 0 --> x <= y */
+  return x <= y || x == 0;
+}
+
+_Bool or2(unsigned x, unsigned y)
+{
+  /* x >= y || x == UINT_MAX --> x >= y */
+  return x >= y || x == UINT_MAX;
+}
+
+_Bool or3(signed x, signed y)
+{
+  /* x <= y || x == INT_MIN --> x <= y */
+  return x <= y || x == INT_MIN;
+}
+
+_Bool or4(signed x, signed y)
+{
+  /* x >= y || x == INT_MAX --> x >= y */
+  return x >= y || x == INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " == " "ifcombine" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-12.c b/gcc/testsuite/gcc.dg/pr88784-12.c
new file mode 100644
index 00000000000..477bba07821
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-12.c
@@ -0,0 +1,30 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */
+
+#include <limits.h>
+
+_Bool or1(unsigned x, unsigned y)
+{
+  /* x <= y || x == 0 --> x <= y */
+  return x <= y || x == 0;
+}
+
+_Bool or2(unsigned x, unsigned y)
+{
+  /* x >= y || x == UINT_MAX --> x >= y */
+  return x >= y || x == UINT_MAX;
+}
+
+_Bool or3(signed x, signed y)
+{
+  /* x <= y || x == INT_MIN --> x <= y */
+  return x <= y || x == INT_MIN;
+}
+
+_Bool or4(signed x, signed y)
+{
+  /* x >= y || x == INT_MAX --> x >= y */
+  return x >= y || x == INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " == " "dce3" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-2.c b/gcc/testsuite/gcc.dg/pr88784-2.c
new file mode 100644
index 00000000000..265ddc7ceeb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-2.c
@@ -0,0 +1,30 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */
+
+#include <limits.h>
+
+_Bool and1(unsigned x, unsigned y)
+{
+  /* x > y && x != 0 --> x > y */
+  return x > y && x != 0;
+}
+
+_Bool and2(unsigned x, unsigned y)
+{
+  /* x < y && x != UINT_MAX --> x < y */
+  return x < y && x != UINT_MAX;
+}
+
+_Bool and3(signed x, signed y)
+{
+  /* x > y && x != INT_MIN --> x > y */
+  return x > y && x != INT_MIN;
+}
+
+_Bool and4(signed x, signed y)
+{
+  /* x < y && x != INT_MAX --> x < y */
+  return x < y && x != INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " != " "dce3" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-3.c b/gcc/testsuite/gcc.dg/pr88784-3.c
new file mode 100644
index 00000000000..be2ce315e60
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-3.c
@@ -0,0 +1,32 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */
+
+#include <limits.h>
+
+_Bool and1(unsigned x, unsigned y)
+{
+  /* x > y && x == 0 --> false */
+  return x > y && x == 0;
+}
+
+_Bool and2(unsigned x, unsigned y)
+{
+  /* x < y && x == UINT_MAX --> false */
+  return x < y && x == UINT_MAX;
+}
+
+_Bool and3(signed x, signed y)
+{
+  /* x > y && x == INT_MIN --> false */
+  return x > y && x == INT_MIN;
+}
+
+_Bool and4(signed x, signed y)
+{
+  /* x < y && x == INT_MAX --> false */
+  return x < y && x == INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " == " "ifcombine" } } */
+/* { dg-final { scan-tree-dump-not " > " "ifcombine" } } */
+/* { dg-final { scan-tree-dump-not " < " "ifcombine" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-4.c b/gcc/testsuite/gcc.dg/pr88784-4.c
new file mode 100644
index 00000000000..b927e712464
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-4.c
@@ -0,0 +1,32 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */
+
+#include <limits.h>
+
+_Bool and1(unsigned x, unsigned y)
+{
+  /* x > y && x == 0 --> false */
+  return x > y && x == 0;
+}
+
+_Bool and2(unsigned x, unsigned y)
+{
+  /* x < y && x == UINT_MAX --> false */
+  return x < y && x == UINT_MAX;
+}
+
+_Bool and3(signed x, signed y)
+{
+  /* x > y && x == INT_MIN --> false */
+  return x > y && x == INT_MIN;
+}
+
+_Bool and4(signed x, signed y)
+{
+  /* x < y && x == INT_MAX --> false */
+  return x < y && x == INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " == " "dce3" } } */
+/* { dg-final { scan-tree-dump-not " > " "dce3" } } */
+/* { dg-final { scan-tree-dump-not " < " "dce3" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-5.c b/gcc/testsuite/gcc.dg/pr88784-5.c
new file mode 100644
index 00000000000..c6a349d7c75
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-5.c
@@ -0,0 +1,31 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */
+
+#include <limits.h>
+
+_Bool and1(unsigned x, unsigned y)
+{
+  /* x <= y && x == 0 --> x == 0 */
+  return x <= y && x == 0;
+}
+
+_Bool and2(unsigned x, unsigned y)
+{
+  /* x >= y && x == UINT_MAX --> x == UINT_MAX */
+  return x >= y && x == UINT_MAX;
+}
+
+_Bool and3(signed x, signed y)
+{
+  /* x <= y && x == INT_MIN --> x == INT_MIN */
+  return x <= y && x == INT_MIN;
+}
+
+_Bool and4(signed x, signed y)
+{
+  /* x >= y && x == INT_MAX --> x == INT_MAX */
+  return x >= y && x == INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " <= " "ifcombine" } } */
+/* { dg-final { scan-tree-dump-not " >= " "ifcombine" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-6.c b/gcc/testsuite/gcc.dg/pr88784-6.c
new file mode 100644
index 00000000000..5b5d2221bf0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-6.c
@@ -0,0 +1,31 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */
+
+#include <limits.h>
+
+_Bool and1(unsigned x, unsigned y)
+{
+  /* x <= y && x == 0 --> x == 0 */
+  return x <= y && x == 0;
+}
+
+_Bool and2(unsigned x, unsigned y)
+{
+  /* x >= y && x == UINT_MAX --> x == UINT_MAX */
+  return x >= y && x == UINT_MAX;
+}
+
+_Bool and3(signed x, signed y)
+{
+  /* x <= y && x == INT_MIN --> x == INT_MIN */
+  return x <= y && x == INT_MIN;
+}
+
+_Bool and4(signed x, signed y)
+{
+  /* x >= y && x == INT_MAX --> x == INT_MAX */
+  return x >= y && x == INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " <= " "dce3" } } */
+/* { dg-final { scan-tree-dump-not " >= " "dce3" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-7.c b/gcc/testsuite/gcc.dg/pr88784-7.c
new file mode 100644
index 00000000000..937d2d26593
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-7.c
@@ -0,0 +1,31 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */
+
+#include <limits.h>
+
+_Bool or1(unsigned x, unsigned y)
+{
+  /* x > y || x != 0 --> x != 0 */
+  return x > y || x != 0;
+}
+
+_Bool or2(unsigned x, unsigned y)
+{
+  /* x < y || x != UINT_MAX --> x != UINT_MAX */
+  return x < y || x != UINT_MAX;
+}
+
+_Bool or3(signed x, signed y)
+{
+  /* x > y || x != INT_MIN --> x != INT_MIN */
+  return x > y || x != INT_MIN;
+}
+
+_Bool or4(signed x, signed y)
+{
+  /* x < y || x != INT_MAX --> x != INT_MAX */
+  return x < y || x != INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " > " "ifcombine" } } */
+/* { dg-final { scan-tree-dump-not " < " "ifcombine" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-8.c b/gcc/testsuite/gcc.dg/pr88784-8.c
new file mode 100644
index 00000000000..7f5c845eb27
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-8.c
@@ -0,0 +1,31 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */
+
+#include <limits.h>
+
+_Bool or1(unsigned x, unsigned y)
+{
+  /* x > y || x != 0 --> x != 0 */
+  return x > y || x != 0;
+}
+
+_Bool or2(unsigned x, unsigned y)
+{
+  /* x < y || x != UINT_MAX --> x != UINT_MAX */
+  return x < y || x != UINT_MAX;
+}
+
+_Bool or3(signed x, signed y)
+{
+  /* x > y || x != INT_MIN --> x != INT_MIN */
+  return x > y || x != INT_MIN;
+}
+
+_Bool or4(signed x, signed y)
+{
+  /* x < y || x != INT_MAX --> x != INT_MAX */
+  return x < y || x != INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " > " "dce3" } } */
+/* { dg-final { scan-tree-dump-not " < " "dce3" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-9.c b/gcc/testsuite/gcc.dg/pr88784-9.c
new file mode 100644
index 00000000000..94f62d94ede
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-9.c
@@ -0,0 +1,32 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */
+
+#include <limits.h>
+
+_Bool or1(unsigned x, unsigned y)
+{
+  /* x <= y || x != 0 --> true */
+  return x <= y || x != 0;
+}
+
+_Bool or2(unsigned x, unsigned y)
+{
+  /* x >= y || x != UINT_MAX --> true */
+  return x >= y || x != UINT_MAX;
+}
+
+_Bool or3(signed x, signed y)
+{
+  /* x <= y || x != INT_MIN --> true */
+  return x <= y || x != INT_MIN;
+}
+
+_Bool or4(signed x, signed y)
+{
+  /* x >= y || x != INT_MAX --> true */
+  return x >= y || x != INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " != " "ifcombine" } } */
+/* { dg-final { scan-tree-dump-not " <= " "ifcombine" } } */
+/* { dg-final { scan-tree-dump-not " >= " "ifcombine" } } */