Message ID | 1561615913-22109-1-git-send-email-helijia@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | [middle-end/88784] Middle end is missing some optimizations about unsigned | expand |
On 6/27/19 12:11 AM, Li Jia He wrote: > Hi, > > According to the optimizable case described by Qi Feng on > issue 88784, we can combine the cases into the following: > > 1. x > y && x != XXX_MIN --> x > y > 2. x > y && x == XXX_MIN --> false > 3. x <= y && x == XXX_MIN --> x == XXX_MIN > > 4. x < y && x != XXX_MAX --> x < y > 5. x < y && x == XXX_MAX --> false > 6. x >= y && x == XXX_MAX --> x == XXX_MAX > > 7. x > y || x != XXX_MIN --> x != XXX_MIN > 8. x <= y || x != XXX_MIN --> true > 9. x <= y || x == XXX_MIN --> x <= y > > 10. x < y || x != XXX_MAX --> x != UXXX_MAX > 11. x >= y || x != XXX_MAX --> true > 12. x >= y || x == XXX_MAX --> x >= y > > Note: XXX_MIN represents the minimum value of type x. > XXX_MAX represents the maximum value of type x. > > Here we don't need to care about whether the operation is > signed or unsigned. For example, in the below equation: > > 'x > y && x != XXX_MIN --> x > y' > > If the x type is signed int and XXX_MIN is INT_MIN, we can > optimize it to 'x > y'. However, if the type of x is unsigned > int and XXX_MIN is 0, we can still optimize it to 'x > y'. > > The regression testing for the patch was done on GCC mainline on > > powerpc64le-unknown-linux-gnu (Power 9 LE) > > with no regressions. Is it OK for trunk ? > > Thanks, > Lijia He > > gcc/ChangeLog > > 2019-06-27 Li Jia He <helijia@linux.ibm.com> > Qi Feng <ffengqi@linux.ibm.com> > > PR middle-end/88784 > * gimple-fold.c (and_comparisons_contain_equal_operands): New function. > (and_comparisons_1): Use and_comparisons_contain_equal_operands. > (or_comparisons_contain_equal_operands): New function. > (or_comparisons_1): Use or_comparisons_contain_equal_operands. Would this be better done via match.pd? ISTM this transformation would be well suited for that framework. jeff
On 2019/6/27 11:48 PM, Jeff Law wrote: > On 6/27/19 12:11 AM, Li Jia He wrote: >> Hi, >> >> According to the optimizable case described by Qi Feng on >> issue 88784, we can combine the cases into the following: >> >> 1. x > y && x != XXX_MIN --> x > y >> 2. x > y && x == XXX_MIN --> false >> 3. x <= y && x == XXX_MIN --> x == XXX_MIN >> >> 4. x < y && x != XXX_MAX --> x < y >> 5. x < y && x == XXX_MAX --> false >> 6. x >= y && x == XXX_MAX --> x == XXX_MAX >> >> 7. x > y || x != XXX_MIN --> x != XXX_MIN >> 8. x <= y || x != XXX_MIN --> true >> 9. x <= y || x == XXX_MIN --> x <= y >> >> 10. x < y || x != XXX_MAX --> x != UXXX_MAX >> 11. x >= y || x != XXX_MAX --> true >> 12. x >= y || x == XXX_MAX --> x >= y >> >> Note: XXX_MIN represents the minimum value of type x. >> XXX_MAX represents the maximum value of type x. >> >> Here we don't need to care about whether the operation is >> signed or unsigned. For example, in the below equation: >> >> 'x > y && x != XXX_MIN --> x > y' >> >> If the x type is signed int and XXX_MIN is INT_MIN, we can >> optimize it to 'x > y'. However, if the type of x is unsigned >> int and XXX_MIN is 0, we can still optimize it to 'x > y'. >> >> The regression testing for the patch was done on GCC mainline on >> >> powerpc64le-unknown-linux-gnu (Power 9 LE) >> >> with no regressions. Is it OK for trunk ? >> >> Thanks, >> Lijia He >> >> gcc/ChangeLog >> >> 2019-06-27 Li Jia He <helijia@linux.ibm.com> >> Qi Feng <ffengqi@linux.ibm.com> >> >> PR middle-end/88784 >> * gimple-fold.c (and_comparisons_contain_equal_operands): New function. >> (and_comparisons_1): Use and_comparisons_contain_equal_operands. >> (or_comparisons_contain_equal_operands): New function. >> (or_comparisons_1): Use or_comparisons_contain_equal_operands. > Would this be better done via match.pd? ISTM this transformation would > be well suited for that framework. Hi, Jeff I did this because of the following test case: ` _Bool comp(unsigned x, unsigned y) { return x > y && x != 0; } ` The gimple file dumped on the power platform is: ` comp (unsigned int x, unsigned int y) { _Bool D.2837; int iftmp.0; if (x > y) goto <D.2841>; else goto <D.2839>; <D.2841>: if (x != 0) goto <D.2842>; else goto <D.2839>; <D.2842>: iftmp.0 = 1; goto <D.2840>; <D.2839>: iftmp.0 = 0; <D.2840>: D.2837 = (_Bool) iftmp.0; return D.2837; } ` However, the gimple file dumped on x86 is ` comp (unsigned int x, unsigned int y) { _Bool D.2837; _1 = x > y; _2 = x != 0; _3 = _1 & _2; _4 = (int) _3; D.2837 = (_Bool) _4; return D.2837; } ` The reason for the inconsistency between these two behaviors is param logical-op-non-short-circuit. If we add the pattern to the match.pd file, we can only optimize the situation in which the statement is in the same basic block (logical-op-non-short-circuit=1, x86). But for a cross-basic block (logical-op-non-short-circuit=0, power), match.pd can't handle this situation. Another reason is that I found out maybe_fold_and_comparisons and maybe_fold_or_comparisons are not only called by ifcombine pass but also by reassoc pass. Using this method can basically unify param logical-op-non-short-circuit=0 or 1. Thanks, Lijia He > > jeff >
On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote: > > > > On 2019/6/27 11:48 PM, Jeff Law wrote: > > On 6/27/19 12:11 AM, Li Jia He wrote: > >> Hi, > >> > >> According to the optimizable case described by Qi Feng on > >> issue 88784, we can combine the cases into the following: > >> > >> 1. x > y && x != XXX_MIN --> x > y > >> 2. x > y && x == XXX_MIN --> false > >> 3. x <= y && x == XXX_MIN --> x == XXX_MIN > >> > >> 4. x < y && x != XXX_MAX --> x < y > >> 5. x < y && x == XXX_MAX --> false > >> 6. x >= y && x == XXX_MAX --> x == XXX_MAX > >> > >> 7. x > y || x != XXX_MIN --> x != XXX_MIN > >> 8. x <= y || x != XXX_MIN --> true > >> 9. x <= y || x == XXX_MIN --> x <= y > >> > >> 10. x < y || x != XXX_MAX --> x != UXXX_MAX > >> 11. x >= y || x != XXX_MAX --> true > >> 12. x >= y || x == XXX_MAX --> x >= y > >> > >> Note: XXX_MIN represents the minimum value of type x. > >> XXX_MAX represents the maximum value of type x. > >> > >> Here we don't need to care about whether the operation is > >> signed or unsigned. For example, in the below equation: > >> > >> 'x > y && x != XXX_MIN --> x > y' > >> > >> If the x type is signed int and XXX_MIN is INT_MIN, we can > >> optimize it to 'x > y'. However, if the type of x is unsigned > >> int and XXX_MIN is 0, we can still optimize it to 'x > y'. > >> > >> The regression testing for the patch was done on GCC mainline on > >> > >> powerpc64le-unknown-linux-gnu (Power 9 LE) > >> > >> with no regressions. Is it OK for trunk ? > >> > >> Thanks, > >> Lijia He > >> > >> gcc/ChangeLog > >> > >> 2019-06-27 Li Jia He <helijia@linux.ibm.com> > >> Qi Feng <ffengqi@linux.ibm.com> > >> > >> PR middle-end/88784 > >> * gimple-fold.c (and_comparisons_contain_equal_operands): New function. > >> (and_comparisons_1): Use and_comparisons_contain_equal_operands. > >> (or_comparisons_contain_equal_operands): New function. > >> (or_comparisons_1): Use or_comparisons_contain_equal_operands. > > Would this be better done via match.pd? ISTM this transformation would > > be well suited for that framework. > > Hi, Jeff > > I did this because of the following test case: > ` > _Bool comp(unsigned x, unsigned y) > { > return x > y && x != 0; > } > ` > The gimple file dumped on the power platform is: > ` > comp (unsigned int x, unsigned int y) > { > _Bool D.2837; > int iftmp.0; > > if (x > y) goto <D.2841>; else goto <D.2839>; > <D.2841>: > if (x != 0) goto <D.2842>; else goto <D.2839>; > <D.2842>: > iftmp.0 = 1; > goto <D.2840>; > <D.2839>: > iftmp.0 = 0; > <D.2840>: > D.2837 = (_Bool) iftmp.0; > return D.2837; > } > ` > However, the gimple file dumped on x86 is > ` > comp (unsigned int x, unsigned int y) > { > _Bool D.2837; > > _1 = x > y; > _2 = x != 0; > _3 = _1 & _2; > _4 = (int) _3; > D.2837 = (_Bool) _4; > return D.2837; > } > ` > > The reason for the inconsistency between these two behaviors is param > logical-op-non-short-circuit. If we add the pattern to the match.pd > file, we can only optimize the situation in which the statement is in > the same basic block (logical-op-non-short-circuit=1, x86). But for > a cross-basic block (logical-op-non-short-circuit=0, power), match.pd > can't handle this situation. > > Another reason is that I found out maybe_fold_and_comparisons and > maybe_fold_or_comparisons are not only called by ifcombine pass but > also by reassoc pass. Using this method can basically unify param > logical-op-non-short-circuit=0 or 1. As mentioned before ifcombine pass should be using gimple-match instead of fold_build. Try converting ifcombine over to gimple-match infrastructure and add these to match.pd. NOTE tree-ssa-phiopt should also be moved over to gimple-match but that is a different issue. Thanks, Andrew Pinski > > Thanks, > Lijia He > > > > > jeff > > >
On Fri, 28 Jun 2019, Andrew Pinski wrote: > On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote: > > > > > > > > On 2019/6/27 11:48 PM, Jeff Law wrote: > > > On 6/27/19 12:11 AM, Li Jia He wrote: > > >> Hi, > > >> > > >> According to the optimizable case described by Qi Feng on > > >> issue 88784, we can combine the cases into the following: > > >> > > >> 1. x > y && x != XXX_MIN --> x > y > > >> 2. x > y && x == XXX_MIN --> false > > >> 3. x <= y && x == XXX_MIN --> x == XXX_MIN > > >> > > >> 4. x < y && x != XXX_MAX --> x < y > > >> 5. x < y && x == XXX_MAX --> false > > >> 6. x >= y && x == XXX_MAX --> x == XXX_MAX > > >> > > >> 7. x > y || x != XXX_MIN --> x != XXX_MIN > > >> 8. x <= y || x != XXX_MIN --> true > > >> 9. x <= y || x == XXX_MIN --> x <= y > > >> > > >> 10. x < y || x != XXX_MAX --> x != UXXX_MAX > > >> 11. x >= y || x != XXX_MAX --> true > > >> 12. x >= y || x == XXX_MAX --> x >= y > > >> > > >> Note: XXX_MIN represents the minimum value of type x. > > >> XXX_MAX represents the maximum value of type x. > > >> > > >> Here we don't need to care about whether the operation is > > >> signed or unsigned. For example, in the below equation: > > >> > > >> 'x > y && x != XXX_MIN --> x > y' > > >> > > >> If the x type is signed int and XXX_MIN is INT_MIN, we can > > >> optimize it to 'x > y'. However, if the type of x is unsigned > > >> int and XXX_MIN is 0, we can still optimize it to 'x > y'. > > >> > > >> The regression testing for the patch was done on GCC mainline on > > >> > > >> powerpc64le-unknown-linux-gnu (Power 9 LE) > > >> > > >> with no regressions. Is it OK for trunk ? > > >> > > >> Thanks, > > >> Lijia He > > >> > > >> gcc/ChangeLog > > >> > > >> 2019-06-27 Li Jia He <helijia@linux.ibm.com> > > >> Qi Feng <ffengqi@linux.ibm.com> > > >> > > >> PR middle-end/88784 > > >> * gimple-fold.c (and_comparisons_contain_equal_operands): New function. > > >> (and_comparisons_1): Use and_comparisons_contain_equal_operands. > > >> (or_comparisons_contain_equal_operands): New function. > > >> (or_comparisons_1): Use or_comparisons_contain_equal_operands. > > > Would this be better done via match.pd? ISTM this transformation would > > > be well suited for that framework. > > > > Hi, Jeff > > > > I did this because of the following test case: > > ` > > _Bool comp(unsigned x, unsigned y) > > { > > return x > y && x != 0; > > } > > ` > > The gimple file dumped on the power platform is: > > ` > > comp (unsigned int x, unsigned int y) > > { > > _Bool D.2837; > > int iftmp.0; > > > > if (x > y) goto <D.2841>; else goto <D.2839>; > > <D.2841>: > > if (x != 0) goto <D.2842>; else goto <D.2839>; > > <D.2842>: > > iftmp.0 = 1; > > goto <D.2840>; > > <D.2839>: > > iftmp.0 = 0; > > <D.2840>: > > D.2837 = (_Bool) iftmp.0; > > return D.2837; > > } > > ` > > However, the gimple file dumped on x86 is > > ` > > comp (unsigned int x, unsigned int y) > > { > > _Bool D.2837; > > > > _1 = x > y; > > _2 = x != 0; > > _3 = _1 & _2; > > _4 = (int) _3; > > D.2837 = (_Bool) _4; > > return D.2837; > > } > > ` > > > > The reason for the inconsistency between these two behaviors is param > > logical-op-non-short-circuit. If we add the pattern to the match.pd > > file, we can only optimize the situation in which the statement is in > > the same basic block (logical-op-non-short-circuit=1, x86). But for > > a cross-basic block (logical-op-non-short-circuit=0, power), match.pd > > can't handle this situation. > > > > Another reason is that I found out maybe_fold_and_comparisons and > > maybe_fold_or_comparisons are not only called by ifcombine pass but > > also by reassoc pass. Using this method can basically unify param > > logical-op-non-short-circuit=0 or 1. > > > As mentioned before ifcombine pass should be using gimple-match > instead of fold_build. Try converting ifcombine over to gimple-match > infrastructure and add these to match.pd. Yes, I mentioned that in the PR. The issue is that at the moment to combine x > y with x <= y you'd have to build GENERIC trees for both or temporary GIMPLE assign with a SSA def (and then feed that into the GENERIC or GIMPLE match.pd path). maybe_fold_and/or_comparisons handle two exploded binary expressions while the current match.pd entries handle at most one exploded one (the outermost then, either AND or OR). But it would be definitely doable to auto-generate maybe_fold_and/or_comparisons from match.pd patterns which is what I'd ultimatively suggest to do (in some more generalized form maybe). Either with a separate genmatch invocation or as part of the --gimple processing (not sure what is more feasible here). I told Li Jia He that I don't expect him to do this work. Note I didn't review the actual patch yet. Thanks, Richard.
On 2019/7/1 3:30 PM, Richard Biener wrote: > On Fri, 28 Jun 2019, Andrew Pinski wrote: > >> On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote: >>> >>> >>> >>> On 2019/6/27 11:48 PM, Jeff Law wrote: >>>> On 6/27/19 12:11 AM, Li Jia He wrote: >>>>> Hi, >>>>> >>>>> According to the optimizable case described by Qi Feng on >>>>> issue 88784, we can combine the cases into the following: >>>>> >>>>> 1. x > y && x != XXX_MIN --> x > y >>>>> 2. x > y && x == XXX_MIN --> false >>>>> 3. x <= y && x == XXX_MIN --> x == XXX_MIN >>>>> >>>>> 4. x < y && x != XXX_MAX --> x < y >>>>> 5. x < y && x == XXX_MAX --> false >>>>> 6. x >= y && x == XXX_MAX --> x == XXX_MAX >>>>> >>>>> 7. x > y || x != XXX_MIN --> x != XXX_MIN >>>>> 8. x <= y || x != XXX_MIN --> true >>>>> 9. x <= y || x == XXX_MIN --> x <= y >>>>> >>>>> 10. x < y || x != XXX_MAX --> x != UXXX_MAX >>>>> 11. x >= y || x != XXX_MAX --> true >>>>> 12. x >= y || x == XXX_MAX --> x >= y >>>>> >>>>> Note: XXX_MIN represents the minimum value of type x. >>>>> XXX_MAX represents the maximum value of type x. >>>>> >>>>> Here we don't need to care about whether the operation is >>>>> signed or unsigned. For example, in the below equation: >>>>> >>>>> 'x > y && x != XXX_MIN --> x > y' >>>>> >>>>> If the x type is signed int and XXX_MIN is INT_MIN, we can >>>>> optimize it to 'x > y'. However, if the type of x is unsigned >>>>> int and XXX_MIN is 0, we can still optimize it to 'x > y'. >>>>> >>>>> The regression testing for the patch was done on GCC mainline on >>>>> >>>>> powerpc64le-unknown-linux-gnu (Power 9 LE) >>>>> >>>>> with no regressions. Is it OK for trunk ? >>>>> >>>>> Thanks, >>>>> Lijia He >>>>> >>>>> gcc/ChangeLog >>>>> >>>>> 2019-06-27 Li Jia He <helijia@linux.ibm.com> >>>>> Qi Feng <ffengqi@linux.ibm.com> >>>>> >>>>> PR middle-end/88784 >>>>> * gimple-fold.c (and_comparisons_contain_equal_operands): New function. >>>>> (and_comparisons_1): Use and_comparisons_contain_equal_operands. >>>>> (or_comparisons_contain_equal_operands): New function. >>>>> (or_comparisons_1): Use or_comparisons_contain_equal_operands. >>>> Would this be better done via match.pd? ISTM this transformation would >>>> be well suited for that framework. >>> >>> Hi, Jeff >>> >>> I did this because of the following test case: >>> ` >>> _Bool comp(unsigned x, unsigned y) >>> { >>> return x > y && x != 0; >>> } >>> ` >>> The gimple file dumped on the power platform is: >>> ` >>> comp (unsigned int x, unsigned int y) >>> { >>> _Bool D.2837; >>> int iftmp.0; >>> >>> if (x > y) goto <D.2841>; else goto <D.2839>; >>> <D.2841>: >>> if (x != 0) goto <D.2842>; else goto <D.2839>; >>> <D.2842>: >>> iftmp.0 = 1; >>> goto <D.2840>; >>> <D.2839>: >>> iftmp.0 = 0; >>> <D.2840>: >>> D.2837 = (_Bool) iftmp.0; >>> return D.2837; >>> } >>> ` >>> However, the gimple file dumped on x86 is >>> ` >>> comp (unsigned int x, unsigned int y) >>> { >>> _Bool D.2837; >>> >>> _1 = x > y; >>> _2 = x != 0; >>> _3 = _1 & _2; >>> _4 = (int) _3; >>> D.2837 = (_Bool) _4; >>> return D.2837; >>> } >>> ` >>> >>> The reason for the inconsistency between these two behaviors is param >>> logical-op-non-short-circuit. If we add the pattern to the match.pd >>> file, we can only optimize the situation in which the statement is in >>> the same basic block (logical-op-non-short-circuit=1, x86). But for >>> a cross-basic block (logical-op-non-short-circuit=0, power), match.pd >>> can't handle this situation. >>> >>> Another reason is that I found out maybe_fold_and_comparisons and >>> maybe_fold_or_comparisons are not only called by ifcombine pass but >>> also by reassoc pass. Using this method can basically unify param >>> logical-op-non-short-circuit=0 or 1. >> >> >> As mentioned before ifcombine pass should be using gimple-match >> instead of fold_build. Try converting ifcombine over to gimple-match >> infrastructure and add these to match.pd. > > Yes, I mentioned that in the PR. The issue is that at the moment > to combine x > y with x <= y you'd have to build GENERIC trees > for both or temporary GIMPLE assign with a SSA def (and then feed > that into the GENERIC or GIMPLE match.pd path). Hi, I did some experimentation using ‘temporary GIMPLE with a SSA def (and then feed that into the GIMPLE match.pd path’. Could we consider the code in the attachment(I did a test and the code took effect)? Thanks, Lijia He > > maybe_fold_and/or_comparisons handle two exploded binary expressions > while the current match.pd entries handle at most one exploded one > (the outermost then, either AND or OR). But it would be definitely > doable to auto-generate maybe_fold_and/or_comparisons from match.pd > patterns which is what I'd ultimatively suggest to do (in some more > generalized form maybe). Either with a separate genmatch invocation > or as part of the --gimple processing (not sure what is more feasible > here). > > I told Li Jia He that I don't expect him to do this work. > > Note I didn't review the actual patch yet. > > Thanks, > Richard. > diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c index dfb31a02078..9974b491626 100644 --- a/gcc/gimple-fold.c +++ b/gcc/gimple-fold.c @@ -5789,6 +5789,12 @@ and_comparisons_1 (enum tree_code code1, tree op1a, tree op1b, return NULL_TREE; } +static tree +do_valueize (tree t) +{ + return t; +} + /* Try to simplify the AND of two comparisons, specified by (OP1A CODE1 OP1B) and (OP2B CODE2 OP2B), respectively. If this can be simplified to a single expression (without requiring @@ -5800,11 +5806,39 @@ tree maybe_fold_and_comparisons (enum tree_code code1, tree op1a, tree op1b, enum tree_code code2, tree op2a, tree op2b) { - tree t = and_comparisons_1 (code1, op1a, op1b, code2, op2a, op2b); + tree t1 = fold_build2 (code1, boolean_type_node, op1a, op1b); + tree t2 = fold_build2 (code2, boolean_type_node, op2a, op2b); + + tree new_lhs1 = make_ssa_name (TREE_TYPE (t1)); + tree new_lhs2 = make_ssa_name (TREE_TYPE (t2)); + + gassign *gassign1 = gimple_build_assign (new_lhs1, t1); + gassign *gassign2 = gimple_build_assign (new_lhs2, t2); + + tree t = gimple_simplify (TRUTH_AND_EXPR, boolean_type_node, + gimple_assign_lhs (gassign1), + gimple_assign_lhs (gassign2), NULL, do_valueize); + + if (!t) + { + t = gimple_simplify (TRUTH_AND_EXPR, boolean_type_node, + gimple_assign_lhs (gassign2), + gimple_assign_lhs (gassign1), NULL, do_valueize); + } + if (t) - return t; - else - return and_comparisons_1 (code2, op2a, op2b, code1, op1a, op1b); + { + gimple *def = SSA_NAME_DEF_STMT (t); + + if (!is_gimple_assign (def) + || TREE_CODE_CLASS (gimple_assign_rhs_code (def)) != tcc_comparison) + return NULL_TREE; + + return fold_build2 (gimple_assign_rhs_code (def), boolean_type_node, + gimple_assign_rhs1 (def), gimple_assign_rhs2 (def)); + } + + return NULL_TREE; } /* Helper function for or_comparisons_1: try to simplify the OR of the diff --git a/gcc/match.pd b/gcc/match.pd index f8e35e96d22..21a147d0ff1 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1859,6 +1859,101 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) { wide_int_to_tree (type, (wi::to_wide (@1) & (bitpos / BITS_PER_UNIT))); })))) +/* x > y && x != XXX_MIN --> x > y */ +(for and (truth_and bit_and) + (simplify + (and:c (gt:c@3 @0 @1) (ne @0 INTEGER_CST@2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) + && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2))))) + @3))) + +/* x > y && x == XXX_MIN --> false */ +(for and (truth_and bit_and) + (simplify + (and:c (gt:c @0 @1) (eq @0 INTEGER_CST@2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) + && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2))))) + { boolean_false_node; }))) + +/* x <= y && x == XXX_MIN --> x == XXX_MIN */ +(for and (truth_and bit_and) + (simplify + (and:c (le:c @0 @1) (eq@3 @0 INTEGER_CST@2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) + && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2))))) + @3))) + +/* x < y && x != XXX_MAX --> x < y */ +(for and (truth_and bit_and) + (simplify + (and:c (lt:c@3 @0 @1) (ne @0 INTEGER_CST@2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) + && (wi::eq_p (wi::to_wide (@2), wi::max_value (TREE_TYPE (@2))))) + @3))) + +/* x < y && x == XXX_MAX --> false */ +(for and (truth_and bit_and) + (simplify + (and:c (lt:c @0 @1) (eq @0 INTEGER_CST@2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) + && (wi::eq_p (wi::to_wide (@2), wi::max_value (TREE_TYPE (@2))))) + { boolean_false_node; }))) + +/* x >= y && x == XXX_MAX --> x == XXX_MAX */ +(for and (truth_and bit_and) + (simplify + (and:c (ge:c @0 @1) (eq@3 @0 INTEGER_CST@2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) + && (wi::eq_p (wi::to_wide (@2), wi::max_value (TREE_TYPE (@2))))) + @3))) + +/* x > y || x != XXX_MIN --> x != XXX_MIN */ +(for or (truth_or bit_ior) + (simplify + (or:c (gt:c @0 @1) (ne@3 @0 INTEGER_CST@2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) + && wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)))) + @3))) + +/* x <= y || x != XXX_MIN --> true */ +(for or (truth_or bit_ior) + (simplify + (or:c (le:c @0 @1) (ne @0 INTEGER_CST@2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) + && wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)))) + { boolean_true_node; }))) + +/* x <= y || x == XXX_MIN --> x <= y */ +(for or (truth_or bit_ior) + (simplify + (or:c (le:c@3 @0 @1) (eq @0 INTEGER_CST@2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) + && wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)))) + @3))) + +/* x < y || x != XXX_MAX --> x != XXX_MAX */ +(for or (truth_or bit_ior) + (simplify + (or:c (lt:c @0 @1) (ne@3 @0 INTEGER_CST@2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) + && wi::eq_p (wi::to_wide (@2), wi::max_value (TREE_TYPE (@2)))) + @3))) + +/* x >= y || x != XXX_MAX --> true */ +(for or (truth_or bit_ior) + (simplify + (or:c (ge:c @0 @1) (ne @0 INTEGER_CST@2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) + && wi::eq_p (wi::to_wide (@2), wi::max_value (TREE_TYPE (@2)))) + { boolean_true_node; }))) + +/* x >= y || x == XXX_MAX --> x >= y */ +(for or (truth_or bit_ior) + (simplify + (or:c (ge:c@3 @0 @1) (eq @0 INTEGER_CST@2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) + && wi::eq_p (wi::to_wide (@2), wi::max_value (TREE_TYPE (@2)))) + @3))) /* We can't reassociate at all for saturating types. */ (if (!TYPE_SATURATING (type))
On Tue, 2 Jul 2019, Li Jia He wrote: > > > On 2019/7/1 3:30 PM, Richard Biener wrote: > > On Fri, 28 Jun 2019, Andrew Pinski wrote: > > > > > On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote: > > > > > > > > > > > > > > > > On 2019/6/27 11:48 PM, Jeff Law wrote: > > > > > On 6/27/19 12:11 AM, Li Jia He wrote: > > > > > > Hi, > > > > > > > > > > > > According to the optimizable case described by Qi Feng on > > > > > > issue 88784, we can combine the cases into the following: > > > > > > > > > > > > 1. x > y && x != XXX_MIN --> x > y > > > > > > 2. x > y && x == XXX_MIN --> false > > > > > > 3. x <= y && x == XXX_MIN --> x == XXX_MIN > > > > > > > > > > > > 4. x < y && x != XXX_MAX --> x < y > > > > > > 5. x < y && x == XXX_MAX --> false > > > > > > 6. x >= y && x == XXX_MAX --> x == XXX_MAX > > > > > > > > > > > > 7. x > y || x != XXX_MIN --> x != XXX_MIN > > > > > > 8. x <= y || x != XXX_MIN --> true > > > > > > 9. x <= y || x == XXX_MIN --> x <= y > > > > > > > > > > > > 10. x < y || x != XXX_MAX --> x != UXXX_MAX > > > > > > 11. x >= y || x != XXX_MAX --> true > > > > > > 12. x >= y || x == XXX_MAX --> x >= y > > > > > > > > > > > > Note: XXX_MIN represents the minimum value of type x. > > > > > > XXX_MAX represents the maximum value of type x. > > > > > > > > > > > > Here we don't need to care about whether the operation is > > > > > > signed or unsigned. For example, in the below equation: > > > > > > > > > > > > 'x > y && x != XXX_MIN --> x > y' > > > > > > > > > > > > If the x type is signed int and XXX_MIN is INT_MIN, we can > > > > > > optimize it to 'x > y'. However, if the type of x is unsigned > > > > > > int and XXX_MIN is 0, we can still optimize it to 'x > y'. > > > > > > > > > > > > The regression testing for the patch was done on GCC mainline on > > > > > > > > > > > > powerpc64le-unknown-linux-gnu (Power 9 LE) > > > > > > > > > > > > with no regressions. Is it OK for trunk ? > > > > > > > > > > > > Thanks, > > > > > > Lijia He > > > > > > > > > > > > gcc/ChangeLog > > > > > > > > > > > > 2019-06-27 Li Jia He <helijia@linux.ibm.com> > > > > > > Qi Feng <ffengqi@linux.ibm.com> > > > > > > > > > > > > PR middle-end/88784 > > > > > > * gimple-fold.c (and_comparisons_contain_equal_operands): New > > > > > > function. > > > > > > (and_comparisons_1): Use > > > > > > and_comparisons_contain_equal_operands. > > > > > > (or_comparisons_contain_equal_operands): New function. > > > > > > (or_comparisons_1): Use or_comparisons_contain_equal_operands. > > > > > Would this be better done via match.pd? ISTM this transformation > > > > > would > > > > > be well suited for that framework. > > > > > > > > Hi, Jeff > > > > > > > > I did this because of the following test case: > > > > ` > > > > _Bool comp(unsigned x, unsigned y) > > > > { > > > > return x > y && x != 0; > > > > } > > > > ` > > > > The gimple file dumped on the power platform is: > > > > ` > > > > comp (unsigned int x, unsigned int y) > > > > { > > > > _Bool D.2837; > > > > int iftmp.0; > > > > > > > > if (x > y) goto <D.2841>; else goto <D.2839>; > > > > <D.2841>: > > > > if (x != 0) goto <D.2842>; else goto <D.2839>; > > > > <D.2842>: > > > > iftmp.0 = 1; > > > > goto <D.2840>; > > > > <D.2839>: > > > > iftmp.0 = 0; > > > > <D.2840>: > > > > D.2837 = (_Bool) iftmp.0; > > > > return D.2837; > > > > } > > > > ` > > > > However, the gimple file dumped on x86 is > > > > ` > > > > comp (unsigned int x, unsigned int y) > > > > { > > > > _Bool D.2837; > > > > > > > > _1 = x > y; > > > > _2 = x != 0; > > > > _3 = _1 & _2; > > > > _4 = (int) _3; > > > > D.2837 = (_Bool) _4; > > > > return D.2837; > > > > } > > > > ` > > > > > > > > The reason for the inconsistency between these two behaviors is param > > > > logical-op-non-short-circuit. If we add the pattern to the match.pd > > > > file, we can only optimize the situation in which the statement is in > > > > the same basic block (logical-op-non-short-circuit=1, x86). But for > > > > a cross-basic block (logical-op-non-short-circuit=0, power), match.pd > > > > can't handle this situation. > > > > > > > > Another reason is that I found out maybe_fold_and_comparisons and > > > > maybe_fold_or_comparisons are not only called by ifcombine pass but > > > > also by reassoc pass. Using this method can basically unify param > > > > logical-op-non-short-circuit=0 or 1. > > > > > > > > > As mentioned before ifcombine pass should be using gimple-match > > > instead of fold_build. Try converting ifcombine over to gimple-match > > > infrastructure and add these to match.pd. > > > > Yes, I mentioned that in the PR. The issue is that at the moment > > to combine x > y with x <= y you'd have to build GENERIC trees > > for both or temporary GIMPLE assign with a SSA def (and then feed > > that into the GENERIC or GIMPLE match.pd path). > > Hi, > > I did some experimentation using ‘temporary GIMPLE with a SSA def (and then > feed that into the GIMPLE match.pd path’. Could we consider the code in the > attachment(I did a test and the code took effect)? No, that's excessive overhead IMHO - the whole point of these functions originally was to avoid build2 (...) on both conditionals. If we now create two SSA names and two GIMPLE statements that's too much. As said it shouldn't be rocket science to teach genmatch to auto-generate those functions from match.pd patterns but it certainly is some work (less so for somebody with genmatch knowledge). I'll give it a poke... Richard. > Thanks, > Lijia He > > > > > maybe_fold_and/or_comparisons handle two exploded binary expressions > > while the current match.pd entries handle at most one exploded one > > (the outermost then, either AND or OR). But it would be definitely > > doable to auto-generate maybe_fold_and/or_comparisons from match.pd > > patterns which is what I'd ultimatively suggest to do (in some more > > generalized form maybe). Either with a separate genmatch invocation > > or as part of the --gimple processing (not sure what is more feasible > > here). > > > > I told Li Jia He that I don't expect him to do this work. > > > > Note I didn't review the actual patch yet. > > > > Thanks, > > Richard. > > >
On Tue, 2 Jul 2019, Richard Biener wrote: > On Tue, 2 Jul 2019, Li Jia He wrote: > > > > > > > On 2019/7/1 3:30 PM, Richard Biener wrote: > > > On Fri, 28 Jun 2019, Andrew Pinski wrote: > > > > > > > On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote: > > > > > > > > > > > > > > > > > > > > On 2019/6/27 11:48 PM, Jeff Law wrote: > > > > > > On 6/27/19 12:11 AM, Li Jia He wrote: > > > > > > > Hi, > > > > > > > > > > > > > > According to the optimizable case described by Qi Feng on > > > > > > > issue 88784, we can combine the cases into the following: > > > > > > > > > > > > > > 1. x > y && x != XXX_MIN --> x > y > > > > > > > 2. x > y && x == XXX_MIN --> false > > > > > > > 3. x <= y && x == XXX_MIN --> x == XXX_MIN > > > > > > > > > > > > > > 4. x < y && x != XXX_MAX --> x < y > > > > > > > 5. x < y && x == XXX_MAX --> false > > > > > > > 6. x >= y && x == XXX_MAX --> x == XXX_MAX > > > > > > > > > > > > > > 7. x > y || x != XXX_MIN --> x != XXX_MIN > > > > > > > 8. x <= y || x != XXX_MIN --> true > > > > > > > 9. x <= y || x == XXX_MIN --> x <= y > > > > > > > > > > > > > > 10. x < y || x != XXX_MAX --> x != UXXX_MAX > > > > > > > 11. x >= y || x != XXX_MAX --> true > > > > > > > 12. x >= y || x == XXX_MAX --> x >= y > > > > > > > > > > > > > > Note: XXX_MIN represents the minimum value of type x. > > > > > > > XXX_MAX represents the maximum value of type x. > > > > > > > > > > > > > > Here we don't need to care about whether the operation is > > > > > > > signed or unsigned. For example, in the below equation: > > > > > > > > > > > > > > 'x > y && x != XXX_MIN --> x > y' > > > > > > > > > > > > > > If the x type is signed int and XXX_MIN is INT_MIN, we can > > > > > > > optimize it to 'x > y'. However, if the type of x is unsigned > > > > > > > int and XXX_MIN is 0, we can still optimize it to 'x > y'. > > > > > > > > > > > > > > The regression testing for the patch was done on GCC mainline on > > > > > > > > > > > > > > powerpc64le-unknown-linux-gnu (Power 9 LE) > > > > > > > > > > > > > > with no regressions. Is it OK for trunk ? > > > > > > > > > > > > > > Thanks, > > > > > > > Lijia He > > > > > > > > > > > > > > gcc/ChangeLog > > > > > > > > > > > > > > 2019-06-27 Li Jia He <helijia@linux.ibm.com> > > > > > > > Qi Feng <ffengqi@linux.ibm.com> > > > > > > > > > > > > > > PR middle-end/88784 > > > > > > > * gimple-fold.c (and_comparisons_contain_equal_operands): New > > > > > > > function. > > > > > > > (and_comparisons_1): Use > > > > > > > and_comparisons_contain_equal_operands. > > > > > > > (or_comparisons_contain_equal_operands): New function. > > > > > > > (or_comparisons_1): Use or_comparisons_contain_equal_operands. > > > > > > Would this be better done via match.pd? ISTM this transformation > > > > > > would > > > > > > be well suited for that framework. > > > > > > > > > > Hi, Jeff > > > > > > > > > > I did this because of the following test case: > > > > > ` > > > > > _Bool comp(unsigned x, unsigned y) > > > > > { > > > > > return x > y && x != 0; > > > > > } > > > > > ` > > > > > The gimple file dumped on the power platform is: > > > > > ` > > > > > comp (unsigned int x, unsigned int y) > > > > > { > > > > > _Bool D.2837; > > > > > int iftmp.0; > > > > > > > > > > if (x > y) goto <D.2841>; else goto <D.2839>; > > > > > <D.2841>: > > > > > if (x != 0) goto <D.2842>; else goto <D.2839>; > > > > > <D.2842>: > > > > > iftmp.0 = 1; > > > > > goto <D.2840>; > > > > > <D.2839>: > > > > > iftmp.0 = 0; > > > > > <D.2840>: > > > > > D.2837 = (_Bool) iftmp.0; > > > > > return D.2837; > > > > > } > > > > > ` > > > > > However, the gimple file dumped on x86 is > > > > > ` > > > > > comp (unsigned int x, unsigned int y) > > > > > { > > > > > _Bool D.2837; > > > > > > > > > > _1 = x > y; > > > > > _2 = x != 0; > > > > > _3 = _1 & _2; > > > > > _4 = (int) _3; > > > > > D.2837 = (_Bool) _4; > > > > > return D.2837; > > > > > } > > > > > ` > > > > > > > > > > The reason for the inconsistency between these two behaviors is param > > > > > logical-op-non-short-circuit. If we add the pattern to the match.pd > > > > > file, we can only optimize the situation in which the statement is in > > > > > the same basic block (logical-op-non-short-circuit=1, x86). But for > > > > > a cross-basic block (logical-op-non-short-circuit=0, power), match.pd > > > > > can't handle this situation. > > > > > > > > > > Another reason is that I found out maybe_fold_and_comparisons and > > > > > maybe_fold_or_comparisons are not only called by ifcombine pass but > > > > > also by reassoc pass. Using this method can basically unify param > > > > > logical-op-non-short-circuit=0 or 1. > > > > > > > > > > > > As mentioned before ifcombine pass should be using gimple-match > > > > instead of fold_build. Try converting ifcombine over to gimple-match > > > > infrastructure and add these to match.pd. > > > > > > Yes, I mentioned that in the PR. The issue is that at the moment > > > to combine x > y with x <= y you'd have to build GENERIC trees > > > for both or temporary GIMPLE assign with a SSA def (and then feed > > > that into the GENERIC or GIMPLE match.pd path). > > > > Hi, > > > > I did some experimentation using ‘temporary GIMPLE with a SSA def (and then > > feed that into the GIMPLE match.pd path’. Could we consider the code in the > > attachment(I did a test and the code took effect)? > > No, that's excessive overhead IMHO - the whole point of these functions > originally was to avoid build2 (...) on both conditionals. If we > now create two SSA names and two GIMPLE statements that's too much. > > As said it shouldn't be rocket science to teach genmatch to > auto-generate those functions from match.pd patterns but it certainly > is some work (less so for somebody with genmatch knowledge). > I'll give it a poke... OK, it's not so easy. I guess we could lower the cost of building SSA names / gimple stmts significantly if we allocated them on the stack like via gimple *stmt1 = XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN) + 2 * sizeof (tree)); memset (stmt1, 0, ...); gimple_set_code (stmt1, GIMPLE_ASSIGN); gimple_set_num_ops (stmt1, 3); gimple_init_singleton (stmt1); gimple_stmt_iterator gsi = gsi_for_stmt (stmt1); gimple_assign_set_rhs_with_ops (&gsi, actual operation...); tree lhs1 = XALLOCA (tree_ssa_name); memset (lhs1, 0, sizeof (tree_ssa_name)); TREE_SET_CODE (lhs1, SSA_NAME); TREE_TYPE (lhs1) = ...; gimple_assing_set_lhs (stmt1, lhs1); it's all a bit ugly and some factoring in the generic gimple machinery might easen this kind of hacks, but well... With the above you could then use gimple_match_op op (gimple_match_cond::UNCOND, BIT_AND_EXPR /* or OR */, boolean_type_node, lhs1, lhs2); if (gimple_resimplify2 (NULL, &op, follow_all_ssa_edges)) ... successfully simplified into 'op' ... with the simplification path then extracting the simplified result from 'op'. Note this deliberately leaves out passing a gimple_seq as storage for more complex simplification results to match existing behavior. Your patch has the GC allocation and SSA name (and namespace!) allocation overhead and most definitely misses releasing the SSA names you allocated. Note with the above hack the simplified result has to be checked for mentions of lhs1 or lhs2 - a case we have to reject because their definitions are transitional. Richard. > Richard. > > > Thanks, > > Lijia He > > > > > > > > maybe_fold_and/or_comparisons handle two exploded binary expressions > > > while the current match.pd entries handle at most one exploded one > > > (the outermost then, either AND or OR). But it would be definitely > > > doable to auto-generate maybe_fold_and/or_comparisons from match.pd > > > patterns which is what I'd ultimatively suggest to do (in some more > > > generalized form maybe). Either with a separate genmatch invocation > > > or as part of the --gimple processing (not sure what is more feasible > > > here). > > > > > > I told Li Jia He that I don't expect him to do this work. > > > > > > Note I didn't review the actual patch yet. > > > > > > Thanks, > > > Richard. > > > > > > >
On 2019/7/2 4:51 PM, Richard Biener wrote: > On Tue, 2 Jul 2019, Richard Biener wrote: > >> On Tue, 2 Jul 2019, Li Jia He wrote: >> >>> >>> >>> On 2019/7/1 3:30 PM, Richard Biener wrote: >>>> On Fri, 28 Jun 2019, Andrew Pinski wrote: >>>> >>>>> On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 2019/6/27 11:48 PM, Jeff Law wrote: >>>>>>> On 6/27/19 12:11 AM, Li Jia He wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> According to the optimizable case described by Qi Feng on >>>>>>>> issue 88784, we can combine the cases into the following: >>>>>>>> >>>>>>>> 1. x > y && x != XXX_MIN --> x > y >>>>>>>> 2. x > y && x == XXX_MIN --> false >>>>>>>> 3. x <= y && x == XXX_MIN --> x == XXX_MIN >>>>>>>> >>>>>>>> 4. x < y && x != XXX_MAX --> x < y >>>>>>>> 5. x < y && x == XXX_MAX --> false >>>>>>>> 6. x >= y && x == XXX_MAX --> x == XXX_MAX >>>>>>>> >>>>>>>> 7. x > y || x != XXX_MIN --> x != XXX_MIN >>>>>>>> 8. x <= y || x != XXX_MIN --> true >>>>>>>> 9. x <= y || x == XXX_MIN --> x <= y >>>>>>>> >>>>>>>> 10. x < y || x != XXX_MAX --> x != UXXX_MAX >>>>>>>> 11. x >= y || x != XXX_MAX --> true >>>>>>>> 12. x >= y || x == XXX_MAX --> x >= y >>>>>>>> >>>>>>>> Note: XXX_MIN represents the minimum value of type x. >>>>>>>> XXX_MAX represents the maximum value of type x. >>>>>>>> >>>>>>>> Here we don't need to care about whether the operation is >>>>>>>> signed or unsigned. For example, in the below equation: >>>>>>>> >>>>>>>> 'x > y && x != XXX_MIN --> x > y' >>>>>>>> >>>>>>>> If the x type is signed int and XXX_MIN is INT_MIN, we can >>>>>>>> optimize it to 'x > y'. However, if the type of x is unsigned >>>>>>>> int and XXX_MIN is 0, we can still optimize it to 'x > y'. >>>>>>>> >>>>>>>> The regression testing for the patch was done on GCC mainline on >>>>>>>> >>>>>>>> powerpc64le-unknown-linux-gnu (Power 9 LE) >>>>>>>> >>>>>>>> with no regressions. Is it OK for trunk ? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Lijia He >>>>>>>> >>>>>>>> gcc/ChangeLog >>>>>>>> >>>>>>>> 2019-06-27 Li Jia He <helijia@linux.ibm.com> >>>>>>>> Qi Feng <ffengqi@linux.ibm.com> >>>>>>>> >>>>>>>> PR middle-end/88784 >>>>>>>> * gimple-fold.c (and_comparisons_contain_equal_operands): New >>>>>>>> function. >>>>>>>> (and_comparisons_1): Use >>>>>>>> and_comparisons_contain_equal_operands. >>>>>>>> (or_comparisons_contain_equal_operands): New function. >>>>>>>> (or_comparisons_1): Use or_comparisons_contain_equal_operands. >>>>>>> Would this be better done via match.pd? ISTM this transformation >>>>>>> would >>>>>>> be well suited for that framework. >>>>>> >>>>>> Hi, Jeff >>>>>> >>>>>> I did this because of the following test case: >>>>>> ` >>>>>> _Bool comp(unsigned x, unsigned y) >>>>>> { >>>>>> return x > y && x != 0; >>>>>> } >>>>>> ` >>>>>> The gimple file dumped on the power platform is: >>>>>> ` >>>>>> comp (unsigned int x, unsigned int y) >>>>>> { >>>>>> _Bool D.2837; >>>>>> int iftmp.0; >>>>>> >>>>>> if (x > y) goto <D.2841>; else goto <D.2839>; >>>>>> <D.2841>: >>>>>> if (x != 0) goto <D.2842>; else goto <D.2839>; >>>>>> <D.2842>: >>>>>> iftmp.0 = 1; >>>>>> goto <D.2840>; >>>>>> <D.2839>: >>>>>> iftmp.0 = 0; >>>>>> <D.2840>: >>>>>> D.2837 = (_Bool) iftmp.0; >>>>>> return D.2837; >>>>>> } >>>>>> ` >>>>>> However, the gimple file dumped on x86 is >>>>>> ` >>>>>> comp (unsigned int x, unsigned int y) >>>>>> { >>>>>> _Bool D.2837; >>>>>> >>>>>> _1 = x > y; >>>>>> _2 = x != 0; >>>>>> _3 = _1 & _2; >>>>>> _4 = (int) _3; >>>>>> D.2837 = (_Bool) _4; >>>>>> return D.2837; >>>>>> } >>>>>> ` >>>>>> >>>>>> The reason for the inconsistency between these two behaviors is param >>>>>> logical-op-non-short-circuit. If we add the pattern to the match.pd >>>>>> file, we can only optimize the situation in which the statement is in >>>>>> the same basic block (logical-op-non-short-circuit=1, x86). But for >>>>>> a cross-basic block (logical-op-non-short-circuit=0, power), match.pd >>>>>> can't handle this situation. >>>>>> >>>>>> Another reason is that I found out maybe_fold_and_comparisons and >>>>>> maybe_fold_or_comparisons are not only called by ifcombine pass but >>>>>> also by reassoc pass. Using this method can basically unify param >>>>>> logical-op-non-short-circuit=0 or 1. >>>>> >>>>> >>>>> As mentioned before ifcombine pass should be using gimple-match >>>>> instead of fold_build. Try converting ifcombine over to gimple-match >>>>> infrastructure and add these to match.pd. >>>> >>>> Yes, I mentioned that in the PR. The issue is that at the moment >>>> to combine x > y with x <= y you'd have to build GENERIC trees >>>> for both or temporary GIMPLE assign with a SSA def (and then feed >>>> that into the GENERIC or GIMPLE match.pd path). >>> >>> Hi, >>> >>> I did some experimentation using ‘temporary GIMPLE with a SSA def (and then >>> feed that into the GIMPLE match.pd path’. Could we consider the code in the >>> attachment(I did a test and the code took effect)? >> >> No, that's excessive overhead IMHO - the whole point of these functions >> originally was to avoid build2 (...) on both conditionals. If we >> now create two SSA names and two GIMPLE statements that's too much. >> >> As said it shouldn't be rocket science to teach genmatch to >> auto-generate those functions from match.pd patterns but it certainly >> is some work (less so for somebody with genmatch knowledge). >> I'll give it a poke... > > OK, it's not so easy. I guess we could lower the cost of building > SSA names / gimple stmts significantly if we allocated them on the > stack like via > > gimple *stmt1 = XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN) + 2 * > sizeof (tree)); > memset (stmt1, 0, ...); > gimple_set_code (stmt1, GIMPLE_ASSIGN); > gimple_set_num_ops (stmt1, 3); > gimple_init_singleton (stmt1); > > gimple_stmt_iterator gsi = gsi_for_stmt (stmt1); > gimple_assign_set_rhs_with_ops (&gsi, actual operation...); > > tree lhs1 = XALLOCA (tree_ssa_name); > memset (lhs1, 0, sizeof (tree_ssa_name)); > TREE_SET_CODE (lhs1, SSA_NAME); > TREE_TYPE (lhs1) = ...; > > gimple_assing_set_lhs (stmt1, lhs1); > > it's all a bit ugly and some factoring in the generic gimple machinery > might easen this kind of hacks, but well... > > With the above you could then use > > gimple_match_op op (gimple_match_cond::UNCOND, BIT_AND_EXPR /* or OR */, > boolean_type_node, lhs1, lhs2); > if (gimple_resimplify2 (NULL, &op, follow_all_ssa_edges)) > ... successfully simplified into 'op' ... > > with the simplification path then extracting the simplified result > from 'op'. Note this deliberately leaves out passing a gimple_seq > as storage for more complex simplification results to match > existing behavior. > > Your patch has the GC allocation and SSA name (and namespace!) > allocation overhead and most definitely misses releasing the > SSA names you allocated. > > Note with the above hack the simplified result has to be checked > for mentions of lhs1 or lhs2 - a case we have to reject because > their definitions are transitional. > Hi, I made some changes based on the recommendations. Would you like to help me to see it again ? Sorry, it took so long time to provide the patch. Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1. The reason is that I did not found a good way to handle the optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd. Maybe I missing some important information about match.pd. 2. The gimple_resimplify2 function is not used. Since stmt1, stmt2, lhs1 and lhs2 are allocated on the stack, Is there a question with the value on the stack as the return value ? I may have misunderstood Richard's intention. Thanks, Lijia He > Richard. > >> Richard. >> >>> Thanks, >>> Lijia He >>> >>>> >>>> maybe_fold_and/or_comparisons handle two exploded binary expressions >>>> while the current match.pd entries handle at most one exploded one >>>> (the outermost then, either AND or OR). But it would be definitely >>>> doable to auto-generate maybe_fold_and/or_comparisons from match.pd >>>> patterns which is what I'd ultimatively suggest to do (in some more >>>> generalized form maybe). Either with a separate genmatch invocation >>>> or as part of the --gimple processing (not sure what is more feasible >>>> here). >>>> >>>> I told Li Jia He that I don't expect him to do this work. >>>> >>>> Note I didn't review the actual patch yet. >>>> >>>> Thanks, >>>> Richard. >>>> >>> >> >> > From 2699516000abdef60b06b203d703f80b8905cf09 Mon Sep 17 00:00:00 2001 From: Li Jia He <helijia@linux.ibm.com> Date: Mon, 15 Jul 2019 00:30:25 -0500 Subject: [PATCH 1/2] Auto-generate maybe_fold_and/or_comparisons from match.pd Hi, This patch try to auto-generate maybe_fold_and/or_comparisons from match.pd. We try to build tempory assign with SSA def and then feed that into the GIMPLE match.pd path. In order to lower the cost of building SSA names / gimple stmts significantly, we need to allocated them on the stack. This will cause the code a bit ugly. The regression testing for the patch was done on GCC mainline on powerpc64le-unknown-linux-gnu (Power 9 LE) with no regressions. Is it OK for trunk ? Thanks, Lijia He gcc/ChangeLog 2019-07-16 Li Jia He <helijia@linux.ibm.com> * gimple.h (gimple_set_code): Declare. (gimple_size): Likewise. * gimple.c (gimple_set_code): Remove static and inline restrictions. (gimple_size): Likewise. * gimple-fold.c (maybe_fold_comparisons_from_match_pd): New function. (maybe_fold_and_comparisons): Modify and_comparisons_1 invocation and call maybe_fold_comparisons_from_match_pd. (maybe_fold_or_comparisons): Modify or_comparisons_1 invocation and call maybe_fold_comparisons_from_match_pd. --- gcc/gimple-fold.c | 122 ++++++++++++++++++++++++++++++++++++++++++---- gcc/gimple.c | 4 +- gcc/gimple.h | 2 + 3 files changed, 117 insertions(+), 11 deletions(-) diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c index dfb31a02078..a5c2790361f 100644 --- a/gcc/gimple-fold.c +++ b/gcc/gimple-fold.c @@ -5789,6 +5789,91 @@ and_comparisons_1 (enum tree_code code1, tree op1a, tree op1b, return NULL_TREE; } +/* Helper function for maybe_fold_and_comparisons and maybe_fold_or_comparisons + : try to simplify the AND/OR of the ssa variable VAR with the comparison + specified by (OP2A CODE2 OP2B) from match.pd. Return NULL_EXPR if we can't + simplify this to a single expression. As we are going to lower the cost + of building SSA names / gimple stmts significantly, we need to allocate + them ont the stack. This will cause the code to be a bit ugly. */ + +static tree +maybe_fold_comparisons_from_match_pd (enum tree_code code, enum tree_code code1, + tree op1a, tree op1b, + enum tree_code code2, tree op2a, + tree op2b) +{ + /* Allocate gimple stmt1 on the stack. */ + gimple *stmt1 = (gimple *) (XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN) + + 2 * sizeof (tree))); + memset (stmt1, 0, gimple_size (GIMPLE_ASSIGN) + 2 * sizeof (tree)); + gimple_set_code (stmt1, GIMPLE_ASSIGN); + gimple_set_num_ops (stmt1, 3); + gimple_init_singleton (stmt1); + + /* Allocate gimple stmt2 on the stack. */ + gimple *stmt2 = (gimple *) (XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN) + + 2 * sizeof (tree))); + memset (stmt2, 0, gimple_size (GIMPLE_ASSIGN) + 2 * sizeof (tree)); + gimple_set_code (stmt2, GIMPLE_ASSIGN); + gimple_set_num_ops (stmt2, 3); + gimple_init_singleton (stmt2); + + gimple_assign_set_rhs_code (stmt1, code1); + gimple_assign_set_rhs1 (stmt1, op1a); + gimple_assign_set_rhs2 (stmt1, op1b); + + gimple_assign_set_rhs_code (stmt2, code2); + gimple_assign_set_rhs1 (stmt2, op2a); + gimple_assign_set_rhs2 (stmt2, op2b); + + /* Allocate SSA names(lhs1) on the stack. */ + tree lhs1 = XALLOCA (tree_node); + memset (lhs1, 0, sizeof (tree_node)); + TREE_SET_CODE (lhs1, SSA_NAME); + TREE_TYPE (lhs1) = boolean_type_node; + + /* Allocate SSA names(lhs2) on the stack. */ + tree lhs2 = XALLOCA (tree_node); + memset (lhs2, 0, sizeof (tree_node)); + TREE_SET_CODE (lhs2, SSA_NAME); + TREE_TYPE (lhs2) = boolean_type_node; + + gimple_assign_set_lhs (stmt1, lhs1); + gimple_assign_set_lhs (stmt2, lhs2); + + /* Call the interface function of match.pd to simplify the expression. */ + tree t = gimple_simplify (code, boolean_type_node, gimple_assign_lhs (stmt1), + gimple_assign_lhs (stmt2), NULL, + follow_all_ssa_edges); + + if (t) + { + switch (TREE_CODE (t)) + { + case SSA_NAME: + { + gimple *def = SSA_NAME_DEF_STMT (t); + + if (!is_gimple_assign (def) + || TREE_CODE_CLASS (gimple_assign_rhs_code (def)) + != tcc_comparison) + return NULL_TREE; + + return fold_build2 (gimple_assign_rhs_code (def), + boolean_type_node, gimple_assign_rhs1 (def), + gimple_assign_rhs2 (def)); + } + case INTEGER_CST: + /* Fold expression to boolean_true_node or boolean_false_node. */ + return t; + default: + return NULL_TREE; + } + } + + return NULL_TREE; +} + /* Try to simplify the AND of two comparisons, specified by (OP1A CODE1 OP1B) and (OP2B CODE2 OP2B), respectively. If this can be simplified to a single expression (without requiring @@ -5800,11 +5885,21 @@ tree maybe_fold_and_comparisons (enum tree_code code1, tree op1a, tree op1b, enum tree_code code2, tree op2a, tree op2b) { - tree t = and_comparisons_1 (code1, op1a, op1b, code2, op2a, op2b); - if (t) + if (tree t = and_comparisons_1 (code1, op1a, op1b, code2, op2a, op2b)) return t; - else - return and_comparisons_1 (code2, op2a, op2b, code1, op1a, op1b); + + if (tree t = and_comparisons_1 (code2, op2a, op2b, code1, op1a, op1b)) + return t; + + if (tree t = maybe_fold_comparisons_from_match_pd (BIT_AND_EXPR, code1, op1a, + op1b, code2, op2a, op2b)) + return t; + + if (tree t = maybe_fold_comparisons_from_match_pd (BIT_AND_EXPR, code2, op2a, + op2b, code1, op1a, op1b)) + return t; + + return NULL_TREE; } /* Helper function for or_comparisons_1: try to simplify the OR of the @@ -6264,13 +6359,22 @@ tree maybe_fold_or_comparisons (enum tree_code code1, tree op1a, tree op1b, enum tree_code code2, tree op2a, tree op2b) { - tree t = or_comparisons_1 (code1, op1a, op1b, code2, op2a, op2b); - if (t) + if (tree t = or_comparisons_1 (code1, op1a, op1b, code2, op2a, op2b)) + return t; + + if (tree t = or_comparisons_1 (code2, op2a, op2b, code1, op1a, op1b)) + return t; + + if (tree t = maybe_fold_comparisons_from_match_pd (BIT_IOR_EXPR, code1, op1a, + op1b, code2, op2a, op2b)) + return t; + + if (tree t = maybe_fold_comparisons_from_match_pd (BIT_IOR_EXPR, code2, op2a, + op2b, code1, op1a, op1b)) return t; - else - return or_comparisons_1 (code2, op2a, op2b, code1, op1a, op1b); -} + return NULL_TREE; +} /* Fold STMT to a constant using VALUEIZE to valueize SSA names. diff --git a/gcc/gimple.c b/gcc/gimple.c index 63c8d5e85ae..6769bd9edec 100644 --- a/gcc/gimple.c +++ b/gcc/gimple.c @@ -101,7 +101,7 @@ const enum gimple_code gcond::code_; /* Set the code for statement G to CODE. */ -static inline void +void gimple_set_code (gimple *g, enum gimple_code code) { g->code = code; @@ -110,7 +110,7 @@ gimple_set_code (gimple *g, enum gimple_code code) /* Return the number of bytes needed to hold a GIMPLE statement with code CODE. */ -static inline size_t +size_t gimple_size (enum gimple_code code) { return gsstruct_code_size[gss_for_code (code)]; diff --git a/gcc/gimple.h b/gcc/gimple.h index 47070e7f409..a32708ad349 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -1447,6 +1447,8 @@ extern enum gimple_statement_structure_enum const gss_for_code_[]; of comminucating the profile info to the builtin expanders. */ extern gimple *currently_expanding_gimple_stmt; +void gimple_set_code (gimple *g, enum gimple_code code); +size_t gimple_size (enum gimple_code code); gimple *gimple_alloc (enum gimple_code, unsigned CXX_MEM_STAT_INFO); greturn *gimple_build_return (tree); void gimple_call_reset_alias_info (gcall *);
@Richi: PING^1 On 7/16/19 8:35 AM, Li Jia He wrote: > > > On 2019/7/2 4:51 PM, Richard Biener wrote: >> On Tue, 2 Jul 2019, Richard Biener wrote: >> >>> On Tue, 2 Jul 2019, Li Jia He wrote: >>> >>>> >>>> >>>> On 2019/7/1 3:30 PM, Richard Biener wrote: >>>>> On Fri, 28 Jun 2019, Andrew Pinski wrote: >>>>> >>>>>> On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2019/6/27 11:48 PM, Jeff Law wrote: >>>>>>>> On 6/27/19 12:11 AM, Li Jia He wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> According to the optimizable case described by Qi Feng on >>>>>>>>> issue 88784, we can combine the cases into the following: >>>>>>>>> >>>>>>>>> 1. x > y && x != XXX_MIN --> x > y >>>>>>>>> 2. x > y && x == XXX_MIN --> false >>>>>>>>> 3. x <= y && x == XXX_MIN --> x == XXX_MIN >>>>>>>>> >>>>>>>>> 4. x < y && x != XXX_MAX --> x < y >>>>>>>>> 5. x < y && x == XXX_MAX --> false >>>>>>>>> 6. x >= y && x == XXX_MAX --> x == XXX_MAX >>>>>>>>> >>>>>>>>> 7. x > y || x != XXX_MIN --> x != XXX_MIN >>>>>>>>> 8. x <= y || x != XXX_MIN --> true >>>>>>>>> 9. x <= y || x == XXX_MIN --> x <= y >>>>>>>>> >>>>>>>>> 10. x < y || x != XXX_MAX --> x != UXXX_MAX >>>>>>>>> 11. x >= y || x != XXX_MAX --> true >>>>>>>>> 12. x >= y || x == XXX_MAX --> x >= y >>>>>>>>> >>>>>>>>> Note: XXX_MIN represents the minimum value of type x. >>>>>>>>> XXX_MAX represents the maximum value of type x. >>>>>>>>> >>>>>>>>> Here we don't need to care about whether the operation is >>>>>>>>> signed or unsigned. For example, in the below equation: >>>>>>>>> >>>>>>>>> 'x > y && x != XXX_MIN --> x > y' >>>>>>>>> >>>>>>>>> If the x type is signed int and XXX_MIN is INT_MIN, we can >>>>>>>>> optimize it to 'x > y'. However, if the type of x is unsigned >>>>>>>>> int and XXX_MIN is 0, we can still optimize it to 'x > y'. >>>>>>>>> >>>>>>>>> The regression testing for the patch was done on GCC mainline on >>>>>>>>> >>>>>>>>> powerpc64le-unknown-linux-gnu (Power 9 LE) >>>>>>>>> >>>>>>>>> with no regressions. Is it OK for trunk ? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Lijia He >>>>>>>>> >>>>>>>>> gcc/ChangeLog >>>>>>>>> >>>>>>>>> 2019-06-27 Li Jia He <helijia@linux.ibm.com> >>>>>>>>> Qi Feng <ffengqi@linux.ibm.com> >>>>>>>>> >>>>>>>>> PR middle-end/88784 >>>>>>>>> * gimple-fold.c (and_comparisons_contain_equal_operands): New >>>>>>>>> function. >>>>>>>>> (and_comparisons_1): Use >>>>>>>>> and_comparisons_contain_equal_operands. >>>>>>>>> (or_comparisons_contain_equal_operands): New function. >>>>>>>>> (or_comparisons_1): Use or_comparisons_contain_equal_operands. >>>>>>>> Would this be better done via match.pd? ISTM this transformation >>>>>>>> would >>>>>>>> be well suited for that framework. >>>>>>> >>>>>>> Hi, Jeff >>>>>>> >>>>>>> I did this because of the following test case: >>>>>>> ` >>>>>>> _Bool comp(unsigned x, unsigned y) >>>>>>> { >>>>>>> return x > y && x != 0; >>>>>>> } >>>>>>> ` >>>>>>> The gimple file dumped on the power platform is: >>>>>>> ` >>>>>>> comp (unsigned int x, unsigned int y) >>>>>>> { >>>>>>> _Bool D.2837; >>>>>>> int iftmp.0; >>>>>>> >>>>>>> if (x > y) goto <D.2841>; else goto <D.2839>; >>>>>>> <D.2841>: >>>>>>> if (x != 0) goto <D.2842>; else goto <D.2839>; >>>>>>> <D.2842>: >>>>>>> iftmp.0 = 1; >>>>>>> goto <D.2840>; >>>>>>> <D.2839>: >>>>>>> iftmp.0 = 0; >>>>>>> <D.2840>: >>>>>>> D.2837 = (_Bool) iftmp.0; >>>>>>> return D.2837; >>>>>>> } >>>>>>> ` >>>>>>> However, the gimple file dumped on x86 is >>>>>>> ` >>>>>>> comp (unsigned int x, unsigned int y) >>>>>>> { >>>>>>> _Bool D.2837; >>>>>>> >>>>>>> _1 = x > y; >>>>>>> _2 = x != 0; >>>>>>> _3 = _1 & _2; >>>>>>> _4 = (int) _3; >>>>>>> D.2837 = (_Bool) _4; >>>>>>> return D.2837; >>>>>>> } >>>>>>> ` >>>>>>> >>>>>>> The reason for the inconsistency between these two behaviors is param >>>>>>> logical-op-non-short-circuit. If we add the pattern to the match.pd >>>>>>> file, we can only optimize the situation in which the statement is in >>>>>>> the same basic block (logical-op-non-short-circuit=1, x86). But for >>>>>>> a cross-basic block (logical-op-non-short-circuit=0, power), match.pd >>>>>>> can't handle this situation. >>>>>>> >>>>>>> Another reason is that I found out maybe_fold_and_comparisons and >>>>>>> maybe_fold_or_comparisons are not only called by ifcombine pass but >>>>>>> also by reassoc pass. Using this method can basically unify param >>>>>>> logical-op-non-short-circuit=0 or 1. >>>>>> >>>>>> >>>>>> As mentioned before ifcombine pass should be using gimple-match >>>>>> instead of fold_build. Try converting ifcombine over to gimple-match >>>>>> infrastructure and add these to match.pd. >>>>> >>>>> Yes, I mentioned that in the PR. The issue is that at the moment >>>>> to combine x > y with x <= y you'd have to build GENERIC trees >>>>> for both or temporary GIMPLE assign with a SSA def (and then feed >>>>> that into the GENERIC or GIMPLE match.pd path). >>>> >>>> Hi, >>>> >>>> I did some experimentation using ‘temporary GIMPLE with a SSA def (and then >>>> feed that into the GIMPLE match.pd path’. Could we consider the code in the >>>> attachment(I did a test and the code took effect)? >>> >>> No, that's excessive overhead IMHO - the whole point of these functions >>> originally was to avoid build2 (...) on both conditionals. If we >>> now create two SSA names and two GIMPLE statements that's too much. >>> >>> As said it shouldn't be rocket science to teach genmatch to >>> auto-generate those functions from match.pd patterns but it certainly >>> is some work (less so for somebody with genmatch knowledge). >>> I'll give it a poke... >> >> OK, it's not so easy. I guess we could lower the cost of building >> SSA names / gimple stmts significantly if we allocated them on the >> stack like via >> >> gimple *stmt1 = XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN) + 2 * >> sizeof (tree)); >> memset (stmt1, 0, ...); >> gimple_set_code (stmt1, GIMPLE_ASSIGN); >> gimple_set_num_ops (stmt1, 3); >> gimple_init_singleton (stmt1); >> >> gimple_stmt_iterator gsi = gsi_for_stmt (stmt1); >> gimple_assign_set_rhs_with_ops (&gsi, actual operation...); >> >> tree lhs1 = XALLOCA (tree_ssa_name); >> memset (lhs1, 0, sizeof (tree_ssa_name)); >> TREE_SET_CODE (lhs1, SSA_NAME); >> TREE_TYPE (lhs1) = ...; >> >> gimple_assing_set_lhs (stmt1, lhs1); >> >> it's all a bit ugly and some factoring in the generic gimple machinery >> might easen this kind of hacks, but well... >> >> With the above you could then use >> >> gimple_match_op op (gimple_match_cond::UNCOND, BIT_AND_EXPR /* or OR */, >> boolean_type_node, lhs1, lhs2); >> if (gimple_resimplify2 (NULL, &op, follow_all_ssa_edges)) >> ... successfully simplified into 'op' ... >> >> with the simplification path then extracting the simplified result >> from 'op'. Note this deliberately leaves out passing a gimple_seq >> as storage for more complex simplification results to match >> existing behavior. >> >> Your patch has the GC allocation and SSA name (and namespace!) >> allocation overhead and most definitely misses releasing the >> SSA names you allocated. >> >> Note with the above hack the simplified result has to be checked >> for mentions of lhs1 or lhs2 - a case we have to reject because >> their definitions are transitional. >> > > Hi, > > I made some changes based on the recommendations. Would you like to > help me to see it again ? Sorry, it took so long time to provide the > patch. > > Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1. > The reason is that I did not found a good way to handle the > optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd. > Maybe I missing some important information about match.pd. > 2. The gimple_resimplify2 function is not used. Since stmt1, > stmt2, lhs1 and lhs2 are allocated on the stack, Is there a > question with the value on the stack as the return value ? > I may have misunderstood Richard's intention. > > Thanks, > Lijia He > >> Richard. >> >>> Richard. >>> >>>> Thanks, >>>> Lijia He >>>> >>>>> >>>>> maybe_fold_and/or_comparisons handle two exploded binary expressions >>>>> while the current match.pd entries handle at most one exploded one >>>>> (the outermost then, either AND or OR). But it would be definitely >>>>> doable to auto-generate maybe_fold_and/or_comparisons from match.pd >>>>> patterns which is what I'd ultimatively suggest to do (in some more >>>>> generalized form maybe). Either with a separate genmatch invocation >>>>> or as part of the --gimple processing (not sure what is more feasible >>>>> here). >>>>> >>>>> I told Li Jia He that I don't expect him to do this work. >>>>> >>>>> Note I didn't review the actual patch yet. >>>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>> >>> >>> >>
On Tue, 16 Jul 2019, Li Jia He wrote: > Hi, > > I made some changes based on the recommendations. Would you like to > help me to see it again ? Sorry, it took so long time to provide the > patch. > > Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1. > The reason is that I did not found a good way to handle the > optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd. > Maybe I missing some important information about match.pd. > 2. The gimple_resimplify2 function is not used. Since stmt1, > stmt2, lhs1 and lhs2 are allocated on the stack, Is there a > question with the value on the stack as the return value ? > I may have misunderstood Richard's intention. Sorry for the delay in reviewing. Rather than exporting gimple_set_code and gimple_size I'd split out a void gimple_init (gimple *, enum gimple_code code, unsigned nops); from gimple_alloc (changing that to GC allocate not cleared memory) doing all of the actual initialization. Then the allocation would go like gimple *stmt1 = (gimple *)XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN, 3)); gimple_init (stmt1, GIMPLE_ASSIGN, 3); with an overload for gimple_size to account for # of ops. + /* Allocate SSA names(lhs1) on the stack. */ + tree lhs1 = XALLOCA (tree_node); you can use (tree) XALLOCA (tree_ssa_name) here + /* Call the interface function of match.pd to simplify the expression. */ + tree t = gimple_simplify (code, boolean_type_node, gimple_assign_lhs (stmt1), + gimple_assign_lhs (stmt2), NULL, + follow_all_ssa_edges); + + if (t) As I told Martin offline the appropriate function to use is You'd do gimple_match_op op (gimple_match_cond::UNCOND, code, boolean_type_node, gimple_assign_lhs (stmt1), gimple_assign_lhs (stmt2)); if (op->resimplify (NULL, follow_all_ssa_edges)) { if (gimple_simplified_result_is_gimple_val (res_op)) .. got a constant or SSA name .. else if (res_op->code.is_tree_code () && TREE_CODE_CLASS ((tree_code)res_op->code)) == tcc_comparison) ... got a comparison res_op->op[0] res_op->code res_op->op[1] ... so you get the outermost expression back decomposed. Otherwise with you passing NULL as the gimple_seq you'll miss quite some simplification cases. You also have to watch out for the result containing the LHS of one of your temporary stmts. + if (tree t = maybe_fold_comparisons_from_match_pd (BIT_AND_EXPR, code1, op1a, + op1b, code2, op2a, op2b)) + return t; + + if (tree t = maybe_fold_comparisons_from_match_pd (BIT_AND_EXPR, code2, op2a, + op2b, code1, op1a, op1b)) + return t; with match.pd rules you shouldn't need to call this twice, once with swapped operands. Otherwise this first patch looks like what I'd have done and we can build upon it. Not sure if you or Martin wants to improve it according to my comments. Thanks, Richard.
On Tue, 16 Jul 2019, Li Jia He wrote: > Hi, > > I made some changes based on the recommendations. Would you like to > help me to see it again ? Sorry, it took so long time to provide the > patch. > > Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1. > The reason is that I did not found a good way to handle the > optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd. > Maybe I missing some important information about match.pd. > 2. The gimple_resimplify2 function is not used. Since stmt1, > stmt2, lhs1 and lhs2 are allocated on the stack, Is there a > question with the value on the stack as the return value ? > I may have misunderstood Richard's intention. And now for the match.pd patch. +/* x > y && x != XXX_MIN --> x > y */ +(for and (truth_and bit_and) + (simplify + (and:c (gt:c@3 @0 @1) (ne @0 INTEGER_CST@2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) + && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2))))) + @3))) + +/* x > y && x == XXX_MIN --> false */ +(for and (truth_and bit_and) + (simplify + (and:c (gt:c @0 @1) (eq @0 INTEGER_CST@2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) + && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2))))) + { boolean_false_node; }))) you could merge those two via (for eqne (eq ne) (for and (.... (simplify (and:c (gt:c @0 @1) (eqne @0 INTEGER_CST@2)) (if (...) (switch (if (eqne == NE_EXPR) @3) (if (eqne == EQ_EXPR) { constant_boolean_node (false, type); })))) notice using constant_boolean_node (false, type); instead of boolean_false_node. I suspect more unification is possible. Also you could do (match min_value INTEGER_CST (if (INTEGRAL_TYPE_P (type) && wi::eq_p (wi::to_wide (t), wi::min_value (type))))) and then write (simplify (and:c (gt:c @0 @1) (eq @0 min_value)) (... Your (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) is redundant, it's enough to check either @0 or @1 given they have to be compatible for the gt operation. Note you probably want to use (and:c (gt:c @0 @1) (eq @@0 min_value)) and verify that types_match (@1, @0) because when @0 are a constant (and (eq @0 min_value) is not folded which can happen) then they might have different types and thus you could have (SHORT_MAX > intvar) && (SHORT_MAX == SHORT_MAX) That said, the patterns can be quite a bit simplified I think. Richard.
On 9/5/19 3:01 PM, Richard Biener wrote: > Not sure if you or Martin wants to improve it according to my > comments. Yes please. I'm working on that based on the review you provided. Thanks, Martin
On 9/5/19 3:17 PM, Richard Biener wrote:
> That said, the patterns can be quite a bit simplified I think.
I will take care of it as well.
Martin
On Thu, 5 Sep 2019, Martin Liška wrote: > On 9/5/19 3:01 PM, Richard Biener wrote: > > Not sure if you or Martin wants to improve it according to my > > comments. > > Yes please. I'm working on that based on the review you provided. Oh, and it just occured to me since we're doing single_use checks on SSA names in match.pd that you need to initialize the SSA_NAME_IMM_USE_NODE () as well - see make_ssa_name_fn, probably makes sense to split out a short helper for that. Richard.
On 9/6/19 10:01 AM, Richard Biener wrote: > On Thu, 5 Sep 2019, Martin Liška wrote: > >> On 9/5/19 3:01 PM, Richard Biener wrote: >>> Not sure if you or Martin wants to improve it according to my >>> comments. >> >> Yes please. I'm working on that based on the review you provided. > > Oh, and it just occured to me since we're doing single_use checks > on SSA names in match.pd that you need to initialize the > SSA_NAME_IMM_USE_NODE () as well - see make_ssa_name_fn, probably > makes sense to split out a short helper for that. Yes, I already hit the ICE :) Martin > > Richard. >
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c index dfb31a02078..6d2472d2fcb 100644 --- a/gcc/gimple-fold.c +++ b/gcc/gimple-fold.c @@ -5535,6 +5535,62 @@ and_var_with_comparison_1 (gimple *stmt, return NULL_TREE; } +/* Try to simplify the AND of two comparisons defined by + (OP1A CODE1 OP1B) and (OP2A CODE2 OP2B), respectively. + This optimization needs to satisfy op1a equal to op2a + or op1b equal to op2a. If this can be done without + constructing an intermediate value, return the resulting + tree; otherwise NULL_TREE is returned. */ + +static tree +and_comparisons_contain_equal_operands (enum tree_code code1, tree op1a, + tree op1b, enum tree_code code2, + tree op2a, tree op2b) +{ + /* Transform ((Y1 CODE1 X) AND (X CODE2 Y2)) to + ((X CODE1' Y1) AND (X CODE2 Y2)). */ + if (!operand_equal_p (op1a, op1b, 0) && operand_equal_p (op1b, op2a, 0) + && !operand_equal_p (op2a, op2b, 0)) + return and_comparisons_contain_equal_operands (swap_tree_comparison (code1), + op1b, op1a, code2, op2a, + op2b); + + tree op1a_type = TREE_TYPE (op1a); + tree op1b_type = TREE_TYPE (op1b); + tree op2b_type = TREE_TYPE (op2b); + + if (INTEGRAL_TYPE_P (op1a_type) && INTEGRAL_TYPE_P (op1b_type) + && operand_equal_p (op1a, op2a, 0) && TREE_CODE (op2b) == INTEGER_CST) + { + if (wi::eq_p (wi::to_wide (op2b), wi::min_value (op2b_type))) + { + /* x > y && x != XXX_MIN --> x > y */ + if (code1 == GT_EXPR && code2 == NE_EXPR) + return fold_build2 (code1, boolean_type_node, op1a, op1b); + /* x > y && x == XXX_MIN --> false */ + if (code1 == GT_EXPR && code2 == EQ_EXPR) + return boolean_false_node; + /* x <= y && x == XXX_MIN --> x == XXX_MIN */ + if (code1 == LE_EXPR && code2 == EQ_EXPR) + return fold_build2 (code2, boolean_type_node, op2a, op2b); + } + else if (wi::eq_p (wi::to_wide (op2b), wi::max_value (op2b_type))) + { + /* x < y && x != XXX_MAX --> x < y */ + if (code1 == LT_EXPR && code2 == NE_EXPR) + return fold_build2 (code1, boolean_type_node, op1a, op1b); + /* x < y && x == XXX_MAX --> false */ + if (code1 == LT_EXPR && code2 == EQ_EXPR) + return boolean_false_node; + /* x >= y && x == XXX_MAX --> x == XXX_MAX */ + if (code1 == GE_EXPR && code2 == EQ_EXPR) + return fold_build2 (code2, boolean_type_node, op2a, op2b); + } + } + + return NULL_TREE; +} + /* Try to simplify the AND of two comparisons defined by (OP1A CODE1 OP1B) and (OP2A CODE2 OP2B), respectively. If this can be done without constructing an intermediate value, @@ -5546,6 +5602,12 @@ static tree and_comparisons_1 (enum tree_code code1, tree op1a, tree op1b, enum tree_code code2, tree op2a, tree op2b) { + /* Try to optimize ((x CODE1 y1) AND (x CODE2 y2)) + and ((y1 CODE1 x) AND (x CODE2 y2)). */ + if (tree t = and_comparisons_contain_equal_operands (code1, op1a, op1b, code2, + op2a, op2b)) + return t; + tree truth_type = truth_type_for (TREE_TYPE (op1a)); /* First check for ((x CODE1 y) AND (x CODE2 y)). */ @@ -5999,6 +6061,62 @@ or_var_with_comparison_1 (gimple *stmt, return NULL_TREE; } +/* Try to simplify the OR of two comparisons defined by + (OP1A CODE1 OP1B) or (OP2A CODE2 OP2B), respectively. + This optimization needs to satisfy op1a equal to op2a + or op1b equal to op2a. If this can be done without + constructing an intermediate value, return the resulting + tree; otherwise NULL_TREE is returned. */ + +static tree +or_comparisons_contain_equal_operands (enum tree_code code1, tree op1a, + tree op1b, enum tree_code code2, + tree op2a, tree op2b) +{ + /* Transform ((Y1 CODE1 X) OR (X CODE2 Y2)) to + ((X CODE1' Y1) OR (X CODE2 Y2)). */ + if (!operand_equal_p (op1a, op1b, 0) && operand_equal_p (op1b, op2a, 0) + && !operand_equal_p (op2a, op2b, 0)) + return or_comparisons_contain_equal_operands (swap_tree_comparison (code1), + op1b, op1a, code2, op2a, + op2b); + + tree op1a_type = TREE_TYPE (op1a); + tree op1b_type = TREE_TYPE (op1b); + tree op2b_type = TREE_TYPE (op2b); + + if (INTEGRAL_TYPE_P (op1a_type) && INTEGRAL_TYPE_P (op1b_type) + && operand_equal_p (op1a, op2a, 0) && TREE_CODE (op2b) == INTEGER_CST) + { + if (wi::eq_p (wi::to_wide (op2b), wi::min_value (op2b_type))) + { + /* x > y || x != XXX_MIN --> x != XXX_MIN */ + if (code1 == GT_EXPR && code2 == NE_EXPR) + return fold_build2 (code2, boolean_type_node, op2a, op2b); + /* x <= y || x != XXX_MIN --> true */ + if (code1 == LE_EXPR && code2 == NE_EXPR) + return boolean_true_node; + /* x <= y || x == XXX_MIN --> x <= y */ + if (code1 == LE_EXPR && code2 == EQ_EXPR) + return fold_build2 (code1, boolean_type_node, op1a, op1b); + } + else if (wi::eq_p (wi::to_wide (op2b), wi::max_value (op2b_type))) + { + /* x < y || x != XXX_MAX --> x != XXX_MAX */ + if (code1 == LT_EXPR && code2 == NE_EXPR) + return fold_build2 (code2, boolean_type_node, op2a, op2b); + /* x >= y || x != XXX_MAX --> true */ + if (code1 == GE_EXPR && code2 == NE_EXPR) + return boolean_true_node; + /* x >= y || x == XXX_MAX --> x >= y */ + if (code1 == GE_EXPR && code2 == EQ_EXPR) + return fold_build2 (code1, boolean_type_node, op1a, op1b); + } + } + + return NULL_TREE; +} + /* Try to simplify the OR of two comparisons defined by (OP1A CODE1 OP1B) and (OP2A CODE2 OP2B), respectively. If this can be done without constructing an intermediate value, @@ -6010,6 +6128,12 @@ static tree or_comparisons_1 (enum tree_code code1, tree op1a, tree op1b, enum tree_code code2, tree op2a, tree op2b) { + /* Try to optimize ((x CODE1 y1) OR (x CODE2 y2)) + and ((y1 CODE1 x) OR (x CODE2 y2)). */ + if (tree t = or_comparisons_contain_equal_operands (code1, op1a, op1b, code2, + op2a, op2b)) + return t; + tree truth_type = truth_type_for (TREE_TYPE (op1a)); /* First check for ((x CODE1 y) OR (x CODE2 y)). */ diff --git a/gcc/testsuite/gcc.dg/pr88784-1.c b/gcc/testsuite/gcc.dg/pr88784-1.c new file mode 100644 index 00000000000..067d40d0739 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr88784-1.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */ + +#include <limits.h> + +_Bool and1(unsigned x, unsigned y) +{ + /* x > y && x != 0 --> x > y */ + return x > y && x != 0; +} + +_Bool and2(unsigned x, unsigned y) +{ + /* x < y && x != UINT_MAX --> x < y */ + return x < y && x != UINT_MAX; +} + +_Bool and3(signed x, signed y) +{ + /* x > y && x != INT_MIN --> x > y */ + return x > y && x != INT_MIN; +} + +_Bool and4(signed x, signed y) +{ + /* x < y && x != INT_MAX --> x < y */ + return x < y && x != INT_MAX; +} + +/* { dg-final { scan-tree-dump-not " != " "ifcombine" } } */ diff --git a/gcc/testsuite/gcc.dg/pr88784-10.c b/gcc/testsuite/gcc.dg/pr88784-10.c new file mode 100644 index 00000000000..fa185d680c2 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr88784-10.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */ + +#include <limits.h> + +_Bool or1(unsigned x, unsigned y) +{ + /* x <= y || x != 0 --> true */ + return x <= y || x != 0; +} + +_Bool or2(unsigned x, unsigned y) +{ + /* x >= y || x != UINT_MAX --> true */ + return x >= y || x != UINT_MAX; +} + +_Bool or3(signed x, signed y) +{ + /* x <= y || x != INT_MIN --> true */ + return x <= y || x != INT_MIN; +} + +_Bool or4(signed x, signed y) +{ + /* x >= y || x != INT_MAX --> true */ + return x >= y || x != INT_MAX; +} + +/* { dg-final { scan-tree-dump-not " != " "dce3" } } */ +/* { dg-final { scan-tree-dump-not " <= " "dce3" } } */ +/* { dg-final { scan-tree-dump-not " >= " "dce3" } } */ diff --git a/gcc/testsuite/gcc.dg/pr88784-11.c b/gcc/testsuite/gcc.dg/pr88784-11.c new file mode 100644 index 00000000000..4465910efbb --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr88784-11.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */ + +#include <limits.h> + +_Bool or1(unsigned x, unsigned y) +{ + /* x <= y || x == 0 --> x <= y */ + return x <= y || x == 0; +} + +_Bool or2(unsigned x, unsigned y) +{ + /* x >= y || x == UINT_MAX --> x >= y */ + return x >= y || x == UINT_MAX; +} + +_Bool or3(signed x, signed y) +{ + /* x <= y || x == INT_MIN --> x <= y */ + return x <= y || x == INT_MIN; +} + +_Bool or4(signed x, signed y) +{ + /* x >= y || x == INT_MAX --> x >= y */ + return x >= y || x == INT_MAX; +} + +/* { dg-final { scan-tree-dump-not " == " "ifcombine" } } */ diff --git a/gcc/testsuite/gcc.dg/pr88784-12.c b/gcc/testsuite/gcc.dg/pr88784-12.c new file mode 100644 index 00000000000..477bba07821 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr88784-12.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */ + +#include <limits.h> + +_Bool or1(unsigned x, unsigned y) +{ + /* x <= y || x == 0 --> x <= y */ + return x <= y || x == 0; +} + +_Bool or2(unsigned x, unsigned y) +{ + /* x >= y || x == UINT_MAX --> x >= y */ + return x >= y || x == UINT_MAX; +} + +_Bool or3(signed x, signed y) +{ + /* x <= y || x == INT_MIN --> x <= y */ + return x <= y || x == INT_MIN; +} + +_Bool or4(signed x, signed y) +{ + /* x >= y || x == INT_MAX --> x >= y */ + return x >= y || x == INT_MAX; +} + +/* { dg-final { scan-tree-dump-not " == " "dce3" } } */ diff --git a/gcc/testsuite/gcc.dg/pr88784-2.c b/gcc/testsuite/gcc.dg/pr88784-2.c new file mode 100644 index 00000000000..265ddc7ceeb --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr88784-2.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */ + +#include <limits.h> + +_Bool and1(unsigned x, unsigned y) +{ + /* x > y && x != 0 --> x > y */ + return x > y && x != 0; +} + +_Bool and2(unsigned x, unsigned y) +{ + /* x < y && x != UINT_MAX --> x < y */ + return x < y && x != UINT_MAX; +} + +_Bool and3(signed x, signed y) +{ + /* x > y && x != INT_MIN --> x > y */ + return x > y && x != INT_MIN; +} + +_Bool and4(signed x, signed y) +{ + /* x < y && x != INT_MAX --> x < y */ + return x < y && x != INT_MAX; +} + +/* { dg-final { scan-tree-dump-not " != " "dce3" } } */ diff --git a/gcc/testsuite/gcc.dg/pr88784-3.c b/gcc/testsuite/gcc.dg/pr88784-3.c new file mode 100644 index 00000000000..be2ce315e60 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr88784-3.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */ + +#include <limits.h> + +_Bool and1(unsigned x, unsigned y) +{ + /* x > y && x == 0 --> false */ + return x > y && x == 0; +} + +_Bool and2(unsigned x, unsigned y) +{ + /* x < y && x == UINT_MAX --> false */ + return x < y && x == UINT_MAX; +} + +_Bool and3(signed x, signed y) +{ + /* x > y && x == INT_MIN --> false */ + return x > y && x == INT_MIN; +} + +_Bool and4(signed x, signed y) +{ + /* x < y && x == INT_MAX --> false */ + return x < y && x == INT_MAX; +} + +/* { dg-final { scan-tree-dump-not " == " "ifcombine" } } */ +/* { dg-final { scan-tree-dump-not " > " "ifcombine" } } */ +/* { dg-final { scan-tree-dump-not " < " "ifcombine" } } */ diff --git a/gcc/testsuite/gcc.dg/pr88784-4.c b/gcc/testsuite/gcc.dg/pr88784-4.c new file mode 100644 index 00000000000..b927e712464 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr88784-4.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */ + +#include <limits.h> + +_Bool and1(unsigned x, unsigned y) +{ + /* x > y && x == 0 --> false */ + return x > y && x == 0; +} + +_Bool and2(unsigned x, unsigned y) +{ + /* x < y && x == UINT_MAX --> false */ + return x < y && x == UINT_MAX; +} + +_Bool and3(signed x, signed y) +{ + /* x > y && x == INT_MIN --> false */ + return x > y && x == INT_MIN; +} + +_Bool and4(signed x, signed y) +{ + /* x < y && x == INT_MAX --> false */ + return x < y && x == INT_MAX; +} + +/* { dg-final { scan-tree-dump-not " == " "dce3" } } */ +/* { dg-final { scan-tree-dump-not " > " "dce3" } } */ +/* { dg-final { scan-tree-dump-not " < " "dce3" } } */ diff --git a/gcc/testsuite/gcc.dg/pr88784-5.c b/gcc/testsuite/gcc.dg/pr88784-5.c new file mode 100644 index 00000000000..c6a349d7c75 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr88784-5.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */ + +#include <limits.h> + +_Bool and1(unsigned x, unsigned y) +{ + /* x <= y && x == 0 --> x == 0 */ + return x <= y && x == 0; +} + +_Bool and2(unsigned x, unsigned y) +{ + /* x >= y && x == UINT_MAX --> x == UINT_MAX */ + return x >= y && x == UINT_MAX; +} + +_Bool and3(signed x, signed y) +{ + /* x <= y && x == INT_MIN --> x == INT_MIN */ + return x <= y && x == INT_MIN; +} + +_Bool and4(signed x, signed y) +{ + /* x >= y && x == INT_MAX --> x == INT_MAX */ + return x >= y && x == INT_MAX; +} + +/* { dg-final { scan-tree-dump-not " <= " "ifcombine" } } */ +/* { dg-final { scan-tree-dump-not " >= " "ifcombine" } } */ diff --git a/gcc/testsuite/gcc.dg/pr88784-6.c b/gcc/testsuite/gcc.dg/pr88784-6.c new file mode 100644 index 00000000000..5b5d2221bf0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr88784-6.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */ + +#include <limits.h> + +_Bool and1(unsigned x, unsigned y) +{ + /* x <= y && x == 0 --> x == 0 */ + return x <= y && x == 0; +} + +_Bool and2(unsigned x, unsigned y) +{ + /* x >= y && x == UINT_MAX --> x == UINT_MAX */ + return x >= y && x == UINT_MAX; +} + +_Bool and3(signed x, signed y) +{ + /* x <= y && x == INT_MIN --> x == INT_MIN */ + return x <= y && x == INT_MIN; +} + +_Bool and4(signed x, signed y) +{ + /* x >= y && x == INT_MAX --> x == INT_MAX */ + return x >= y && x == INT_MAX; +} + +/* { dg-final { scan-tree-dump-not " <= " "dce3" } } */ +/* { dg-final { scan-tree-dump-not " >= " "dce3" } } */ diff --git a/gcc/testsuite/gcc.dg/pr88784-7.c b/gcc/testsuite/gcc.dg/pr88784-7.c new file mode 100644 index 00000000000..937d2d26593 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr88784-7.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */ + +#include <limits.h> + +_Bool or1(unsigned x, unsigned y) +{ + /* x > y || x != 0 --> x != 0 */ + return x > y || x != 0; +} + +_Bool or2(unsigned x, unsigned y) +{ + /* x < y || x != UINT_MAX --> x != UINT_MAX */ + return x < y || x != UINT_MAX; +} + +_Bool or3(signed x, signed y) +{ + /* x > y || x != INT_MIN --> x != INT_MIN */ + return x > y || x != INT_MIN; +} + +_Bool or4(signed x, signed y) +{ + /* x < y || x != INT_MAX --> x != INT_MAX */ + return x < y || x != INT_MAX; +} + +/* { dg-final { scan-tree-dump-not " > " "ifcombine" } } */ +/* { dg-final { scan-tree-dump-not " < " "ifcombine" } } */ diff --git a/gcc/testsuite/gcc.dg/pr88784-8.c b/gcc/testsuite/gcc.dg/pr88784-8.c new file mode 100644 index 00000000000..7f5c845eb27 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr88784-8.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */ + +#include <limits.h> + +_Bool or1(unsigned x, unsigned y) +{ + /* x > y || x != 0 --> x != 0 */ + return x > y || x != 0; +} + +_Bool or2(unsigned x, unsigned y) +{ + /* x < y || x != UINT_MAX --> x != UINT_MAX */ + return x < y || x != UINT_MAX; +} + +_Bool or3(signed x, signed y) +{ + /* x > y || x != INT_MIN --> x != INT_MIN */ + return x > y || x != INT_MIN; +} + +_Bool or4(signed x, signed y) +{ + /* x < y || x != INT_MAX --> x != INT_MAX */ + return x < y || x != INT_MAX; +} + +/* { dg-final { scan-tree-dump-not " > " "dce3" } } */ +/* { dg-final { scan-tree-dump-not " < " "dce3" } } */ diff --git a/gcc/testsuite/gcc.dg/pr88784-9.c b/gcc/testsuite/gcc.dg/pr88784-9.c new file mode 100644 index 00000000000..94f62d94ede --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr88784-9.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */ + +#include <limits.h> + +_Bool or1(unsigned x, unsigned y) +{ + /* x <= y || x != 0 --> true */ + return x <= y || x != 0; +} + +_Bool or2(unsigned x, unsigned y) +{ + /* x >= y || x != UINT_MAX --> true */ + return x >= y || x != UINT_MAX; +} + +_Bool or3(signed x, signed y) +{ + /* x <= y || x != INT_MIN --> true */ + return x <= y || x != INT_MIN; +} + +_Bool or4(signed x, signed y) +{ + /* x >= y || x != INT_MAX --> true */ + return x >= y || x != INT_MAX; +} + +/* { dg-final { scan-tree-dump-not " != " "ifcombine" } } */ +/* { dg-final { scan-tree-dump-not " <= " "ifcombine" } } */ +/* { dg-final { scan-tree-dump-not " >= " "ifcombine" } } */