Message ID | CAEwic4a-PGR=+h--OT4Saz+XuW5drEePqQq8Y-4WN=F1L4A1rg@mail.gmail.com |
---|---|
State | New |
Headers | show |
Hi, On Tue, 2 Aug 2011, Kai Tietz wrote: > this patch improves the ability of reassociation pass to simplifiy > more complex bitwise-binary > operations with comparisons. We break-up for this patch statements > like (X | Y) != 0 to X != 0 | Y != 0, > and (X | Y) == 0 to expanded X == 0 & Y == 0. > Additionally we expand logical-not expressions like ~(A & B) -> ~A | > ~B, ~(A & B) -> ~A | ~B, and > ~(A ^ B) -> A ^ ~B. These expansion are just temporary for this pass > and getting later by fold > reversed again back to their original form. Implement all of this in the normal reassoc machinery that already exists. Don't implement your own walker (which btw is superlinear because you recurse into both operands). If no simplifications were possible you have to fold back the NOTs into the shorter sequences again, which conveniently reassoc already can do for negates and PLUS/MINUS chains. Hence extend the existing support for arithmetic operations to logical operations. Ciao, Michael.
2011/8/3 Michael Matz <matz@suse.de>: > Hi, > > On Tue, 2 Aug 2011, Kai Tietz wrote: > >> this patch improves the ability of reassociation pass to simplifiy >> more complex bitwise-binary >> operations with comparisons. We break-up for this patch statements >> like (X | Y) != 0 to X != 0 | Y != 0, >> and (X | Y) == 0 to expanded X == 0 & Y == 0. >> Additionally we expand logical-not expressions like ~(A & B) -> ~A | >> ~B, ~(A & B) -> ~A | ~B, and >> ~(A ^ B) -> A ^ ~B. These expansion are just temporary for this pass >> and getting later by fold >> reversed again back to their original form. > > Implement all of this in the normal reassoc machinery that already exists. > Don't implement your own walker (which btw is superlinear because you > recurse into both operands). If no simplifications were possible you have > to fold back the NOTs into the shorter sequences again, which conveniently > reassoc already can do for negates and PLUS/MINUS chains. > > Hence extend the existing support for arithmetic operations to logical > operations. > > > Ciao, > Michael. What you mean by existing machinery for negate expression here? This machinery doen't work in this case and additionally doesn't provide the opportunity to do a complete reassociation rewrite of bitwise-expression-chains. Eg: the case (~(a | c) & (b & ~d)) would be expanded (by code in patch) to ~a & ~c & b & ~d. This intermediate result is good to inspect doubles, or inverted optimizations. On rebuilding of tree the result gets transformed (or should) to ~(a | c | d) & b. This opportunity is just present because we unwinded the intial tree. Classical folding pass isn't able to actual detect or do that on gimpled-trees. For comparisons it is somewhat the same, just that bitwise-binary-chain is then for sure boolean-typed. So as example: (a == 1 & b != 0) & ~(a <= 1 | b == 0) would be transformed by the code in patch to a == 1 & b != 0 & a > 1 & b != 0. So the final tree is able to transform this into a >=1 & b != 0. Again without expansion and flattening of bitwise tree, we aren't able to detect and combine that. I thought about using the same mechanism as for negate, but it doesn't work and has major weaknesses for bitwise-binary operations. The negate logic is here a predicate, but the unwinding and flattening of bitwise trees isn't a predicate. Regards, Kai
On Wed, Aug 3, 2011 at 3:32 PM, Kai Tietz <ktietz70@googlemail.com> wrote: > 2011/8/3 Michael Matz <matz@suse.de>: >> Hi, >> >> On Tue, 2 Aug 2011, Kai Tietz wrote: >> >>> this patch improves the ability of reassociation pass to simplifiy >>> more complex bitwise-binary >>> operations with comparisons. We break-up for this patch statements >>> like (X | Y) != 0 to X != 0 | Y != 0, >>> and (X | Y) == 0 to expanded X == 0 & Y == 0. >>> Additionally we expand logical-not expressions like ~(A & B) -> ~A | >>> ~B, ~(A & B) -> ~A | ~B, and >>> ~(A ^ B) -> A ^ ~B. These expansion are just temporary for this pass >>> and getting later by fold >>> reversed again back to their original form. >> >> Implement all of this in the normal reassoc machinery that already exists. >> Don't implement your own walker (which btw is superlinear because you >> recurse into both operands). If no simplifications were possible you have >> to fold back the NOTs into the shorter sequences again, which conveniently >> reassoc already can do for negates and PLUS/MINUS chains. >> >> Hence extend the existing support for arithmetic operations to logical >> operations. >> >> >> Ciao, >> Michael. > > What you mean by existing machinery for negate expression here? This > machinery doen't work in this case and additionally doesn't provide > the opportunity to do a complete reassociation rewrite of > bitwise-expression-chains. > > Eg: the case (~(a | c) & (b & ~d)) would be expanded (by code in > patch) to ~a & ~c & b & ~d. > This intermediate result is good to inspect doubles, or inverted optimizations. > On rebuilding of tree the result gets transformed (or should) to ~(a | > c | d) & b. It depends on the context whether a conjunctive or a disjunctive normal form is what you want. As you are mixing two operation kinds reassoc isn't the perfect place to deal with this (yet). You don't seem to stop when single-use chains end (which is where reassoc will give up) and even visit stmts multiple times that way. You need to at least do this "unfolding" in a lot more controlled manner. Richard. > This opportunity is just present because we unwinded the intial tree. > Classical folding pass isn't able to actual detect or do that on > gimpled-trees. > > For comparisons it is somewhat the same, just that > bitwise-binary-chain is then for sure boolean-typed. So as example: > (a == 1 & b != 0) & ~(a <= 1 | b == 0) would be transformed by the > code in patch to a == 1 & b != 0 & a > 1 & b != 0. So the final tree > is able to transform this into a >=1 & b != 0. Again without > expansion and flattening of bitwise tree, we aren't able to detect and > combine that. > > I thought about using the same mechanism as for negate, but it doesn't > work and has major weaknesses for bitwise-binary operations. The > negate logic is here a predicate, but the unwinding and flattening of > bitwise trees isn't a predicate. > > Regards, > Kai >
Hi, On Wed, 3 Aug 2011, Kai Tietz wrote: > > Implement all of this in the normal reassoc machinery that already > > exists. Don't implement your own walker (which btw is superlinear > > because you recurse into both operands). If no simplifications were > > possible you have to fold back the NOTs into the shorter sequences > > again, which conveniently reassoc already can do for negates and > > PLUS/MINUS chains. > > > > Hence extend the existing support for arithmetic operations to logical > > operations. > > What you mean by existing machinery for negate expression here? tree-reassoc can handle chains of NEGATE/PLUS/MINUS/MULT/DIV expressions, including undistribution (A*X + B*X -> (A+B)*X). It does have some deficiencies with mixed chains (e.g. mixture of PLUS and MULT expressions). Now, in the usual mathematical parlance we have these correspondences: NOT == NEGATE AND == MULT IOR == PLUS hence, if you extend the existing reassoc machinery to first deal nicely with mixed chains and then handle NOT/AND/IOR like their arithmetic counterparts you will have improved reassoc for arithmetic operations _and_ extended it to also handle logical operation chains. Without hacking a completely different way of walking and collection operands for logicals in parallel to the one for arithmetics. > This machinery doen't work in this case That's why you have to extend it. > Eg: the case (~(a | c) & (b & ~d)) would be expanded (by code in > patch) to ~a & ~c & b & ~d. That's what I mean with better handling of mixed chains. -(a+b) is already (sometimes) rewritten into -a + -b (see negate_value). That's just slightly different from rewriting ~(a|b) into ~a & ~b. > This intermediate result is good to inspect doubles, or inverted > optimizations. On rebuilding of tree the result gets transformed (or > should) to ~(a | c | d) & b. And that would be done by undistribute_ops_list, if it were properly extended. > This opportunity is just present because we unwinded the intial tree. > Classical folding pass isn't able to actual detect or do that on > gimpled-trees. That's why you do this whole excercise in tree-reassoc, where it indeed belongs. My point is that you should extend the existing means to support your operations, instead of implementing a similar-but-different approach in parallel. > I thought about using the same mechanism as for negate, but it doesn't > work and has major weaknesses for bitwise-binary operations. The negate > logic is here a predicate, but the unwinding and flattening of bitwise > trees isn't a predicate. What you descrive as "flattening" is actually building a normal form. reassoc uses a vector of operand extries to hold such normal form (as said, not yet dealing with mixed chains). linearize_expr_tree does the flattening, and you would have to extend it (and its associated helper routines) to handle mixed chains and then you logical operations. Once a linearized array of operands (possibly of level two to handle mixed chains) exists undistribute_ops_list and optimize_ops_list will optimize those operands again, and rewrite_expr_tree will produce statements implementing the so optimized operand list. You will have to extend that too for your logical operations (and possibly compares). Ciao, Michael.
2011/8/3 Michael Matz <matz@suse.de>: > Hi, > > On Wed, 3 Aug 2011, Kai Tietz wrote: > >> > Implement all of this in the normal reassoc machinery that already >> > exists. Don't implement your own walker (which btw is superlinear >> > because you recurse into both operands). If no simplifications were >> > possible you have to fold back the NOTs into the shorter sequences >> > again, which conveniently reassoc already can do for negates and >> > PLUS/MINUS chains. >> > >> > Hence extend the existing support for arithmetic operations to logical >> > operations. >> >> What you mean by existing machinery for negate expression here? > > tree-reassoc can handle chains of NEGATE/PLUS/MINUS/MULT/DIV expressions, > including undistribution (A*X + B*X -> (A+B)*X). It does have some > deficiencies with mixed chains (e.g. mixture of PLUS and MULT > expressions). > > Now, in the usual mathematical parlance we have these correspondences: > NOT == NEGATE > AND == MULT > IOR == PLUS > hence, if you extend the existing reassoc machinery to first deal nicely > with mixed chains and then handle NOT/AND/IOR like their arithmetic > counterparts you will have improved reassoc for arithmetic operations > _and_ extended it to also handle logical operation chains. Without > hacking a completely different way of walking and collection operands for > logicals in parallel to the one for arithmetics. > >> This machinery doen't work in this case > > That's why you have to extend it. The issue about this machinery is that it assumes that the statement itself gets transformed, but for normalized form of invert of bitwise operations it is essential to perform invert operation on the operands, too. >> Eg: the case (~(a | c) & (b & ~d)) would be expanded (by code in >> patch) to ~a & ~c & b & ~d. > > That's what I mean with better handling of mixed chains. -(a+b) is > already (sometimes) rewritten into -a + -b (see negate_value). That's > just slightly different from rewriting ~(a|b) into ~a & ~b. Yes, it affects just one operand. And its weakness is that a (which might be a more complex statement) doesn't get folded for the negate operation here. Eg. a = 1 - d; c = - (a + b); should become d - b - 1 and optimized form of (d - (b + 1)). But AFAIR the code the thing is different what break_up_substract does. It modifies (a - b) -> a + (-b), which is for sure worth to simplify and ease arithmetic plus optimization. But doesn't match the point you are talking about. >> This intermediate result is good to inspect doubles, or inverted >> optimizations. On rebuilding of tree the result gets transformed (or >> should) to ~(a | c | d) & b. > > And that would be done by undistribute_ops_list, if it were properly > extended. > >> This opportunity is just present because we unwinded the intial tree. >> Classical folding pass isn't able to actual detect or do that on >> gimpled-trees. > > That's why you do this whole excercise in tree-reassoc, where it indeed > belongs. My point is that you should extend the existing means to support > your operations, instead of implementing a similar-but-different approach > in parallel. As I tried to pointed out before is the approach in reassoc only well working for single statements (not to talk here about non-single-used statements, which seems to be something my patch needs to handle better - a Richard said, it might walks statements so multiple times). But for an invert the operands and the expression-codes are changing, too. As later reassoc pass just walk the tree and combines simliar operation-codes in its vector by rank, and later optimization just happens within this range, it is essential to feed this vector with proper (normalized) expression-codes. >> I thought about using the same mechanism as for negate, but it doesn't >> work and has major weaknesses for bitwise-binary operations. The negate >> logic is here a predicate, but the unwinding and flattening of bitwise >> trees isn't a predicate. > > What you descrive as "flattening" is actually building a normal form. > reassoc uses a vector of operand extries to hold such normal form (as > said, not yet dealing with mixed chains). linearize_expr_tree does the > flattening, and you would have to extend it (and its associated helper > routines) to handle mixed chains and then you logical operations. Well, it does this in a very limited approach and is strictly bound to an underlying statement. > Once a linearized array of operands (possibly of level two to handle mixed > chains) exists undistribute_ops_list and optimize_ops_list will optimize > those operands again, and rewrite_expr_tree will produce statements > implementing the so optimized operand list. You will have to extend that > too for your logical operations (and possibly compares). > > > Ciao, > Michael. Actually it might be also a way to rewrite reassociation pass in a way that it just operates on normalized predicated trees made out of the orignal tree. Operates on it, and then write it out in a denormalized form. Such a tree would need information flags for inverted, negated, code, operands (maybe also altered), Which means that such a tree gets easily more complex. Nevertheless the creation of final result out of a normalized predicated tree would mean a re-write of tree anyway - even if there might be in some rare cases the opportunity to reuse present statements and avoid their rewriting. Regards, Kai
Hi, On Wed, 3 Aug 2011, Kai Tietz wrote: > >> This machinery doen't work in this case > > > > That's why you have to extend it. > > The issue about this machinery is that it assumes that the statement > itself gets transformed, but for normalized form of invert of bitwise > operations it is essential to perform invert operation on the operands, > too. Yes, and that is what break_up_subtract_bb and friends do for negates. In particular if the defining statement is of the right type itself (right now simply a PLUS) it propagates the negate into the operands of that operation. Extend it to handle AND/IOR/NOT and you're a step further. > >> Eg: the case (~(a | c) & (b & ~d)) would be expanded (by code in > >> patch) to ~a & ~c & b & ~d. > > > > That's what I mean with better handling of mixed chains. -(a+b) is > > already (sometimes) rewritten into -a + -b (see negate_value). That's > > just slightly different from rewriting ~(a|b) into ~a & ~b. > > Yes, it affects just one operand. No, it doesn't. > And its weakness is that a (which might be a more complex statement) > doesn't get folded for the negate operation here. Eg. a = 1 - d; c = - > (a + b); should become d - b - 1 and optimized form of (d - (b + 1)). The folding you're talking about is done by undistribute_ops_list, optimize_ops_list and repropagate_negates. I'm asking you to follow the same general scheme of tree-ssa-reassoc which is: 1) enabling transformations to make collecting operands easier (this possibly changes and adds statements, that's the breakup_subtract things) 2) collect operands of operation chains (this would sometimes require remembering some context like top level and second level operation in your case) 3) optimize that list of operands 4) transform this list into statements again 5) undo transformations done in 1) if they turned out to be unprofitable I'm fairly certain that the transformations you want can be done with this scheme. In particular (as said multiple times) you are missing step (5) even though you unconditionally do step (1). And you don't do steps 2-4 at all. > But AFAIR the code the thing is different what break_up_substract does. > It modifies (a - b) -> a + (-b), which is for sure worth to simplify and > ease arithmetic plus optimization. But doesn't match the point you are > talking about. I'm not sure what you think my point is. It is: reuse existing schemes to do things, extending them as necessary to support your usecases. > As I tried to pointed out before is the approach in reassoc only well > working for single statements That's completely wrong. reassoc was precisely implemented to handle a chain of arithmetic operations, i.e. multiple statements. > But for an invert the operands and the expression-codes are changing, > too. Right, that's why you (possibly) might want to handle chains of mixing AND/IOR operations, though I don't see those happening in your testcases. After the negates are folded into atomics (i.e. negates of AND and IOR are folded into operands) you always have a chain of either AND or IOR (with the atomics being negated or not). That can trivially be handled with the currently implemented scheme. > As later reassoc pass just walk the tree and combines simliar > operation-codes in its vector by rank, and later optimization just > happens within this range, it is essential to feed this vector with > proper (normalized) expression-codes. That's step (1) above, and for arithmetics done by break_up_subtract. > Actually it might be also a way to rewrite reassociation pass in a way > that it just operates on normalized predicated trees made out of the > orignal tree. Operates on it, and then write it out in a denormalized > form. Such a tree would need information flags for inverted, negated, > code, operands (maybe also altered), Which means that such a tree gets > easily more complex. Right, but ultimately it's the way forward. Though, as I said, right now your testcase only have one common reducing operation. > Nevertheless the creation of final result out of a normalized > predicated tree would mean a re-write of tree anyway - even if there > might be in some rare cases the opportunity to reuse present > statements and avoid their rewriting. The reusing of statements in tree-reassoc is merely an artifact, it's not an inherent design decision. Ciao, Michael.
Index: gcc/gcc/tree-ssa-reassoc.c =================================================================== --- gcc.orig/gcc/tree-ssa-reassoc.c +++ gcc/gcc/tree-ssa-reassoc.c @@ -41,6 +41,10 @@ along with GCC; see the file COPYING3. #include "cfgloop.h" #include "flags.h" +/* Forwarders. */ +static gimple build_and_add_sum (tree, tree, tree, enum tree_code); +static void remove_visited_stmt_chain (tree); + /* This is a simple global reassociation pass. It is, in part, based on the LLVM pass of the same name (They do some things more/less than we do, in different orders, etc). @@ -48,7 +52,9 @@ along with GCC; see the file COPYING3. It consists of five steps: 1. Breaking up subtract operations into addition + negate, where - it would promote the reassociation of adds. + it would promote the reassociation of adds. Additionally breaking + up combined expression made out of boolean-typed bitwise expressions + for improving simplification. 2. Left linearization of the expression trees, so that (A+B)+(C+D) becomes (((A+B)+C)+D), which is easier for us to rewrite later. @@ -554,6 +560,233 @@ get_unary_op (tree name, enum tree_code return NULL_TREE; } +/* Create a temorary register expression with type TYPE, tree code CODE, and + operands OP1 and OP2. If REF_DEF is a valid gimple statement, we use its + location for new generated temporary. + Function returns left-hand-side of new generated temporary register. */ +static tree +make_new_tmp_statement (tree type, enum tree_code code, tree op1, tree op2, + gimple ref_def) +{ + gimple sum; + tree tmpvar = create_tmp_reg (type, NULL); + add_referenced_var (tmpvar); + sum = build_and_add_sum (tmpvar, op1, op2, code); + if (ref_def) + gimple_set_location (sum, gimple_location (ref_def)); + return gimple_get_lhs (sum); +} + +/* Perfrom on tree LHS with optional definition statement EPXR + a logic-not operation. TYPE is of kind boolean with a 1-bit + precision. */ +static tree +expand_not_bitwise_binary (tree type, tree lhs, gimple expr) +{ + enum tree_code code = ERROR_MARK; + tree op1, op2; + gimple s1 = NULL, s2 = NULL; + + if (expr && is_gimple_assign (expr)) + code = gimple_assign_rhs_code (expr); + else if (TREE_CODE (lhs) == INTEGER_CST) + return fold_build1 (BIT_NOT_EXPR, type, lhs); + + /* ~(~X) -> X. */ + if (code == BIT_NOT_EXPR) + return gimple_assign_rhs1 (expr); + /* Invert comparison if possible, otherwise fall through to + default case. */ + else if (TREE_CODE_CLASS (code) == tcc_comparison) + { + enum tree_code ncode; + ncode = invert_tree_comparison (code, + HONOR_NANS (TYPE_MODE (type))); + if (ncode != ERROR_MARK) + return make_new_tmp_statement (type, ncode, + gimple_assign_rhs1 (expr), + gimple_assign_rhs2 (expr), + expr); + } + /* ~(A & B) -> ~A | ~B. */ + else if (code == BIT_AND_EXPR) + { + op1 = gimple_assign_rhs1 (expr); + if (TREE_CODE (op1) == SSA_NAME) + s1 = SSA_NAME_DEF_STMT (op1); + op2 = gimple_assign_rhs2 (expr); + if (TREE_CODE (op2) == SSA_NAME) + s2 = SSA_NAME_DEF_STMT (op2); + op1 = expand_not_bitwise_binary (type, op1, s1); + op2 = expand_not_bitwise_binary (type, op2, s2); + if (TREE_CODE (op2) == INTEGER_CST && integer_zerop (op2)) + return op1; + else if (TREE_CODE (op2) == INTEGER_CST && integer_all_onesp (op2)) + return op2; + return make_new_tmp_statement (type, BIT_IOR_EXPR, op1, op2, expr); + } + /* ~(A | B) -> ~A & ~B. */ + else if (code == BIT_IOR_EXPR) + { + op1 = gimple_assign_rhs1 (expr); + if (TREE_CODE (op1) == SSA_NAME) + s1 = SSA_NAME_DEF_STMT (op1); + op2 = gimple_assign_rhs2 (expr); + if (TREE_CODE (op2) == SSA_NAME) + s2 = SSA_NAME_DEF_STMT (op2); + op1 = expand_not_bitwise_binary (type, op1, s1); + op2 = expand_not_bitwise_binary (type, op2, s2); + if (TREE_CODE (op2) == INTEGER_CST && integer_zerop (op2)) + return op2; + else if (TREE_CODE (op2) == INTEGER_CST && integer_all_onesp (op2)) + return op1; + return make_new_tmp_statement (type, BIT_AND_EXPR, op1, op2, expr); + } + /* ~(A ^ B) -> ~A ^ B. Handle here special cases for B being + an integer constant, or being a logical-not. */ + else if (code == BIT_XOR_EXPR) + { + op1 = gimple_assign_rhs1 (expr); + if (TREE_CODE (op1) == SSA_NAME) + s1 = SSA_NAME_DEF_STMT (op1); + op2 = gimple_assign_rhs2 (expr); + if (TREE_CODE (op2) == SSA_NAME) + s2 = SSA_NAME_DEF_STMT (op2); + if (TREE_CODE (op2) != INTEGER_CST + && (!s2 || !is_gimple_assign (s2) + || gimple_assign_rhs_code (s2) != BIT_NOT_EXPR)) + op1 = expand_not_bitwise_binary (type, op1, s1); + else + op2 = expand_not_bitwise_binary (type, op2, s2); + + if (TREE_CODE (op2) == INTEGER_CST && integer_zerop (op2)) + return op1; + else if (TREE_CODE (op2) == INTEGER_CST && integer_all_onesp (op2)) + return make_new_tmp_statement (type, BIT_NOT_EXPR, op1, NULL_TREE, + expr); + return make_new_tmp_statement (type, BIT_XOR_EXPR, op1, op2, expr); + } + + /* Default case lhs -> ~lhs */ + return make_new_tmp_statement (type, BIT_NOT_EXPR, lhs, NULL_TREE, expr); +} + +/* Break up statement STMT if it is a combined expressions made out of + bitwise operations. Handle expansion of (A | B) !=/== 0, and ~(A op B). */ +static bool +break_up_bitwise_combined_stmt (gimple stmt) +{ + tree op1, op2; + gimple op1_def, op2_def; + enum tree_code code = gimple_assign_rhs_code (stmt); + tree type = TREE_TYPE (gimple_assign_lhs (stmt)); + gimple_stmt_iterator gsi; + bool ret; + + /* Don't do anything for none integral type. */ + if (!INTEGRAL_TYPE_P (type)) + return false; + + op1 = gimple_assign_rhs1 (stmt); + + op1_def = NULL; + if (TREE_CODE (op1) != SSA_NAME + || !(op1_def = SSA_NAME_DEF_STMT (op1)) + || !is_gimple_assign (op1_def)) + op1_def = NULL; + + if (op1_def && (code == NE_EXPR || code == EQ_EXPR)) + { + tree zero = gimple_assign_rhs2 (stmt); + tree old_op = op1; + + /* Check that right-hand operand has integral type and + not a boolean. And see if it is constant zero valued. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (zero)) + || TREE_CODE (TREE_TYPE (zero)) == BOOLEAN_TYPE + || !integer_zerop (zero)) + return false; + /* Is left-hand operand bitwise-or expression? */ + if (gimple_assign_rhs_code (op1_def) != BIT_IOR_EXPR) + return false; + op1 = make_new_tmp_statement (type, code, + gimple_assign_rhs1 (op1_def), zero, + stmt); + op2 = make_new_tmp_statement (type, code, + gimple_assign_rhs2 (op1_def), zero, + stmt); + + gsi = gsi_for_stmt (stmt); + gimple_assign_set_rhs_with_ops (&gsi, (code == NE_EXPR ? BIT_IOR_EXPR + : BIT_AND_EXPR), + op1, op2); + update_stmt (gsi_stmt (gsi)); + remove_visited_stmt_chain (old_op); + return true; + } + /* Handle expansion for logical-not ~X. */ + else if (op1_def && code == BIT_NOT_EXPR + && TREE_CODE (type) == BOOLEAN_TYPE + && TYPE_PRECISION (type) == 1) + { + tree old_op; + enum tree_code inner_code = gimple_assign_rhs_code (op1_def); + if (inner_code != BIT_AND_EXPR && inner_code != BIT_IOR_EXPR + && inner_code != BIT_XOR_EXPR) + return false; + old_op = op1; + op1 = gimple_assign_rhs1 (op1_def); + op2 = gimple_assign_rhs2 (op1_def); + op1_def = op2_def = NULL; + if (TREE_CODE (op1) != SSA_NAME + || (op1_def = SSA_NAME_DEF_STMT (op1)) == NULL + || !is_gimple_assign (op1_def)) + op1_def = NULL; + if (TREE_CODE (op2) != SSA_NAME + || (op2_def = SSA_NAME_DEF_STMT (op2)) == NULL + || !is_gimple_assign (op2_def)) + op2_def = NULL; + if (inner_code == BIT_XOR_EXPR) + { + if (TREE_CODE (op2) != INTEGER_CST + && (!op2_def || !is_gimple_assign (op2_def) + || gimple_assign_rhs_code (op2_def) != BIT_NOT_EXPR)) + op1 = expand_not_bitwise_binary (type, op1, op1_def); + else + op2 = expand_not_bitwise_binary (type, op2, op2_def); + } + else + { + op1 = expand_not_bitwise_binary (type, op1, op1_def); + op2 = expand_not_bitwise_binary (type, op2, op2_def); + inner_code = (inner_code == BIT_AND_EXPR ? BIT_IOR_EXPR + : BIT_AND_EXPR); + } + gsi = gsi_for_stmt (stmt); + gimple_assign_set_rhs_with_ops (&gsi, inner_code, op1, op2); + update_stmt (gsi_stmt (gsi)); + remove_visited_stmt_chain (old_op); + return true; + } + /* Is CODE a bitwise-binary operation X op Y? */ + else if (code != BIT_AND_EXPR && code != BIT_IOR_EXPR + && code != BIT_XOR_EXPR) + return false; + + /* See if X needs to be expanded. */ + ret = false; + if (op1_def) + ret = break_up_bitwise_combined_stmt (op1_def); + + /* See if Y needs to be expanded. */ + op2 = gimple_assign_rhs2 (stmt); + if (TREE_CODE (op2) != SSA_NAME + || !(op2_def = SSA_NAME_DEF_STMT (op2)) + || !is_gimple_assign (op2_def)) + return ret; + return break_up_bitwise_combined_stmt (op2_def) | ret; +} + /* If CURR and LAST are a pair of ops that OPCODE allows us to eliminate through equivalences, do so, remove them from OPS, and return true. Otherwise, return false. */ @@ -1015,8 +1248,8 @@ zero_one_operation (tree *def, enum tree while (1); } -/* Builds one statement performing OP1 OPCODE OP2 using TMPVAR for - the result. Places the statement after the definition of either +/* Builds one statement performing OP1 OPCODE OP2, OPCODE op1 using TMPVAR + for the result. Places the statement after the definition of either OP1 or OP2. Returns the new statement. */ static gimple @@ -1035,7 +1268,7 @@ build_and_add_sum (tree tmpvar, tree op1 /* Find an insertion place and insert. */ if (TREE_CODE (op1) == SSA_NAME) op1def = SSA_NAME_DEF_STMT (op1); - if (TREE_CODE (op2) == SSA_NAME) + if (op2 && TREE_CODE (op2) == SSA_NAME) op2def = SSA_NAME_DEF_STMT (op2); if ((!op1def || gimple_nop_p (op1def)) && (!op2def || gimple_nop_p (op2def))) @@ -2133,6 +2366,17 @@ can_reassociate_p (tree op) we want to break up k = t - q, but we won't until we've transformed q = b - r, which won't be broken up until we transform b = c - d. + Break up comparison !=/== 0 operations of bitwise-or operations for + being able to optimize within combined conditions. + (A | B) != 0 -> (A != 0) || (B != 0) + (A | B) == 0 -> (A == 0) && (B != 0) + + Break up logical-not expressions of bitwise boolean-typed and/or/xor + operations for being able to optimize wihin combined conditions. + ~(A | B) -> ~A | ~B + ~(A & B) -> ~A & ~B + ~(A ^ B) -> A ^ ~B (special case if B is a constant) + En passant, clear the GIMPLE visited flag on every statement. */ static void @@ -2141,21 +2385,32 @@ break_up_subtract_bb (basic_block bb) gimple_stmt_iterator gsi; basic_block son; - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); ) { gimple stmt = gsi_stmt (gsi); gimple_set_visited (stmt, false); - - if (!is_gimple_assign (stmt) - || !can_reassociate_p (gimple_assign_lhs (stmt))) + if (!is_gimple_assign (stmt)) + { + gsi_next (&gsi); + continue; + } + if (break_up_bitwise_combined_stmt (stmt)) continue; + if (!can_reassociate_p (gimple_assign_lhs (stmt))) + { + gsi_next (&gsi); + continue; + } /* Look for simple gimple subtract operations. */ if (gimple_assign_rhs_code (stmt) == MINUS_EXPR) { if (!can_reassociate_p (gimple_assign_rhs1 (stmt)) || !can_reassociate_p (gimple_assign_rhs2 (stmt))) - continue; + { + gsi_next (&gsi); + continue; + } /* Check for a subtract used only in an addition. If this is the case, transform it into add of a negate for better @@ -2167,6 +2422,7 @@ break_up_subtract_bb (basic_block bb) else if (gimple_assign_rhs_code (stmt) == NEGATE_EXPR && can_reassociate_p (gimple_assign_rhs1 (stmt))) VEC_safe_push (tree, heap, plus_negates, gimple_assign_lhs (stmt)); + gsi_next (&gsi); } for (son = first_dom_son (CDI_DOMINATORS, bb); son; Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-24.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-24.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */ + +int foo (int a, int b, int c, int d) +{ + int r1 = a != 0 & c != 0 & b != 0; + int r2 = a == 0 | b != 0 | d == 0; + return (r1 != 0 & r2 == 0); +} + +/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized"} } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-25.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-25.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */ + +int foo (int a, int b, int c, int d) +{ + int r1 = a != 0 & c != 0 & b != 0; + int r2 = a == 0 | b != 0 | d == 0; + return (r1 == 0 | r2 != 0); +} + +/* { dg-final { scan-tree-dump-times "return 1" 1 "optimized"} } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-26.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-26.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */ + +int foo (int a, int b, int c, int d) +{ + int r1 = (a | b | c) == 0; + int r2 = (a | d | c) != 0 | b == 0; + return (r1 == 0 | r2 != 0); +} + +/* { dg-final { scan-tree-dump-times "return 1" 1 "optimized"} } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */