Message ID | 56997381.7010101@redhat.com |
---|---|
State | New |
Headers | show |
On Fri, Jan 15, 2016 at 03:32:33PM -0700, Jeff Law wrote: > +bool > +ssa_name_has_boolean_range (tree op) > +{ > + gcc_assert (TREE_CODE (op) == SSA_NAME); > + > + /* Boolean types always have a range [0..1]. */ > + if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE) > + return true; > + > + /* An integral type with a single bit of precision. */ > + if (INTEGRAL_TYPE_P (TREE_TYPE (op)) > + && TYPE_UNSIGNED (TREE_TYPE (op)) > + && TYPE_PRECISION (TREE_TYPE (op)) == 1) > + return true; > + > + /* An integral type with more precision, but the object > + only takes on values [0..1] as determined by VRP > + analysis. */ > + if (INTEGRAL_TYPE_P (TREE_TYPE (op)) > + && (TYPE_PRECISION (TREE_TYPE (op)) > 1 > + || TYPE_UNSIGNED (TREE_TYPE (op))) I think this || TYPE_UNSIGNED (TREE_TYPE (op)) is useless. Because, if TYPE_PRECISION (TREE_TYPE (op)) > 1, then both signed and unsigned is fine, and if precision is 1, then already the earlier if handled it, and precision 0 is hopefully invalid. > + && wi::eq_p (get_nonzero_bits (op), 1)) > + return true; > + > + return false; > +} Jakub
Jeff Law <law@redhat.com> writes: > commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0 > Author: Jeff Law <law@tor.usersys.redhat.com> > Date: Fri Jan 15 17:15:24 2016 -0500 > > PR tree-optimization/69270 > * tree-ssanames.c (ssa_name_has_boolean_range): Moved here from > tree-ssa-dom.c. Improve test for [0..1] ranve from VRP. > * tree-ssa-dom.c (ssa_name_has_boolean_range): Remove. > * tree-ssanames.h (ssa_name_has_boolean_range): Prototype. > * tree-ssa-uncprop.c (associate_equivalences_with_edges): Use > ssa_name_has_boolean_range and constant_boolean_node. > > PR tree-optimization/69270 > * gcc.dg/tree-ssa/pr69270-2.c: New test. > * gcc.dg/tree-ssa/pr69270-3.c: New test. This breaks gcc.target/aarch64/tst_3.c. //.tune generic .type f1, %function f1: - tst x0, 1 - csinc w0, w0, wzr, eq + ands w1, w0, 1 + csel w0, w1, w0, ne ret .size f1, .-f1 Andreas.
On 18/01/16 11:31, Andreas Schwab wrote: > Jeff Law <law@redhat.com> writes: > >> commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0 >> Author: Jeff Law <law@tor.usersys.redhat.com> >> Date: Fri Jan 15 17:15:24 2016 -0500 >> >> PR tree-optimization/69270 >> * tree-ssanames.c (ssa_name_has_boolean_range): Moved here from >> tree-ssa-dom.c. Improve test for [0..1] ranve from VRP. >> * tree-ssa-dom.c (ssa_name_has_boolean_range): Remove. >> * tree-ssanames.h (ssa_name_has_boolean_range): Prototype. >> * tree-ssa-uncprop.c (associate_equivalences_with_edges): Use >> ssa_name_has_boolean_range and constant_boolean_node. >> >> PR tree-optimization/69270 >> * gcc.dg/tree-ssa/pr69270-2.c: New test. >> * gcc.dg/tree-ssa/pr69270-3.c: New test. > This breaks gcc.target/aarch64/tst_3.c. > > //.tune generic > .type f1, %function > f1: > - tst x0, 1 > - csinc w0, w0, wzr, eq > + ands w1, w0, 1 > + csel w0, w1, w0, ne > ret > .size f1, .-f1 The two sequences look equally valid to me. Instead of doing an and-compare followed by a conditional increment we do an and-compare followed by a conditional select (without discarding the result of the and). So the testcase should be adjusted. I'll do it. Thanks, Kyrill > Andreas. >
On Mon, Jan 18, 2016 at 11:38:37AM +0000, Kyrill Tkachov wrote: > On 18/01/16 11:31, Andreas Schwab wrote: > >Jeff Law <law@redhat.com> writes: > > > >>commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0 > >>Author: Jeff Law <law@tor.usersys.redhat.com> > >>Date: Fri Jan 15 17:15:24 2016 -0500 > >> > >> PR tree-optimization/69270 > >> * tree-ssanames.c (ssa_name_has_boolean_range): Moved here from > >> tree-ssa-dom.c. Improve test for [0..1] ranve from VRP. > >> * tree-ssa-dom.c (ssa_name_has_boolean_range): Remove. > >> * tree-ssanames.h (ssa_name_has_boolean_range): Prototype. > >> * tree-ssa-uncprop.c (associate_equivalences_with_edges): Use > >> ssa_name_has_boolean_range and constant_boolean_node. > >> PR tree-optimization/69270 > >> * gcc.dg/tree-ssa/pr69270-2.c: New test. > >> * gcc.dg/tree-ssa/pr69270-3.c: New test. > >This breaks gcc.target/aarch64/tst_3.c. > > > > //.tune generic > > .type f1, %function > > f1: > >- tst x0, 1 > >- csinc w0, w0, wzr, eq > >+ ands w1, w0, 1 > >+ csel w0, w1, w0, ne > > ret > > .size f1, .-f1 > > The two sequences look equally valid to me. > Instead of doing an and-compare followed by a conditional increment > we do an and-compare followed by a conditional select (without discarding > the result of the and). > So the testcase should be adjusted. > I'll do it. IMHO please wait for the resolution of PR69320 here. Jakub
On 18/01/16 11:49, Jakub Jelinek wrote: > On Mon, Jan 18, 2016 at 11:38:37AM +0000, Kyrill Tkachov wrote: >> On 18/01/16 11:31, Andreas Schwab wrote: >>> Jeff Law <law@redhat.com> writes: >>> >>>> commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0 >>>> Author: Jeff Law <law@tor.usersys.redhat.com> >>>> Date: Fri Jan 15 17:15:24 2016 -0500 >>>> >>>> PR tree-optimization/69270 >>>> * tree-ssanames.c (ssa_name_has_boolean_range): Moved here from >>>> tree-ssa-dom.c. Improve test for [0..1] ranve from VRP. >>>> * tree-ssa-dom.c (ssa_name_has_boolean_range): Remove. >>>> * tree-ssanames.h (ssa_name_has_boolean_range): Prototype. >>>> * tree-ssa-uncprop.c (associate_equivalences_with_edges): Use >>>> ssa_name_has_boolean_range and constant_boolean_node. >>>> PR tree-optimization/69270 >>>> * gcc.dg/tree-ssa/pr69270-2.c: New test. >>>> * gcc.dg/tree-ssa/pr69270-3.c: New test. >>> This breaks gcc.target/aarch64/tst_3.c. >>> >>> //.tune generic >>> .type f1, %function >>> f1: >>> - tst x0, 1 >>> - csinc w0, w0, wzr, eq >>> + ands w1, w0, 1 >>> + csel w0, w1, w0, ne >>> ret >>> .size f1, .-f1 >> The two sequences look equally valid to me. >> Instead of doing an and-compare followed by a conditional increment >> we do an and-compare followed by a conditional select (without discarding >> the result of the and). >> So the testcase should be adjusted. >> I'll do it. > IMHO please wait for the resolution of PR69320 here. Ok, but both sequences should be ok for aarch64 here. The function in the test is: int f1 (int x) { if (x & 1) return 1; return x; } and the "return 1" should just be changed to "return 2" to avoid whatever optimiser decided to perform that transformation. Kyrill
On Fri, Jan 15, 2016 at 2:32 PM, Jeff Law <law@redhat.com> wrote: > On 01/14/2016 11:14 AM, Jeff Law wrote: >> >> On 01/14/2016 12:49 AM, Jakub Jelinek wrote: >>> >>> On Thu, Jan 14, 2016 at 08:46:43AM +0100, Jakub Jelinek wrote: >>>> >>>> On Thu, Jan 14, 2016 at 12:38:52AM -0700, Jeff Law wrote: >>>>> >>>>> + /* An integral type with more precision, but the object >>>>> + only takes on values [0..1] as determined by VRP >>>>> + analysis. */ >>>>> + wide_int min, max; >>>>> + if (INTEGRAL_TYPE_P (TREE_TYPE (op)) >>>>> + && get_range_info (op, &min, &max) == VR_RANGE >>>>> + && wi::eq_p (min, 0) >>>>> + && wi::eq_p (max, 1)) >>>>> + return true; >>>> >>>> >>>> You could use and/or: >>>> if (INTEGRAL_TYPE_P (TREE_TYPE (op)) && wi::eq_p (get_nonzero_bits >>>> (op), 1)) >>>> set_range_info for VR_RANGE should usually update also the non-zero >>>> bits, but >>>> set_nonzero_bits does not update the recorded range. >>> >>> >>> Though, that would need to be limited to TYPE_PRECISION (TREE_TYPE >>> (op)) > 1 >>> or TYPE_UNSIGNED. >> >> Quite surprisingly, this does seem to do better fairly often. Usually >> it's just getting more constants into the PHI nodes without further >> improvements. However occasionally I see a PHI that turns into a >> constant, which is then propagated to a use where we're able to simplify >> some arithmetic/logical. >> >> Unfortunately it doesn't make a bit of difference in the final output, >> so something post DOM was picking up these anyway (most likely VRP2). >> But I'm a fan of getting stuff optimized earlier in the pipeline when >> it's reasonable to do so, and this seems reasonable. >> >> Limiting to TYPE_PRECISION > 1 or TYPE_UNSIGNED ought to be trivial. > > So further testing did show some code regressions from this improvement. > Specifically we were clearly better at propagating boolean values derived > from test conditions into PHIs (and ultimately into real statements as > well). That was the purpose of the patch. > > Where we took a small step backwards was the out-of-ssa translation and RTL > expansion. A constant argument in a PHI generates a constant load at RTL > time. We have uncprop to detect cases where there are already objects > holding the value we want and just before out-of-ssa we un-propagate the > constant. When the object holding the value we want coalesces with the LHS > of the PHI (which is most of the time) we win. > > uncprop wasn't catching these new cases where we'd propagated constants more > aggressively into PHI nodes. This patch fixes that problem. > > In all, this is a very small improvement in the generated code. It may > ultimately prove more useful in the future to drive path partitioning. > > There's two small tests. One verifies we're able to propagate more > constants per the original intent of the patch. The other verifies we > un-propagate as well. > > > Bootstrapped and regression tested on x86_64. Installed on the trunk. > > jeff > > > > commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0 > Author: Jeff Law <law@tor.usersys.redhat.com> > Date: Fri Jan 15 17:15:24 2016 -0500 > > PR tree-optimization/69270 > * tree-ssanames.c (ssa_name_has_boolean_range): Moved here from > tree-ssa-dom.c. Improve test for [0..1] ranve from VRP. > * tree-ssa-dom.c (ssa_name_has_boolean_range): Remove. > * tree-ssanames.h (ssa_name_has_boolean_range): Prototype. > * tree-ssa-uncprop.c (associate_equivalences_with_edges): Use > ssa_name_has_boolean_range and constant_boolean_node. > > PR tree-optimization/69270 > * gcc.dg/tree-ssa/pr69270-2.c: New test. > * gcc.dg/tree-ssa/pr69270-3.c: New test. > This caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70005 H.J.
diff --git a/gcc/ChangeLog b/gcc/ChangeLog index e3dc328..409e981 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,13 @@ +2016-01-15 Jeff Law <law@redhat.com> + + PR tree-optimization/69270 + * tree-ssanames.c (ssa_name_has_boolean_range): Moved here from + tree-ssa-dom.c. Improve test for [0..1] ranve from VRP. + * tree-ssa-dom.c (ssa_name_has_boolean_range): Remove. + * tree-ssanames.h (ssa_name_has_boolean_range): Prototype. + * tree-ssa-uncprop.c (associate_equivalences_with_edges): Use + ssa_name_has_boolean_range and constant_boolean_node. + 2016-01-15 Vladimir Makarov <vmakarov@redhat.com> PR rtl-optimization/69030 diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 29291a2..d9a9246 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,9 @@ +2016-01-15 Jeff Law <law@redhat.com> + + PR tree-optimization/69270 + * gcc.dg/tree-ssa/pr69270-2.c: New test. + * gcc.dg/tree-ssa/pr69270-3.c: New test. + 2016-01-15 Paul Thomas <pault@gcc.gnu.org> PR fortran/49630 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr69270-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr69270-2.c new file mode 100644 index 0000000..15c7bdd --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr69270-2.c @@ -0,0 +1,52 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-dom3-details -w" } */ + +/* There should be a reference to usecount that turn into + constants. */ +/* { dg-final { scan-tree-dump-times "Replaced .usecount_\[0-9\]+. with constant .1." 1 "dom3"} } */ + +/* And an assignment using usecount ought to fold down to constants. */ +/* { dg-final { scan-tree-dump-times "Folded to: usecount_\[0-9\]+ = 2;" 1 "dom3"} } */ + +/* The arithmetic using usecount should be gone, except for the one in the + details debugging. */ +/* { dg-final { scan-tree-dump-times "usecount_\[0-9\]+ = usecount_\[0-9\]+ . 1;" 1 "dom3"} } */ + +typedef union tree_node *tree; +typedef union gimple_statement_d *gimple; +extern const int tree_code_type[]; +union tree_node +{ + int code:16; +}; +typedef struct immediate_use_iterator_d +{ +} +imm_use_iterator; +void +insert_debug_temp_for_var_def (gimple stmt) +{ + gimple def_stmt = ((void *) 0); + int usecount = 0; + tree value = ((void *) 0); + for (; arf ();) + { + if (!gimple_debug_bind_p (stmt)) + continue; + if (usecount++) + break; + unsigned char no_value = 0; + if (!gimple_bb (def_stmt)) + no_value = 1; + if (!no_value) + value = gimple_assign_rhs_to_tree (); + } + if (value) + { + if ((tree_code_type[(int) (((value)->code))] == 42) + || (usecount == 1 && (is_gimple_min_invariant (value)))) + value = unshare_expr (value); + } +} + + diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c new file mode 100644 index 0000000..89735f6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-uncprop-details -w" } */ + +/* We're looking for a constant argument a PHI node. There + should only be one if we unpropagate correctly. */ +/* { dg-final { scan-tree-dump-times ", 1" 1 "uncprop1"} } */ + +typedef long unsigned int size_t; +typedef union gimple_statement_d *gimple; +unsigned char +propagate_with_phi () +{ + gimple use_stmt; + unsigned char phi_inserted; + phi_inserted = 0; + for (; !end_imm_use_stmt_p (); next_imm_use_stmt ()) + { + if (!(arf () == 10 && boo () == 20)) + continue; + if (!phi_inserted) + phi_inserted = 1; + else + update_stmt (); + } +} + diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c index f2257b3..8298637 100644 --- a/gcc/tree-ssa-dom.c +++ b/gcc/tree-ssa-dom.c @@ -316,39 +316,6 @@ record_conditions (struct edge_info *edge_info, tree cond, tree inverted) edge_info->cond_equivalences.safe_push (c); } -/* Return TRUE is OP, an SSA_NAME has a range of values [0..1], false - otherwise. - - This can be because it is a boolean type, any unsigned integral - type with a single bit of precision, or has known range of [0..1] - via VRP analysis. */ - -static bool -ssa_name_has_boolean_range (tree op) -{ - /* Boolean types always have a range [0..1]. */ - if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE) - return true; - - /* An integral type with a single bit of precision. */ - if (INTEGRAL_TYPE_P (TREE_TYPE (op)) - && TYPE_UNSIGNED (TREE_TYPE (op)) - && TYPE_PRECISION (TREE_TYPE (op)) == 1) - return true; - - /* An integral type with more precision, but the object - only takes on values [0..1] as determined by VRP - analysis. */ - wide_int min, max; - if (INTEGRAL_TYPE_P (TREE_TYPE (op)) - && get_range_info (op, &min, &max) == VR_RANGE - && wi::eq_p (min, 0) - && wi::eq_p (max, 1)) - return true; - - return false; -} - /* We have finished optimizing BB, record any information implied by taking a specific outgoing edge from BB. */ diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c index 4b57578..307bb1f 100644 --- a/gcc/tree-ssa-uncprop.c +++ b/gcc/tree-ssa-uncprop.c @@ -94,23 +94,26 @@ associate_equivalences_with_edges (void) can record an equivalence for OP0 rather than COND. */ if (TREE_CODE (op0) == SSA_NAME && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op0) - && TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE + && ssa_name_has_boolean_range (op0) && is_gimple_min_invariant (op1)) { + tree true_val = constant_boolean_node (true, TREE_TYPE (op0)); + tree false_val = constant_boolean_node (false, + TREE_TYPE (op0)); if (code == EQ_EXPR) { equivalency = XNEW (struct edge_equivalency); equivalency->lhs = op0; equivalency->rhs = (integer_zerop (op1) - ? boolean_false_node - : boolean_true_node); + ? false_val + : true_val); true_edge->aux = equivalency; equivalency = XNEW (struct edge_equivalency); equivalency->lhs = op0; equivalency->rhs = (integer_zerop (op1) - ? boolean_true_node - : boolean_false_node); + ? true_val + : false_val); false_edge->aux = equivalency; } else @@ -118,15 +121,15 @@ associate_equivalences_with_edges (void) equivalency = XNEW (struct edge_equivalency); equivalency->lhs = op0; equivalency->rhs = (integer_zerop (op1) - ? boolean_true_node - : boolean_false_node); + ? true_val + : false_val); true_edge->aux = equivalency; equivalency = XNEW (struct edge_equivalency); equivalency->lhs = op0; equivalency->rhs = (integer_zerop (op1) - ? boolean_false_node - : boolean_true_node); + ? false_val + : true_val); false_edge->aux = equivalency; } } diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c index 82866b2..b6f72e2 100644 --- a/gcc/tree-ssanames.c +++ b/gcc/tree-ssanames.c @@ -411,6 +411,40 @@ get_nonzero_bits (const_tree name) return ri->get_nonzero_bits (); } +/* Return TRUE is OP, an SSA_NAME has a range of values [0..1], false + otherwise. + + This can be because it is a boolean type, any unsigned integral + type with a single bit of precision, or has known range of [0..1] + via VRP analysis. */ + +bool +ssa_name_has_boolean_range (tree op) +{ + gcc_assert (TREE_CODE (op) == SSA_NAME); + + /* Boolean types always have a range [0..1]. */ + if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE) + return true; + + /* An integral type with a single bit of precision. */ + if (INTEGRAL_TYPE_P (TREE_TYPE (op)) + && TYPE_UNSIGNED (TREE_TYPE (op)) + && TYPE_PRECISION (TREE_TYPE (op)) == 1) + return true; + + /* An integral type with more precision, but the object + only takes on values [0..1] as determined by VRP + analysis. */ + if (INTEGRAL_TYPE_P (TREE_TYPE (op)) + && (TYPE_PRECISION (TREE_TYPE (op)) > 1 + || TYPE_UNSIGNED (TREE_TYPE (op))) + && wi::eq_p (get_nonzero_bits (op), 1)) + return true; + + return false; +} + /* We no longer need the SSA_NAME expression VAR, release it so that it may be reused. diff --git a/gcc/tree-ssanames.h b/gcc/tree-ssanames.h index d8ce684..c81b1a1 100644 --- a/gcc/tree-ssanames.h +++ b/gcc/tree-ssanames.h @@ -75,6 +75,7 @@ extern enum value_range_type get_range_info (const_tree, wide_int *, wide_int *); extern void set_nonzero_bits (tree, const wide_int_ref &); extern wide_int get_nonzero_bits (const_tree); +extern bool ssa_name_has_boolean_range (tree); extern void init_ssanames (struct function *, int); extern void fini_ssanames (struct function *); extern void ssanames_print_statistics (void);