From patchwork Mon Dec 4 05:59:34 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 844118 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-468417-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="O2ZZ0xYD"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yqvNN50Txz9s9Y for ; Mon, 4 Dec 2017 16:59:48 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:message-id:date:mime-version:content-type; q=dns; s= default; b=UJyldeiOU/x9EOVUK3KqcBE0X3YMW/MKlVk1fEYS/8k1y9gLhTumr FpcStdzRXPmlAKMEM65Mx2PyCQ8z5QHC12z7APPmVLciZtfbWE1tlbosrlxbZcSF BYXolZua4dk09GQK0NlBsEixKLTPnaQSZxLPXp34Ta6+nYAd3aGTHM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:message-id:date:mime-version:content-type; s= default; bh=fVJo4g3zTiM5XNK5PyfyMzSoCV4=; b=O2ZZ0xYDvmuC+A5p6lEH XpF3xgYOcFPkY14WTDCQWwkVOnqc006HwmFujXEY9vWgbLg7pqtHe9eoYnNyt4e8 BBpi8VrPwuqIdNWnCAsvd7cMDwqmFh0sDcyXhyZXhtL29H8igK9JikFgFohyWRFb GluQC6HZApd4SN6rttf/QHg= Received: (qmail 14571 invoked by alias); 4 Dec 2017 05:59:41 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 14554 invoked by uid 89); 4 Dec 2017 05:59:40 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_HELO_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=nowhere, examining, pps, pulling X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 04 Dec 2017 05:59:37 +0000 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DA0FF81DF6 for ; Mon, 4 Dec 2017 05:59:36 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-2.rdu2.redhat.com [10.10.112.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id ED5D25D9C6 for ; Mon, 4 Dec 2017 05:59:35 +0000 (UTC) From: Jeff Law Subject: [RFA][PATCH][tree-optimization/78496] 03/03 Embed and exploit EVRP analysis within DOM To: gcc-patches Message-ID: <811b15c2-2116-a8bd-1648-3446a7e10fc4@redhat.com> Date: Sun, 3 Dec 2017 22:59:34 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 X-IsSubscribed: yes And finally, the meat of the fix for pr78496. This is largely the patch that was posed right when stage1 closed with just minor updates. This patch embeds evrp analysis into DOM and exploits it in the minimal way necessary to fix pr78496 without regressing other tests in the testsuite. The key effect we're looking for here is to pick up a slew of jump threads in DOM2 by exploiting the range information during jump threading. These aren't handled well in the standard tree-vrp jump threading -- we could abuse ASSERT_EXPRs further, but it'd just be ugly and probably computationally expensive. The ranges we want fall out naturally from a dominator walk, hence all the recent work around pulling out EVRP analysis into a little module other passes could reuse. With these changes we pick up all the missing jump thread opportunities in DOM for pr78496. Sadly, I've been able to pull together an automated test that I like. About the best I could come up with would be to verify that a large number of jump threads were realized during DOM2. But that still seems rather fragile. I also looked at examining the results of PRE, but the partial redundancies that were originally the source of complaints in that BZ have already been fixed. I also looked at perhaps looking for PHIs with constant args and then perhaps trying to filter those, but got nowhere. So there's no direct test for pr78496. Sigh. Running EVRP analysis in DOM had an unintended side effect. Namely that SSA_NAMEs that got created after the EVRP optimization pass would have range information attached to them. That caused two warning regressions with -O1 code in the C++ and Fortran testsuites. The problem is the ranges attached to the new SSA_NAMES were very wide and there was code left on an unexecutable path that called an allocation routine (C++) and a memset (Fortran) using those names as arguments. The uber-wide ranges in those circumstances triggered the false positive warnings. By exploiting the EVRP data during the standard part of DOM to optimize conditionals, we're able to prove the paths in question simply aren't executable. We remove them and the warnings go away. This work compromises builtin-unreachable-6. So the original test it kept and -fno-tree-dominator-opts is added to keep it from being compromised. A new builtin-unreachable-6a test is created to very that DOM does indeed remove all the __builtin_unreachable paths. This work also results in us optimizing 20030922-2.c again (conditional equivalences). EVRP analysis will note the conditional equivalence. We don't propagate anything from EVRP analysis at this time, but we do use it to try and simplify GIMPLE_CONDs to a compile-time constant which is what happens in this case. I plan to check this in if/when the first two patches are approved. Jeff ps. The x_vr_values hack is scheduled to go away as we remove tree-vrp.c's threading implementation in gcc-9 -- we'll be able to drop the simplification callback and instead have simplification live within the dom_opt_dom_walker class where it'll have access to vr_values. The analogous version in tree-vrp.c just gets deleted at that time. pps. I know there's a bogus propagation patch that I need to go back and re-check based on comments from Richi. That's definitely in the queue. * gimple-ssa-evrp-analyze.h (evrp_range_analyzer::get_vr_values): Simplify. * gimple-ssa-evrp-analyze.c: Corresponding changes. * tree-ssa-dom.c: Include alloc-pool.h, tree-vrp.h, vr-values.h and gimple-ssa-evrp-analyze.h. (dom_opt_dom_walker class): Add evrp_range_analyzer member. (simplify_stmt_for_jump_threading): Copy a blob of code from tree-vrp.c to use ranges to simplify statements. (dom_opt_dom_walker::before_dom_children): Call evrp_range_analyzer::{enter,record_ranges_from_stmt} methods. (dom_opt_dom_walker::after_dom_children): Similarly for evrp_range_analyzer::leave. (dom_opt_dom_walker::optimize_stmt): Use EVRP ranges to optimize conditionals. * gcc.dg/builtin-unreachable-6.c: Disable DOM. * gcc.dg/builtin-unreachable-6a.c: New test. * gcc.dg/tree-ssa/20030922-1.c: No longer XFAIL. * gcc.dg/ssa-dom-branch-1.c: Tweak expected output. diff --git a/gcc/gimple-ssa-evrp-analyze.h b/gcc/gimple-ssa-evrp-analyze.h index 6216a90..4783e6f 100644 --- a/gcc/gimple-ssa-evrp-analyze.h +++ b/gcc/gimple-ssa-evrp-analyze.h @@ -51,13 +51,7 @@ class evrp_range_analyzer true, then we are transferring ownership. Else we keep ownership. This should be converted to a unique_ptr. */ - class vr_values *get_vr_values (bool transfer) - { - class vr_values *x = vr_values; - if (transfer) - vr_values = NULL; - return x; - } + class vr_values *get_vr_values (void) { return vr_values; } private: DISABLE_COPY_AND_ASSIGN (evrp_range_analyzer); diff --git a/gcc/gimple-ssa-evrp.c b/gcc/gimple-ssa-evrp.c index a554cf9..609ce38 100644 --- a/gcc/gimple-ssa-evrp.c +++ b/gcc/gimple-ssa-evrp.c @@ -68,7 +68,7 @@ class evrp_dom_walker : public dom_walker public: evrp_dom_walker () : dom_walker (CDI_DOMINATORS), - evrp_folder (evrp_range_analyzer.get_vr_values (false)) + evrp_folder (evrp_range_analyzer.get_vr_values ()) { need_eh_cleanup = BITMAP_ALLOC (NULL); } diff --git a/gcc/testsuite/gcc.dg/builtin-unreachable-6.c b/gcc/testsuite/gcc.dg/builtin-unreachable-6.c index 2f8ca36..1915dd1 100644 --- a/gcc/testsuite/gcc.dg/builtin-unreachable-6.c +++ b/gcc/testsuite/gcc.dg/builtin-unreachable-6.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-fab1" } */ +/* { dg-options "-O2 -fdump-tree-fab1 -fno-tree-dominator-opts" } */ void foo (int b, int c) diff --git a/gcc/testsuite/gcc.dg/builtin-unreachable-6a.c b/gcc/testsuite/gcc.dg/builtin-unreachable-6a.c new file mode 100644 index 0000000..f527f2e --- /dev/null +++ b/gcc/testsuite/gcc.dg/builtin-unreachable-6a.c @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-fab1" } */ + +#include "builtin-unreachable-6.c" + +/* { dg-final { scan-tree-dump-times "lab:" 1 "fab1" } } */ +/* { dg-final { scan-tree-dump-not "__builtin_unreachable" "fab1" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c index 172f203..16c79da 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c @@ -20,6 +20,4 @@ rgn_rank (rtx insn1, rtx insn2) } /* There should be two IF conditionals. */ -/* We no longer record the conditional equivalence by design, thus we - are unable to simplify the 3rd conditional at compile time. */ -/* { dg-final { scan-tree-dump-times "if " 2 "dom2" { xfail *-*-* } } } */ +/* { dg-final { scan-tree-dump-times "if " 2 "dom2" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c index 3c15296..fae5bde 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c @@ -19,9 +19,10 @@ try_combine (rtx i1, rtx newpat) else if (i1 && foo ()); } -/* There should be two tests against i1. One from the hash table - dumps, one in the code itself. */ -/* { dg-final { scan-tree-dump-times "if .i1_" 2 "dom2"} } */ +/* There should be four tests against i1. One from the hash table + dumps, one from the EVRP analyzer one from EVRP evaluation and one + in the code itself. */ +/* { dg-final { scan-tree-dump-times "if .i1_" 4 "dom2"} } */ /* There should be no actual jump threads realized by DOM. The legitimize jump threads are handled in VRP and those discovered diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c index 916d661..59a7d87 100644 --- a/gcc/tree-ssa-dom.c +++ b/gcc/tree-ssa-dom.c @@ -46,6 +46,10 @@ along with GCC; see the file COPYING3. If not see #include "gimplify.h" #include "tree-cfgcleanup.h" #include "dbgcnt.h" +#include "alloc-pool.h" +#include "tree-vrp.h" +#include "vr-values.h" +#include "gimple-ssa-evrp-analyze.h" /* This file implements optimizations on the dominator tree. */ @@ -584,6 +588,9 @@ private: class const_and_copies *m_const_and_copies; class avail_exprs_stack *m_avail_exprs_stack; + /* VRP data. */ + class evrp_range_analyzer evrp_range_analyzer; + /* Dummy condition to avoid creating lots of throw away statements. */ gcond *m_dummy_cond; @@ -835,6 +842,9 @@ make_pass_dominator (gcc::context *ctxt) return new pass_dominator (ctxt); } +/* A hack until we remove threading from tree-vrp.c and bring the + simplification routine into the dom_opt_dom_walker class. */ +static class vr_values *x_vr_values; /* A trivial wrapper so that we can present the generic jump threading code with a simple API for simplifying statements. */ @@ -844,7 +854,95 @@ simplify_stmt_for_jump_threading (gimple *stmt, class avail_exprs_stack *avail_exprs_stack, basic_block bb ATTRIBUTE_UNUSED) { - return avail_exprs_stack->lookup_avail_expr (stmt, false, true); + /* First query our hash table to see if the the expression is available + there. A non-NULL return value will be either a constant or another + SSA_NAME. */ + tree cached_lhs = avail_exprs_stack->lookup_avail_expr (stmt, false, true); + if (cached_lhs) + return cached_lhs; + + /* If the hash table query failed, query VRP information. This is + essentially the same as tree-vrp's simplification routine. The + copy in tree-vrp is scheduled for removal in gcc-9. */ + if (gcond *cond_stmt = dyn_cast (stmt)) + { + cached_lhs + = x_vr_values->vrp_evaluate_conditional (gimple_cond_code (cond_stmt), + gimple_cond_lhs (cond_stmt), + gimple_cond_rhs (cond_stmt), + within_stmt); + return cached_lhs; + } + + if (gswitch *switch_stmt = dyn_cast (stmt)) + { + tree op = gimple_switch_index (switch_stmt); + if (TREE_CODE (op) != SSA_NAME) + return NULL_TREE; + + value_range *vr = x_vr_values->get_value_range (op); + if ((vr->type != VR_RANGE && vr->type != VR_ANTI_RANGE) + || symbolic_range_p (vr)) + return NULL_TREE; + + if (vr->type == VR_RANGE) + { + size_t i, j; + + find_case_label_range (switch_stmt, vr->min, vr->max, &i, &j); + + if (i == j) + { + tree label = gimple_switch_label (switch_stmt, i); + + if (CASE_HIGH (label) != NULL_TREE + ? (tree_int_cst_compare (CASE_LOW (label), vr->min) <= 0 + && tree_int_cst_compare (CASE_HIGH (label), vr->max) >= 0) + : (tree_int_cst_equal (CASE_LOW (label), vr->min) + && tree_int_cst_equal (vr->min, vr->max))) + return label; + + if (i > j) + return gimple_switch_label (switch_stmt, 0); + } + } + + if (vr->type == VR_ANTI_RANGE) + { + unsigned n = gimple_switch_num_labels (switch_stmt); + tree min_label = gimple_switch_label (switch_stmt, 1); + tree max_label = gimple_switch_label (switch_stmt, n - 1); + + /* The default label will be taken only if the anti-range of the + operand is entirely outside the bounds of all the (non-default) + case labels. */ + if (tree_int_cst_compare (vr->min, CASE_LOW (min_label)) <= 0 + && (CASE_HIGH (max_label) != NULL_TREE + ? tree_int_cst_compare (vr->max, CASE_HIGH (max_label)) >= 0 + : tree_int_cst_compare (vr->max, CASE_LOW (max_label)) >= 0)) + return gimple_switch_label (switch_stmt, 0); + } + return NULL_TREE; + } + + if (gassign *assign_stmt = dyn_cast (stmt)) + { + tree lhs = gimple_assign_lhs (assign_stmt); + if (TREE_CODE (lhs) == SSA_NAME + && (INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + || POINTER_TYPE_P (TREE_TYPE (lhs))) + && stmt_interesting_for_vrp (stmt)) + { + edge dummy_e; + tree dummy_tree; + value_range new_vr = VR_INITIALIZER; + x_vr_values->extract_range_from_stmt (stmt, &dummy_e, + &dummy_tree, &new_vr); + if (range_int_cst_singleton_p (&new_vr)) + return new_vr.min; + } + } + return NULL; } /* Valueize hook for gimple_fold_stmt_to_constant_1. */ @@ -1310,6 +1408,8 @@ dom_opt_dom_walker::before_dom_children (basic_block bb) if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "\n\nOptimizing block #%d\n\n", bb->index); + evrp_range_analyzer.enter (bb); + /* Push a marker on the stacks of local information so that we know how far to unwind when we finalize this block. */ m_avail_exprs_stack->push_marker (); @@ -1332,7 +1432,10 @@ dom_opt_dom_walker::before_dom_children (basic_block bb) edge taken_edge = NULL; for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) - taken_edge = this->optimize_stmt (bb, gsi); + { + evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi)); + taken_edge = this->optimize_stmt (bb, gsi); + } /* Now prepare to process dominated blocks. */ record_edge_info (bb); @@ -1350,13 +1453,16 @@ dom_opt_dom_walker::before_dom_children (basic_block bb) void dom_opt_dom_walker::after_dom_children (basic_block bb) { + x_vr_values = evrp_range_analyzer.get_vr_values (); thread_outgoing_edges (bb, m_dummy_cond, m_const_and_copies, m_avail_exprs_stack, simplify_stmt_for_jump_threading); + x_vr_values = NULL; /* These remove expressions local to BB from the tables. */ m_avail_exprs_stack->pop_to_marker (); m_const_and_copies->pop_to_marker (); + evrp_range_analyzer.leave (bb); } /* Search for redundant computations in STMT. If any are found, then @@ -1902,6 +2008,31 @@ dom_opt_dom_walker::optimize_stmt (basic_block bb, gimple_stmt_iterator si) integer_zero_node)); gimple_set_modified (stmt, true); } + else if (TREE_CODE (lhs) == SSA_NAME) + { + /* Exploiting EVRP data is not yet fully integrated into DOM + but we need to do something for this case to avoid regressing + udr4.f90 and new1.C which have unexecutable blocks with + undefined behavior that get diagnosed if they're left in the + IL because we've attached range information to new + SSA_NAMES. */ + edge taken_edge = NULL; + evrp_range_analyzer.vrp_visit_cond_stmt (as_a (stmt), + &taken_edge); + if (taken_edge) + { + if (taken_edge->flags & EDGE_TRUE_VALUE) + gimple_cond_make_true (as_a (stmt)); + else if (taken_edge->flags & EDGE_FALSE_VALUE) + gimple_cond_make_false (as_a (stmt)); + else + gcc_unreachable (); + gimple_set_modified (stmt, true); + update_stmt (stmt); + cfg_altered = true; + return taken_edge; + } + } } update_stmt_if_modified (stmt);