From patchwork Wed Sep 28 22:21:25 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 116861 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id C89D2B6F75 for ; Thu, 29 Sep 2011 08:20:02 +1000 (EST) Received: (qmail 8780 invoked by alias); 28 Sep 2011 22:20:00 -0000 Received: (qmail 8765 invoked by uid 22791); 28 Sep 2011 22:19:58 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL, BAYES_00, TW_CF, TW_CP, TW_TM X-Spam-Check-By: sourceware.org Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 28 Sep 2011 22:19:42 +0000 Received: from nat-dem.mentorg.com ([195.212.93.2] helo=eu2-mail.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1R92Td-00033x-4N from Tom_deVries@mentor.com ; Wed, 28 Sep 2011 15:19:41 -0700 Received: from [127.0.0.1] ([172.16.63.104]) by eu2-mail.mgc.mentorg.com with Microsoft SMTPSVC(6.0.3790.4675); Thu, 29 Sep 2011 00:19:39 +0200 Message-ID: <4E839DE5.4070709@mentor.com> Date: Thu, 29 Sep 2011 00:21:25 +0200 From: Tom de Vries User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.21) Gecko/20110831 Lightning/1.0b2 Thunderbird/3.1.13 MIME-Version: 1.0 To: Richard Guenther CC: Tom de Vries , "gcc-patches@gcc.gnu.org" , Maxim Kuvyrkov Subject: Re: [PATCH, PR43814] Assume function arguments of pointer type are aligned. References: <4E787543.1090009@codesourcery.com> In-Reply-To: Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On 09/24/2011 11:31 AM, Richard Guenther wrote: > On Tue, Sep 20, 2011 at 1:13 PM, Tom de Vries wrote: >> Hi Richard, >> >> I have a patch for PR43814. It introduces an option that assumes that function >> arguments of pointer type are aligned, and uses that information in >> tree-ssa-ccp. This enables the memcpy in pr43814-2.c to be inlined. >> >> I tested the patch successfully on-by-default on x86_64 and i686 (both gcc only >> builds). >> >> I also tested the patch on-by-default for ARM (gcc/glibc build). The patch >> generated wrong code for uselocale.c: >> ... >> glibc/locale/locale.h: >> ... >> /* This value can be passed to `uselocale' and may be returned by >> it. Passing this value to any other function has undefined behavior. */ >> # define LC_GLOBAL_LOCALE ((__locale_t) -1L) >> ... >> glibc/locale/uselocale.c: >> ... >> locale_t >> __uselocale (locale_t newloc) >> { >> locale_t oldloc = _NL_CURRENT_LOCALE; >> >> if (newloc != NULL) >> { >> const locale_t locobj >> = newloc == LC_GLOBAL_LOCALE ? &_nl_global_locale : newloc; >> >> ... >> The assumption that function arguments of pointer type are aligned, allowed the >> test 'newloc == LC_GLOBAL_LOCALE' to evaluate to false. >> But the usage of ((__locale_t) -1L) as function argument in uselocale violates >> that assumption. >> >> Fixing the definition of LC_GLOBAL_LOCALE allowed the gcc tests to run without >> regressions for ARM. >> >> Furthermore, the patch fixes ipa-sra-2.c and ipa-sra-6.c regressions on ARM, >> discussed here: >> - http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00930.html >> - http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00459.html >> >> But, since glibc uses this construct currently, the option is off-by-default for >> now. >> >> OK for trunk? > > No, I don't like to have an option to control this. And given the issue > you spotted it doesn't look like the best idea either. This area in GCC > is simply too fragile right now :/ > > It would be nice if we could extend IPA-CP to propagate alignment > information though. > > And at some point devise a reliable way for frontends to communicate > alignment constraints the middle-end can rely on (well, yes, you could > argue that's what TYPE_ALIGN is about, and I thought that maybe we > can unconditionally rely on TYPE_ALIGN for pointer PARM_DECLs > at least - your example shows we can't). > > In the end I'd probably say the patch is ok without the option (thus > turned on by default), but if LC_GLOBAL_LOCALE is part of the > glibc ABI then we clearly can't do this. > I removed the flag, and enabled the optimization now with the aligned attribute. bootstrapped and tested on x86_64 (gcc only build), and build and reg-tested on arm (gcc + glibc build), no issues found. OK for trunk? I intend to send a follow-up patch that introduces a target hook function_pointers_aligned, that is false by default, and on by default for -mandroid. I asked Google to test their Android codebase with the optimization on by default. Would such a target hook be acceptable? > Richard. > Thanks, - Tom 2011-09-29 Tom de Vries PR target/43814 * tree-ssa-ccp.c (get_align_value): New function, factored out of get_value_from_alignment. (get_value_from_alignment): Use get_align_value. (get_value_for_expr): Use get_align_value to handle alignment of function argument pointers with TYPE_USER_ALIGN. * gcc/testsuite/gcc.dg/pr43814.c: New test. * gcc/testsuite/gcc.target/arm/pr43814-2.c: New test. Index: gcc/tree-cfgcleanup.c =================================================================== --- gcc/tree-cfgcleanup.c (revision 173703) +++ gcc/tree-cfgcleanup.c (working copy) @@ -641,6 +641,552 @@ cleanup_omp_return (basic_block bb) return true; } +/* Returns true if S contains (I1, I2). */ + +static bool +int_int_splay_lookup (splay_tree s, unsigned int i1, unsigned int i2) +{ + splay_tree_node node; + + if (s == NULL) + return false; + + node = splay_tree_lookup (s, i1); + return node && node->value == i2; +} + +/* Attempts to insert (I1, I2) into *S. Returns true if successful. + Allocates *S if necessary. */ + +static bool +int_int_splay_insert (splay_tree *s, unsigned int i1 , unsigned int i2) +{ + if (*s != NULL) + { + /* Check for existing element, which would otherwise be silently + overwritten by splay_tree_insert. */ + if (splay_tree_lookup (*s, i1)) + return false; + } + else + *s = splay_tree_new (splay_tree_compare_ints, 0, 0); + + splay_tree_insert (*s, i1, i2); + return true; +} + +/* Returns 0 if (NODE->value, NODE->key) is an element of S. Otherwise, + returns 1. */ + +static int +int_int_splay_node_contained_in (splay_tree_node node, void *s) +{ + splay_tree_node snode = splay_tree_lookup ((splay_tree)s, node->key); + return (!snode || node->value != snode->value) ? 1 : 0; +} + +/* Returns true if all elements of S1 are also in S2. */ + +static bool +int_int_splay_contained_in (splay_tree s1, splay_tree s2) +{ + if (s1 == NULL) + return true; + if (s2 == NULL) + return false; + return splay_tree_foreach (s1, int_int_splay_node_contained_in, s2) == 0; +} + +typedef splay_tree equiv_t; + +/* Returns true if EQUIV contains (SSA_NAME_VERSION (VAL1), + SSA_NAME_VERSION (VAL2)). */ + +static bool +equiv_lookup (equiv_t equiv, tree val1, tree val2) +{ + if (val1 == NULL_TREE || val2 == NULL_TREE + || TREE_CODE (val1) != SSA_NAME || TREE_CODE (val2) != SSA_NAME) + return false; + + return int_int_splay_lookup (equiv, SSA_NAME_VERSION (val1), + SSA_NAME_VERSION (val2)); +} + +/* Attempts to insert (SSA_NAME_VERSION (VAL1), SSA_NAME_VERSION (VAL2)) into + EQUIV, provided they are defined BB1 and BB2. Returns true if successful. + Allocates *EQUIV if necessary. */ + +static bool +equiv_insert (equiv_t *equiv, tree val1, tree val2, + basic_block bb1, basic_block bb2) +{ + if (val1 == NULL_TREE || val2 == NULL_TREE + || TREE_CODE (val1) != SSA_NAME || TREE_CODE (val2) != SSA_NAME + || gimple_bb (SSA_NAME_DEF_STMT (val1)) != bb1 + || gimple_bb (SSA_NAME_DEF_STMT (val2)) != bb2) + return false; + + return int_int_splay_insert (equiv, SSA_NAME_VERSION (val1), + SSA_NAME_VERSION (val2)); +} + +/* Returns true if all elements of S1 are also in S2. */ + +static bool +equiv_contained_in (equiv_t s1, equiv_t s2) +{ + return int_int_splay_contained_in (s1, s2); +} + +/* Init equiv_t *S. */ + +static void +equiv_init (equiv_t *s) +{ + *s = NULL; +} + +/* Delete equiv_t *S and reinit. */ + +static void +equiv_delete (equiv_t *s) +{ + if (!*s) + return; + + splay_tree_delete (*s); + *s = NULL; +} + +/* Check whether S1 and S2 are equal, considering the fields in + gimple_statement_base. Ignores fields uid, location, bb, and block. */ + +static bool +gimple_base_equal_p (gimple s1, gimple s2) +{ + if (gimple_code (s1) != gimple_code (s2)) + return false; + + if (gimple_no_warning_p (s1) != gimple_no_warning_p (s2)) + return false; + + /* For pass-local flags visited and plf we would like to be more aggressive. + But that means we must have a way to find out whether the flags are + currently in use or not. */ + if (gimple_visited_p (s1) != gimple_visited_p (s2)) + return false; + + if (is_gimple_assign (s1) + && (gimple_assign_nontemporal_move_p (s1) + != gimple_assign_nontemporal_move_p (s2))) + return false; + + if (gimple_plf (s1, GF_PLF_1) != gimple_plf (s2, GF_PLF_1)) + return false; + + if (gimple_plf (s1, GF_PLF_2) != gimple_plf (s2, GF_PLF_2)) + return false; + + /* The modified field is set when allocating, but only reset for the first + time once ssa_operands_active. So before ssa_operands_active, the field + signals that the ssa operands have not been scanned, and after + ssa_operands_active it signals that the ssa operands might be invalid. + We check here only for the latter case. */ + if (ssa_operands_active () + && (gimple_modified_p (s1) || gimple_modified_p (s2))) + return false; + + if (gimple_has_volatile_ops (s1) != gimple_has_volatile_ops (s2)) + return false; + + if (s1->gsbase.subcode != s2->gsbase.subcode) + return false; + + if (gimple_num_ops (s1) != gimple_num_ops (s2)) + return false; + + return true; +} + +/* Return true if p1 and p2 can be considered equal. */ + +static bool +pt_solution_equal_p (struct pt_solution *p1, struct pt_solution *p2) +{ + if (pt_solution_empty_p (p1) != pt_solution_empty_p (p2)) + return false; + if (pt_solution_empty_p (p1)) + return true; + + /* TODO: make this less conservative. */ + return (p1->anything && p2->anything); +} + +/* Return true if gimple statements S1 and S2 are equal. EQUIV contains pairs + of local defs that can be considered equivalent at entry, and if equal + contains at exit the defs and vdefs of S1 and S2. */ + +static bool +gimple_equal_p (equiv_t *equiv, gimple s1, gimple s2) +{ + unsigned int i; + enum gimple_statement_structure_enum gss; + tree lhs1, lhs2; + basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2); + + /* Handle omp gimples conservatively. */ + if (is_gimple_omp (s1) || is_gimple_omp (s2)) + return false; + + if (!gimple_base_equal_p (s1, s2)) + return false; + + gss = gimple_statement_structure (s1); + switch (gss) + { + case GSS_CALL: + if (!pt_solution_equal_p (gimple_call_use_set (s1), + gimple_call_use_set (s2)) + || !pt_solution_equal_p (gimple_call_clobber_set (s1), + gimple_call_clobber_set (s2)) + || !gimple_call_same_target_p (s1, s2)) + return false; + /* Falthru. */ + + case GSS_WITH_MEM_OPS_BASE: + case GSS_WITH_MEM_OPS: + { + tree vdef1 = gimple_vdef (s1), vdef2 = gimple_vdef (s2); + tree vuse1 = gimple_vuse (s1), vuse2 = gimple_vuse (s2); + if (vuse1 != vuse2 && !equiv_lookup (*equiv, vuse1, vuse2)) + return false; + if (vdef1 != vdef2 && !equiv_insert (equiv, vdef1, vdef2, bb1, bb2)) + return false; + } + /* Falthru. */ + + case GSS_WITH_OPS: + /* Ignore gimple_def_ops and gimple_use_ops. They are duplicates of + gimple_vdef, gimple_vuse and gimple_ops, and checked elsewhere. */ + /* Falthru. */ + + case GSS_BASE: + break; + + default: + return false; + } + + /* Find lhs. */ + lhs1 = gimple_get_lhs (s1); + lhs2 = gimple_get_lhs (s2); + + /* Handle ops. */ + for (i = 0; i < gimple_num_ops (s1); ++i) + { + tree t1 = gimple_op (s1, i); + tree t2 = gimple_op (s2, i); + if (t1 == NULL_TREE && t2 == NULL_TREE) + continue; + if (t1 == NULL_TREE || t2 == NULL_TREE) + return false; + /* Skip lhs. */ + if (lhs1 == t1 && i == 0) + continue; + if (!operand_equal_p (t1, t2, 0) && !equiv_lookup (*equiv, t1, t2)) + return false; + } + + /* Handle lhs. */ + if (lhs1 != lhs2 && !equiv_insert (equiv, lhs1, lhs2, bb1, bb2)) + return false; + + return true; +} + +/* Return true if BB1 and BB2 contain the same non-debug gimple statements, and + if the def pairs in PHI_EQUIV are found to be equivalent defs in BB1 and + BB2. */ + +static bool +bb_gimple_equal_p (equiv_t phi_equiv, basic_block bb1, basic_block bb2) +{ + gimple_stmt_iterator gsi1 = gsi_start_nondebug_bb (bb1); + gimple_stmt_iterator gsi2 = gsi_start_nondebug_bb (bb2); + bool end1, end2; + equiv_t equiv; + bool equal = true; + + end1 = gsi_end_p (gsi1); + end2 = gsi_end_p (gsi2); + + /* Don't handle empty blocks, these are handled elsewhere in the cleanup. */ + if (end1 || end2) + return false; + + /* TODO: handle blocks with phi-nodes. We'll have find corresponding + phi-nodes in bb1 and bb2, with the same alternatives for the same + preds. */ + if (phi_nodes (bb1) != NULL || phi_nodes (bb2) != NULL) + return false; + + equiv_init (&equiv); + while (true) + { + if (end1 && end2) + break; + if (end1 || end2 + || !gimple_equal_p (&equiv, gsi_stmt (gsi1), gsi_stmt (gsi2))) + { + equal = false; + break; + } + + gsi_next_nondebug (&gsi1); + gsi_next_nondebug (&gsi2); + end1 = gsi_end_p (gsi1); + end2 = gsi_end_p (gsi2); + } + + /* equiv now contains all bb1,bb2 def pairs which are equivalent. + Check if the phi alternatives are indeed equivalent. */ + equal = equal && equiv_contained_in (phi_equiv, equiv); + + equiv_delete (&equiv); + + return equal; +} + +/* Resets all debug statements in BBUSE that have uses that are not + dominated by their defs. */ + +static void +update_debug_stmts (basic_block bbuse) +{ + use_operand_p use_p; + ssa_op_iter oi; + basic_block bbdef; + gimple stmt, def_stmt; + gimple_stmt_iterator gsi; + tree name; + + for (gsi = gsi_start_bb (bbuse); !gsi_end_p (gsi); gsi_next (&gsi)) + { + stmt = gsi_stmt (gsi); + if (!is_gimple_debug (stmt)) + continue; + gcc_assert (gimple_debug_bind_p (stmt)); + + gcc_assert (dom_info_available_p (CDI_DOMINATORS)); + + FOR_EACH_PHI_OR_STMT_USE (use_p, stmt, oi, SSA_OP_USE) + { + name = USE_FROM_PTR (use_p); + gcc_assert (TREE_CODE (name) == SSA_NAME); + + def_stmt = SSA_NAME_DEF_STMT (name); + gcc_assert (def_stmt != NULL); + + bbdef = gimple_bb (def_stmt); + if (bbdef == NULL || bbuse == bbdef + || dominated_by_p (CDI_DOMINATORS, bbuse, bbdef)) + continue; + + gimple_debug_bind_reset_value (stmt); + update_stmt (stmt); + } + } +} + +/* E1 and E2 have a common dest. Detect if E1->src and E2->src are duplicates, + and if so, redirect the predecessor edges of E1->src to E2->src and remove + E1->src. Returns true if any changes were made. */ + +static bool +cleanup_duplicate_preds_1 (equiv_t phi_equiv, edge e1, edge e2) +{ + edge pred_edge; + basic_block bb1, bb2, pred; + basic_block bb_dom = NULL, bb2_dom = NULL; + unsigned int i; + basic_block bb = e1->dest; + gcc_assert (bb == e2->dest); + + if (e1->flags != e2->flags) + return false; + + bb1 = e1->src; + bb2 = e2->src; + + /* TODO: We could allow multiple successor edges here, as long as bb1 and bb2 + have the same successors. */ + if (EDGE_COUNT (bb1->succs) != 1 || EDGE_COUNT (bb2->succs) != 1) + return false; + + if (!bb_gimple_equal_p (phi_equiv, bb1, bb2)) + return false; + + if (dump_file) + fprintf (dump_file, "cleanup_duplicate_preds: " + "cleaning up , duplicate of \n", bb1->index, + bb2->index); + + /* Calculate the changes to be made to the dominator info. */ + if (dom_info_available_p (CDI_DOMINATORS)) + { + /* Calculate bb2_dom. */ + bb2_dom = nearest_common_dominator (CDI_DOMINATORS, bb2, bb1); + if (bb2_dom == bb1 || bb2_dom == bb2) + bb2_dom = get_immediate_dominator (CDI_DOMINATORS, bb2_dom); + + /* Calculate bb_dom. */ + bb_dom = get_immediate_dominator (CDI_DOMINATORS, bb); + if (bb == bb2) + bb_dom = bb2_dom; + else if (bb_dom == bb1 || bb_dom == bb2) + bb_dom = bb2; + else + { + /* Count the predecessors of bb (other than bb1 or bb2), not dominated + by bb. If there are none, merging bb1 and bb2 will mean that bb2 + dominates bb. */ + int not_dominated = 0; + for (i = 0; i < EDGE_COUNT (bb->preds); ++i) + { + pred_edge = EDGE_PRED (bb, i); + pred = pred_edge->src; + if (pred == bb1 || pred == bb2) + continue; + if (dominated_by_p (CDI_DOMINATORS, pred, bb)) + continue; + not_dominated++; + } + if (not_dominated == 0) + bb_dom = bb2; + } + } + + /* Redirect the incoming edges of bb1 to bb2. */ + for (i = EDGE_COUNT (bb1->preds); i > 0 ; --i) + { + pred_edge = EDGE_PRED (bb1, i - 1); + pred = pred_edge->src; + pred_edge = redirect_edge_and_branch (pred_edge, bb2); + gcc_assert (pred_edge != NULL); + /* The set of successors of pred have changed. */ + bitmap_set_bit (cfgcleanup_altered_bbs, pred->index); + } + + /* The set of predecessors has changed for both bb and bb2. */ + bitmap_set_bit (cfgcleanup_altered_bbs, bb->index); + bitmap_set_bit (cfgcleanup_altered_bbs, bb2->index); + + /* bb1 has no incoming edges anymore, and has become unreachable. */ + delete_basic_block (bb1); + bitmap_clear_bit (cfgcleanup_altered_bbs, bb1->index); + + /* Update dominator info. */ + if (dom_info_available_p (CDI_DOMINATORS)) + { + /* Note: update order is relevant. */ + set_immediate_dominator (CDI_DOMINATORS, bb2, bb2_dom); + if (bb != bb2) + set_immediate_dominator (CDI_DOMINATORS, bb, bb_dom); + verify_dominators (CDI_DOMINATORS); + } + + /* Reset invalidated debug statements. */ + update_debug_stmts (bb2); + + return true; +} + +/* Returns whether for all phis in E1->dest the phi alternatives for E1 and + E2 are either: + - equal, or + - defined locally in E1->src and E2->src. + In the latter case, register the alternatives in *PHI_EQUIV. */ + +static bool +same_or_local_phi_alternatives (equiv_t *phi_equiv, edge e1, edge e2) +{ + int n1 = e1->dest_idx; + int n2 = e2->dest_idx; + gimple_stmt_iterator gsi; + basic_block dest = e1->dest; + gcc_assert (dest == e2->dest); + + for (gsi = gsi_start_phis (dest); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple phi = gsi_stmt (gsi); + tree val1 = gimple_phi_arg_def (phi, n1); + tree val2 = gimple_phi_arg_def (phi, n2); + + gcc_assert (val1 != NULL_TREE); + gcc_assert (val2 != NULL_TREE); + + if (operand_equal_for_phi_arg_p (val1, val2)) + continue; + + if (!equiv_insert (phi_equiv, val1, val2, e1->src, e2->src)) + return false; + } + + return true; +} + +/* Detect duplicate predecessor blocks of BB and clean them up. Return true if + any changes were made. */ + +static bool +cleanup_duplicate_preds (basic_block bb) +{ + edge e1, e2, e1_swapped, e2_swapped; + unsigned int i, j, n; + equiv_t phi_equiv; + bool changed; + + if (optimize < 2) + return false; + + n = EDGE_COUNT (bb->preds); + + for (i = 0; i < n; ++i) + { + e1 = EDGE_PRED (bb, i); + if (e1->flags & EDGE_COMPLEX) + continue; + for (j = i + 1; j < n; ++j) + { + e2 = EDGE_PRED (bb, j); + if (e2->flags & EDGE_COMPLEX) + continue; + + /* Block e1->src might be deleted. If bb and e1->src are the same + block, delete e2->src instead, by swapping e1 and e2. */ + e1_swapped = (bb == e1->src) ? e2: e1; + e2_swapped = (bb == e1->src) ? e1: e2; + + /* For all phis in bb, the phi alternatives for e1 and e2 need to have + the same value. */ + equiv_init (&phi_equiv); + if (same_or_local_phi_alternatives (&phi_equiv, e1_swapped, e2_swapped)) + /* Collapse e1->src and e2->src if they are duplicates. */ + changed = cleanup_duplicate_preds_1 (phi_equiv, e1_swapped, e2_swapped); + else + changed = false; + + equiv_delete (&phi_equiv); + + if (changed) + return true; + } + } + + return false; +} + /* Tries to cleanup cfg in basic block BB. Returns true if anything changes. */ @@ -668,6 +1210,9 @@ cleanup_tree_cfg_bb (basic_block bb) return true; } + if (cleanup_duplicate_preds (bb)) + return true; + return retval; }