Patchwork [PR43814] Assume function arguments of pointer type are aligned.

login
register
mail settings
Submitter Tom de Vries
Date Sept. 28, 2011, 10:21 p.m.
Message ID <4E839DE5.4070709@mentor.com>
Download mbox | patch
Permalink /patch/116861/
State New
Headers show

Comments

Tom de Vries - Sept. 28, 2011, 10:21 p.m.
On 09/24/2011 11:31 AM, Richard Guenther wrote:
> On Tue, Sep 20, 2011 at 1:13 PM, Tom de Vries <vries@codesourcery.com> wrote:
>> Hi Richard,
>>
>> I have a patch for PR43814. It introduces an option that assumes that function
>> arguments of pointer type are aligned, and uses that information in
>> tree-ssa-ccp. This enables the memcpy in pr43814-2.c to be inlined.
>>
>> I tested the patch successfully on-by-default on x86_64 and i686 (both gcc only
>> builds).
>>
>> I also tested the patch on-by-default for ARM (gcc/glibc build). The patch
>> generated wrong code for uselocale.c:
>> ...
>> glibc/locale/locale.h:
>> ...
>> /* This value can be passed to `uselocale' and may be returned by
>>   it. Passing this value to any other function has undefined behavior.  */
>> # define LC_GLOBAL_LOCALE       ((__locale_t) -1L)
>> ...
>> glibc/locale/uselocale.c:
>> ...
>> locale_t
>> __uselocale (locale_t newloc)
>> {
>>  locale_t oldloc = _NL_CURRENT_LOCALE;
>>
>>  if (newloc != NULL)
>>    {
>>      const locale_t locobj
>>        = newloc == LC_GLOBAL_LOCALE ? &_nl_global_locale : newloc;
>>
>> ...
>> The assumption that function arguments of pointer type are aligned, allowed the
>> test 'newloc == LC_GLOBAL_LOCALE' to evaluate to false.
>> But the usage of ((__locale_t) -1L) as function argument in uselocale violates
>> that assumption.
>>
>> Fixing the definition of LC_GLOBAL_LOCALE allowed the gcc tests to run without
>> regressions for ARM.
>>
>> Furthermore, the patch fixes ipa-sra-2.c and ipa-sra-6.c regressions on ARM,
>> discussed here:
>> - http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00930.html
>> - http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00459.html
>>
>> But, since glibc uses this construct currently, the option is off-by-default for
>> now.
>>
>> OK for trunk?
> 
> No, I don't like to have an option to control this.  And given the issue
> you spotted it doesn't look like the best idea either.  This area in GCC
> is simply too fragile right now :/
> 
> It would be nice if we could extend IPA-CP to propagate alignment
> information though.
> 
> And at some point devise a reliable way for frontends to communicate
> alignment constraints the middle-end can rely on (well, yes, you could
> argue that's what TYPE_ALIGN is about, and I thought that maybe we
> can unconditionally rely on TYPE_ALIGN for pointer PARM_DECLs
> at least - your example shows we can't).
> 
> In the end I'd probably say the patch is ok without the option (thus
> turned on by default), but if LC_GLOBAL_LOCALE is part of the
> glibc ABI then we clearly can't do this.
> 

I removed the flag, and enabled the optimization now with the aligned attribute.

bootstrapped and tested on x86_64 (gcc only build), and build and reg-tested on
arm (gcc + glibc build), no issues found.

OK for trunk?

I intend to send a follow-up patch that introduces a target hook
function_pointers_aligned, that is false by default, and on by default for
-mandroid. I asked Google to test their Android codebase with the optimization
on by default. Would such a target hook be acceptable?

> Richard.
> 

Thanks,
- Tom

2011-09-29  Tom de Vries <tom@codesourcery.com>

	PR target/43814
	* tree-ssa-ccp.c (get_align_value): New function, factored out of
	get_value_from_alignment.
	(get_value_from_alignment): Use get_align_value.
	(get_value_for_expr): Use get_align_value to handle alignment of
	function argument pointers with TYPE_USER_ALIGN.

	* gcc/testsuite/gcc.dg/pr43814.c: New test.
	* gcc/testsuite/gcc.target/arm/pr43814-2.c: New test.

Patch

Index: gcc/tree-cfgcleanup.c
===================================================================
--- gcc/tree-cfgcleanup.c	(revision 173703)
+++ gcc/tree-cfgcleanup.c	(working copy)
@@ -641,6 +641,552 @@  cleanup_omp_return (basic_block bb)
   return true;
 }
 
+/* Returns true if S contains (I1, I2).  */
+
+static bool
+int_int_splay_lookup (splay_tree s, unsigned int i1, unsigned int i2)
+{
+  splay_tree_node node;
+
+  if (s == NULL)
+    return false;
+
+  node = splay_tree_lookup (s, i1);
+  return node && node->value == i2;
+}
+
+/* Attempts to insert (I1, I2) into *S.  Returns true if successful.
+   Allocates *S if necessary.  */
+
+static bool
+int_int_splay_insert (splay_tree *s, unsigned int i1 , unsigned int i2)
+{
+  if (*s != NULL)
+    {
+      /* Check for existing element, which would otherwise be silently
+	 overwritten by splay_tree_insert.  */
+      if (splay_tree_lookup (*s, i1))
+	return false;
+    }
+  else
+    *s = splay_tree_new (splay_tree_compare_ints, 0, 0);
+
+  splay_tree_insert (*s, i1, i2);
+  return true;
+}
+
+/* Returns 0 if (NODE->value, NODE->key) is an element of S.  Otherwise,
+   returns 1.  */
+
+static int
+int_int_splay_node_contained_in (splay_tree_node node, void *s)
+{
+  splay_tree_node snode = splay_tree_lookup ((splay_tree)s, node->key);
+  return (!snode || node->value != snode->value) ? 1 : 0;
+}
+
+/* Returns true if all elements of S1 are also in S2.  */
+
+static bool
+int_int_splay_contained_in (splay_tree s1, splay_tree s2)
+{
+  if (s1 == NULL)
+    return true;
+  if (s2 == NULL)
+    return false;
+  return splay_tree_foreach (s1, int_int_splay_node_contained_in, s2) == 0;
+}
+
+typedef splay_tree equiv_t;
+
+/* Returns true if EQUIV contains (SSA_NAME_VERSION (VAL1),
+                                   SSA_NAME_VERSION (VAL2)).  */
+
+static bool
+equiv_lookup (equiv_t equiv, tree val1, tree val2)
+{
+  if (val1 == NULL_TREE || val2 == NULL_TREE
+      || TREE_CODE (val1) != SSA_NAME || TREE_CODE (val2) != SSA_NAME)
+    return false;
+
+  return int_int_splay_lookup (equiv, SSA_NAME_VERSION (val1),
+			       SSA_NAME_VERSION (val2));
+}
+
+/* Attempts to insert (SSA_NAME_VERSION (VAL1), SSA_NAME_VERSION (VAL2)) into
+   EQUIV, provided they are defined BB1 and BB2.  Returns true if successful.
+   Allocates *EQUIV if necessary.  */
+
+static bool
+equiv_insert (equiv_t *equiv, tree val1, tree val2,
+	      basic_block bb1, basic_block bb2)
+{
+  if (val1 == NULL_TREE || val2 == NULL_TREE
+      || TREE_CODE (val1) != SSA_NAME || TREE_CODE (val2) != SSA_NAME
+      || gimple_bb (SSA_NAME_DEF_STMT (val1)) != bb1
+      || gimple_bb (SSA_NAME_DEF_STMT (val2)) != bb2)
+    return false;
+
+  return int_int_splay_insert (equiv, SSA_NAME_VERSION (val1),
+			       SSA_NAME_VERSION (val2));
+}
+
+/* Returns true if all elements of S1 are also in S2.  */
+
+static bool
+equiv_contained_in (equiv_t s1, equiv_t s2)
+{
+  return int_int_splay_contained_in (s1, s2);
+}
+
+/* Init equiv_t *S.  */
+
+static void
+equiv_init (equiv_t *s)
+{
+  *s = NULL;
+}
+
+/* Delete equiv_t *S and reinit.  */
+
+static void
+equiv_delete (equiv_t *s)
+{
+  if (!*s)
+    return;
+
+  splay_tree_delete (*s);
+  *s = NULL;
+}
+
+/* Check whether S1 and S2 are equal, considering the fields in
+   gimple_statement_base.  Ignores fields uid, location, bb, and block.  */
+
+static bool
+gimple_base_equal_p (gimple s1, gimple s2)
+{
+  if (gimple_code (s1) != gimple_code (s2))
+    return false;
+
+  if (gimple_no_warning_p (s1) != gimple_no_warning_p (s2))
+    return false;
+
+  /* For pass-local flags visited and plf we would like to be more aggressive.
+     But that means we must have a way to find out whether the flags are
+     currently in use or not.  */
+  if (gimple_visited_p (s1) != gimple_visited_p (s2))
+    return false;
+
+  if (is_gimple_assign (s1)
+      && (gimple_assign_nontemporal_move_p (s1)
+          != gimple_assign_nontemporal_move_p (s2)))
+    return false;
+
+  if (gimple_plf (s1, GF_PLF_1) != gimple_plf (s2, GF_PLF_1))
+    return false;
+
+  if (gimple_plf (s1, GF_PLF_2) != gimple_plf (s2, GF_PLF_2))
+    return false;
+
+  /* The modified field is set when allocating, but only reset for the first
+     time once ssa_operands_active.  So before ssa_operands_active, the field
+     signals that the ssa operands have not been scanned, and after
+     ssa_operands_active it signals that the ssa operands might be invalid.
+     We check here only for the latter case.  */
+  if (ssa_operands_active ()
+      && (gimple_modified_p (s1) || gimple_modified_p (s2)))
+    return false;
+
+  if (gimple_has_volatile_ops (s1) != gimple_has_volatile_ops (s2))
+    return false;
+
+  if (s1->gsbase.subcode != s2->gsbase.subcode)
+    return false;
+
+  if (gimple_num_ops (s1) != gimple_num_ops (s2))
+    return false;
+
+  return true;
+}
+
+/* Return true if p1 and p2 can be considered equal.  */
+
+static bool
+pt_solution_equal_p (struct pt_solution *p1, struct pt_solution *p2)
+{
+  if (pt_solution_empty_p (p1) != pt_solution_empty_p (p2))
+    return false;
+  if (pt_solution_empty_p (p1))
+    return true;
+
+  /* TODO: make this less conservative.  */
+  return (p1->anything && p2->anything);
+}
+
+/* Return true if gimple statements S1 and S2 are equal.  EQUIV contains pairs
+   of local defs that can be considered equivalent at entry, and if equal
+   contains at exit the defs and vdefs of S1 and S2.  */
+
+static bool
+gimple_equal_p (equiv_t *equiv, gimple s1, gimple s2)
+{
+  unsigned int i;
+  enum gimple_statement_structure_enum gss;
+  tree lhs1, lhs2;
+  basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2);
+
+  /* Handle omp gimples conservatively.  */
+  if (is_gimple_omp (s1) || is_gimple_omp (s2))
+    return false;
+
+  if (!gimple_base_equal_p (s1, s2))
+    return false;
+
+  gss = gimple_statement_structure (s1);
+  switch (gss)
+    {
+    case GSS_CALL:
+      if (!pt_solution_equal_p (gimple_call_use_set (s1),
+				gimple_call_use_set (s2))
+	  || !pt_solution_equal_p (gimple_call_clobber_set (s1),
+				   gimple_call_clobber_set (s2))
+	  || !gimple_call_same_target_p (s1, s2))
+	return false;
+      /* Falthru.  */
+
+    case GSS_WITH_MEM_OPS_BASE:
+    case GSS_WITH_MEM_OPS:
+      {
+	tree vdef1 = gimple_vdef (s1), vdef2 = gimple_vdef (s2);
+	tree vuse1 = gimple_vuse (s1), vuse2 = gimple_vuse (s2);
+	if (vuse1 != vuse2 && !equiv_lookup (*equiv, vuse1, vuse2))
+	  return false;
+	if (vdef1 != vdef2 && !equiv_insert (equiv, vdef1, vdef2, bb1, bb2))
+	  return false;
+      }
+      /* Falthru.  */
+
+    case GSS_WITH_OPS:
+      /* Ignore gimple_def_ops and gimple_use_ops.  They are duplicates of
+	 gimple_vdef, gimple_vuse and gimple_ops, and checked elsewhere.  */
+      /* Falthru.  */
+
+    case GSS_BASE:
+      break;
+
+    default:
+      return false;
+    }
+
+  /* Find lhs.  */
+  lhs1 = gimple_get_lhs (s1);
+  lhs2 = gimple_get_lhs (s2);
+
+  /* Handle ops.  */
+  for (i = 0; i < gimple_num_ops (s1); ++i)
+    {
+      tree t1 = gimple_op (s1, i);
+      tree t2 = gimple_op (s2, i);
+      if (t1 == NULL_TREE && t2 == NULL_TREE)
+        continue;
+      if (t1 == NULL_TREE || t2 == NULL_TREE)
+        return false;
+      /* Skip lhs.  */
+      if (lhs1 == t1 && i == 0)
+	continue;
+      if (!operand_equal_p (t1, t2, 0) && !equiv_lookup (*equiv, t1, t2))
+	return false;
+    }
+
+  /* Handle lhs.  */
+  if (lhs1 != lhs2 && !equiv_insert (equiv, lhs1, lhs2, bb1, bb2))
+    return false;
+
+  return true;
+}
+
+/* Return true if BB1 and BB2 contain the same non-debug gimple statements, and
+   if the def pairs in PHI_EQUIV are found to be equivalent defs in BB1 and
+   BB2.  */
+
+static bool
+bb_gimple_equal_p (equiv_t phi_equiv, basic_block bb1, basic_block bb2)
+{
+  gimple_stmt_iterator gsi1 = gsi_start_nondebug_bb (bb1);
+  gimple_stmt_iterator gsi2 = gsi_start_nondebug_bb (bb2);
+  bool end1, end2;
+  equiv_t equiv;
+  bool equal = true;
+
+  end1 = gsi_end_p (gsi1);
+  end2 = gsi_end_p (gsi2);
+
+  /* Don't handle empty blocks, these are handled elsewhere in the cleanup.  */
+  if (end1 || end2)
+    return false;
+
+  /* TODO: handle blocks with phi-nodes.  We'll have find corresponding
+     phi-nodes in bb1 and bb2, with the same alternatives for the same
+     preds.  */
+  if (phi_nodes (bb1) != NULL || phi_nodes (bb2) != NULL)
+    return false;
+
+  equiv_init (&equiv);
+  while (true)
+    {
+      if (end1 && end2)
+        break;
+      if (end1 || end2
+	  || !gimple_equal_p (&equiv, gsi_stmt (gsi1), gsi_stmt (gsi2)))
+	{
+	  equal = false;
+	  break;
+	}
+
+      gsi_next_nondebug (&gsi1);
+      gsi_next_nondebug (&gsi2);
+      end1 = gsi_end_p (gsi1);
+      end2 = gsi_end_p (gsi2);
+    }
+
+  /* equiv now contains all bb1,bb2 def pairs which are equivalent.
+     Check if the phi alternatives are indeed equivalent.  */
+  equal = equal && equiv_contained_in (phi_equiv, equiv);
+
+  equiv_delete (&equiv);
+
+  return equal;
+}
+
+/* Resets all debug statements in BBUSE that have uses that are not
+   dominated by their defs.  */
+
+static void
+update_debug_stmts (basic_block bbuse)
+{
+  use_operand_p use_p;
+  ssa_op_iter oi;
+  basic_block bbdef;
+  gimple stmt, def_stmt;
+  gimple_stmt_iterator gsi;
+  tree name;
+
+  for (gsi = gsi_start_bb (bbuse); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      stmt = gsi_stmt (gsi);
+      if (!is_gimple_debug (stmt))
+	continue;
+      gcc_assert (gimple_debug_bind_p (stmt));
+
+      gcc_assert (dom_info_available_p (CDI_DOMINATORS));
+
+      FOR_EACH_PHI_OR_STMT_USE (use_p, stmt, oi, SSA_OP_USE)
+	{
+	  name = USE_FROM_PTR (use_p);
+	  gcc_assert (TREE_CODE (name) == SSA_NAME);
+
+	  def_stmt = SSA_NAME_DEF_STMT (name);
+	  gcc_assert (def_stmt != NULL);
+
+	  bbdef = gimple_bb (def_stmt);
+	  if (bbdef == NULL || bbuse == bbdef
+	      || dominated_by_p (CDI_DOMINATORS, bbuse, bbdef))
+	    continue;
+
+	  gimple_debug_bind_reset_value (stmt);
+	  update_stmt (stmt);
+	}
+    }
+}
+
+/* E1 and E2 have a common dest.  Detect if E1->src and E2->src are duplicates,
+   and if so, redirect the predecessor edges of E1->src to E2->src and remove
+   E1->src.  Returns true if any changes were made.  */
+
+static bool
+cleanup_duplicate_preds_1 (equiv_t phi_equiv, edge e1, edge e2)
+{
+  edge pred_edge;
+  basic_block bb1, bb2, pred;
+  basic_block bb_dom = NULL, bb2_dom = NULL;
+  unsigned int i;
+  basic_block bb = e1->dest;
+  gcc_assert (bb == e2->dest);
+
+  if (e1->flags != e2->flags)
+    return false;
+
+  bb1 = e1->src;
+  bb2 = e2->src;
+
+  /* TODO: We could allow multiple successor edges here, as long as bb1 and bb2
+     have the same successors.  */
+  if (EDGE_COUNT (bb1->succs) != 1 || EDGE_COUNT (bb2->succs) != 1)
+    return false;
+
+  if (!bb_gimple_equal_p (phi_equiv, bb1, bb2))
+    return false;
+
+  if (dump_file)
+    fprintf (dump_file, "cleanup_duplicate_preds: "
+             "cleaning up <bb %d>, duplicate of <bb %d>\n", bb1->index,
+             bb2->index);
+
+  /* Calculate the changes to be made to the dominator info.  */
+  if (dom_info_available_p (CDI_DOMINATORS))
+    {
+      /* Calculate bb2_dom.  */
+      bb2_dom = nearest_common_dominator (CDI_DOMINATORS, bb2, bb1);
+      if (bb2_dom == bb1 || bb2_dom == bb2)
+        bb2_dom = get_immediate_dominator (CDI_DOMINATORS, bb2_dom);
+
+      /* Calculate bb_dom.  */
+      bb_dom = get_immediate_dominator (CDI_DOMINATORS, bb);
+      if (bb == bb2)
+        bb_dom = bb2_dom;
+      else if (bb_dom == bb1 || bb_dom == bb2)
+        bb_dom = bb2;
+      else
+        {
+          /* Count the predecessors of bb (other than bb1 or bb2), not dominated
+             by bb.  If there are none, merging bb1 and bb2 will mean that bb2
+             dominates bb.  */
+          int not_dominated = 0;
+          for (i = 0; i < EDGE_COUNT (bb->preds); ++i)
+            {
+              pred_edge = EDGE_PRED (bb, i);
+              pred = pred_edge->src;
+              if (pred == bb1 || pred == bb2)
+                continue;
+              if (dominated_by_p (CDI_DOMINATORS, pred, bb))
+                continue;
+              not_dominated++;
+            }
+          if (not_dominated == 0)
+            bb_dom = bb2;
+        }
+    }
+
+  /* Redirect the incoming edges of bb1 to bb2.  */
+  for (i = EDGE_COUNT (bb1->preds); i > 0 ; --i)
+    {
+      pred_edge = EDGE_PRED (bb1, i - 1);
+      pred = pred_edge->src;
+      pred_edge = redirect_edge_and_branch (pred_edge, bb2);
+      gcc_assert (pred_edge != NULL);
+      /* The set of successors of pred have changed.  */
+      bitmap_set_bit (cfgcleanup_altered_bbs, pred->index);
+    }
+
+  /* The set of predecessors has changed for both bb and bb2.  */
+  bitmap_set_bit (cfgcleanup_altered_bbs, bb->index);
+  bitmap_set_bit (cfgcleanup_altered_bbs, bb2->index);
+
+  /* bb1 has no incoming edges anymore, and has become unreachable.  */
+  delete_basic_block (bb1);
+  bitmap_clear_bit (cfgcleanup_altered_bbs, bb1->index);
+
+  /* Update dominator info.  */
+  if (dom_info_available_p (CDI_DOMINATORS))
+    {
+      /* Note: update order is relevant.  */
+      set_immediate_dominator (CDI_DOMINATORS, bb2, bb2_dom);
+      if (bb != bb2)
+	set_immediate_dominator (CDI_DOMINATORS, bb, bb_dom);
+      verify_dominators (CDI_DOMINATORS);
+    }
+
+  /* Reset invalidated debug statements.  */
+  update_debug_stmts (bb2);
+
+  return true;
+}
+
+/* Returns whether for all phis in E1->dest the phi alternatives for E1 and
+   E2 are either:
+   - equal, or
+   - defined locally in E1->src and E2->src.
+   In the latter case, register the alternatives in *PHI_EQUIV.  */
+
+static bool
+same_or_local_phi_alternatives (equiv_t *phi_equiv, edge e1, edge e2)
+{
+  int n1 = e1->dest_idx;
+  int n2 = e2->dest_idx;
+  gimple_stmt_iterator gsi;
+  basic_block dest = e1->dest;
+  gcc_assert (dest == e2->dest);
+
+  for (gsi = gsi_start_phis (dest); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple phi = gsi_stmt (gsi);
+      tree val1 = gimple_phi_arg_def (phi, n1);
+      tree val2 = gimple_phi_arg_def (phi, n2);
+
+      gcc_assert (val1 != NULL_TREE);
+      gcc_assert (val2 != NULL_TREE);
+
+      if (operand_equal_for_phi_arg_p (val1, val2))
+	continue;
+
+      if (!equiv_insert (phi_equiv, val1, val2, e1->src, e2->src))
+	return false;
+    }
+
+  return true;
+}
+
+/* Detect duplicate predecessor blocks of BB and clean them up.  Return true if
+   any changes were made.  */
+
+static bool
+cleanup_duplicate_preds (basic_block bb)
+{
+  edge e1, e2, e1_swapped, e2_swapped;
+  unsigned int i, j, n;
+  equiv_t phi_equiv;
+  bool changed;
+
+  if (optimize < 2)
+    return false;
+
+  n = EDGE_COUNT (bb->preds);
+
+  for (i = 0; i < n; ++i)
+    {
+      e1 = EDGE_PRED (bb, i);
+      if (e1->flags & EDGE_COMPLEX)
+	continue;
+      for (j = i + 1; j < n; ++j)
+        {
+          e2 = EDGE_PRED (bb, j);
+	  if (e2->flags & EDGE_COMPLEX)
+	    continue;
+
+	  /* Block e1->src might be deleted.  If bb and e1->src are the same
+	     block, delete e2->src instead, by swapping e1 and e2.  */
+	  e1_swapped = (bb == e1->src) ? e2: e1;
+	  e2_swapped = (bb == e1->src) ? e1: e2;
+
+	  /* For all phis in bb, the phi alternatives for e1 and e2 need to have
+	     the same value.  */
+	  equiv_init (&phi_equiv);
+	  if (same_or_local_phi_alternatives (&phi_equiv, e1_swapped, e2_swapped))
+	    /* Collapse e1->src and e2->src if they are duplicates.  */
+	    changed = cleanup_duplicate_preds_1 (phi_equiv, e1_swapped, e2_swapped);
+	  else
+	    changed = false;
+
+	  equiv_delete (&phi_equiv);
+
+	  if (changed)
+	    return true;
+        }
+    }
+
+  return false;
+}
+
 /* Tries to cleanup cfg in basic block BB.  Returns true if anything
    changes.  */
 
@@ -668,6 +1210,9 @@  cleanup_tree_cfg_bb (basic_block bb)
       return true;
     }
 
+  if (cleanup_duplicate_preds (bb))
+    return true;
+
   return retval;
 }