Fix PR91482
diff mbox series

Message ID alpine.LSU.2.20.1908201403150.32458@zhemvz.fhfr.qr
State New
Headers show
Series
  • Fix PR91482
Related show

Commit Message

Richard Biener Aug. 20, 2019, 12:16 p.m. UTC
Excessive use of __builtin_assume_aligned can cause missed optimizations
because those calls are propagation barriers.  The following removes
those that are redundant and provide no extra information, on the
testcase allowng store-merging to apply.

Since the bit lattice and the const/copy lattice are merged
we cannot track this during CCP propagation (make a copy out
of the redundant call and propagate that out).  Thus I apply
it in the CCP specific folding routine called during
substitute_and_fold only.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2019-08-20  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/91482
	* tree-ssa-ccp.c (ccp_folder::fold_stmt): Remove useless
	BUILT_IN_ASSUME_ALIGNED calls.

	* gcc.dg/tree-ssa/pr91482.c: New testcase.

Comments

Richard Biener Aug. 21, 2019, 11:44 a.m. UTC | #1
On Tue, 20 Aug 2019, Richard Biener wrote:

> 
> Excessive use of __builtin_assume_aligned can cause missed optimizations
> because those calls are propagation barriers.  The following removes
> those that are redundant and provide no extra information, on the
> testcase allowng store-merging to apply.
> 
> Since the bit lattice and the const/copy lattice are merged
> we cannot track this during CCP propagation (make a copy out
> of the redundant call and propagate that out).  Thus I apply
> it in the CCP specific folding routine called during
> substitute_and_fold only.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

The following is what I have applied.

Richard.

2019-08-21  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/91482
	* tree-ssa-ccp.c (ccp_folder::fold_stmt): Remove useless
	BUILT_IN_ASSUME_ALIGNED calls.

	* gcc.dg/tree-ssa/pr91482.c: New testcase.

Index: gcc/tree-ssa-ccp.c
===================================================================
--- gcc/tree-ssa-ccp.c	(revision 274750)
+++ gcc/tree-ssa-ccp.c	(working copy)
@@ -2315,6 +2315,32 @@ ccp_folder::fold_stmt (gimple_stmt_itera
 	      }
           }
 
+	/* If there's no extra info from an assume_aligned call,
+	   drop it so it doesn't act as otherwise useless dataflow
+	   barrier.  */
+	if (gimple_call_builtin_p (stmt, BUILT_IN_ASSUME_ALIGNED))
+	  {
+	    tree ptr = gimple_call_arg (stmt, 0);
+	    ccp_prop_value_t ptrval = get_value_for_expr (ptr, true);
+	    if (ptrval.lattice_val == CONSTANT
+		&& TREE_CODE (ptrval.value) == INTEGER_CST
+		&& ptrval.mask != 0)
+	      {
+		ccp_prop_value_t val
+		  = bit_value_assume_aligned (stmt, NULL_TREE, ptrval, false);
+		unsigned int ptralign = least_bit_hwi (ptrval.mask.to_uhwi ());
+		unsigned int align = least_bit_hwi (val.mask.to_uhwi ());
+		if (ptralign == align
+		    && ((TREE_INT_CST_LOW (ptrval.value) & (align - 1))
+			== (TREE_INT_CST_LOW (val.value) & (align - 1))))
+		  {
+		    bool res = update_call_from_tree (gsi, ptr);
+		    gcc_assert (res);
+		    return true;
+		  }
+	      }
+	  }
+
 	/* Propagate into the call arguments.  Compared to replace_uses_in
 	   this can use the argument slot types for type verification
 	   instead of the current argument type.  We also can safely
Index: gcc/testsuite/gcc.dg/tree-ssa/pr91482.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/pr91482.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr91482.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ccp1 -fdump-tree-store-merging" } */
+
+void write64 (void *p)
+{
+  unsigned *p1 = (unsigned *) __builtin_assume_aligned (p, 8);
+  *p1++ = 0;
+  unsigned *p2 = (unsigned *) __builtin_assume_aligned (p1, 4);
+  *p2++ = 1;
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_assume_aligned" 1 "ccp1" } } */
+/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 2 stores" "store-merging" { target lp64 } } } */

Patch
diff mbox series

Index: gcc/tree-ssa-ccp.c
===================================================================
--- gcc/tree-ssa-ccp.c	(revision 274670)
+++ gcc/tree-ssa-ccp.c	(working copy)
@@ -2315,6 +2315,27 @@  ccp_folder::fold_stmt (gimple_stmt_itera
 	      }
           }
 
+	/* If there's no extra info from an assume_aligned call,
+	   drop it so it doesn't act as otherwise useless dataflow
+	   barrier.  */
+	if (gimple_call_builtin_p (stmt, BUILT_IN_ASSUME_ALIGNED))
+	  {
+	    tree ptr = gimple_call_arg (stmt, 0);
+	    ccp_prop_value_t ptrval = get_value_for_expr (ptr, true);
+	    ccp_prop_value_t val = bit_value_assume_aligned (stmt, NULL_TREE,
+							     ptrval, false);
+	    unsigned int ptralign = least_bit_hwi (ptrval.mask.to_uhwi ());
+	    unsigned int align = least_bit_hwi (val.mask.to_uhwi ());
+	    if (ptralign == align
+		&& ((TREE_INT_CST_LOW (ptrval.value) & (align - 1))
+		    == (TREE_INT_CST_LOW (val.value) & (align - 1))))
+	      {
+		bool res = update_call_from_tree (gsi, ptr);
+		gcc_assert (res);
+		return true;
+	      }
+	  }
+
 	/* Propagate into the call arguments.  Compared to replace_uses_in
 	   this can use the argument slot types for type verification
 	   instead of the current argument type.  We also can safely
Index: gcc/testsuite/gcc.dg/tree-ssa/pr91482.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/pr91482.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr91482.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ccp1 -fdump-tree-store-merging" } */
+
+void write64 (void *p)
+{
+  unsigned *p1 = (unsigned *) __builtin_assume_aligned (p, 8);
+  *p1++ = 0;
+  unsigned *p2 = (unsigned *) __builtin_assume_aligned (p1, 4);
+  *p2++ = 1;
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_assume_aligned" 1 "ccp1" } } */
+/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 2 stores" "store-merging" { target lp64 } } } */