diff mbox

Prevent cselib substitution of FP, SP, SFP

Message ID 4F048AA2.30805@arm.com
State New
Headers show

Commit Message

Marcus Shawcroft Jan. 4, 2012, 5:21 p.m. UTC
Alias analysis by DSE based on CSELIB expansion assumes that
references to the stack frame from different base registers (ie FP, SP)
never alias.

The comment block in cselib explains that cselib does not allow
substitution of FP, SP or SFP specifically in order not to break DSE.

However, the logic implemented in CSELIB uses two mutually recursive
functions to perform cse expansion, cselib_expand_value_rtx_1() and
expand_loc(). The first explicitly checks and rejects an attempt to
expand FP, SFP, SP, otherwise it calls expand_loc() to choose and
expansion for a value from the list of available expansions.

expand_loc() does not implement the equivalent logic for
SFP, FP and SP, rather it always prefers the first available value
expansion, falling back to a simple regno expansion.

Therefore given a value that can be expanded to either FP or another
value, expand_loc() will always reject FP in favour of the available
value expansion.

This patch ensures that expand_loc() prefers to substitute for a frame
register in order to preserve the behaviour expected by DSE.

This issue was found while developing the ARM aarch64 backend where
gcc.c-torture/execute/vector-2.c is mis-compiled with 
-fno-omit-frame-pointer. The issue has not been observed on another 
target. The patch has been regressed on x86 (and aarch64).

Patch attached, proposed ChangeLog:

2012-01-04  Marcus Shawcroft  <marcus.shawcroft@arm.com>

	* cselib.c (expand_loc): Prefer a frame register
	substitution.


/Marcus
diff mbox

Patch

diff --git a/gcc/cselib.c b/gcc/cselib.c
index fc86ef1..7159ca4 100644
--- a/gcc/cselib.c
+++ b/gcc/cselib.c
@@ -1331,11 +1331,32 @@  cselib_lookup_mem (rtx x, int create)
   return mem_elt;
 }
 
-/* Search thru the possible substitutions in P.  We prefer a non reg
-   substitution because this allows us to expand the tree further.  If
-   we find, just a reg, take the lowest regno.  There may be several
-   non-reg results, we just take the first one because they will all
-   expand to the same place.  */
+/* Search through the possible substitutions in P.
+
+   We prefer in order:
+   1) STACK_POINTER_REGNUM, FRAME_POINTER or the HARD_FRAME_POINTER.
+   2) A non reg substitution because this allows us to expand the tree further.
+   3) The lowest regno.
+
+   We are not willing to substitute the STACK_POINTER_REGNUM,
+   FRAME_POINTER or the HARD_FRAME_POINTER. This is requirement of
+   DSE.
+
+   These expansions confuse the code that notices that stores
+   into the frame go dead at the end of the function and that the
+   frame is not affected by calls to subroutines.
+
+   If STACK_POINTER_REGNUM substitution is allowed, DSE will think
+   that parameter pushing also goes dead which is wrong.
+
+   If FRAME_POINTER or HARD_FRAME_POINTER substitution is allowed then
+   you lose the opportunity to make the frame assumptions.
+
+   If other potential uses need to substitute these frame registers we
+   should add a parameter to control this behavior.
+
+   There may be several non register results, we just take the first
+   one because they will all expand to the same place.  */
 
 static rtx
 expand_loc (struct elt_loc_list *p, struct expand_value_data *evd,
@@ -1345,7 +1366,24 @@  expand_loc (struct elt_loc_list *p, struct expand_value_data *evd,
   unsigned int regno = UINT_MAX;
   struct elt_loc_list *p_in = p;
 
-  for (; p; p = p -> next)
+  /* Prefer a frame related register over any other option.  */
+  for (p = p_in; p; p = p -> next)
+    {
+      if ((REG_P (p->loc))
+	  && (REGNO (p->loc) < regno)
+	  && !bitmap_bit_p (evd->regs_active, REGNO (p->loc)))
+	{
+	  unsigned int candidate_regno = REGNO (p->loc);
+	  if (candidate_regno == STACK_POINTER_REGNUM
+	      || candidate_regno == FRAME_POINTER_REGNUM
+	      || candidate_regno == HARD_FRAME_POINTER_REGNUM)
+	    {
+	      return p->loc;
+	    }
+	}
+    }
+
+  for (p = p_in; p; p = p -> next)
     {
       /* Avoid infinite recursion trying to expand a reg into a
 	 the same reg.  */