diff mbox

[RFC] PR69195, Reload confused by invalid reg equivs

Message ID 20160304145430.GD9617@bubble.grove.modra.org
State New
Headers show

Commit Message

Alan Modra March 4, 2016, 2:54 p.m. UTC
This is a fix for two testcases that show reload replacing pseudos
that don't get hard regs, with their equivalent mem initialization,
but failing to address the mem properly.  The short story is that ira
analysis creates reg equivalence info for use by reload, based on
register lifetimes that are invalidated by ira itself deleting dead
insns.

My first attempt to fix this problem was to just run
delete_trivially_dead_insns early in ira.  That's enough to cure the
testcase failures.  However, ira also deletes unreachable blocks (that
are only recognized as such after ira reg equivalence analysis), and
I figure it is possible that deleting those insns may similarly affect
register lifetimes.

So what this patch does is revalidate the reg equivalences after insns
have been deleted, recreating them as if insns had been deleted before
any analysis occurred.  The patch has been bootstrapped and regression
tested on powerpc64le-linux.  Do we go with this approach, or just run
simple delete_trivially_dead_insns early?  Or something else?

gcc/
	PR rtl-optimization/69195
	* ira.c (pdx_subregs): New static, extracted from update_equiv_regs.
	(set_paradoxical_subreg): Delete pdx_subregs param.
	(add_store_equivs): New function, extracted from..
	(update_equiv_regs): ..here.  Don't create and free pdx_subregs or
	reg_equiv.
	(validate_equiv_regs): New function.
	(ira): Create and free pdx_subregs and reg_equiv.  After deleting
	insns, call validate_equiv_regs.  Call setup_reg_equiv later.
	(setup_reg_equiv): Protect against deleted insns.
gcc/testsuite/
	* gcc.dg/pr69195.c: New.
	* gcc.dg/pr69238.c: New.

Comments

Bernd Schmidt March 4, 2016, 5:39 p.m. UTC | #1
On 03/04/2016 03:54 PM, Alan Modra wrote:
> This is a fix for two testcases that show reload replacing pseudos
> that don't get hard regs, with their equivalent mem initialization,
> but failing to address the mem properly.  The short story is that ira
> analysis creates reg equivalence info for use by reload, based on
> register lifetimes that are invalidated by ira itself deleting dead
> insns.
>
> My first attempt to fix this problem was to just run
> delete_trivially_dead_insns early in ira.  That's enough to cure the
> testcase failures.  However, ira also deletes unreachable blocks (that
> are only recognized as such after ira reg equivalence analysis), and
> I figure it is possible that deleting those insns may similarly affect
> register lifetimes.
>
> So what this patch does is revalidate the reg equivalences after insns
> have been deleted, recreating them as if insns had been deleted before
> any analysis occurred.  The patch has been bootstrapped and regression
> tested on powerpc64le-linux.  Do we go with this approach, or just run
> simple delete_trivially_dead_insns early?  Or something else?

I've managed to reproduce this, and I think your analysis of the problem 
is correct. So the patch is probably ok from the point of you of "will 
it work". I can probably be convinced to approve it as-is, but I wonder 
if you'd be prepared to try something else first.

This whole area in IRA is turning into spaghetti a little bit. I would 
prefer that we schedule a new small optimization pass beforehand (either 
as a real pass or just a function call) that does only the 
label-substituting trick from update_equiv_regs, then removes dead code 
as necessary. When we then get to IRA and setting up equiv regs, we 
should no longer get to a point where we need to reverify equivalences 
or update regstat.


Bernd
Alan Modra March 7, 2016, 2:46 p.m. UTC | #2
On Fri, Mar 04, 2016 at 06:39:54PM +0100, Bernd Schmidt wrote:
> I've managed to reproduce this, and I think your analysis of the problem is
> correct. So the patch is probably ok from the point of you of "will it
> work". I can probably be convinced to approve it as-is, but I wonder if
> you'd be prepared to try something else first.
> 
> This whole area in IRA is turning into spaghetti a little bit. I would
> prefer that we schedule a new small optimization pass beforehand (either as
> a real pass or just a function call) that does only the label-substituting
> trick from update_equiv_regs, then removes dead code as necessary. When we
> then get to IRA and setting up equiv regs, we should no longer get to a
> point where we need to reverify equivalences or update regstat.

I had the same thought myself, but wondered what I'd be getting into
so didn't mention the idea.  ;-)  I'll have a go.  (Not sure if I'd
trust me to get it right though, for stage 4.)

Something for later:  It seems to my untrained eye that more than just
the label substitution should be separated out.  All of the latter
half of update_equiv_regs doing global combine and moving insns could
be better separated from the rest of ira.  It is to some extent, by
use of reg_equiv[] vs. ira_reg_equiv[], but shares the REG_EQUIV
notes.  That means the REG_EQUIV notes created by ira need to take
into account ira, lra and reload behaviour and are thus rather more
restrictive than needed for the mini-combine pass.  eg. See the
comment about calls in validate_equiv_mem.

Incidentally, an all-langs bootstrap and regression test of gcc
triggers delete_unreachable_blocks just once for x86_64 and never for
powerpc64le.  The file is testsuite/gfortran.dg/pr46755.f, which of
course isn't that surprising since the pr46755 fix added the
delete_unreachable_blocks call.
diff mbox

Patch

diff --git a/gcc/ira.c b/gcc/ira.c
index 0973258..047e164 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -3284,11 +3284,15 @@  no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSED,
     }
 }
 
+/* Use pdx_subregs to show whether a reg is used in a paradoxical
+   subreg.  */
+static  bool *pdx_subregs;
+
 /* Check whether the SUBREG is a paradoxical subreg and set the result
    in PDX_SUBREGS.  */
 
 static void
-set_paradoxical_subreg (rtx_insn *insn, bool *pdx_subregs)
+set_paradoxical_subreg (rtx_insn *insn)
 {
   subrtx_iterator::array_type array;
   FOR_EACH_SUBRTX (iter, array, PATTERN (insn), NONCONST)
@@ -3319,6 +3323,100 @@  adjust_cleared_regs (rtx loc, const_rtx old_rtx ATTRIBUTE_UNUSED, void *data)
   return NULL_RTX;
 }
 
+/* For insns that set a MEM to the contents of a REG that is only used
+   in a single basic block, see if the register is always equivalent
+   to that memory location and if moving the store from INSN to the
+   insn that sets REG is safe.  If so, put a REG_EQUIV note on the
+   initializing insn.
+
+   Don't add a REG_EQUIV note if the insn already has one.  The
+   existing REG_EQUIV is likely more useful than the one we are
+   adding.
+
+   If one of the regs in the address has reg_equiv[REGNO].replace set,
+   then we can't add this REG_EQUIV note.  The reg_equiv[REGNO].replace
+   optimization may move the set of this register immediately before
+   insn, which puts it after reg_equiv[REGNO].init_insns, and hence the
+   mention in the REG_EQUIV note would be to an uninitialized pseudo.  */
+static void
+add_store_equivs (void)
+{
+  for (rtx_insn *insn = get_insns (); insn; insn = NEXT_INSN (insn))
+    {
+      if (! INSN_P (insn))
+	continue;
+
+      rtx set = single_set (insn);
+      if (! set)
+	continue;
+
+      rtx dest = SET_DEST (set);
+      rtx src = SET_SRC (set);
+      unsigned int regno;
+      rtx_insn *init_insn;
+
+      if (MEM_P (dest)
+	  && REG_P (src)
+	  && (regno = REGNO (src)) >= FIRST_PSEUDO_REGISTER
+	  && REG_BASIC_BLOCK (regno) >= NUM_FIXED_BLOCKS
+	  && DF_REG_DEF_COUNT (regno) == 1
+	  && ! pdx_subregs[regno]
+	  && reg_equiv[regno].init_insns != 0
+	  && (init_insn = reg_equiv[regno].init_insns->insn ()) != 0
+	  && ! find_reg_note (init_insn, REG_EQUIV, NULL_RTX)
+	  && ! contains_replace_regs (XEXP (dest, 0))
+	  && validate_equiv_mem (init_insn, src, dest)
+	  && ! memref_used_between_p (dest, init_insn, insn)
+	  /* Attaching a REG_EQUIV note will fail if INIT_INSN has
+	     multiple sets.  */
+	  && set_unique_reg_note (init_insn, REG_EQUIV, copy_rtx (dest)))
+	{
+	  /* This insn makes the equivalence, not the one initializing
+	     the register.  */
+	  ira_reg_equiv[regno].init_insns
+	    = gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX);
+	  df_notes_rescan (init_insn);
+	}
+    }
+}
+
+/* After deleting dead insns, check that REG_EQUIV notes are still
+   valid.  */
+static void
+validate_equiv_regs (void)
+{
+  bool done_something = false;
+  unsigned int max = vec_safe_length (reg_equivs);
+
+  for (unsigned int regno = FIRST_PSEUDO_REGISTER; regno < max; regno++)
+    {
+      for (rtx_insn_list *elem = reg_equiv[regno].init_insns;
+	   elem;
+	   elem = elem->next ())
+	{
+	  rtx_insn *insn = elem->insn ();
+	  rtx set, note;
+
+	  if (insn != 0
+	      && ! insn->deleted ()
+	      && (set = single_set (insn)) != 0
+	      && REG_P (SET_DEST (set))
+	      && REGNO (SET_DEST (set)) == regno
+	      && MEM_P (SET_SRC (set))
+	      && (note = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != 0
+	      && rtx_equal_p (SET_SRC (set), XEXP (note, 0))
+	      && ! validate_equiv_mem (insn, SET_DEST (set), SET_SRC (set)))
+	    {
+	      remove_note (insn, note);
+	      done_something = true;
+	    }
+	}
+    }
+
+  if (done_something)
+    add_store_equivs ();
+}
+
 /* Nonzero if we recorded an equivalence for a LABEL_REF.  */
 static int recorded_label_ref;
 
@@ -3341,17 +3439,11 @@  update_equiv_regs (void)
   basic_block bb;
   int loop_depth;
   bitmap cleared_regs;
-  bool *pdx_subregs;
 
   /* We need to keep track of whether or not we recorded a LABEL_REF so
      that we know if the jump optimizer needs to be rerun.  */
   recorded_label_ref = 0;
 
-  /* Use pdx_subregs to show whether a reg is used in a paradoxical
-     subreg.  */
-  pdx_subregs = XCNEWVEC (bool, max_regno);
-
-  reg_equiv = XCNEWVEC (struct equivalence, max_regno);
   grow_reg_equivs ();
 
   init_alias_analysis ();
@@ -3363,7 +3455,7 @@  update_equiv_regs (void)
   FOR_EACH_BB_FN (bb, cfun)
     FOR_BB_INSNS (bb, insn)
       if (NONDEBUG_INSN_P (insn))
-	set_paradoxical_subreg (insn, pdx_subregs);
+	set_paradoxical_subreg (insn);
 
   /* Scan the insns and find which registers have equivalences.  Do this
      in a separate scan of the insns because (due to -fcse-follow-jumps)
@@ -3625,65 +3717,7 @@  update_equiv_regs (void)
 
   /* A second pass, to gather additional equivalences with memory.  This needs
      to be done after we know which registers we are going to replace.  */
-
-  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
-    {
-      rtx set, src, dest;
-      unsigned regno;
-
-      if (! INSN_P (insn))
-	continue;
-
-      set = single_set (insn);
-      if (! set)
-	continue;
-
-      dest = SET_DEST (set);
-      src = SET_SRC (set);
-
-      /* If this sets a MEM to the contents of a REG that is only used
-	 in a single basic block, see if the register is always equivalent
-	 to that memory location and if moving the store from INSN to the
-	 insn that set REG is safe.  If so, put a REG_EQUIV note on the
-	 initializing insn.
-
-	 Don't add a REG_EQUIV note if the insn already has one.  The existing
-	 REG_EQUIV is likely more useful than the one we are adding.
-
-	 If one of the regs in the address has reg_equiv[REGNO].replace set,
-	 then we can't add this REG_EQUIV note.  The reg_equiv[REGNO].replace
-	 optimization may move the set of this register immediately before
-	 insn, which puts it after reg_equiv[REGNO].init_insns, and hence
-	 the mention in the REG_EQUIV note would be to an uninitialized
-	 pseudo.  */
-
-      if (MEM_P (dest) && REG_P (src)
-	  && (regno = REGNO (src)) >= FIRST_PSEUDO_REGISTER
-	  && REG_BASIC_BLOCK (regno) >= NUM_FIXED_BLOCKS
-	  && DF_REG_DEF_COUNT (regno) == 1
-	  && reg_equiv[regno].init_insns != NULL
-	  && reg_equiv[regno].init_insns->insn () != NULL
-	  && ! find_reg_note (XEXP (reg_equiv[regno].init_insns, 0),
-			      REG_EQUIV, NULL_RTX)
-	  && ! contains_replace_regs (XEXP (dest, 0))
-	  && ! pdx_subregs[regno])
-	{
-	  rtx_insn *init_insn =
-	    as_a <rtx_insn *> (XEXP (reg_equiv[regno].init_insns, 0));
-	  if (validate_equiv_mem (init_insn, src, dest)
-	      && ! memref_used_between_p (dest, init_insn, insn)
-	      /* Attaching a REG_EQUIV note will fail if INIT_INSN has
-		 multiple sets.  */
-	      && set_unique_reg_note (init_insn, REG_EQUIV, copy_rtx (dest)))
-	    {
-	      /* This insn makes the equivalence, not the one initializing
-		 the register.  */
-	      ira_reg_equiv[regno].init_insns
-		= gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX);
-	      df_notes_rescan (init_insn);
-	    }
-	}
-    }
+  add_store_equivs ();
 
   cleared_regs = BITMAP_ALLOC (NULL);
   /* Now scan all regs killed in an insn to see if any of them are
@@ -3858,8 +3892,6 @@  update_equiv_regs (void)
   /* Clean up.  */
 
   end_alias_analysis ();
-  free (reg_equiv);
-  free (pdx_subregs);
   return recorded_label_ref;
 }
 
@@ -3882,11 +3914,12 @@  setup_reg_equiv (void)
       {
 	next_elem = elem->next ();
 	insn = elem->insn ();
-	set = single_set (insn);
 	
 	/* Init insns can set up equivalence when the reg is a destination or
 	   a source (in this case the destination is memory).  */
-	if (set != 0 && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set))))
+	if (! insn->deleted ()
+	    && (set = single_set (insn)) != 0
+	    && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set))))
 	  {
 	    if ((x = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != NULL)
 	      {
@@ -5090,7 +5123,6 @@  ira (FILE *f)
 {
   bool loops_p;
   int ira_max_point_before_emit;
-  int rebuild_p;
   bool saved_flag_caller_saves = flag_caller_saves;
   enum ira_region saved_flag_ira_region = flag_ira_region;
 
@@ -5184,10 +5216,9 @@  ira (FILE *f)
   if (resize_reg_info () && flag_ira_loop_pressure)
     ira_set_pseudo_classes (true, ira_dump_file);
 
-  rebuild_p = update_equiv_regs ();
-  setup_reg_equiv ();
-  setup_reg_equiv_init ();
-
+  pdx_subregs = XCNEWVEC (bool, max_regno);
+  reg_equiv = XCNEWVEC (struct equivalence, max_regno);
+  bool rebuild_p = update_equiv_regs ();
   bool update_regstat = false;
 
   if (optimize && rebuild_p)
@@ -5210,6 +5241,14 @@  ira (FILE *f)
       update_regstat = true;
     }
 
+  if (update_regstat)
+    validate_equiv_regs ();
+  free (reg_equiv);
+  free (pdx_subregs);
+
+  setup_reg_equiv ();
+  setup_reg_equiv_init ();
+
   /* It is not worth to do such improvement when we use a simple
      allocation because of -O0 usage or because the function is too
      big.  */
diff --git a/gcc/testsuite/gcc.dg/pr69195.c b/gcc/testsuite/gcc.dg/pr69195.c
new file mode 100644
index 0000000..af373a1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69195.c
@@ -0,0 +1,27 @@ 
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-dce -fno-forward-propagate" } */
+
+void __attribute__ ((noinline, noclone))
+foo (int *a, int n)
+{
+  int *lasta = a + n;
+  for (; a != lasta; a++)
+    {
+      *a *= 2;
+      a[1] = a[-1] + a[-2];
+    }
+}
+
+int
+main ()
+{
+  int a[16] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 };
+  int r[16] = { 1, 2, 6, 6, 16, 24, 44, 80,
+		136, 248, 432, 768, 1360, 2400, 4256, 3760 };
+  unsigned i;
+  foo (&a[2], 13);
+  for (i = 0; i < 8; ++i)
+    if (a[i] != r[i])
+      __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/pr69238.c b/gcc/testsuite/gcc.dg/pr69238.c
new file mode 100644
index 0000000..3538e63
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69238.c
@@ -0,0 +1,28 @@ 
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-dce -fno-forward-propagate -fno-rerun-cse-after-loop -funroll-loops" } */
+
+
+#define N 32
+
+short sa[N];
+short sb[N];
+int ia[N];
+int ib[N];
+
+int __attribute__ ((noinline, noclone))
+main1 (int n)
+{
+  int i;
+  for (i = 0; i < n; i++)
+    {
+      sa[i+7] = sb[i];
+      ia[i+3] = ib[i+1];
+    }
+  return 0;
+}
+
+int
+main (void)
+{ 
+  return main1 (N-7);
+}