diff mbox

Out-of-order update of new_spill_reg_store[]

Message ID 87botqioeu.fsf@firetop.home
State New
Headers show

Commit Message

Richard Sandiford Oct. 9, 2011, 8:01 a.m. UTC
This patch fixes an ordering problem in reload: the output reloads are
emitted in reverse operand order, but new_spill_reg_store[] is updated
in forward reload order.  This causes problems if the same register is
used for two reloads.

I saw this hit on mips64-linux-gnu/-mabi=64 as a failure in
execute/scal-to-vec1.c at -O3.  The reloads were:

Reloads for insn # 580
Reload 0: GR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, secondary_reload_p
        reload_reg_rtx: (reg:SI 5 $5)
Reload 1: reload_out (SI) = (reg:SI 32 $f0 [1655])
        MD1_REG, RELOAD_FOR_OUTPUT (opnum = 0)
        reload_out_reg: (reg:SI 32 $f0 [1655])
        reload_reg_rtx: (reg:SI 65 lo)
        secondary_out_reload = 0

Reload 2: reload_out (SI) = (reg:SI 1656)
        GR_REGS, RELOAD_FOR_OUTPUT (opnum = 3)
        reload_out_reg: (reg:SI 1656)
        reload_reg_rtx: (reg:SI 5 $5)

So $5 is first stored in 1656 (operand 3), then $5 is used a secondary
reload in copying LO to $f0 (operand 0, reg 1655).  The next and final
use of 1655 ends up inheriting this second reload of $5, so we try to
delete the original output copy.  The problem is that we delete the
wrong one: we delete the store of $5 to 1656 rather than the copy of
$5 to 1655/$f0.

The fix I went for is to clear new_spill_reg_store[] for all reloads
as a separate pass (rather than in the main do_{input,output}_reload
loop), then only allow new_spill_store_reg[] to be set if the associated
reload register reaches the end of the reload sequence.

emit_input_reloads has:

      /* Output a special code sequence for this case, and forget about
	 spill reg information.  */
      new_spill_reg_store[REGNO (reloadreg)] = NULL;
      inc_for_reload (reloadreg, oldequiv, rl->out, rl->inc);

I think this store is redundant: emit_reload_insns should already have
cleared it, beth before and after the patch.  (The code was originally:

      /* Output a special code sequence for this case.  */
      new_spill_reg_store[REGNO (reloadreg)]
	= inc_for_reload (reloadreg, oldequiv, rl->out,
			  rl->inc);

but was changed because we can't inherit auto-inc reloads as easily
as that.  So the nullification came from an existing new_spill_reg_store[]
assignment, rather than being added explicitly.)

Also, emit_reload_insns has two blocks to record inheritance information:
one for spill registers and one for non-spill registers.  The spill version
checks that the reload register reaches the end of the sequence, and I think
the non-spill version should too.

Tested on mips64-linux-gnu and x86_64-linux-gnu.  It fixes the testcase
(by deleting the correct instruction -- the inheritance still happens).
Bernd, Uli, does this look OK?

Richard


gcc/
	* reload1.c (reload_regs_reach_end_p): Replace with...
	(reload_reg_rtx_reaches_end_p): ...this function.
	(new_spill_reg_store): Update commentary.
	(emit_input_reload_insns): Don't clear new_spill_reg_store here.
	(emit_output_reload_insns): Check reload_reg_rtx_reaches_end_p
	before setting new_spill_reg_store.
	(emit_reload_insns): Use a separate loop to clear new_spill_reg_store.
	Use reload_reg_rtx_reaches_end_p instead of reload_regs_reach_end_p.
	Also use reload_reg_rtx_reaches_end_p when recording inheritance
	information for non-spill reload registers.

Comments

Bernd Schmidt Oct. 11, 2011, 10:37 a.m. UTC | #1
I'm not completely following this yet, so please bear with me...

On 10/09/11 10:01, Richard Sandiford wrote:
> Reload 0: GR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, secondary_reload_p
>         reload_reg_rtx: (reg:SI 5 $5)
> Reload 1: reload_out (SI) = (reg:SI 32 $f0 [1655])
>         MD1_REG, RELOAD_FOR_OUTPUT (opnum = 0)
>         reload_out_reg: (reg:SI 32 $f0 [1655])
>         reload_reg_rtx: (reg:SI 65 lo)
>         secondary_out_reload = 0
> 
> Reload 2: reload_out (SI) = (reg:SI 1656)
>         GR_REGS, RELOAD_FOR_OUTPUT (opnum = 3)
>         reload_out_reg: (reg:SI 1656)
>         reload_reg_rtx: (reg:SI 5 $5)
> 
> So $5 is first stored in 1656 (operand 3), then $5 is used a secondary
> reload in copying LO to $f0 (operand 0, reg 1655).  The next and final
> use of 1655 ends up inheriting this second reload of $5, so we try to
> delete the original output copy.  The problem is that we delete the
> wrong one: we delete the store of $5 to 1656 rather than the copy of
> $5 to 1655/$f0.

So, reload 1 inherited from somewhere else rather than using reg $5 from
its secondary reload? Where do we try to delete the insn, and what's the
state of the spill_reg_store data at that point?

> The fix I went for is to clear new_spill_reg_store[] for all reloads
> as a separate pass (rather than in the main do_{input,output}_reload
> loop), then only allow new_spill_store_reg[] to be set if the associated
> reload register reaches the end of the reload sequence.

In this case, reload 0 is emitted after reload 2, so it reaches the end.
Correct? What would happen if the 0/1 pair and 2 were swapped?


Bernd
Richard Sandiford Oct. 11, 2011, 12:35 p.m. UTC | #2
Bernd Schmidt <bernds@codesourcery.com> writes:
> On 10/09/11 10:01, Richard Sandiford wrote:
>> Reload 0: GR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, secondary_reload_p
>>         reload_reg_rtx: (reg:SI 5 $5)
>> Reload 1: reload_out (SI) = (reg:SI 32 $f0 [1655])
>>         MD1_REG, RELOAD_FOR_OUTPUT (opnum = 0)
>>         reload_out_reg: (reg:SI 32 $f0 [1655])
>>         reload_reg_rtx: (reg:SI 65 lo)
>>         secondary_out_reload = 0
>> 
>> Reload 2: reload_out (SI) = (reg:SI 1656)
>>         GR_REGS, RELOAD_FOR_OUTPUT (opnum = 3)
>>         reload_out_reg: (reg:SI 1656)
>>         reload_reg_rtx: (reg:SI 5 $5)
>> 
>> So $5 is first stored in 1656 (operand 3), then $5 is used a secondary
>> reload in copying LO to $f0 (operand 0, reg 1655).  The next and final
>> use of 1655 ends up inheriting this second reload of $5, so we try to
>> delete the original output copy.  The problem is that we delete the
>> wrong one: we delete the store of $5 to 1656 rather than the copy of
>> $5 to 1655/$f0.
>
> So, reload 1 inherited from somewhere else rather than using reg $5 from
> its secondary reload? Where do we try to delete the insn, and what's the
> state of the spill_reg_store data at that point?

No, reload 1 is inherited by a later instruction.  And it's inherited
correctly, in terms of the register contents being what we expect.
(Reload 1 is the one that survives to the end of the instruction's
reload sequence.  Reload 2, in contrast, is clobbered by reload 1,
so could not be inherited.  So when we record inheritance information
in emit_reload_insns, reload_reg_reaches_end_p correctly stops us
from recording reload 2 but allows us to record reload 1.)

The problem is that we record the wrong instruction for reload 1.
We say that reload 1 is performed by the instruction that performs
reload 2.  So spill_reg_store[] contains the instruction for reload 2
rather than the instruction for reload 1.  We delete it in
delete_output_reload at the point of inheritance.

>> The fix I went for is to clear new_spill_reg_store[] for all reloads
>> as a separate pass (rather than in the main do_{input,output}_reload
>> loop), then only allow new_spill_store_reg[] to be set if the associated
>> reload register reaches the end of the reload sequence.
>
> In this case, reload 0 is emitted after reload 2, so it reaches the end.
> Correct?

Right.  And this is a defined order: output reloads are emitted in reverse
operand order (operand N-1, ..., operand 0). reload_reg_reaches_end_p
and reload_reg_free_p both rely on this property.  In this case, reload 2
is associated with operand 3, while reload 1 is associated with operand 0,
so reload 2 has to come first.

> What would happen if the 0/1 pair and 2 were swapped?

If only the reload indices changed (so that reload 2 became 0) then the
reloads would still be emitted in the current order (because the order
is tied to the operand number rather than the reload number).  And the
patch would also record the same instruction in spill_reg_store[].
I think current trunk would also behave correctly in this case,
because the reload numbers happen to match the insn order.

If the operand numbers were swapped, then the order of the output
reloads would be too.  This register selection would then be invalid.
We would have a secondary reload that uses $5 being emitted while the $5
output from the main instruction is still live.  But reload_reg_free_p
knows to avoid this situation.

Richard
Bernd Schmidt Oct. 12, 2011, 12:42 p.m. UTC | #3
On 10/11/11 14:35, Richard Sandiford wrote:
> No, reload 1 is inherited by a later instruction.  And it's inherited
> correctly, in terms of the register contents being what we expect.
> (Reload 1 is the one that survives to the end of the instruction's
> reload sequence.  Reload 2, in contrast, is clobbered by reload 1,
> so could not be inherited.  So when we record inheritance information
> in emit_reload_insns, reload_reg_reaches_end_p correctly stops us
> from recording reload 2 but allows us to record reload 1.)
> 
> The problem is that we record the wrong instruction for reload 1.
> We say that reload 1 is performed by the instruction that performs
> reload 2.  So spill_reg_store[] contains the instruction for reload 2
> rather than the instruction for reload 1.  We delete it in
> delete_output_reload at the point of inheritance.

Ok. So, would the minimal fix of testing !new_spill_reg_store[..] before
writing to it also work? Seems to me this would cope with the
out-of-order writes by only allowing the first.

If so, then I think I'd prefer that, but we could gcc_assert
(reload_reg_reaches_end (..)) as a bit of a verification of that function.


Bernd
Richard Sandiford Oct. 12, 2011, 8:56 p.m. UTC | #4
Bernd Schmidt <bernds@codesourcery.com> writes:
> On 10/11/11 14:35, Richard Sandiford wrote:
>> No, reload 1 is inherited by a later instruction.  And it's inherited
>> correctly, in terms of the register contents being what we expect.
>> (Reload 1 is the one that survives to the end of the instruction's
>> reload sequence.  Reload 2, in contrast, is clobbered by reload 1,
>> so could not be inherited.  So when we record inheritance information
>> in emit_reload_insns, reload_reg_reaches_end_p correctly stops us
>> from recording reload 2 but allows us to record reload 1.)
>> 
>> The problem is that we record the wrong instruction for reload 1.
>> We say that reload 1 is performed by the instruction that performs
>> reload 2.  So spill_reg_store[] contains the instruction for reload 2
>> rather than the instruction for reload 1.  We delete it in
>> delete_output_reload at the point of inheritance.
>
> Ok. So, would the minimal fix of testing !new_spill_reg_store[..] before
> writing to it also work? Seems to me this would cope with the
> out-of-order writes by only allowing the first.
>
> If so, then I think I'd prefer that, but we could gcc_assert
> (reload_reg_reaches_end (..)) as a bit of a verification of that function.

I don't think the assert would be safe.  We could have similar reuse in
cases where the first reload (in rld order) is a double-register value
starting in $4 and the second reload uses just $5.  In that case, the first
reload will have set new_spill_reg_store[4], so new_spill_reg_store[5] will
still be null.  But $5 in the second reload won't survive until the end of
the sequence.  So we'd try to set new_spill_reg_store[5] and trip the assert.

IMO it's a choice between just checking for null and not asserting
(if that really is safe, since we'll be storing instructions that
don't actually reach the end of the reload sequence), or checking
reload_reg_reaches_end.  I prefer the second, since it seems more
direct, and matches the corresponding code in emit_reload_insns.

Richard
Bernd Schmidt Oct. 17, 2011, 1:52 p.m. UTC | #5
> gcc/
> 	* reload1.c (reload_regs_reach_end_p): Replace with...
> 	(reload_reg_rtx_reaches_end_p): ...this function.
> 	(new_spill_reg_store): Update commentary.
> 	(emit_input_reload_insns): Don't clear new_spill_reg_store here.
> 	(emit_output_reload_insns): Check reload_reg_rtx_reaches_end_p
> 	before setting new_spill_reg_store.
> 	(emit_reload_insns): Use a separate loop to clear new_spill_reg_store.
> 	Use reload_reg_rtx_reaches_end_p instead of reload_regs_reach_end_p.
> 	Also use reload_reg_rtx_reaches_end_p when recording inheritance
> 	information for non-spill reload registers.

Just an update to say that based on our discussion I think the general
approach is OK, but I'm still trying to figure out what exactly this
piece of code is doing, and whether the changes to it make sense:

> @@ -8329,30 +8329,33 @@ emit_reload_insns (struct insn_chain *ch
>  		 the storing insn so that we may delete this insn with
>  		 delete_output_reload.  */
>  	      src_reg = reload_reg_rtx_for_output[r];
> -
> -	      /* If this is an optional reload, try to find the source reg
> -		 from an input reload.  */
> -	      if (! src_reg)
> +	      if (src_reg
> +		  && reload_reg_rtx_reaches_end_p (src_reg, r))
> +		store_insn = new_spill_reg_store[REGNO (src_reg)];
> +	      else
>  		{
> +		  /* If this is an optional reload, try to find the
> +		     source reg from an input reload.  */
>  		  rtx set = single_set (insn);
>  		  if (set && SET_DEST (set) == rld[r].out)
>  		    {
>  		      int k;
> +		      rtx cand;
>  
>  		      src_reg = SET_SRC (set);
>  		      store_insn = insn;
>  		      for (k = 0; k < n_reloads; k++)
> -			{
> -			  if (rld[k].in == src_reg)
> -			    {
> -			      src_reg = reload_reg_rtx_for_input[k];
> -			      break;
> -			    }
> -			}
> +			if (rld[k].in == src_reg)
> +			  {
> +			    cand = reload_reg_rtx_for_input[k];
> +			    if (reload_reg_rtx_reaches_end_p (cand, k))
> +			      {
> +				src_reg = cand;
> +				break;
> +			      }
> +			  }
>  		    }
>  		}
> -	      else
> -		store_insn = new_spill_reg_store[REGNO (src_reg)];
>  	      if (src_reg && REG_P (src_reg)
>  		  && REGNO (src_reg) < FIRST_PSEUDO_REGISTER)
>  		{


Bernd
diff mbox

Patch

Index: gcc/reload1.c
===================================================================
--- gcc/reload1.c	2011-10-08 16:32:26.000000000 +0100
+++ gcc/reload1.c	2011-10-08 16:32:26.000000000 +0100
@@ -5499,15 +5499,15 @@  reload_reg_reaches_end_p (unsigned int r
 }
 
 /* Like reload_reg_reaches_end_p, but check that the condition holds for
-   every register in the range [REGNO, REGNO + NREGS).  */
+   every register in REG.  */
 
 static bool
-reload_regs_reach_end_p (unsigned int regno, int nregs, int reloadnum)
+reload_reg_rtx_reaches_end_p (rtx reg, int reloadnum)
 {
-  int i;
+  unsigned int i;
 
-  for (i = 0; i < nregs; i++)
-    if (!reload_reg_reaches_end_p (regno + i, reloadnum))
+  for (i = REGNO (reg); i < END_REGNO (reg); i++)
+    if (!reload_reg_reaches_end_p (i, reloadnum))
       return false;
   return true;
 }
@@ -7052,7 +7052,9 @@  static rtx operand_reload_insns = 0;
 static rtx other_operand_reload_insns = 0;
 static rtx other_output_reload_insns[MAX_RECOG_OPERANDS];
 
-/* Values to be put in spill_reg_store are put here first.  */
+/* Values to be put in spill_reg_store are put here first.  Instructions
+   must only be placed here if the associated reload register reaches
+   the end of the instruction's reload sequence.  */
 static rtx new_spill_reg_store[FIRST_PSEUDO_REGISTER];
 static HARD_REG_SET reg_reloaded_died;
 
@@ -7213,9 +7215,7 @@  emit_input_reload_insns (struct insn_cha
 
       /* Prevent normal processing of this reload.  */
       special = 1;
-      /* Output a special code sequence for this case, and forget about
-	 spill reg information.  */
-      new_spill_reg_store[REGNO (reloadreg)] = NULL;
+      /* Output a special code sequence for this case.  */
       inc_for_reload (reloadreg, oldequiv, rl->out, rl->inc);
     }
 
@@ -7736,14 +7736,14 @@  emit_output_reload_insns (struct insn_ch
 		    rld[s].out_reg = rl->out_reg;
 		    set = single_set (next);
 		    if (set && SET_SRC (set) == s_reg
-			&& ! new_spill_reg_store[REGNO (s_reg)])
+			&& reload_reg_rtx_reaches_end_p (s_reg, s))
 		      {
 			SET_HARD_REG_BIT (reg_is_output_reload,
 					  REGNO (s_reg));
 			new_spill_reg_store[REGNO (s_reg)] = next;
 		      }
 		  }
-		else
+		else if (reload_reg_rtx_reaches_end_p (rl_reg_rtx, j))
 		  new_spill_reg_store[REGNO (rl_reg_rtx)] = p;
 	      }
 	  }
@@ -8003,6 +8003,15 @@  emit_reload_insns (struct insn_chain *ch
       debug_reload_to_stream (dump_file);
     }
 
+  for (j = 0; j < n_reloads; j++)
+    if (rld[j].reg_rtx && HARD_REGISTER_P (rld[j].reg_rtx))
+      {
+	unsigned int i;
+
+	for (i = REGNO (rld[j].reg_rtx); i < END_REGNO (rld[j].reg_rtx); i++)
+	  new_spill_reg_store[i] = 0;
+      }
+
   /* Now output the instructions to copy the data into and out of the
      reload registers.  Do these in the order that the reloads were reported,
      since reloads of base and index registers precede reloads of operands
@@ -8010,14 +8019,6 @@  emit_reload_insns (struct insn_chain *ch
 
   for (j = 0; j < n_reloads; j++)
     {
-      if (rld[j].reg_rtx && HARD_REGISTER_P (rld[j].reg_rtx))
-	{
-	  unsigned int i;
-
-	  for (i = REGNO (rld[j].reg_rtx); i < END_REGNO (rld[j].reg_rtx); i++)
-	    new_spill_reg_store[i] = 0;
-	}
-
       do_input_reload (chain, rld + j, j);
       do_output_reload (chain, rld + j, j);
     }
@@ -8143,15 +8144,13 @@  emit_reload_insns (struct insn_chain *ch
 			 && GET_CODE (rld[r].out) != PRE_MODIFY))))
 	    {
 	      rtx reg;
-	      enum machine_mode mode;
-	      int regno, nregs;
 
 	      reg = reload_reg_rtx_for_output[r];
-	      mode = GET_MODE (reg);
-	      regno = REGNO (reg);
-	      nregs = hard_regno_nregs[regno][mode];
-	      if (reload_regs_reach_end_p (regno, nregs, r))
+	      if (reload_reg_rtx_reaches_end_p (reg, r))
 		{
+		  enum machine_mode mode = GET_MODE (reg);
+		  int regno = REGNO (reg);
+		  int nregs = hard_regno_nregs[regno][mode];
 		  rtx out = (REG_P (rld[r].out)
 			     ? rld[r].out
 			     : rld[r].out_reg
@@ -8215,20 +8214,21 @@  emit_reload_insns (struct insn_chain *ch
 		   && !reg_set_p (reload_reg_rtx_for_input[r], PATTERN (insn)))
 	    {
 	      rtx reg;
-	      enum machine_mode mode;
-	      int regno, nregs;
 
 	      reg = reload_reg_rtx_for_input[r];
-	      mode = GET_MODE (reg);
-	      regno = REGNO (reg);
-	      nregs = hard_regno_nregs[regno][mode];
-	      if (reload_regs_reach_end_p (regno, nregs, r))
+	      if (reload_reg_rtx_reaches_end_p (reg, r))
 		{
+		  enum machine_mode mode;
+		  int regno;
+		  int nregs;
 		  int in_regno;
 		  int in_nregs;
 		  rtx in;
 		  bool piecemeal;
 
+		  mode = GET_MODE (reg);
+		  regno = REGNO (reg);
+		  nregs = hard_regno_nregs[regno][mode];
 		  if (REG_P (rld[r].in)
 		      && REGNO (rld[r].in) >= FIRST_PSEUDO_REGISTER)
 		    in = rld[r].in;
@@ -8329,30 +8329,33 @@  emit_reload_insns (struct insn_chain *ch
 		 the storing insn so that we may delete this insn with
 		 delete_output_reload.  */
 	      src_reg = reload_reg_rtx_for_output[r];
-
-	      /* If this is an optional reload, try to find the source reg
-		 from an input reload.  */
-	      if (! src_reg)
+	      if (src_reg
+		  && reload_reg_rtx_reaches_end_p (src_reg, r))
+		store_insn = new_spill_reg_store[REGNO (src_reg)];
+	      else
 		{
+		  /* If this is an optional reload, try to find the
+		     source reg from an input reload.  */
 		  rtx set = single_set (insn);
 		  if (set && SET_DEST (set) == rld[r].out)
 		    {
 		      int k;
+		      rtx cand;
 
 		      src_reg = SET_SRC (set);
 		      store_insn = insn;
 		      for (k = 0; k < n_reloads; k++)
-			{
-			  if (rld[k].in == src_reg)
-			    {
-			      src_reg = reload_reg_rtx_for_input[k];
-			      break;
-			    }
-			}
+			if (rld[k].in == src_reg)
+			  {
+			    cand = reload_reg_rtx_for_input[k];
+			    if (reload_reg_rtx_reaches_end_p (cand, k))
+			      {
+				src_reg = cand;
+				break;
+			      }
+			  }
 		    }
 		}
-	      else
-		store_insn = new_spill_reg_store[REGNO (src_reg)];
 	      if (src_reg && REG_P (src_reg)
 		  && REGNO (src_reg) < FIRST_PSEUDO_REGISTER)
 		{