Patchwork Out-of-order update of new_spill_reg_store[]

login
register
mail settings
Submitter Richard Sandiford
Date Dec. 4, 2011, 10:42 a.m.
Message ID <87aa78fwj8.fsf@firetop.home>
Download mbox | patch
Permalink /patch/129137/
State New
Headers show

Comments

Richard Sandiford - Dec. 4, 2011, 10:42 a.m.
Back to this...

Bernd Schmidt <bernds@codesourcery.com> writes:
>> gcc/
>> 	* reload1.c (reload_regs_reach_end_p): Replace with...
>> 	(reload_reg_rtx_reaches_end_p): ...this function.
>> 	(new_spill_reg_store): Update commentary.
>> 	(emit_input_reload_insns): Don't clear new_spill_reg_store here.
>> 	(emit_output_reload_insns): Check reload_reg_rtx_reaches_end_p
>> 	before setting new_spill_reg_store.
>> 	(emit_reload_insns): Use a separate loop to clear new_spill_reg_store.
>> 	Use reload_reg_rtx_reaches_end_p instead of reload_regs_reach_end_p.
>> 	Also use reload_reg_rtx_reaches_end_p when recording inheritance
>> 	information for non-spill reload registers.
>
> Just an update to say that based on our discussion I think the general
> approach is OK, but I'm still trying to figure out what exactly this
> piece of code is doing, and whether the changes to it make sense:
>
>> @@ -8329,30 +8329,33 @@ emit_reload_insns (struct insn_chain *ch
>>  		 the storing insn so that we may delete this insn with
>>  		 delete_output_reload.  */
>>  	      src_reg = reload_reg_rtx_for_output[r];
>> -
>> -	      /* If this is an optional reload, try to find the source reg
>> -		 from an input reload.  */
>> -	      if (! src_reg)
>> +	      if (src_reg
>> +		  && reload_reg_rtx_reaches_end_p (src_reg, r))
>> +		store_insn = new_spill_reg_store[REGNO (src_reg)];
>> +	      else
>>  		{
>> +		  /* If this is an optional reload, try to find the
>> +		     source reg from an input reload.  */
>>  		  rtx set = single_set (insn);
>>  		  if (set && SET_DEST (set) == rld[r].out)
>>  		    {
>>  		      int k;
>> +		      rtx cand;
>>  
>>  		      src_reg = SET_SRC (set);
>>  		      store_insn = insn;
>>  		      for (k = 0; k < n_reloads; k++)
>> -			{
>> -			  if (rld[k].in == src_reg)
>> -			    {
>> -			      src_reg = reload_reg_rtx_for_input[k];
>> -			      break;
>> -			    }
>> -			}
>> +			if (rld[k].in == src_reg)
>> +			  {
>> +			    cand = reload_reg_rtx_for_input[k];
>> +			    if (reload_reg_rtx_reaches_end_p (cand, k))
>> +			      {
>> +				src_reg = cand;
>> +				break;
>> +			      }
>> +			  }
>>  		    }
>>  		}
>> -	      else
>> -		store_insn = new_spill_reg_store[REGNO (src_reg)];
>>  	      if (src_reg && REG_P (src_reg)
>>  		  && REGNO (src_reg) < FIRST_PSEUDO_REGISTER)
>>  		{

Yeah, I was in two minds what to do here.  AIUI, the code:

	  if (!HARD_REGISTER_NUM_P (out_regno))
	    {
	      rtx src_reg, store_insn = NULL_RTX;

	      reg_last_reload_reg[out_regno] = 0;

	      /* If we can find a hard register that is stored, record
		 the storing insn so that we may delete this insn with
		 delete_output_reload.  */
	      src_reg = reload_reg_rtx_for_output[r];

	      /* If this is an optional reload, try to find the source reg
		 from an input reload.  */
	      if (! src_reg)
		{
		  rtx set = single_set (insn);
		  if (set && SET_DEST (set) == rld[r].out)
		    {
		      int k;

[A]		      src_reg = SET_SRC (set);
		      store_insn = insn;
		      for (k = 0; k < n_reloads; k++)
			{
			  if (rld[k].in == src_reg)
			    {
[B]			      src_reg = reload_reg_rtx_for_input[k];
			      break;
			    }
			}
		    }
		}
	      else
[C]		store_insn = new_spill_reg_store[REGNO (src_reg)];
	      if (src_reg && REG_P (src_reg)
		  && REGNO (src_reg) < FIRST_PSEUDO_REGISTER)
		{
                  ...record inheritance for out <- src_reg...

is coping with three cases:

 [C] we have an output reload for a non-spill register.  E.g. the
     pre-reload instruction might be:

        (set (reg P1) (.... (reg H1) ...))
             REG_DEAD: H1

     (H = hard register, P = pseudo register), where the P1 and H1
     operands have matching constraints.  In this case we might use
     H1 as the reload register for (reg P1).

     This is the case that really does need reload_reg_rtx_reaches_end_p.
     E.g. the SET above could be in parallel with another SET that needs
     a secondary output reload.  It's possible in principle for H1 to be
     used as the secondary reload register there too, so that the final
     sequence looks like:

       (parallel [(set (reg H1 [orig P1]) (... (reg H1) ...))
                  (set (reg H2 [orig P2]) (...))]
       (set (reg P1) (reg H1))
       (set (reg H1) (reg H2))
       (set (reg P2) (reg H1))

     (I say "in principle" because I don't know whether there is code
     to take advantage of this for non-spill registers like H1.
     But it would be a valid choice.)

     Either way, the idea is that new_spill_reg_store[R] is only valid
     for reload registers R that reach the end of the reload sequence.
     We really should check that H1 reaches the end before using
     new_spill_reg_store[H1].

 [A] (but not [B]).  I think this is for optional reloads that we
     decided not to make, in cases where the source of the set needs
     no reloads.  E.g. the pre-reload instruction might be:

       (set (reg P1) (reg H1))

     and P1 has an optional output reload that we decided not to make.
     Here we record that H1 holds P1 as a possible inheritance.
     If inheritance causes the P1 <- H1 move to become unnecessary,
     we can delete it as dead.

     There doesn't seem to be any kind of "reaches end" check, but I
     suppose the hope is that instructions like this are going to be so
     simple that there's no point.  Although I'm not sure whether for:

          (parallel [(set (reg P1) (reg H1))
                     (clobber (scratch))])

     we'd ever consider using H1 for the scratch in cases where the
     clobber isn't an earlyclobber.

     Either way, this is unrelated to the original problem of
     new_spill_reg_store being out of whack.  reload_reg_rtx_reaches_end_p
     wouldn't be the right thing to check, since we don't have a reload
     register to test in this case.  Because there's no "off the shelf"
     fix, it's probably best to leave it be until we have a testcase
     that exhibits a problem.

     The patch doesn't change this case.  

 [B] I think this is similar to [A], but for cases where the source
     is reloaded.  E.g. the pre-reload instruction might be:

       (set (reg P1) (reg P2))

     where P1 has an optional reload that we decided not to make and
     P2 is reloaded into H2.  The final sequence would look like:

       (set (reg H2) (reg P2))
       (set (reg P1) (reg H2))

     and the code would record P1 <- H2 for inheritance.

     There does seem to be a real danger here that H2 could be reused
     for clobbers (except again for earlyclobbers), so it seemed best
     to test reload_reg_rtx_reaches_end_p.  However, this change was
     by inspection, and isn't directly related to the new handling
     of new_spill_reg_store[], since the inherited "reload" is
     actually the reloaded instruction itself.

     If H2 doesn't reach the end, the patch falls back to [A].
     This means that if the original source is a hard register
     of the wrong class (instead of a pseudo as in the example above),
     we can continue to record an inheritance for that.

     I suppose it could be argued that this in itself has a risk
     (although I think a smaller one) because of the aforementioned
     lack of a "reaches end" test in [A].  So while the effect of the
     reload_reg_rtx_reaches_end_p check is to stop [B] from trumping [A]
     in cases where [B] is definitely invalid, it doesn't do anything to
     fix any problems that there might be in [A].

     Which is why this is a quandry.  After the patch for [C],
     I think this is the only remaining case in which we record
     inheritance for a reload register without checking whether
     the reload register reaches the end.  (Note that the "i >= 0"
     block above this one already applies the check consistently.)
     So on the one hand, it seems wrong to leave an obvious hole
     (especially one that will lead to silent wrong-code bugs),
     but on the other, it's always dangerous changing reload by
     inspection.

If you prefer changing as little as possible, then the version below
just does [C].  Also tested on mips64-linux-gnu.

Richard


gcc/
	* reload1.c (reload_regs_reach_end_p): Replace with...
	(reload_reg_rtx_reaches_end_p): ...this function.
	(new_spill_reg_store): Update commentary.
	(emit_input_reload_insns): Don't clear new_spill_reg_store here.
	(emit_output_reload_insns): Check reload_reg_rtx_reaches_end_p
	before setting new_spill_reg_store.
	(emit_reload_insns): Use a separate loop to clear new_spill_reg_store.
	Use reload_reg_rtx_reaches_end_p instead of reload_regs_reach_end_p.
	Also use reload_reg_rtx_reaches_end_p when reading new_spill_reg_store
	for non-spill reload registers.

Patch

Index: gcc/reload1.c
===================================================================
--- gcc/reload1.c	2011-11-27 10:06:56.000000000 +0000
+++ gcc/reload1.c	2011-12-03 14:29:26.000000000 +0000
@@ -5505,15 +5505,15 @@  reload_reg_reaches_end_p (unsigned int r
 }
 
 /* Like reload_reg_reaches_end_p, but check that the condition holds for
-   every register in the range [REGNO, REGNO + NREGS).  */
+   every register in REG.  */
 
 static bool
-reload_regs_reach_end_p (unsigned int regno, int nregs, int reloadnum)
+reload_reg_rtx_reaches_end_p (rtx reg, int reloadnum)
 {
-  int i;
+  unsigned int i;
 
-  for (i = 0; i < nregs; i++)
-    if (!reload_reg_reaches_end_p (regno + i, reloadnum))
+  for (i = REGNO (reg); i < END_REGNO (reg); i++)
+    if (!reload_reg_reaches_end_p (i, reloadnum))
       return false;
   return true;
 }
@@ -7058,7 +7058,9 @@  static rtx operand_reload_insns = 0;
 static rtx other_operand_reload_insns = 0;
 static rtx other_output_reload_insns[MAX_RECOG_OPERANDS];
 
-/* Values to be put in spill_reg_store are put here first.  */
+/* Values to be put in spill_reg_store are put here first.  Instructions
+   must only be placed here if the associated reload register reaches
+   the end of the instruction's reload sequence.  */
 static rtx new_spill_reg_store[FIRST_PSEUDO_REGISTER];
 static HARD_REG_SET reg_reloaded_died;
 
@@ -7219,9 +7221,7 @@  emit_input_reload_insns (struct insn_cha
 
       /* Prevent normal processing of this reload.  */
       special = 1;
-      /* Output a special code sequence for this case, and forget about
-	 spill reg information.  */
-      new_spill_reg_store[REGNO (reloadreg)] = NULL;
+      /* Output a special code sequence for this case.  */
       inc_for_reload (reloadreg, oldequiv, rl->out, rl->inc);
     }
 
@@ -7742,14 +7742,14 @@  emit_output_reload_insns (struct insn_ch
 		    rld[s].out_reg = rl->out_reg;
 		    set = single_set (next);
 		    if (set && SET_SRC (set) == s_reg
-			&& ! new_spill_reg_store[REGNO (s_reg)])
+			&& reload_reg_rtx_reaches_end_p (s_reg, s))
 		      {
 			SET_HARD_REG_BIT (reg_is_output_reload,
 					  REGNO (s_reg));
 			new_spill_reg_store[REGNO (s_reg)] = next;
 		      }
 		  }
-		else
+		else if (reload_reg_rtx_reaches_end_p (rl_reg_rtx, j))
 		  new_spill_reg_store[REGNO (rl_reg_rtx)] = p;
 	      }
 	  }
@@ -8009,6 +8009,15 @@  emit_reload_insns (struct insn_chain *ch
       debug_reload_to_stream (dump_file);
     }
 
+  for (j = 0; j < n_reloads; j++)
+    if (rld[j].reg_rtx && HARD_REGISTER_P (rld[j].reg_rtx))
+      {
+	unsigned int i;
+
+	for (i = REGNO (rld[j].reg_rtx); i < END_REGNO (rld[j].reg_rtx); i++)
+	  new_spill_reg_store[i] = 0;
+      }
+
   /* Now output the instructions to copy the data into and out of the
      reload registers.  Do these in the order that the reloads were reported,
      since reloads of base and index registers precede reloads of operands
@@ -8016,14 +8025,6 @@  emit_reload_insns (struct insn_chain *ch
 
   for (j = 0; j < n_reloads; j++)
     {
-      if (rld[j].reg_rtx && HARD_REGISTER_P (rld[j].reg_rtx))
-	{
-	  unsigned int i;
-
-	  for (i = REGNO (rld[j].reg_rtx); i < END_REGNO (rld[j].reg_rtx); i++)
-	    new_spill_reg_store[i] = 0;
-	}
-
       do_input_reload (chain, rld + j, j);
       do_output_reload (chain, rld + j, j);
     }
@@ -8149,15 +8150,13 @@  emit_reload_insns (struct insn_chain *ch
 			 && GET_CODE (rld[r].out) != PRE_MODIFY))))
 	    {
 	      rtx reg;
-	      enum machine_mode mode;
-	      int regno, nregs;
 
 	      reg = reload_reg_rtx_for_output[r];
-	      mode = GET_MODE (reg);
-	      regno = REGNO (reg);
-	      nregs = hard_regno_nregs[regno][mode];
-	      if (reload_regs_reach_end_p (regno, nregs, r))
+	      if (reload_reg_rtx_reaches_end_p (reg, r))
 		{
+		  enum machine_mode mode = GET_MODE (reg);
+		  int regno = REGNO (reg);
+		  int nregs = hard_regno_nregs[regno][mode];
 		  rtx out = (REG_P (rld[r].out)
 			     ? rld[r].out
 			     : rld[r].out_reg
@@ -8221,20 +8220,21 @@  emit_reload_insns (struct insn_chain *ch
 		   && !reg_set_p (reload_reg_rtx_for_input[r], PATTERN (insn)))
 	    {
 	      rtx reg;
-	      enum machine_mode mode;
-	      int regno, nregs;
 
 	      reg = reload_reg_rtx_for_input[r];
-	      mode = GET_MODE (reg);
-	      regno = REGNO (reg);
-	      nregs = hard_regno_nregs[regno][mode];
-	      if (reload_regs_reach_end_p (regno, nregs, r))
+	      if (reload_reg_rtx_reaches_end_p (reg, r))
 		{
+		  enum machine_mode mode;
+		  int regno;
+		  int nregs;
 		  int in_regno;
 		  int in_nregs;
 		  rtx in;
 		  bool piecemeal;
 
+		  mode = GET_MODE (reg);
+		  regno = REGNO (reg);
+		  nregs = hard_regno_nregs[regno][mode];
 		  if (REG_P (rld[r].in)
 		      && REGNO (rld[r].in) >= FIRST_PSEUDO_REGISTER)
 		    in = rld[r].in;
@@ -8336,10 +8336,17 @@  emit_reload_insns (struct insn_chain *ch
 		 delete_output_reload.  */
 	      src_reg = reload_reg_rtx_for_output[r];
 
-	      /* If this is an optional reload, try to find the source reg
-		 from an input reload.  */
-	      if (! src_reg)
+	      if (src_reg)
+		{
+		  if (reload_reg_rtx_reaches_end_p (src_reg, r))
+		    store_insn = new_spill_reg_store[REGNO (src_reg)];
+		  else
+		    src_reg = NULL_RTX;
+		}
+	      else
 		{
+		  /* If this is an optional reload, try to find the
+		     source reg from an input reload.  */
 		  rtx set = single_set (insn);
 		  if (set && SET_DEST (set) == rld[r].out)
 		    {
@@ -8357,8 +8364,6 @@  emit_reload_insns (struct insn_chain *ch
 			}
 		    }
 		}
-	      else
-		store_insn = new_spill_reg_store[REGNO (src_reg)];
 	      if (src_reg && REG_P (src_reg)
 		  && REGNO (src_reg) < FIRST_PSEUDO_REGISTER)
 		{