diff mbox

[RFC] Preferred rename register in regrename pass

Message ID B5E67142681B53468FAF6B7C31356562441BCDA8@hhmail02.hh.imgtec.org
State New
Headers show

Commit Message

Robert Suchanek Oct. 9, 2015, 7:10 a.m. UTC
Hi Bernd,

Thanks for the comments, much appreciated. Comments inlined and a reworked
patch attached.

> On 09/17/2015 04:38 PM, Robert Suchanek wrote:
> > We came across a situation for MIPS64 where moves for sign-extension were
> > not converted into a nop because of IRA spilled some of the allocnos and
> > assigned different hard register for the output operand in the move.
> > LRA is not fixing this up as most likely the move was not introduced by
> > the LRA itself.  I found it hard to fix this in LRA and looked at
> > an alternative solution where regrename pass appeared to be the best
> candidate.
> 
> For reference, please post examples of the insn pattern(s) where you
> would hope to get an improvement. Do they use matching constraints
> between the input and output operands in at least one alternative?

It all started because of 'extendmn2' SPN on MIPS64 where IRA broke the tie 
between input and output registers causing the move to stay around.  Generally,
I was expecting some of the moves to disappear.  Initially I thought
that the approach would benefit for matching constraints but whilst reworking
and more detailed analysis I couldn't find a case. 

> So this does look like something that could be addressed in regrename,
> but I think the patch is not quite the way to do it.
> 
> > +/* Return a preferred rename register for HEAD.  */
> 
> Function comments ideally ought to be a little more detailed. Preferred
> how and why?

Noted.  Will provide better description in the future.
> 
> > +static int
> > +find_preferred_rename_reg (du_head_p head)
> > +{
> > +  struct du_chain *this_du;
> > +  int preferred_reg = -1;
> > +
> > +  for (this_du = head->first; this_du; this_du = this_du->next_use)
> 
> This loop seems to search for the insn where the chain terminates (i.e.
> the register dies). It seems strange to do this here rather than during
> the initial scan in record_out_operands where we visit every insn and
> already look for REG_DEAD notes.

At the time, I thought it seemed to ok to do this separately. 
> 
> > +      rtx note;
> > +      insn_rr_info *p;
> > +
> > +      /* The preferred rename register is an output register iff an input
> > +	 register dies in an instruction but the candidate must be validated by
> > +	 check_new_reg_p.  */
> > +      for (note = REG_NOTES (this_du->insn); note; note = XEXP (note, 1))
> > +	if (insn_rr.exists()
> > +	    && REG_NOTE_KIND (note) == REG_DEAD
> > +	    && REGNO (XEXP (note, 0)) == head->regno
> > +	    && (p = &insn_rr[INSN_UID (this_du->insn)])
> > +	    && p->op_info)
> > +	  {
> > +	    int i;
> > +	    for (i = 0; i < p->op_info->n_chains; i++)
> > +	      {
> > +		struct du_head *next_head = p->op_info->heads[i];
> > +		if (head != next_head)
> 
> Here you're not actually verifying the chosen preferred reg is an
> output? Is the use of plain "p->op_info" (which is actually an array)
> intentional as a guess that operand 0 is the output? I'm not thrilled
> with this, and at the very least it should be "p->op_info[0]." to avoid
> reader confusion.
> It's also not verifying that this is indeed a case where choosing a
> preferred reg has a beneficial effect at all.

I realized that the check done in find_rename_reg should have been moved here as,
indeed, it is rather confusing. This is more like finding a candidate rather
than validating it.
AFAICS, "p->op_info" is a pointer to struct operand_rr_info and it can be null
if a chain is opened but not used in a BB.  It appears this happens if
a register is live across BB but not used in the currently processed one.

> 
> The use of insn_rr would probably also be unnecessary if this was done
> during the scan phase.
> 
> > +		    preferred_reg = next_head->regno;
> 
> The problem here is that there's an ordering issue. What if next_head
> gets renamed afterwards? The choice of preferred register hasn't bought
> us anything in that case.
> 
> For all these reasons I'd suggest a different approach, looking for such
> situations during the scan. Try to detect a situation where
>   * we have a REG_DEAD note for an existing chain
>   * the insn fulfils certain conditions (i.e. it's a move, or maybe one
>     of the alternatives has a matching constraint). After all, there's
>     not much point in tying if the reg that dies was used in a memory
>     address.
>   * a new chain is started for a single output
> Then, instead of picking a best register, mark the two chains as tied.
> Then, when choosing a rename register, see if a tied chain already was
> renamed, and try to pick the same register first.

I had something similar in mind at beginning but I didn't see how this would fit into
the scan. I think I have now better understanding of the regrename pass, thus,
the attached patch is likely the way go to.

I limited the conditions to moves only as I didn't see any tying for
matching constraints.

> 
> > @@ -1826,7 +1900,7 @@ regrename_optimize (void)
> >     df_analyze ();
> >     df_set_flags (DF_DEFER_INSN_RESCAN);
> >
> > -  regrename_init (false);
> > +  regrename_init (true);
> 
> It would be good to avoid this as it makes the renamer more expensive. I
> expect that if you follow the strategy described above, this won't be
> necessary.
> 
> > -  struct du_chain *chains[MAX_REGS_PER_ADDRESS];
> > -  struct du_head *heads[MAX_REGS_PER_ADDRESS];
> > +  vec<struct du_chain *> chains;
> > +  vec<struct du_head *> heads;
> 
> Given that MAX_REGS_PER_ADDRESS tends to be 1 or 2 this appears to make
> things more heavyweight, especially with the extra loop needed to free
> the vecs. If possible, try to avoid this. (Again, AFAICS this
> information shouldn't really be necessary for what you're trying to do).

It happens that MIPS defines the macro to 1 and one of the DSP tests in Dejagnu
failed. I wasn't sure if the macro used here is the correct one or rather
it should be relying on NREGS but this cannot be determined statically.
Alternatively, a new macro like MAX_REGS_PER_OPERAND would probably better
but now this change is redundant.

> 
> 
> Bernd

> > I'll let Bernd comment on the patch itself.  But I would say that if
> > you're setting up cases where we can tie the source/dest of an extension
> > together, then it's a good thing.  It'll cause more of them to turn into
> > NOPs and it'll make the redundant extension elimination pass more
> > effective as well.
>
> Not if you do it in regrename though, as it's run very late in the pipeline.
> If you want to make REE more effective, this would need to be done during RA.
>
> -- 
> Eric Botcazou

My immediate thought on this is to add a target hook and enable/disable this
on a port basis if this hurts performance.  I tried to fix this in IRA using
a similar technique as in this patch but couldn't find the right place
to apply this.  I'll come back to this if I have some more time. 

Regards,
Robert

gcc/
	* regrename.c (create_new_chain): Initialize terminated_dead,
	renamed and tied_chain.
	(find_best_rename_reg): Pick and check register from the tied chain.
	(regrename_do_replace): Mark head as renamed.
	(scan_rtx_reg): Tie chains in move insns.  Set terminate_dead flag.
	* regrename.h (struct du_head): Add tied_chain, renamed and
	terminated_dead members.
---
 gcc/regrename.c | 48 +++++++++++++++++++++++++++++++++++++++++++++++-
 gcc/regrename.h |  6 ++++++
 2 files changed, 53 insertions(+), 1 deletion(-)

Comments

Bernd Schmidt Oct. 9, 2015, 11:06 a.m. UTC | #1
Hi Robert,
> gcc/
> 	* regrename.c (create_new_chain): Initialize terminated_dead,
> 	renamed and tied_chain.
> 	(find_best_rename_reg): Pick and check register from the tied chain.
> 	(regrename_do_replace): Mark head as renamed.
> 	(scan_rtx_reg): Tie chains in move insns.  Set terminate_dead flag.
> 	* regrename.h (struct du_head): Add tied_chain, renamed and
> 	terminated_dead members.

Thanks - this looks a lot better already. You didn't say how it was 
bootstrapped and tested; please include this information for future 
submissions. For a patch like this, some data on the improvement you got 
would also be appreciated.

I'd still like to investigate the possibility of further simplification:

> +	    {
> +	      /* Find the input chain.  */
> +	      for (i = c->id - 1; id_to_chain.iterate (i, &head); i--)
> +		if (head->last && head->last->insn == insn
> +		    && head->terminated_dead)
> +		  {
> +		    gcc_assert (head->regno == REGNO (recog_data.operand[1]));
> +		    c->tied_chain = head;
> +		    head->tied_chain = c;
> +
> +		    if (dump_file)
> +		      fprintf (dump_file, "Tying chain %s (%d) with %s (%d)\n",
> +			       reg_names[c->regno], c->id,
> +			       reg_names[head->regno], head->id);
> +		    /* Once tied, we're done.  */
> +		    break;
> +		  }
> +	    }
> +	}
> +
This looks like it's a little more complicated than necessary. Couldn't 
you add a static var "terminated_this_insn" which gets initialized to 
NULL and set when a reg dies, and then you check this here rather than 
having a loop? That would also eliminate the new "terminated_dead" field.

Other than that I'm pretty happy with this.


Bernd
Robert Suchanek Oct. 9, 2015, 11:20 a.m. UTC | #2
Hi Bernd,

> Hi Robert,
> > gcc/
> > 	* regrename.c (create_new_chain): Initialize terminated_dead,
> > 	renamed and tied_chain.
> > 	(find_best_rename_reg): Pick and check register from the tied chain.
> > 	(regrename_do_replace): Mark head as renamed.
> > 	(scan_rtx_reg): Tie chains in move insns.  Set terminate_dead flag.
> > 	* regrename.h (struct du_head): Add tied_chain, renamed and
> > 	terminated_dead members.
> 
> Thanks - this looks a lot better already. You didn't say how it was
> bootstrapped and tested; please include this information for future
> submissions. For a patch like this, some data on the improvement you got
> would also be appreciated.

Ah, sorry.  I bootstrapped on x86_64-unknown-linux-gnu and ran the Dejagnu
with -frename-registers.  All looked fine.  As for the data, I'll do 
the comparison and will update this thread by next week.

> 
> I'd still like to investigate the possibility of further simplification:
> 
> > +	    {
> > +	      /* Find the input chain.  */
> > +	      for (i = c->id - 1; id_to_chain.iterate (i, &head); i--)
> > +		if (head->last && head->last->insn == insn
> > +		    && head->terminated_dead)
> > +		  {
> > +		    gcc_assert (head->regno == REGNO (recog_data.operand[1]));
> > +		    c->tied_chain = head;
> > +		    head->tied_chain = c;
> > +
> > +		    if (dump_file)
> > +		      fprintf (dump_file, "Tying chain %s (%d) with %s (%d)\n",
> > +			       reg_names[c->regno], c->id,
> > +			       reg_names[head->regno], head->id);
> > +		    /* Once tied, we're done.  */
> > +		    break;
> > +		  }
> > +	    }
> > +	}
> > +
> This looks like it's a little more complicated than necessary. Couldn't
> you add a static var "terminated_this_insn" which gets initialized to
> NULL and set when a reg dies, and then you check this here rather than
> having a loop? That would also eliminate the new "terminated_dead" field.

That is a good idea. I'll add the changes and update together
with the results.

> Other than that I'm pretty happy with this.
> 
> 
> Bernd

Regards,
Robert
diff mbox

Patch

diff --git a/gcc/regrename.c b/gcc/regrename.c
index 6517f4e..7356a20 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -230,6 +230,9 @@  create_new_chain (unsigned this_regno, unsigned this_nregs, rtx *loc,
   head->nregs = this_nregs;
   head->need_caller_save_reg = 0;
   head->cannot_rename = 0;
+  head->terminated_dead = 0;
+  head->renamed = 0;
+  head->tied_chain = NULL;
 
   id_to_chain.safe_push (head);
   head->id = current_id++;
@@ -373,6 +376,13 @@  find_best_rename_reg (du_head_p this_head, enum reg_class super_class,
   preferred_class
     = (enum reg_class) targetm.preferred_rename_class (super_class);
 
+  /* Pick and check the register from the tied chain iff the tied chain
+     is not renamed.  */
+  if (this_head->tied_chain && !this_head->tied_chain->renamed
+      && check_new_reg_p (old_reg, this_head->tied_chain->regno,
+			  this_head, *unavailable))
+    return this_head->tied_chain->regno;
+
   /* If PREFERRED_CLASS is not NO_REGS, we iterate in the first pass
      over registers that belong to PREFERRED_CLASS and try to find the
      best register within the class.  If that failed, we iterate in
@@ -952,6 +962,7 @@  regrename_do_replace (struct du_head *head, int reg)
     }
 
   mode = GET_MODE (*head->first->loc);
+  head->renamed = 1;
   head->regno = reg;
   head->nregs = hard_regno_nregs[reg][mode];
 }
@@ -1035,7 +1046,40 @@  scan_rtx_reg (rtx insn, rtx *loc, enum reg_class cl, enum scan_actions action,
   if (action == mark_write)
     {
       if (type == OP_OUT)
-	create_new_chain (this_regno, this_nregs, loc, insn, cl);
+	{
+	  int i;
+	  du_head_p c;
+	  du_head_p head;
+	  rtx pat = PATTERN (insn);
+
+	  c = create_new_chain (this_regno, this_nregs, loc, insn, cl);
+
+	  /* We try to tie chains in a move instruction for
+	     a single output.  */
+	  if (recog_data.n_operands == 2
+	      && GET_CODE (pat) == SET
+	      && GET_CODE (SET_DEST (pat)) == REG
+	      && GET_CODE (SET_SRC (pat)) == REG)
+	    {
+	      /* Find the input chain.  */
+	      for (i = c->id - 1; id_to_chain.iterate (i, &head); i--)
+		if (head->last && head->last->insn == insn
+		    && head->terminated_dead)
+		  {
+		    gcc_assert (head->regno == REGNO (recog_data.operand[1]));
+		    c->tied_chain = head;
+		    head->tied_chain = c;
+
+		    if (dump_file)
+		      fprintf (dump_file, "Tying chain %s (%d) with %s (%d)\n",
+			       reg_names[c->regno], c->id,
+			       reg_names[head->regno], head->id);
+		    /* Once tied, we're done.  */
+		    break;
+		  }
+	    }
+	}
+
       return;
     }
 
@@ -1143,6 +1187,8 @@  scan_rtx_reg (rtx insn, rtx *loc, enum reg_class cl, enum scan_actions action,
 		SET_HARD_REG_BIT (live_hard_regs, head->regno + nregs);
 	    }
 
+	  if (action == terminate_dead)
+	    (*p)->terminated_dead = 1;
 	  *p = next;
 	  if (dump_file)
 	    fprintf (dump_file,
diff --git a/gcc/regrename.h b/gcc/regrename.h
index 9a611f0..a61c5bd 100644
--- a/gcc/regrename.h
+++ b/gcc/regrename.h
@@ -28,6 +28,8 @@  struct du_head
   struct du_head *next_chain;
   /* The first and last elements of this chain.  */
   struct du_chain *first, *last;
+  /* The chain that this chain is tied to.  */
+  struct du_head *tied_chain;
   /* Describes the register being tracked.  */
   unsigned regno;
   int nregs;
@@ -45,6 +47,10 @@  struct du_head
      such as the SET_DEST of a CALL_INSN or an asm operand that used
      to be a hard register.  */
   unsigned int cannot_rename:1;
+  /* Nonzero if the chain is renamed.  */
+  unsigned int renamed:1;
+  /* Nonzero if the chain is marked as dead.  */
+  unsigned int terminated_dead:1;
 };
 
 typedef struct du_head *du_head_p;