diff mbox series

RFC: lra-constraints.c and TARGET_HARD_REGNO_CALL_PART_CLOBBERED question/patch

Message ID 1531339634.31833.51.camel@cavium.com
State New
Headers show
Series RFC: lra-constraints.c and TARGET_HARD_REGNO_CALL_PART_CLOBBERED question/patch | expand

Commit Message

Steve Ellcey July 11, 2018, 8:07 p.m. UTC
I have a reload/register allocation question and possible patch.  While
working on the Aarch64 SIMD ABI[1] I ran into a problem where GCC was
saving and restoring registers that it did not need to.  I tracked it
down to lra-constraints.c and its use of
targetm.hard_regno_call_part_clobbered on instructions that are not
calls.  Specifically need_for_call_save_p would check this macro even
when the instruction in question (unknown to need_for_call_save_p)
was not a call instruction.

This seems wrong to me and I was wondering if anyone more familiar
with the register allocator and reload could look at this patch and
tell me if it seems reasonable or not.  It passed bootstrap and I
am running tests now.  I am just wondering if there is any reason why
this target function would need to be called on non-call instructions
or if doing so is just an oversight/bug.

Steve Ellcey
sellcey@cavium.com


[1] https://gcc.gnu.org/ml/gcc/2018-07/msg00012.html


2018-07-11  Steve Ellcey  <sellcey@cavium.com>

	* lra-constraints.c (need_for_call_save_p): Add insn argument
	and only check targetm.hard_regno_call_part_clobbered on calls.
	(need_for_split_p): Add insn argument, pass to need_for_call_save_p.
	(split_reg): Pass insn to need_for_call_save_p.
	(split_if_necessary): Pass curr_insn to need_for_split_p.
	(inherit_in_ebb): Ditto.

Comments

Jeff Law July 11, 2018, 9:54 p.m. UTC | #1
On 07/11/2018 02:07 PM, Steve Ellcey wrote:
> I have a reload/register allocation question and possible patch.  While
> working on the Aarch64 SIMD ABI[1] I ran into a problem where GCC was
> saving and restoring registers that it did not need to.  I tracked it
> down to lra-constraints.c and its use of
> targetm.hard_regno_call_part_clobbered on instructions that are not
> calls.  Specifically need_for_call_save_p would check this macro even
> when the instruction in question (unknown to need_for_call_save_p)
> was not a call instruction.
> 
> This seems wrong to me and I was wondering if anyone more familiar
> with the register allocator and reload could look at this patch and
> tell me if it seems reasonable or not.  It passed bootstrap and I
> am running tests now.  I am just wondering if there is any reason why
> this target function would need to be called on non-call instructions
> or if doing so is just an oversight/bug.
> 
> Steve Ellcey
> sellcey@cavium.com
> 
> 
> [1] https://gcc.gnu.org/ml/gcc/2018-07/msg00012.html
> 
> 
> 2018-07-11  Steve Ellcey  <sellcey@cavium.com>
> 
> 	* lra-constraints.c (need_for_call_save_p): Add insn argument
> 	and only check targetm.hard_regno_call_part_clobbered on calls.
> 	(need_for_split_p): Add insn argument, pass to need_for_call_save_p.
> 	(split_reg): Pass insn to need_for_call_save_p.
> 	(split_if_necessary): Pass curr_insn to need_for_split_p.
> 	(inherit_in_ebb): Ditto.
Various target have calls which are exposed as INSNs rather than as
CALL_INSNs.   So we need to check that hook on all insns.

You can probably see this in action with the TLS insns on aarch64.

jeff
Richard Sandiford July 12, 2018, 6:17 a.m. UTC | #2
Jeff Law <law@redhat.com> writes:
> On 07/11/2018 02:07 PM, Steve Ellcey wrote:
>> I have a reload/register allocation question and possible patch.  While
>> working on the Aarch64 SIMD ABI[1] I ran into a problem where GCC was
>> saving and restoring registers that it did not need to.  I tracked it
>> down to lra-constraints.c and its use of
>> targetm.hard_regno_call_part_clobbered on instructions that are not
>> calls.  Specifically need_for_call_save_p would check this macro even
>> when the instruction in question (unknown to need_for_call_save_p)
>> was not a call instruction.
>> 
>> This seems wrong to me and I was wondering if anyone more familiar
>> with the register allocator and reload could look at this patch and
>> tell me if it seems reasonable or not.  It passed bootstrap and I
>> am running tests now.  I am just wondering if there is any reason why
>> this target function would need to be called on non-call instructions
>> or if doing so is just an oversight/bug.
>> 
>> Steve Ellcey
>> sellcey@cavium.com
>> 
>> 
>> [1] https://gcc.gnu.org/ml/gcc/2018-07/msg00012.html
>> 
>> 
>> 2018-07-11  Steve Ellcey  <sellcey@cavium.com>
>> 
>> 	* lra-constraints.c (need_for_call_save_p): Add insn argument
>> 	and only check targetm.hard_regno_call_part_clobbered on calls.
>> 	(need_for_split_p): Add insn argument, pass to need_for_call_save_p.
>> 	(split_reg): Pass insn to need_for_call_save_p.
>> 	(split_if_necessary): Pass curr_insn to need_for_split_p.
>> 	(inherit_in_ebb): Ditto.
> Various target have calls which are exposed as INSNs rather than as
> CALL_INSNs.   So we need to check that hook on all insns.
>
> You can probably see this in action with the TLS insns on aarch64.

Not sure whether it's that: I think other code does only consider
hard_regno_call_part_clobbered on calls.  But as it stands
need_for_call_save_p is checking whether there's a call somewhere
inbetween the current instruction and the last use in the EBB:

/* Return true if we need a caller save/restore for pseudo REGNO which
   was assigned to a hard register.  */
static inline bool
need_for_call_save_p (int regno)
{
  lra_assert (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0);
  return (usage_insns[regno].calls_num < calls_num
...
}

So it only calls targetm.hard_regno_call_part_clobbered if such a
call is known to exist somewhere between the two references to
regno (although we don't have the calls themselves to hand).

Thanks,
Richard
Steve Ellcey July 12, 2018, 10:59 p.m. UTC | #3
On Thu, 2018-07-12 at 07:17 +0100, Richard Sandiford wrote:

> So it only calls targetm.hard_regno_call_part_clobbered if such a
> call is known to exist somewhere between the two references to
> regno (although we don't have the calls themselves to hand).
> 
> Thanks,
> Richard

Having the specific calls is the problem here because the normal ABI
and the SIMD ABI are going to have different values for caller_save
and part_clobbered.  But I think I have found a better way to address
this.

Looking at 'need_for_call_save_p', when 'flag_ipa_ra' is set, we look
at 'actual_call_used_reg_set' to see if we need to save a
register.  But even if a particular register doesn't show up as used,
if 'hard_regno_call_part_clobbered' is set for that register we are
going to return true and say we need a save/restore of the
register.  On Aarch64, if I 'really' didn't use the register I
shouldn't need to save/restore it.  If I used it but it is callee saved
then on 'normal' functions I may need to save/restore it (to protect
the upper bits) but on SIMD functions I do not need to save/restore it
at all because the callee will save/restore the entire register and not
just the lower 64 bits.

I think that the patch I want to create is: when 'flag_ipa_ra' is true,
remove the check of 'hard_regno_call_part_clobbered' from
'need_for_call_save_p'.  Instead, check
'hard_regno_call_part_clobbered' in 'process_bb_lives' (where we know
exactly what function is being called) and use that to set
'actual_call_used_reg_set'.  In process_bb_live we know exactly what
function we are calling and can check to see if it is a 'normal'
function or a SIMD function.

Steve Ellcey
sellcey@cavium.com
diff mbox series

Patch

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 7eeec76..7fc8e7f 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -5344,7 +5344,7 @@  inherit_reload_reg (bool def_p, int original_regno,
 /* Return true if we need a caller save/restore for pseudo REGNO which
    was assigned to a hard register.  */
 static inline bool
-need_for_call_save_p (int regno)
+need_for_call_save_p (int regno, rtx_insn *insn)
 {
   lra_assert (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0);
   return (usage_insns[regno].calls_num < calls_num
@@ -5354,7 +5354,7 @@  need_for_call_save_p (int regno)
 	       ? lra_reg_info[regno].actual_call_used_reg_set
 	       : call_used_reg_set,
 	       PSEUDO_REGNO_MODE (regno), reg_renumber[regno])
-	      || (targetm.hard_regno_call_part_clobbered
+	      || (CALL_P (insn) && targetm.hard_regno_call_part_clobbered
 		  (reg_renumber[regno], PSEUDO_REGNO_MODE (regno)))));
 }
 
@@ -5374,7 +5374,8 @@  static bitmap_head ebb_global_regs;
    assignment pass because of too many generated moves which will be
    probably removed in the undo pass.  */
 static inline bool
-need_for_split_p (HARD_REG_SET potential_reload_hard_regs, int regno)
+need_for_split_p (HARD_REG_SET potential_reload_hard_regs,
+		  int regno, rtx_insn *insn)
 {
   int hard_regno = regno < FIRST_PSEUDO_REGISTER ? regno : reg_renumber[regno];
 
@@ -5416,7 +5417,8 @@  need_for_split_p (HARD_REG_SET potential_reload_hard_regs, int regno)
 	       || (regno >= FIRST_PSEUDO_REGISTER
 		   && lra_reg_info[regno].nrefs > 3
 		   && bitmap_bit_p (&ebb_global_regs, regno))))
-	  || (regno >= FIRST_PSEUDO_REGISTER && need_for_call_save_p (regno)));
+	  || (regno >= FIRST_PSEUDO_REGISTER
+	      && need_for_call_save_p (regno, insn)));
 }
 
 /* Return class for the split pseudo created from original pseudo with
@@ -5536,7 +5538,7 @@  split_reg (bool before_p, int original_regno, rtx_insn *insn,
       nregs = hard_regno_nregs (hard_regno, mode);
       rclass = lra_get_allocno_class (original_regno);
       original_reg = regno_reg_rtx[original_regno];
-      call_save_p = need_for_call_save_p (original_regno);
+      call_save_p = need_for_call_save_p (original_regno, insn);
     }
   lra_assert (hard_regno >= 0);
   if (lra_dump_file != NULL)
@@ -5759,7 +5761,7 @@  split_if_necessary (int regno, machine_mode mode,
 	     && INSN_UID (next_usage_insns) < max_uid)
 	    || (GET_CODE (next_usage_insns) == INSN_LIST
 		&& (INSN_UID (XEXP (next_usage_insns, 0)) < max_uid)))
-	&& need_for_split_p (potential_reload_hard_regs, regno + i)
+	&& need_for_split_p (potential_reload_hard_regs, regno + i, insn)
 	&& split_reg (before_p, regno + i, insn, next_usage_insns, NULL))
     res = true;
   return res;
@@ -6529,7 +6531,8 @@  inherit_in_ebb (rtx_insn *head, rtx_insn *tail)
 		  && usage_insns[j].check == curr_usage_insns_check
 		  && (next_usage_insns = usage_insns[j].insns) != NULL_RTX)
 		{
-		  if (need_for_split_p (potential_reload_hard_regs, j))
+		  if (need_for_split_p (potential_reload_hard_regs, j,
+					curr_insn))
 		    {
 		      if (lra_dump_file != NULL && head_p)
 			{