diff mbox series

Rework the legitimize_address_displacement hook

Message ID 878tf4yjt3.fsf@linaro.org
State New
Headers show
Series Rework the legitimize_address_displacement hook | expand

Commit Message

Richard Sandiford Nov. 17, 2017, 4:03 p.m. UTC
This patch:

- tweaks the handling of legitimize_address_displacement
  so that it gets called before rather than after the address has
  been expanded.  This means that we're no longer at the mercy
  of LRA being able to interpret the expanded instructions.

- passes the original offset to legitimize_address_displacement.

- adds SVE support to the AArch64 implementation of
  legitimize_address_displacement.

Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
and powerpc64le-linux-gnu.  OK to install?

Richard


2017-11-17  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* target.def (legitimize_address_displacement): Take the original
	offset as a poly_int.
	* targhooks.h (default_legitimize_address_displacement): Update
	accordingly.
	* targhooks.c (default_legitimize_address_displacement): Likewise.
	* doc/tm.texi: Regenerate.
	* lra-constraints.c (base_plus_disp_to_reg): Take the displacement
	as an argument, moving assert of ad->disp == ad->disp_term to...
	(process_address_1): ...here.  Update calls to base_plus_disp_to_reg.
	Try calling targetm.legitimize_address_displacement before expanding
	the address rather than afterwards, and adjust for the new interface.
	* config/aarch64/aarch64.c (aarch64_legitimize_address_displacement):
	Match the new hook interface.  Handle SVE addresses.
	* config/sh/sh.c (sh_legitimize_address_displacement): Make the
	new hook interface.

Comments

Jeff Law Nov. 17, 2017, 7:45 p.m. UTC | #1
On 11/17/2017 09:03 AM, Richard Sandiford wrote:
> This patch:
> 
> - tweaks the handling of legitimize_address_displacement
>   so that it gets called before rather than after the address has
>   been expanded.  This means that we're no longer at the mercy
>   of LRA being able to interpret the expanded instructions.
> 
> - passes the original offset to legitimize_address_displacement.
> 
> - adds SVE support to the AArch64 implementation of
>   legitimize_address_displacement.
> 
> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> 2017-11-17  Richard Sandiford  <richard.sandiford@linaro.org>
> 	    Alan Hayward  <alan.hayward@arm.com>
> 	    David Sherwood  <david.sherwood@arm.com>
> 
> gcc/
> 	* target.def (legitimize_address_displacement): Take the original
> 	offset as a poly_int.
> 	* targhooks.h (default_legitimize_address_displacement): Update
> 	accordingly.
> 	* targhooks.c (default_legitimize_address_displacement): Likewise.
> 	* doc/tm.texi: Regenerate.
> 	* lra-constraints.c (base_plus_disp_to_reg): Take the displacement
> 	as an argument, moving assert of ad->disp == ad->disp_term to...
> 	(process_address_1): ...here.  Update calls to base_plus_disp_to_reg.
> 	Try calling targetm.legitimize_address_displacement before expanding
> 	the address rather than afterwards, and adjust for the new interface.
> 	* config/aarch64/aarch64.c (aarch64_legitimize_address_displacement):
> 	Match the new hook interface.  Handle SVE addresses.
> 	* config/sh/sh.c (sh_legitimize_address_displacement): Make the
> 	new hook interface.
It depends on poly, so it can't go in yet.  But if/when poly is
approved, the target independent bits are fine as are the sh changes.
I'll let the aarch64 maintainers chime in on those bits.

Jeff
James Greenhalgh Jan. 7, 2018, 8:45 p.m. UTC | #2
On Fri, Nov 17, 2017 at 07:45:28PM +0000, Jeff Law wrote:
> On 11/17/2017 09:03 AM, Richard Sandiford wrote:
> > This patch:
> > 
> > - tweaks the handling of legitimize_address_displacement
> >   so that it gets called before rather than after the address has
> >   been expanded.  This means that we're no longer at the mercy
> >   of LRA being able to interpret the expanded instructions.
> > 
> > - passes the original offset to legitimize_address_displacement.
> > 
> > - adds SVE support to the AArch64 implementation of
> >   legitimize_address_displacement.
> > 
> > Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> > and powerpc64le-linux-gnu.  OK to install?
> > 
> > Richard
> > 
> > 
> > 2017-11-17  Richard Sandiford  <richard.sandiford@linaro.org>
> > 	    Alan Hayward  <alan.hayward@arm.com>
> > 	    David Sherwood  <david.sherwood@arm.com>
> > 
> > gcc/
> > 	* target.def (legitimize_address_displacement): Take the original
> > 	offset as a poly_int.
> > 	* targhooks.h (default_legitimize_address_displacement): Update
> > 	accordingly.
> > 	* targhooks.c (default_legitimize_address_displacement): Likewise.
> > 	* doc/tm.texi: Regenerate.
> > 	* lra-constraints.c (base_plus_disp_to_reg): Take the displacement
> > 	as an argument, moving assert of ad->disp == ad->disp_term to...
> > 	(process_address_1): ...here.  Update calls to base_plus_disp_to_reg.
> > 	Try calling targetm.legitimize_address_displacement before expanding
> > 	the address rather than afterwards, and adjust for the new interface.
> > 	* config/aarch64/aarch64.c (aarch64_legitimize_address_displacement):
> > 	Match the new hook interface.  Handle SVE addresses.
> > 	* config/sh/sh.c (sh_legitimize_address_displacement): Make the
> > 	new hook interface.
> It depends on poly, so it can't go in yet.  But if/when poly is
> approved, the target independent bits are fine as are the sh changes.
> I'll let the aarch64 maintainers chime in on those bits.

> > Index: gcc/config/aarch64/aarch64.c
> > ===================================================================
> > --- gcc/config/aarch64/aarch64.c	2017-11-17 16:01:54.798206825 +0000
> > +++ gcc/config/aarch64/aarch64.c	2017-11-17 16:01:54.992032358 +0000
> > @@ -5902,32 +5902,78 @@ aarch64_legitimate_address_p (machine_mo
> >    return aarch64_classify_address (&addr, x, mode, strict_p, type);
> >  }
> >  
> > -/* Split an out-of-range address displacement into a base and offset.
> > -   Use 4KB range for 1- and 2-byte accesses and a 16KB range otherwise
> > -   to increase opportunities for sharing the base address of different sizes.
> > -   Unaligned accesses use the signed 9-bit range, TImode/TFmode use
> > -   the intersection of signed scaled 7-bit and signed 9-bit offset.  */
> > +/* Implement TARGET_LEGITIMIZE_ADDRESS_DISPLACEMENT.  */
> > +

The new comment is not especially descriptive of the arguments for this hook.
If you could update it, that would be helpful, though I appreciate it will
be documented at the definition of the hook itself.

Otherwise, this is OK.

Thanks,
James
diff mbox series

Patch

Index: gcc/target.def
===================================================================
--- gcc/target.def	2017-11-17 16:01:54.798206825 +0000
+++ gcc/target.def	2017-11-17 16:01:54.994775173 +0000
@@ -5615,13 +5615,16 @@  as SH, this hook can be used to avoid ex
    displacement addressing.  */
 DEFHOOK
 (legitimize_address_displacement,
- "A target hook which returns @code{true} if *@var{disp} is\n\
-legitimezed to valid address displacement with subtracting *@var{offset}\n\
-at memory mode @var{mode}.\n\
-The default version of this target hook returns @code{false}.\n\
-This hook will benefit machines with limited base plus displacement\n\
-addressing.",
- bool, (rtx *disp, rtx *offset, machine_mode mode),
+ "This hook tries to split address offset @var{orig_offset} into\n\
+two parts: one that should be added to the base address to create\n\
+a local anchor point, and an additional offset that can be applied\n\
+to the anchor to address a value of mode @var{mode}.  The idea is that\n\
+the local anchor could be shared by other accesses to nearby locations.\n\
+\n\
+The hook returns true if it succeeds, storing the offset of the\n\
+anchor from the base in @var{offset1} and the offset of the final address\n\
+from the anchor in @var{offset2}.  The default implementation returns false.",
+ bool, (rtx *offset1, rtx *offset2, poly_int64 orig_offset, machine_mode mode),
  default_legitimize_address_displacement)
 
 /* This target hook allows the backend to perform additional
Index: gcc/targhooks.h
===================================================================
--- gcc/targhooks.h	2017-11-17 16:01:54.798206825 +0000
+++ gcc/targhooks.h	2017-11-17 16:01:54.995689444 +0000
@@ -25,7 +25,7 @@  extern bool default_legitimate_address_p
 extern void default_external_libcall (rtx);
 extern rtx default_legitimize_address (rtx, rtx, machine_mode);
 extern bool default_legitimize_address_displacement (rtx *, rtx *,
-						     machine_mode);
+						     poly_int64, machine_mode);
 extern bool default_const_not_ok_for_debug_p (rtx);
 
 extern int default_unspec_may_trap_p (const_rtx, unsigned);
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c	2017-11-17 16:01:54.798206825 +0000
+++ gcc/targhooks.c	2017-11-17 16:01:54.994775173 +0000
@@ -171,9 +171,8 @@  default_legitimize_address (rtx x, rtx o
 }
 
 bool
-default_legitimize_address_displacement (rtx *disp ATTRIBUTE_UNUSED,
-					 rtx *offset ATTRIBUTE_UNUSED,
-					 machine_mode mode ATTRIBUTE_UNUSED)
+default_legitimize_address_displacement (rtx *, rtx *, poly_int64,
+					 machine_mode)
 {
   return false;
 }
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	2017-11-17 16:01:54.798206825 +0000
+++ gcc/doc/tm.texi	2017-11-17 16:01:54.993860901 +0000
@@ -2894,13 +2894,16 @@  machines with non orthogonal register us
 as SH, this hook can be used to avoid excessive spilling.
 @end deftypefn
 
-@deftypefn {Target Hook} bool TARGET_LEGITIMIZE_ADDRESS_DISPLACEMENT (rtx *@var{disp}, rtx *@var{offset}, machine_mode @var{mode})
-A target hook which returns @code{true} if *@var{disp} is
-legitimezed to valid address displacement with subtracting *@var{offset}
-at memory mode @var{mode}.
-The default version of this target hook returns @code{false}.
-This hook will benefit machines with limited base plus displacement
-addressing.
+@deftypefn {Target Hook} bool TARGET_LEGITIMIZE_ADDRESS_DISPLACEMENT (rtx *@var{offset1}, rtx *@var{offset2}, poly_int64 @var{orig_offset}, machine_mode @var{mode})
+This hook tries to split address offset @var{orig_offset} into
+two parts: one that should be added to the base address to create
+a local anchor point, and an additional offset that can be applied
+to the anchor to address a value of mode @var{mode}.  The idea is that
+the local anchor could be shared by other accesses to nearby locations.
+
+The hook returns true if it succeeds, storing the offset of the
+anchor from the base in @var{offset1} and the offset of the final address
+from the anchor in @var{offset2}.  The default implementation returns false.
 @end deftypefn
 
 @deftypefn {Target Hook} reg_class_t TARGET_SPILL_CLASS (reg_class_t, @var{machine_mode})
Index: gcc/lra-constraints.c
===================================================================
--- gcc/lra-constraints.c	2017-11-17 16:01:54.798206825 +0000
+++ gcc/lra-constraints.c	2017-11-17 16:01:54.994775173 +0000
@@ -3053,19 +3053,19 @@  base_to_reg (struct address_info *ad)
   return new_inner;
 }
 
-/* Make reload base reg + disp from address AD.  Return the new pseudo.  */
+/* Make reload base reg + DISP from address AD.  Return the new pseudo.  */
 static rtx
-base_plus_disp_to_reg (struct address_info *ad)
+base_plus_disp_to_reg (struct address_info *ad, rtx disp)
 {
   enum reg_class cl;
   rtx new_reg;
 
-  lra_assert (ad->base == ad->base_term && ad->disp == ad->disp_term);
+  lra_assert (ad->base == ad->base_term);
   cl = base_reg_class (ad->mode, ad->as, ad->base_outer_code,
 		       get_index_code (ad));
   new_reg = lra_create_new_reg (GET_MODE (*ad->base_term), NULL_RTX,
 				cl, "base + disp");
-  lra_emit_add (new_reg, *ad->base_term, *ad->disp_term);
+  lra_emit_add (new_reg, *ad->base_term, disp);
   return new_reg;
 }
 
@@ -3415,12 +3415,30 @@  process_address_1 (int nop, bool check_o
 	 displacements, so reloading into an index register would
 	 not necessarily be a win.  */
       if (new_reg == NULL_RTX)
-        new_reg = base_plus_disp_to_reg (&ad);
+	{
+	  /* See if the target can split the displacement into a
+	     legitimate new displacement from a local anchor.  */
+	  gcc_assert (ad.disp == ad.disp_term);
+	  poly_int64 orig_offset;
+	  rtx offset1, offset2;
+	  if (poly_int_rtx_p (*ad.disp, &orig_offset)
+	      && targetm.legitimize_address_displacement (&offset1, &offset2,
+							  orig_offset,
+							  ad.mode))
+	    {
+	      new_reg = base_plus_disp_to_reg (&ad, offset1);
+	      new_reg = gen_rtx_PLUS (GET_MODE (new_reg), new_reg, offset2);
+	    }
+	  else
+	    new_reg = base_plus_disp_to_reg (&ad, *ad.disp);
+	}
       insns = get_insns ();
       last_insn = get_last_insn ();
       /* If we generated at least two insns, try last insn source as
 	 an address.  If we succeed, we generate one less insn.  */
-      if (last_insn != insns && (set = single_set (last_insn)) != NULL_RTX
+      if (REG_P (new_reg)
+	  && last_insn != insns
+	  && (set = single_set (last_insn)) != NULL_RTX
 	  && GET_CODE (SET_SRC (set)) == PLUS
 	  && REG_P (XEXP (SET_SRC (set), 0))
 	  && CONSTANT_P (XEXP (SET_SRC (set), 1)))
@@ -3440,32 +3458,6 @@  process_address_1 (int nop, bool check_o
 	      delete_insns_since (PREV_INSN (last_insn));
 	    }
 	}
-      /* Try if target can split displacement into legitimite new disp
-	 and offset.  If it's the case, we replace the last insn with
-	 insns for base + offset => new_reg and set new_reg + new disp
-	 to *ad.inner.  */
-      last_insn = get_last_insn ();
-      if ((set = single_set (last_insn)) != NULL_RTX
-	  && GET_CODE (SET_SRC (set)) == PLUS
-	  && REG_P (XEXP (SET_SRC (set), 0))
-	  && REGNO (XEXP (SET_SRC (set), 0)) < FIRST_PSEUDO_REGISTER
-	  && CONST_INT_P (XEXP (SET_SRC (set), 1)))
-	{
-	  rtx addend, disp = XEXP (SET_SRC (set), 1);
-	  if (targetm.legitimize_address_displacement (&disp, &addend,
-						       ad.mode))
-	    {
-	      rtx_insn *new_insns;
-	      start_sequence ();
-	      lra_emit_add (new_reg, XEXP (SET_SRC (set), 0), addend);
-	      new_insns = get_insns ();
-	      end_sequence ();
-	      new_reg = gen_rtx_PLUS (Pmode, new_reg, disp);
-	      delete_insns_since (PREV_INSN (last_insn));
-	      add_insn (new_insns);
-	      insns = get_insns ();
-	    }
-	}
       end_sequence ();
       emit_insn (insns);
       *ad.inner = new_reg;
@@ -3474,7 +3466,8 @@  process_address_1 (int nop, bool check_o
     {
       /* base + scale * index + disp => new base + scale * index,
 	 case (1) above.  */
-      new_reg = base_plus_disp_to_reg (&ad);
+      gcc_assert (ad.disp == ad.disp_term);
+      new_reg = base_plus_disp_to_reg (&ad, *ad.disp);
       *ad.inner = simplify_gen_binary (PLUS, GET_MODE (new_reg),
 				       new_reg, *ad.index);
     }
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c	2017-11-17 16:01:54.798206825 +0000
+++ gcc/config/aarch64/aarch64.c	2017-11-17 16:01:54.992032358 +0000
@@ -5902,32 +5902,78 @@  aarch64_legitimate_address_p (machine_mo
   return aarch64_classify_address (&addr, x, mode, strict_p, type);
 }
 
-/* Split an out-of-range address displacement into a base and offset.
-   Use 4KB range for 1- and 2-byte accesses and a 16KB range otherwise
-   to increase opportunities for sharing the base address of different sizes.
-   Unaligned accesses use the signed 9-bit range, TImode/TFmode use
-   the intersection of signed scaled 7-bit and signed 9-bit offset.  */
+/* Implement TARGET_LEGITIMIZE_ADDRESS_DISPLACEMENT.  */
+
 static bool
-aarch64_legitimize_address_displacement (rtx *disp, rtx *off, machine_mode mode)
+aarch64_legitimize_address_displacement (rtx *offset1, rtx *offset2,
+					 poly_int64 orig_offset,
+					 machine_mode mode)
 {
   HOST_WIDE_INT size;
   if (GET_MODE_SIZE (mode).is_constant (&size))
     {
-      HOST_WIDE_INT offset = INTVAL (*disp);
-      HOST_WIDE_INT base;
+      HOST_WIDE_INT const_offset, second_offset;
+
+      /* A general SVE offset is A * VQ + B.  Remove the A component from
+	 coefficient 0 in order to get the constant B.  */
+      const_offset = orig_offset.coeffs[0] - orig_offset.coeffs[1];
 
+      /* Split an out-of-range address displacement into a base and
+	 offset.  Use 4KB range for 1- and 2-byte accesses and a 16KB
+	 range otherwise to increase opportunities for sharing the base
+	 address of different sizes.  Unaligned accesses use the signed
+	 9-bit range, TImode/TFmode use the intersection of signed
+	 scaled 7-bit and signed 9-bit offset.  */
       if (mode == TImode || mode == TFmode)
-	base = (offset + 0x100) & ~0x1f8;
-      else if ((offset & (size - 1)) != 0)
-	base = (offset + 0x100) & ~0x1ff;
+	second_offset = ((const_offset + 0x100) & 0x1f8) - 0x100;
+      else if ((const_offset & (size - 1)) != 0)
+	second_offset = ((const_offset + 0x100) & 0x1ff) - 0x100;
+      else
+	second_offset = const_offset & (size < 4 ? 0xfff : 0x3ffc);
+
+      if (second_offset == 0 || must_eq (orig_offset, second_offset))
+	return false;
+
+      /* Split the offset into second_offset and the rest.  */
+      *offset1 = gen_int_mode (orig_offset - second_offset, Pmode);
+      *offset2 = gen_int_mode (second_offset, Pmode);
+      return true;
+    }
+  else
+    {
+      /* Get the mode we should use as the basis of the range.  For structure
+	 modes this is the mode of one vector.  */
+      unsigned int vec_flags = aarch64_classify_vector_mode (mode);
+      machine_mode step_mode
+	= (vec_flags & VEC_STRUCT) != 0 ? SVE_BYTE_MODE : mode;
+
+      /* Get the "mul vl" multiplier we'd like to use.  */
+      HOST_WIDE_INT factor = GET_MODE_SIZE (step_mode).coeffs[1];
+      HOST_WIDE_INT vnum = orig_offset.coeffs[1] / factor;
+      if (vec_flags & VEC_SVE_DATA)
+	/* LDR supports a 9-bit range, but the move patterns for
+	   structure modes require all vectors to be in range of the
+	   same base.  The simplest way of accomodating that while still
+	   promoting reuse of anchor points between different modes is
+	   to use an 8-bit range unconditionally.  */
+	vnum = ((vnum + 128) & 255) - 128;
       else
-	base = offset & ~(size < 4 ? 0xfff : 0x3ffc);
+	/* Predicates are only handled singly, so we might as well use
+	   the full range.  */
+	vnum = ((vnum + 256) & 511) - 256;
+      if (vnum == 0)
+	return false;
+
+      /* Convert the "mul vl" multiplier into a byte offset.  */
+      poly_int64 second_offset = GET_MODE_SIZE (step_mode) * vnum;
+      if (must_eq (second_offset, orig_offset))
+	return false;
 
-      *off = GEN_INT (base);
-      *disp = GEN_INT (offset - base);
+      /* Split the offset into second_offset and the rest.  */
+      *offset1 = gen_int_mode (orig_offset - second_offset, Pmode);
+      *offset2 = gen_int_mode (second_offset, Pmode);
       return true;
     }
-  return false;
 }
 
 /* Return the binary representation of floating point constant VALUE in INTVAL.
Index: gcc/config/sh/sh.c
===================================================================
--- gcc/config/sh/sh.c	2017-11-17 16:01:54.798206825 +0000
+++ gcc/config/sh/sh.c	2017-11-17 16:01:54.992946630 +0000
@@ -269,7 +269,8 @@  static bool sh_legitimate_address_p (mac
 static rtx sh_legitimize_address (rtx, rtx, machine_mode);
 static rtx sh_delegitimize_address (rtx);
 static bool sh_cannot_substitute_mem_equiv_p (rtx);
-static bool sh_legitimize_address_displacement (rtx *, rtx *, machine_mode);
+static bool sh_legitimize_address_displacement (rtx *, rtx *,
+						poly_int64, machine_mode);
 static int scavenge_reg (HARD_REG_SET *s);
 
 static rtx sh_struct_value_rtx (tree, int);
@@ -11395,20 +11396,21 @@  sh_cannot_substitute_mem_equiv_p (rtx)
   return true;
 }
 
-/* Return true if DISP can be legitimized.  */
+/* Implement TARGET_LEGITIMIZE_ADDRESS_DISPLACEMENT.  */
 static bool
-sh_legitimize_address_displacement (rtx *disp, rtx *offs,
+sh_legitimize_address_displacement (rtx *offset1, rtx *offset2,
+				    poly_int64 orig_offset,
 				    machine_mode mode)
 {
   if ((TARGET_FPU_DOUBLE && mode == DFmode)
       || (TARGET_SH2E && mode == SFmode))
     return false;
 
-  struct disp_adjust adj = sh_find_mov_disp_adjust (mode, INTVAL (*disp));
+  struct disp_adjust adj = sh_find_mov_disp_adjust (mode, orig_offset);
   if (adj.offset_adjust != NULL_RTX && adj.mov_disp != NULL_RTX)
     {
-      *disp = adj.mov_disp;
-      *offs = adj.offset_adjust;
+      *offset1 = adj.offset_adjust;
+      *offset2 = adj.mov_disp;
       return true;
     }