diff mbox

[RFC,AArch64,PR,63304] Handle literal pools for functions > 1 MiB in size.

Message ID 20150727143235.GA19415@e105545-lin
State New
Headers show

Commit Message

Ramana Radhakrishnan July 27, 2015, 2:33 p.m. UTC
Hi,

	This patch appears to fix the issue in PR63304 where we have
functions that are > 1MiB. The idea is to use adrp / ldr or adrp / add
instructions to address the literal pools under the use of a command line
option. The patch as attached turns this feature on by default as
that is the mode I've used for testing.

My thought is that we turn this on by default on trunk but keep this
disabled by default for the release branches in order to get some
serious testing for this feature while it bakes on trunk.

As a follow-up I would like to try and see if estimate_num_insns or
something else can give us a heuristic to turn this on for "large" functions.
After all the number of incidences of this are quite low in real life,
so may be we should look to restrict this use as much as possible on the
grounds that this code generation implies an extra integer register for
addressing for every floating point and vector constant and I don't think
that's great in code that already may have high register pressure.

Tested on aarch64-none-elf with no regressions.
Bootstrapped and regression tested on
aarch64-none-linux-gnu. Additionally a test run of SPEC2k showed
no issues other than a miscompare for eon which also
appeared in the base run. Thus for now I'm confident this is
reasonably sane for the minute.

I will also note that this patch will also need rebasing on top of Kyrill's
work with target attributes and thus cannot be final for trunk - however this
version is still applicable for all release branches.

More testing is still underway but in the meanwhile I'd like to
put this up for some comments please.

regards
Ramana

<DATE>  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

	PR target/63304
	* config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
	nopcrelative_literal_loads.
	(aarch64_classify_address): Likewise.
	(aarch64_constant_pool_reload_icode): Define.
	(aarch64_secondary_reload): Handle secondary reloads for
	literal pools.
	(aarch64_override_options): Handle nopcrelative_literal_loads.
	(aarch64_classify_symbol): Handle nopcrelative_literal_loads.
	* config/aarch64/aarch64.md (aarch64_reload_movcp<ALLTF:mode><P:mode>):
	Define.
	(aarch64_reload_movcp<VALL:mode><P:mode>): Likewise.
	* config/aarch64/aarch64.opt: New option mnopc-relative-literal-loads
	* config/aarch64/predicates.md (aarch64_constant_pool_symref): New
	predicate.
	* doc/invoke.texi (mnopc-relative-literal-loads): Document.
---
 gcc/config/aarch64/aarch64.c     | 102 +++++++++++++++++++++++++++++++++++++--
 gcc/config/aarch64/aarch64.md    |  26 ++++++++++
 gcc/config/aarch64/aarch64.opt   |   4 ++
 gcc/config/aarch64/predicates.md |   4 ++
 gcc/doc/invoke.texi              |   8 +++
 5 files changed, 140 insertions(+), 4 deletions(-)

Comments

Marcus Shawcroft Aug. 27, 2015, 2:07 p.m. UTC | #1
On 27 July 2015 at 15:33, Ramana Radhakrishnan
<ramana.radhakrishnan@foss.arm.com> wrote:

> <DATE>  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>
>
>         PR target/63304
>         * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
>         nopcrelative_literal_loads.
>         (aarch64_classify_address): Likewise.
>         (aarch64_constant_pool_reload_icode): Define.
>         (aarch64_secondary_reload): Handle secondary reloads for
>         literal pools.
>         (aarch64_override_options): Handle nopcrelative_literal_loads.
>         (aarch64_classify_symbol): Handle nopcrelative_literal_loads.
>         * config/aarch64/aarch64.md (aarch64_reload_movcp<ALLTF:mode><P:mode>):
>         Define.
>         (aarch64_reload_movcp<VALL:mode><P:mode>): Likewise.
>         * config/aarch64/aarch64.opt: New option mnopc-relative-literal-loads
>         * config/aarch64/predicates.md (aarch64_constant_pool_symref): New
>         predicate.
>         * doc/invoke.texi (mnopc-relative-literal-loads): Document.

This looks OK to me. It needs rebasing, but OK if the rebase is
trival.  Default on is fine.  Hold off on the back ports for a couple
of weeks.
Cheers
/Marcus
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 020f63c..f37a031 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1612,11 +1612,27 @@  aarch64_expand_mov_immediate (rtx dest, rtx imm)
 	      aarch64_emit_move (dest, base);
 	      return;
 	    }
+
 	  mem = force_const_mem (ptr_mode, imm);
 	  gcc_assert (mem);
+
+	  /* If we aren't generating PC relative literals, then
+	     we need to expand the literal pool access carefully.
+	     This is something that needs to be done in a number
+	     of places, so could well live as a separate function.  */
+	  if (nopcrelative_literal_loads)
+	    {
+	      gcc_assert (can_create_pseudo_p ());
+	      base = gen_reg_rtx (ptr_mode);
+	      aarch64_expand_mov_immediate (base, XEXP (mem, 0));
+	      mem = gen_rtx_MEM (ptr_mode, base);
+	    }
+
 	  if (mode != ptr_mode)
 	    mem = gen_rtx_ZERO_EXTEND (mode, mem);
+
 	  emit_insn (gen_rtx_SET (dest, mem));
+
 	  return;
 
         case SYMBOL_SMALL_TLSGD:
@@ -3728,9 +3744,10 @@  aarch64_classify_address (struct aarch64_address_info *info,
 	  rtx sym, addend;
 
 	  split_const (x, &sym, &addend);
-	  return (GET_CODE (sym) == LABEL_REF
-		  || (GET_CODE (sym) == SYMBOL_REF
-		      && CONSTANT_POOL_ADDRESS_P (sym)));
+	  return ((GET_CODE (sym) == LABEL_REF
+		   || (GET_CODE (sym) == SYMBOL_REF
+		       && CONSTANT_POOL_ADDRESS_P (sym)
+		       && !nopcrelative_literal_loads)));
 	}
       return false;
 
@@ -4918,12 +4935,69 @@  aarch64_legitimize_reload_address (rtx *x_p,
 }
 
 
+/* Return the reload icode required for a constant pool in mode.  */
+static enum insn_code
+aarch64_constant_pool_reload_icode (machine_mode mode)
+{
+  switch (mode)
+    {
+    case SFmode:
+      return CODE_FOR_aarch64_reload_movcpsfdi;
+
+    case DFmode:
+      return CODE_FOR_aarch64_reload_movcpdfdi;
+
+    case TFmode:
+      return CODE_FOR_aarch64_reload_movcptfdi;
+
+    case V8QImode:
+      return CODE_FOR_aarch64_reload_movcpv8qidi;
+
+    case V16QImode:
+      return CODE_FOR_aarch64_reload_movcpv16qidi;
+
+    case V4HImode:
+      return CODE_FOR_aarch64_reload_movcpv4hidi;
+
+    case V8HImode:
+      return CODE_FOR_aarch64_reload_movcpv8hidi;
+
+    case V2SImode:
+      return CODE_FOR_aarch64_reload_movcpv2sidi;
+
+    case V4SImode:
+      return CODE_FOR_aarch64_reload_movcpv4sidi;
+
+    case V2DImode:
+      return CODE_FOR_aarch64_reload_movcpv2didi;
+
+    case V2DFmode:
+      return CODE_FOR_aarch64_reload_movcpv2dfdi;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  gcc_unreachable ();
+}
 static reg_class_t
 aarch64_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x,
 			  reg_class_t rclass,
 			  machine_mode mode,
 			  secondary_reload_info *sri)
 {
+
+  /* If we have to disable direct literal pool loads and stores because the
+     function is too big, then we need a scratch register.  */
+  if (MEM_P (x) && GET_CODE (x) == SYMBOL_REF && CONSTANT_POOL_ADDRESS_P (x)
+      && (SCALAR_FLOAT_MODE_P (GET_MODE (x))
+	  || targetm.vector_mode_supported_p (GET_MODE (x)))
+      && nopcrelative_literal_loads)
+    {
+      sri->icode = aarch64_constant_pool_reload_icode (mode);
+      return NO_REGS;
+    }
+
   /* Without the TARGET_SIMD instructions we cannot move a Q register
      to a Q register directly.  We need a scratch.  */
   if (REG_P (x) && (mode == TFmode || mode == TImode) && mode == GET_MODE (x)
@@ -7543,6 +7617,17 @@  aarch64_override_options (void)
 #endif
     }
 
+  if (nopcrelative_literal_loads == 2)
+    nopcrelative_literal_loads = 0;
+
+  /* In the tiny memory model it makes no sense
+     to disallow non PC relative literal pool loads
+     as many other things will break anyway.  */
+  if (nopcrelative_literal_loads
+      && (aarch64_cmodel == AARCH64_CMODEL_TINY
+	  || aarch64_cmodel == AARCH64_CMODEL_TINY_PIC))
+    nopcrelative_literal_loads = 0;
+
   aarch64_register_fma_steering ();
 
   aarch64_override_options_after_change ();
@@ -7697,7 +7782,16 @@  aarch64_classify_symbol (rtx x, rtx offset,
   if (GET_CODE (x) == SYMBOL_REF)
     {
       if (aarch64_cmodel == AARCH64_CMODEL_LARGE)
-	  return SYMBOL_FORCE_TO_MEM;
+	{
+	  /* This is alright even in PIC code as the constant
+	     pool reference is always PC relative and within
+	     the same translation unit.  */
+	  if (nopcrelative_literal_loads
+	      && CONSTANT_POOL_ADDRESS_P (x))
+	    return SYMBOL_SMALL_ABSOLUTE;
+	  else
+	    return SYMBOL_FORCE_TO_MEM;
+	}
 
       if (aarch64_tls_symbol_p (x))
 	return aarch64_classify_tls_symbol (x);
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 66fd9ca..a8a274b 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4231,6 +4231,32 @@ 
 ;; -------------------------------------------------------------------
 ;; Reload support
 ;; -------------------------------------------------------------------
+;; Reload Scalar Floating point modes from constant pool.
+;; The AArch64 port doesn't have __int128 constant move support.
+(define_expand "aarch64_reload_movcp<ALLTF:mode><P:mode>"
+ [(set (match_operand:ALLTF 0 "register_operand" "=w")
+       (mem:ALLTF (match_operand 1 "aarch64_constant_pool_symref" "S")))
+  (clobber (match_operand:P 2 "register_operand" "=&r"))]
+ "TARGET_FLOAT && nopcrelative_literal_loads"
+ {
+   aarch64_expand_mov_immediate (operands[2], XEXP (operands[1], 0));
+   emit_move_insn (operands[0], gen_rtx_MEM (<ALLTF:MODE>mode, operands[2]));
+   DONE;
+ }
+)
+
+;; Reload Vector modes from constant pool.
+(define_expand "aarch64_reload_movcp<VALL:mode><P:mode>"
+ [(set (match_operand:VALL 0 "register_operand" "=w")
+       (mem:VALL (match_operand 1 "aarch64_constant_pool_symref" "S")))
+  (clobber (match_operand:P 2 "register_operand" "=&r"))]
+ "TARGET_FLOAT && nopcrelative_literal_loads"
+ {
+   aarch64_expand_mov_immediate (operands[2], XEXP (operands[1], 0));
+   emit_move_insn (operands[0], gen_rtx_MEM (<VALL:MODE>mode, operands[2]));
+   DONE;
+ }
+)
 
 (define_expand "aarch64_reload_mov<mode>"
   [(set (match_operand:TX 0 "register_operand" "=w")
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 98ef9f6..70ade04 100644
--- a/gcc/config/aarch64/aarch64.opt
+++ b/gcc/config/aarch64/aarch64.opt
@@ -124,3 +124,7 @@  Enum(aarch64_abi) String(ilp32) Value(AARCH64_ABI_ILP32)
 
 EnumValue
 Enum(aarch64_abi) String(lp64) Value(AARCH64_ABI_LP64)
+
+mnopc-relative-literal-loads
+Target Report Save Var(nopcrelative_literal_loads) Init(1)
+No PC relative literal loads.
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index 3979209..7b852a4 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -362,3 +362,7 @@ 
 (define_predicate "aarch64_simd_shift_imm_bitsize_di"
   (and (match_code "const_int")
        (match_test "IN_RANGE (INTVAL (op), 0, 64)")))
+
+(define_predicate "aarch64_constant_pool_symref"
+   (and (match_code "symbol_ref")
+	(match_test "CONSTANT_POOL_ADDRESS_P (op)")))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 55c2659..1cc0caf 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -12496,6 +12496,14 @@  for @var{string} in this option are not guaranteed to be consistent
 across releases.
 
 This option is only intended to be useful when developing GCC.
+
+@item -mnopc-relative-literal-loads
+@opindex mnopcrelativeliteralloads
+Disable PC relative literal loads. If this option is used, literal
+pools are assumed to have a range of up to 4GiB and an appropriate
+instruction sequence is used. This option has no impact when used
+with @option{-mcmodel=tiny}.
+
 @end table
 
 @subsubsection @option{-march} and @option{-mcpu} Feature Modifiers