[mid-end] Add a hook to support telling the mid-end when to probe the stack [patch (2/6)]

Message ID 20180711112058.GA15773@arm.com
State New
Headers show
Series
  • [mid-end] Add a hook to support telling the mid-end when to probe the stack [patch (2/6)]
Related show

Commit Message

Tamar Christina July 11, 2018, 11:21 a.m.
Hi All,

This patch adds a hook to tell the mid-end about the probing requirements of the
target.  On AArch64 we allow a specific range for which no probing needs to
be done.  This same range is also the amount that will have to be probed up when
a probe is needed after dropping the stack.

Defining this probe comes with the extra requirement that the outgoing arguments
size of any function that uses alloca and stack clash be at the very least 8
bytes.  With this invariant we can skip doing the zero checks for alloca and
save some code.

A simplified version of the AArch64 stack frame is:

   +-----------------------+                                              
   |                       |                                                 
   |                       |                                              
   |                       |                                              
   +-----------------------+                                              
   |LR                     |                                              
   +-----------------------+                                              
   |FP                     |                                              
   +-----------------------+                                              
   |dynamic allocations    | -\      probe range hook effects these       
   +-----------------------+   --\   and ensures that outgoing stack      
   |padding                |      -- args is always > 8 when alloca.      
   +-----------------------+  ---/   Which means it's always safe to probe
   |outgoing stack args    |-/       at SP                                
   +-----------------------+                                              
                                                                                                           

This allows us to generate better code than without the hook without affecting
other targets.

With this patch I am also removing the stack_clash_protection_final_dynamic_probe
hook which was added specifically for AArch64 but that is no longer needed.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues.
Both targets were tested with stack clash on and off by default.

Ok for trunk?

Thanks,
Tamar

gcc/
2018-07-11  Tamar Christina  <tamar.christina@arm.com>

	PR target/86486
	* explow.c (anti_adjust_stack_and_probe_stack_clash): Support custom
	probe ranges.
	* target.def (stack_clash_protection_alloca_probe_range): New.
	(stack_clash_protection_final_dynamic_probe): Remove.
	* targhooks.h (default_stack_clash_protection_alloca_probe_range) New.
	(default_stack_clash_protection_final_dynamic_probe): Remove.
	* targhooks.c: Likewise.
	* doc/tm.texi.in (TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE): New.
	(TARGET_STACK_CLASH_PROTECTION_FINAL_DYNAMIC_PROBE): Remove.
	* doc/tm.texi: Regenerate.

--

Comments

Jeff Law July 11, 2018, 6:52 p.m. | #1
On 07/11/2018 05:21 AM, Tamar Christina wrote:
> Hi All,
> 
> This patch adds a hook to tell the mid-end about the probing requirements of the
> target.  On AArch64 we allow a specific range for which no probing needs to
> be done.  This same range is also the amount that will have to be probed up when
> a probe is needed after dropping the stack.
> 
> Defining this probe comes with the extra requirement that the outgoing arguments
> size of any function that uses alloca and stack clash be at the very least 8
> bytes.  With this invariant we can skip doing the zero checks for alloca and
> save some code.
> 
> A simplified version of the AArch64 stack frame is:
> 
>    +-----------------------+                                              
>    |                       |                                                 
>    |                       |                                              
>    |                       |                                              
>    +-----------------------+                                              
>    |LR                     |                                              
>    +-----------------------+                                              
>    |FP                     |                                              
>    +-----------------------+                                              
>    |dynamic allocations    | -\      probe range hook effects these       
>    +-----------------------+   --\   and ensures that outgoing stack      
>    |padding                |      -- args is always > 8 when alloca.      
>    +-----------------------+  ---/   Which means it's always safe to probe
>    |outgoing stack args    |-/       at SP                                
>    +-----------------------+                                              
>                                                                                                            
> 
> This allows us to generate better code than without the hook without affecting
> other targets.
> 
> With this patch I am also removing the stack_clash_protection_final_dynamic_probe
> hook which was added specifically for AArch64 but that is no longer needed.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues.
> Both targets were tested with stack clash on and off by default.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> gcc/
> 2018-07-11  Tamar Christina  <tamar.christina@arm.com>
> 
> 	PR target/86486
> 	* explow.c (anti_adjust_stack_and_probe_stack_clash): Support custom
> 	probe ranges.
> 	* target.def (stack_clash_protection_alloca_probe_range): New.
> 	(stack_clash_protection_final_dynamic_probe): Remove.
> 	* targhooks.h (default_stack_clash_protection_alloca_probe_range) New.
> 	(default_stack_clash_protection_final_dynamic_probe): Remove.
> 	* targhooks.c: Likewise.
> 	* doc/tm.texi.in (TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE): New.
> 	(TARGET_STACK_CLASH_PROTECTION_FINAL_DYNAMIC_PROBE): Remove.
> 	* doc/tm.texi: Regenerate.
>
The control flow is a bit convoluted here, but after a few false starts
where I thought this was wrong, I think it's OK.

Jeff

Patch

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 1c5a80920f17694f1119696ec40faef1452fe1c1..138d905023b449a7a239fb32c07f5c8551d5380f 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -3450,8 +3450,12 @@  GCC computed the default from the values of the above macros and you will
 normally not need to override that default.
 @end defmac
 
-@deftypefn {Target Hook} bool TARGET_STACK_CLASH_PROTECTION_FINAL_DYNAMIC_PROBE (rtx @var{residual})
-Some targets make optimistic assumptions about the state of stack probing when they emit their prologues.  On such targets a probe into the end of any dynamically allocated space is likely required for safety against stack clash style attacks.  Define this variable to return nonzero if such a probe is required or zero otherwise.  You need not define this macro if it would always have the value zero.
+@deftypefn {Target Hook} HOST_WIDE_INT TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE (void)
+Some targets have an ABI defined interval for which no probing needs to be done.
+When a probe does need to be done this same interval is used as the probe distance up when doing stack clash protection for alloca.
+On such targets this value can be set to override the default probing up interval.
+Define this variable to return nonzero if such a probe range is required or zero otherwise.  Defining this hook also requires your functions which make use of alloca to have at least 8 byesof outgoing arguments.  If this is not the case the stack will be corrupted.
+You need not define this macro if it would always have the value zero.
 @end deftypefn
 
 @need 2000
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index bf2c64e15dba1b95179cfe682523f29bb8fa1151..8c62ae06b6be2fe338e3614917dc5f5b1f4fd4aa 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2841,7 +2841,7 @@  GCC computed the default from the values of the above macros and you will
 normally not need to override that default.
 @end defmac
 
-@hook TARGET_STACK_CLASH_PROTECTION_FINAL_DYNAMIC_PROBE
+@hook TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE
 
 @need 2000
 @node Frame Registers
diff --git a/gcc/explow.c b/gcc/explow.c
index 9a6182ac5c5f6b7a0d001ad83ab822aa58036f4a..be19945dc5721f3e72bb66aeeb855b3b0a551287 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -1959,10 +1959,21 @@  anti_adjust_stack_and_probe_stack_clash (rtx size)
 
   /* We can get here with a constant size on some targets.  */
   rtx rounded_size, last_addr, residual;
-  HOST_WIDE_INT probe_interval;
+  HOST_WIDE_INT probe_interval, probe_range;
+  bool target_probe_range_p = false;
   compute_stack_clash_protection_loop_data (&rounded_size, &last_addr,
 					    &residual, &probe_interval, size);
 
+  /* Get the back-end specific probe ranges.  */
+  probe_range = targetm.stack_clash_protection_alloca_probe_range ();
+  target_probe_range_p = probe_range != 0;
+  gcc_assert (probe_range >= 0);
+
+  /* If no back-end specific range defined, default to the top of the newly
+     allocated range.  */
+  if (probe_range == 0)
+    probe_range = probe_interval - GET_MODE_SIZE (word_mode);
+
   if (rounded_size != CONST0_RTX (Pmode))
     {
       if (CONST_INT_P (rounded_size)
@@ -1973,13 +1984,12 @@  anti_adjust_stack_and_probe_stack_clash (rtx size)
 	       i += probe_interval)
 	    {
 	      anti_adjust_stack (GEN_INT (probe_interval));
-
 	      /* The prologue does not probe residuals.  Thus the offset
 		 here to probe just beyond what the prologue had already
 		 allocated.  */
 	      emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
-					       (probe_interval
-						- GET_MODE_SIZE (word_mode))));
+					       probe_range));
+
 	      emit_insn (gen_blockage ());
 	    }
 	}
@@ -1993,10 +2003,10 @@  anti_adjust_stack_and_probe_stack_clash (rtx size)
 	  anti_adjust_stack (GEN_INT (probe_interval));
 
 	  /* The prologue does not probe residuals.  Thus the offset here
-	     to probe just beyond what the prologue had already allocated.  */
+	     to probe just beyond what the prologue had already
+	     allocated.  */
 	  emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
-					   (probe_interval
-					    - GET_MODE_SIZE (word_mode))));
+					   probe_range));
 
 	  emit_stack_clash_protection_probe_loop_end (loop_lab, end_loop,
 						      last_addr, rotate_loop);
@@ -2011,48 +2021,55 @@  anti_adjust_stack_and_probe_stack_clash (rtx size)
 	 hold live data.  Furthermore, we do not want to probe into the
 	 red zone.
 
-	 Go ahead and just guard the probe at *sp on RESIDUAL != 0 at
-	 runtime if RESIDUAL is not a compile time constant.  */
+	 If TARGET_PROBE_RANGE_P then the target has promised it's safe to
+	 probe at offset 0.  In which case we no longer have to check for
+	 RESIDUAL == 0.  However we still need to probe at the right offset
+	 when RESIDUAL > PROBE_RANGE, in which case we probe at PROBE_RANGE.
+
+	 If !TARGET_PROBE_RANGE_P then go ahead and just guard the probe at *sp
+	 on RESIDUAL != 0 at runtime if RESIDUAL is not a compile time constant.
+	 */
+      anti_adjust_stack (residual);
+
       if (!CONST_INT_P (residual))
 	{
 	  label = gen_label_rtx ();
-	  emit_cmp_and_jump_insns (residual, CONST0_RTX (GET_MODE (residual)),
-				   EQ, NULL_RTX, Pmode, 1, label);
-	}
+	  rtx_code op = target_probe_range_p ? LT : EQ;
+	  rtx probe_cmp_value = target_probe_range_p
+	    ? gen_rtx_CONST_INT (GET_MODE (residual), probe_range)
+	    : CONST0_RTX (GET_MODE (residual));
 
-      rtx x = force_reg (Pmode, plus_constant (Pmode, residual,
-					       -GET_MODE_SIZE (word_mode)));
-      anti_adjust_stack (residual);
-      emit_stack_probe (gen_rtx_PLUS (Pmode, stack_pointer_rtx, x));
-      emit_insn (gen_blockage ());
-      if (!CONST_INT_P (residual))
-	emit_label (label);
-    }
+	  if (target_probe_range_p)
+	    emit_stack_probe (stack_pointer_rtx);
 
-  /* Some targets make optimistic assumptions in their prologues about
-     how the caller may have probed the stack.  Make sure we honor
-     those assumptions when needed.  */
-  if (size != CONST0_RTX (Pmode)
-      && targetm.stack_clash_protection_final_dynamic_probe (residual))
-    {
-      /* SIZE could be zero at runtime and in that case *sp could hold
-	 live data.  Furthermore, we don't want to probe into the red
-	 zone.
+	  emit_cmp_and_jump_insns (residual, probe_cmp_value,
+				   op, NULL_RTX, Pmode, 1, label);
+	}
 
-	 Go ahead and just guard the probe at *sp on SIZE != 0 at runtime
-	 if SIZE is not a compile time constant.  */
-      rtx label = NULL_RTX;
-      if (!CONST_INT_P (size))
+      rtx x = NULL_RTX;
+
+      /* If RESIDUAL isn't a constant and TARGET_PROBE_RANGE_P then we probe up
+	 by the ABI defined safe value.  */
+      if (!CONST_INT_P (residual) && target_probe_range_p)
+	x = GEN_INT (probe_range);
+      /* If RESIDUAL is a constant but smaller than the ABI defined safe value,
+	 we still want to probe up, but the safest amount if a word.  */
+      else if (target_probe_range_p)
 	{
-	  label = gen_label_rtx ();
-	  emit_cmp_and_jump_insns (size, CONST0_RTX (GET_MODE (size)),
-				   EQ, NULL_RTX, Pmode, 1, label);
+	  if (INTVAL (residual) <= probe_range)
+	    x = GEN_INT (GET_MODE_SIZE (word_mode));
+	  else
+	    x = GEN_INT (probe_range);
 	}
+      else
+      /* If nothing else, probe at the top of the new allocation.  */
+	x = plus_constant (Pmode, residual, -GET_MODE_SIZE (word_mode));
+
+      emit_stack_probe (gen_rtx_PLUS (Pmode, stack_pointer_rtx, x));
 
-      emit_stack_probe (stack_pointer_rtx);
       emit_insn (gen_blockage ());
-      if (!CONST_INT_P (size))
-	emit_label (label);
+      if (!CONST_INT_P (residual))
+	  emit_label (label);
     }
 }
 
diff --git a/gcc/target.def b/gcc/target.def
index 472a593c3466d5cd48a9c8c1d9c875cb22848f60..1ce86ce57cf3774537a309ac9ea17a6c215153c8 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -5824,10 +5824,17 @@  these registers when the target switches are opposed to them.)",
  hook_void_void)
 
 DEFHOOK
-(stack_clash_protection_final_dynamic_probe,
- "Some targets make optimistic assumptions about the state of stack probing when they emit their prologues.  On such targets a probe into the end of any dynamically allocated space is likely required for safety against stack clash style attacks.  Define this variable to return nonzero if such a probe is required or zero otherwise.  You need not define this macro if it would always have the value zero.",
- bool, (rtx residual),
- default_stack_clash_protection_final_dynamic_probe)
+(stack_clash_protection_alloca_probe_range,
+ "Some targets have an ABI defined interval for which no probing needs to be done.\n\
+When a probe does need to be done this same interval is used as the probe distance \
+up when doing stack clash protection for alloca.\n\
+On such targets this value can be set to override the default probing up interval.\n\
+Define this variable to return nonzero if such a probe range is required or zero otherwise.  \
+Defining this hook also requires your functions which make use of alloca to have at least 8 byes\
+of outgoing arguments.  If this is not the case the stack will be corrupted.\n\
+You need not define this macro if it would always have the value zero.",
+ HOST_WIDE_INT, (void),
+ default_stack_clash_protection_alloca_probe_range)
 
 
 /* Functions specific to the C family of frontends.  */
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index b6a8fa544f709ad1e5a63495fb97f321041ab727..49d89904145e88f3db6021ee508458564e955758 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -280,7 +280,7 @@  extern unsigned int default_min_arithmetic_precision (void);
 
 extern enum flt_eval_method
 default_excess_precision (enum excess_precision_type ATTRIBUTE_UNUSED);
-extern bool default_stack_clash_protection_final_dynamic_probe (rtx);
+extern HOST_WIDE_INT default_stack_clash_protection_alloca_probe_range (void);
 extern void default_select_early_remat_modes (sbitmap);
 
 #endif /* GCC_TARGHOOKS_H */
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 51b0dcac111cba45413223e3d44bf565956d23e4..ebd42a8580ba704715e550abf81f03f0fc325e24 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -2285,8 +2285,10 @@  default_excess_precision (enum excess_precision_type ATTRIBUTE_UNUSED)
   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
 }
 
-bool
-default_stack_clash_protection_final_dynamic_probe (rtx residual ATTRIBUTE_UNUSED)
+/* Default implementation for
+  TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE.  */
+HOST_WIDE_INT
+default_stack_clash_protection_alloca_probe_range (void)
 {
   return 0;
 }