diff mbox series

[2/3,aarch64] Implement support for __builtin_load_no_speculate.

Message ID 1197003135ea0b0aeecd038a07db6b787a7db6f6.1515072356.git.Richard.Earnshaw@arm.com
State New
Headers show
Series Add __builtin_load_no_speculate | expand

Commit Message

Richard Earnshaw (lists) Jan. 4, 2018, 1:58 p.m. UTC
This patch implements support for __builtin_load_no_speculate on
AArch64.  On this architecture we inhibit speclation by emitting a
combination of CSEL and a hint instruction that ensures the CSEL is
full resolved when the operands to the CSEL may involve a speculative
load.

	* config/aarch64/aarch64.c (aarch64_print_operand): Handle zero passed
	to 'H' operand qualifier.
	(aarch64_inhibit_load_speculation): New function.
	(TARGET_INHIBIT_LOAD_SPECULATION): Redefine.
	* config/aarch64/aarch64.md (UNSPECV_NOSPECULATE): New unspec_volatile
	code.
	(nospeculate<ALLI:mode>, nospeculateti): New patterns.
---
 gcc/config/aarch64/aarch64.c  | 92 +++++++++++++++++++++++++++++++++++++++++++
 gcc/config/aarch64/aarch64.md | 28 +++++++++++++
 2 files changed, 120 insertions(+)

Comments

Richard Biener Jan. 5, 2018, 9:51 a.m. UTC | #1
On Thu, Jan 4, 2018 at 2:58 PM, Richard Earnshaw
<Richard.Earnshaw@arm.com> wrote:
>
> This patch implements support for __builtin_load_no_speculate on
> AArch64.  On this architecture we inhibit speclation by emitting a
> combination of CSEL and a hint instruction that ensures the CSEL is
> full resolved when the operands to the CSEL may involve a speculative
> load.

Missing a high-level exlanation here but it looks like you do sth like

  if (access is not in bounds)
    no-speculate-access;
  else
     regular access;

with the upper/lower bounds.  Is this because 'no-speculate-access' is
prohibitely
expensive even when factoring in the condition computation and the condjump?
(trying to reverse engineer how the actual assembly will look like
from the patch,
there isn't even a testcase :/)

>         * config/aarch64/aarch64.c (aarch64_print_operand): Handle zero passed
>         to 'H' operand qualifier.
>         (aarch64_inhibit_load_speculation): New function.
>         (TARGET_INHIBIT_LOAD_SPECULATION): Redefine.
>         * config/aarch64/aarch64.md (UNSPECV_NOSPECULATE): New unspec_volatile
>         code.
>         (nospeculate<ALLI:mode>, nospeculateti): New patterns.
> ---
>  gcc/config/aarch64/aarch64.c  | 92 +++++++++++++++++++++++++++++++++++++++++++
>  gcc/config/aarch64/aarch64.md | 28 +++++++++++++
>  2 files changed, 120 insertions(+)
>
Richard Earnshaw (lists) Jan. 5, 2018, 10:48 a.m. UTC | #2
On 05/01/18 09:51, Richard Biener wrote:
> On Thu, Jan 4, 2018 at 2:58 PM, Richard Earnshaw
> <Richard.Earnshaw@arm.com> wrote:
>>
>> This patch implements support for __builtin_load_no_speculate on
>> AArch64.  On this architecture we inhibit speclation by emitting a
>> combination of CSEL and a hint instruction that ensures the CSEL is
>> full resolved when the operands to the CSEL may involve a speculative
>> load.
> 
> Missing a high-level exlanation here but it looks like you do sth like
> 
>   if (access is not in bounds)
>     no-speculate-access;
>   else
>      regular access;

For aarch64 we end up with a sequence like

	CMP	'bounds'
	B<cond> out-of-range
	LDR	x1, [ptr]
out-of-range:
	CSEL	x1, FAILVAL, x1, <cond>
	CSDB

The CSEL+CSDB combination ensures that even if we do fetch x1 from an
out-of-range location the value won't persist beyond the end of the code
sequence and thus can't be used for further speculation.

> 
> with the upper/lower bounds.  Is this because 'no-speculate-access' is
> prohibitely
> expensive even when factoring in the condition computation and the condjump?
> (trying to reverse engineer how the actual assembly will look like
> from the patch,

It's all in the white paper.

> there isn't even a testcase :/)
> 
>>         * config/aarch64/aarch64.c (aarch64_print_operand): Handle zero passed
>>         to 'H' operand qualifier.
>>         (aarch64_inhibit_load_speculation): New function.
>>         (TARGET_INHIBIT_LOAD_SPECULATION): Redefine.
>>         * config/aarch64/aarch64.md (UNSPECV_NOSPECULATE): New unspec_volatile
>>         code.
>>         (nospeculate<ALLI:mode>, nospeculateti): New patterns.
>> ---
>>  gcc/config/aarch64/aarch64.c  | 92 +++++++++++++++++++++++++++++++++++++++++++
>>  gcc/config/aarch64/aarch64.md | 28 +++++++++++++
>>  2 files changed, 120 insertions(+)
>>
Jeff Law Jan. 5, 2018, 4:31 p.m. UTC | #3
On 01/05/2018 03:48 AM, Richard Earnshaw (lists) wrote:
> On 05/01/18 09:51, Richard Biener wrote:
>> On Thu, Jan 4, 2018 at 2:58 PM, Richard Earnshaw
>> <Richard.Earnshaw@arm.com> wrote:
>>>
>>> This patch implements support for __builtin_load_no_speculate on
>>> AArch64.  On this architecture we inhibit speclation by emitting a
>>> combination of CSEL and a hint instruction that ensures the CSEL is
>>> full resolved when the operands to the CSEL may involve a speculative
>>> load.
>>
>> Missing a high-level exlanation here but it looks like you do sth like
>>
>>   if (access is not in bounds)
>>     no-speculate-access;
>>   else
>>      regular access;
> 
> For aarch64 we end up with a sequence like
> 
> 	CMP	'bounds'
> 	B<cond> out-of-range
> 	LDR	x1, [ptr]
> out-of-range:
> 	CSEL	x1, FAILVAL, x1, <cond>
> 	CSDB
> 
> The CSEL+CSDB combination ensures that even if we do fetch x1 from an
> out-of-range location the value won't persist beyond the end of the code
> sequence and thus can't be used for further speculation.
If I try to look at this architecture independently the code starting at
out-of-range is just a conditional move and fence.  Given those
primitives any target ought to be able to define
__builtin_load_no_speculate -- which is obviously important :-)

Jeff
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 93e9d9f9..7410921 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5315,6 +5315,14 @@  aarch64_print_operand (FILE *f, rtx x, int code)
       break;
 
     case 'H':
+       /* Print the higher numbered register of a pair (TImode) of regs.  */
+      if (x == const0_rtx
+	  || (CONST_DOUBLE_P (x) && aarch64_float_const_zero_rtx_p (x)))
+	{
+	  asm_fprintf (f, "xzr");
+	  break;
+	}
+
       if (!REG_P (x) || !GP_REGNUM_P (REGNO (x) + 1))
 	{
 	  output_operand_lossage ("invalid operand for '%%%c'", code);
@@ -15115,6 +15123,87 @@  aarch64_sched_can_speculate_insn (rtx_insn *insn)
     }
 }
 
+static rtx
+aarch64_inhibit_load_speculation (machine_mode mode, rtx result, rtx mem,
+				  rtx lower_bound, rtx upper_bound,
+				  rtx fail_result, rtx cmpptr)
+{
+  rtx cond, comparison;
+  rtx target = gen_reg_rtx (mode);
+  rtx tgt2 = result;
+
+  if (!register_operand (cmpptr, ptr_mode))
+    cmpptr = force_reg (ptr_mode, cmpptr);
+
+  if (!register_operand (tgt2, mode))
+    tgt2 = gen_reg_rtx (mode);
+
+  if (upper_bound == NULL)
+    {
+      if (!register_operand (lower_bound, ptr_mode))
+	lower_bound = force_reg (ptr_mode, lower_bound);
+
+      cond = aarch64_gen_compare_reg (LTU, cmpptr, lower_bound);
+      comparison = gen_rtx_LTU (VOIDmode, cond, const0_rtx);
+    }
+  else if (lower_bound == NULL)
+    {
+      if (!register_operand (upper_bound, ptr_mode))
+	upper_bound = force_reg (ptr_mode, upper_bound);
+
+      cond = aarch64_gen_compare_reg (GEU, cmpptr, upper_bound);
+      comparison = gen_rtx_GEU (VOIDmode, cond, const0_rtx);
+    }
+  else
+    {
+      if (!register_operand (lower_bound, ptr_mode))
+	lower_bound = force_reg (ptr_mode, lower_bound);
+
+      if (!register_operand (upper_bound, ptr_mode))
+	upper_bound = force_reg (ptr_mode, upper_bound);
+
+      rtx cond1 = aarch64_gen_compare_reg (GEU, cmpptr, lower_bound);
+      rtx comparison1 = gen_rtx_GEU (ptr_mode, cond1, const0_rtx);
+      rtx failcond = GEN_INT (aarch64_get_condition_code (comparison1)^1);
+      cond = gen_rtx_REG (CCmode, CC_REGNUM);
+      if (ptr_mode == SImode)
+	emit_insn (gen_ccmpsi (cond1, cond, cmpptr, upper_bound, comparison1,
+			       failcond));
+      else
+	emit_insn (gen_ccmpdi (cond1, cond, cmpptr, upper_bound, comparison1,
+			       failcond));
+      comparison = gen_rtx_GEU (VOIDmode, cond, const0_rtx);
+    }
+
+  rtx_code_label *label = gen_label_rtx ();
+  emit_jump_insn (gen_condjump (comparison, cond, label));
+  emit_move_insn (target, mem);
+  emit_label (label);
+
+  insn_code icode;
+
+  switch (mode)
+    {
+    case E_QImode: icode = CODE_FOR_nospeculateqi; break;
+    case E_HImode: icode = CODE_FOR_nospeculatehi; break;
+    case E_SImode: icode = CODE_FOR_nospeculatesi; break;
+    case E_DImode: icode = CODE_FOR_nospeculatedi; break;
+    case E_TImode: icode = CODE_FOR_nospeculateti; break;
+    default:
+      gcc_unreachable ();
+    }
+
+  if (! insn_operand_matches (icode, 4, fail_result))
+    fail_result = force_reg (mode, fail_result);
+
+  emit_insn (GEN_FCN (icode) (tgt2, comparison, cond, target, fail_result));
+
+  if (tgt2 != result)
+    emit_move_insn (result, tgt2);
+
+  return result;
+}
+
 /* Target-specific selftests.  */
 
 #if CHECKING_P
@@ -15554,6 +15643,9 @@  aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_CONSTANT_ALIGNMENT
 #define TARGET_CONSTANT_ALIGNMENT aarch64_constant_alignment
 
+#undef TARGET_INHIBIT_LOAD_SPECULATION
+#define TARGET_INHIBIT_LOAD_SPECULATION aarch64_inhibit_load_speculation
+
 #if CHECKING_P
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index f1e2a07..1a1f398 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -153,6 +153,7 @@ 
     UNSPECV_SET_FPSR		; Represent assign of FPSR content.
     UNSPECV_BLOCKAGE		; Represent a blockage
     UNSPECV_PROBE_STACK_RANGE	; Represent stack range probing.
+    UNSPECV_NOSPECULATE		; Inhibit speculation
   ]
 )
 
@@ -5797,6 +5798,33 @@ 
   DONE;
 })
 
+(define_insn "nospeculate<ALLI:mode>"
+  [(set (match_operand:ALLI 0 "register_operand" "=r")
+        (unspec_volatile:ALLI
+         [(match_operator 1 "aarch64_comparison_operator"
+	   [(match_operand 2 "cc_register" "") (const_int 0)])
+	  (match_operand:ALLI 3 "register_operand" "r")
+	  (match_operand:ALLI 4 "aarch64_reg_or_zero" "rZ")]
+	 UNSPECV_NOSPECULATE))]
+  ""
+  "csel\\t%<w>0, %<w>3, %<w>4, %M1\;hint\t#0x14\t// CSDB"
+  [(set_attr "type" "csel")
+   (set_attr "length" "8")]
+)
+
+(define_insn "nospeculateti"
+  [(set (match_operand:TI 0 "register_operand" "=r")
+        (unspec_volatile:TI
+         [(match_operator 1 "aarch64_comparison_operator"
+	   [(match_operand 2 "cc_register" "") (const_int 0)])
+	  (match_operand:TI 3 "register_operand" "r")
+	  (match_operand:TI 4 "aarch64_reg_or_zero" "rZ")]
+	 UNSPECV_NOSPECULATE))]
+  ""
+  "csel\\t%x0, %x3, %x4, %M1\;csel\\t%H0, %H3, %H4, %M1\;hint\t#0x14\t// CSDB"
+  [(set_attr "type" "csel")
+   (set_attr "length" "12")]
+)
 ;; AdvSIMD Stuff
 (include "aarch64-simd.md")