diff mbox series

[18/X,libsanitizer] Add in MTE stubs

Message ID HE1PR0802MB22517284F34B82162856F2E8E07E0@HE1PR0802MB2251.eurprd08.prod.outlook.com
State New
Headers show
Series [18/X,libsanitizer] Add in MTE stubs | expand

Commit Message

Matthew Malcomson Nov. 5, 2019, 11:34 a.m. UTC
This patch in the series is just for demonstration, here we add stubs
where MTE would be implemented.

We also add a new flag to request memory tagging as a sanitizer option.
The new flag for memory tagging is `-fsanitize=memtag`, which is in line
with the flag clang uses to request memory tagging.

At the moment all implementations are dummies of some sort, the assembly
generated uses `mov` instead of `irg`, `add` instead of `addg`, and
`sub` instead of `subg`.  This should mean the binaries behave the same
as MTE binaries but for ignoring tags.

For a hardware implementation of memory tagging checks are done
automatically so adding HWASAN_CHECK is not needed.  This means that the
`hwasan` pass is not needed.
Similarly, much of the `sanopt` pass is not needed when compiling for
hardware memory tagging -- though there is still need for handling
HWASAN_MARK.

This patch gives backends extra control over how a tag is stored in a
pointer and how many real-memory bytes is represented by each byte in
the shadow space.

One final difference between memtag and hwasan is that memtag can't use
the ASAN_POISON optimisation.
This optimisation replaces accesses to a variable that has just been
poisoned with an internal function that will be used to report an error
without needing to check the access.

This provides no benefit for memtag since there tends to be no
instructions allowing a report of a memory fault outside of mis-tagging
some memory and attempting to access it.

The optimisation is hence disabled for memory tagging since it provides
no benefit and would require all backends that wanted this feature to
implement a similar dummy hook.

gcc/ChangeLog:

2019-11-05  Matthew Malcomson  <matthew.malcomson@arm.com>

	* asan.c (hwasan_tag_init): Choose initialisation value based on
	memtag vs hwasan.
	(memory_tagging_p): Check for either hwaddress or memtag.
	(hwasan_emit_prologue): Account for memtag.
	(hwasan_emit_uncolour_frame): Account for memtag.
	(hwasan_finish_file): Assert not called for memtag.
	(hwasan_expand_check_ifn): Assert not called for memtag.
	(gate_hwasan): Don't run when have memtag.
	* asan.h (HWASAN_TAG_SIZE): Use backend hook if memtag.
	(HWASAN_TAG_GRANULE_SIZE): Use backend hook if memtag.
	(HWASAN_SHIFT): New.
	(HWASAN_SHIFT_RTX): New.
	(HWASAN_TAG_SHIFT_SIZE): New.
	* builtins.c (expand_builtin_alloca): Extra TODO comment.
	(expand_stack_restore): Extra TODO comment.
	* cfgexpand.c (expand_stack_vars): Only bother untagging bases
	for hwasan.
	* config/aarch64/aarch64.c (aarch64_classify_address): Account
	for addtag unspec marker.
	(aarch64_has_memtag_isa): New hook.
	(aarch64_tag_memory): Add dummy hook.
	(aarch64_gentag): Add dummy hook.
	(aarch64_addtag): New hook.
	(aarch64_addtag_force_operand): New hook.
	(TARGET_MEMTAG_HAS_MEMORY_TAGGING): New.
	(TARGET_MEMTAG_TAG): New.
	(TARGET_MEMTAG_GENTAG): New.
	(TARGET_MEMTAG_ADDTAG): New.
	(TARGET_MEMTAG_ADDTAG_FORCE_OPERAND): New.
	* config/aarch64/aarch64.h (AARCH64_ISA_MEMTAG): New macro.
	* config/aarch64/aarch64.md (random_tag, plain_offset_tagdi):
	New.
	(unspec enum): Add GENTAG and ADDTAG markers.
	* config/aarch64/predicates.md (aarch64_MTE_add_temp,
	aarch64_MTE_tag_offset, aarch64_MTE_value_offset): New.
	* doc/tm.texi: Document new hooks.
	* doc/tm.texi.in: Document new hooks.
	* flag-types.h (enum sanitize_code): Add MEMTAG enum.
	* gcc.c (sanitize_spec_function): Account for MEMTAG option.
	* internal-fn.c (expand_HWASAN_MARK): Account for memtag.
	* opts.c (finish_options): Ensure MEMTAG conflicts with ASAN,
	HWASAN, and THREAD.
	(finish_options): Turn on stack tagging for memtag.
	(sanitizer_opts): Add MEMTAG option.
	* target.def (targetm.memtag.has_memory_tagging): New.
	(targetm.memtag.tag_size): New.
	(targetm.memtag.granule_size): New.
	(targetm.memtag.copy_tag): New.
	(targetm.memtag.tag): New.
	* targhooks.c (default_memtag_has_memory_tagging): New.
	(default_memtag_tag_size): New.
	(default_memtag_granule_size): New.
	(default_memtag_copy_tag): New.
	* targhooks.h (default_memtag_tag_size): New decl.
	(default_memtag_granule_size): New decl.
	(default_memtag_copy_tag): New decl.
	* tree-ssa.c (execute_update_addresses_taken): Avoid ASAN_POISON
	optimisation for memtag.

gcc/testsuite/ChangeLog:

2019-11-05  Matthew Malcomson  <matthew.malcomson@arm.com>

	* gcc.dg/hwasan/poly-int-stack-vars.c: New test.



###############     Attachment also inlined for ease of reply    ###############
diff --git a/gcc/asan.h b/gcc/asan.h
index ff6adf2391ee1602a3c15755312a04f82d6369ce..71dbaee708d0e64911f568503655478b8720f494 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -27,10 +27,10 @@ extern void hwasan_finish_file (void);
 extern void hwasan_record_base (rtx);
 extern uint8_t hwasan_current_tag ();
 extern void hwasan_increment_tag ();
+extern rtx hwasan_extract_tag (rtx);
 extern rtx hwasan_with_tag (rtx, poly_int64);
 extern void hwasan_tag_init ();
 extern rtx hwasan_create_untagged_base (rtx);
-extern rtx hwasan_extract_tag (rtx tagged_pointer);
 extern rtx hwasan_base ();
 extern void hwasan_emit_prologue (rtx *, rtx *, poly_int64 *, uint8_t *, size_t);
 extern rtx_insn *hwasan_emit_uncolour_frame (rtx, rtx, rtx_insn *);
@@ -95,8 +95,12 @@ extern hash_set <tree> *asan_used_labels;
 /* NOTE: The values below define an ABI and are hard-coded to these values in
    libhwasan, hence they can't be changed independently here.  */
 /* How many bits are used to store a tag in a pointer.
-   HWASAN uses the entire top byte of a pointer (i.e. 8 bits).  */
-#define HWASAN_TAG_SIZE 8
+   HWASAN uses the entire top byte of a pointer (i.e. 8 bits).
+   For aarch64 MTE we have 4 bits per colour and that is advertised by the
+   backend hook.  */
+#define HWASAN_TAG_SIZE (sanitize_flags_p (SANITIZE_MEMTAG) \
+			 ? targetm.memtag.tag_size () \
+			 : 8)
 /* Tag Granule of HWASAN shadow stack.
    This is the size in real memory that each byte in the shadow memory refers
    to.  I.e. if a variable is X bytes long in memory then it's colour in shadow
@@ -105,7 +109,12 @@ extern hash_set <tree> *asan_used_labels;
    that are neighbours in memory and share a tag granule would need to share
    the same colour (the shared tag granule can only store one colour).  */
 #define HWASAN_TAG_SHIFT_SIZE 4
-#define HWASAN_TAG_GRANULE_SIZE (1ULL << HWASAN_TAG_SHIFT_SIZE)
+#define HWASAN_TAG_GRANULE_SIZE (sanitize_flags_p (SANITIZE_MEMTAG) \
+				 ? targetm.memtag.granule_size () \
+				 : (1ULL << HWASAN_TAG_SHIFT_SIZE))
+
+/* The following HWASAN_* macros are only used for HWASAN (not MEMTAG), which
+   is why there is no predicate.  */
 /* Define the tag for the stack background.
    This defines what colour the stack pointer will be and hence what tag all
    variables that are not given special tags are (e.g. spilled registers,
diff --git a/gcc/asan.c b/gcc/asan.c
index ef7c90e3358c8fa880b8e4002996f27541c26953..5769d1236908e6d8c75018f04f855928665e4126 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1480,7 +1480,8 @@ asan_redzone_buffer::flush_if_full (void)
 bool
 memory_tagging_p ()
 {
-    return sanitize_flags_p (SANITIZE_HWADDRESS);
+    return sanitize_flags_p (SANITIZE_HWADDRESS)
+      || sanitize_flags_p (SANITIZE_MEMTAG);
 }
 
 /* Are we tagging the stack?  */
@@ -3952,7 +3953,9 @@ hwasan_tag_init ()
   asan_used_labels = NULL;
 
   hwasan_base_ptr = NULL_RTX;
-  tag_offset = HWASAN_STACK_BACKGROUND + 1;
+  tag_offset = sanitize_flags_p (SANITIZE_MEMTAG)
+    ? 0
+    : HWASAN_STACK_BACKGROUND + 1;
 }
 
 tree
@@ -4220,20 +4223,31 @@ hwasan_emit_prologue (rtx *bases,
       if (size.is_constant (&tmp))
 	gcc_assert (tmp % HWASAN_TAG_GRANULE_SIZE == 0);
 
-      rtx ret = init_one_libfunc ("__hwasan_tag_memory");
-      rtx base_tag = hwasan_extract_tag (bases[i]);
-      /* In the case of tag overflow we would want modulo wrapping -- which
-	 should be given from the `plus_constant` in QImode.  */
-      rtx tag_colour = plus_constant (QImode, base_tag, tags[i]);
-      emit_library_call (ret,
-			 LCT_NORMAL,
-			 VOIDmode,
-			 plus_constant (ptr_mode, untagged_bases[i], bot),
-			 ptr_mode,
-			 tag_colour,
-			 QImode,
-			 gen_int_mode (size, ptr_mode),
-			 ptr_mode);
+      if (sanitize_flags_p (SANITIZE_HWADDRESS))
+	{
+	  rtx ret = init_one_libfunc ("__hwasan_tag_memory");
+	  rtx base_tag = hwasan_extract_tag (bases[i]);
+	  /* In the case of tag overflow we would want modulo wrapping -- which
+	     should be given from the `plus_constant` in QImode.  */
+	  rtx tag_colour = plus_constant (QImode, base_tag, tags[i]);
+	  emit_library_call (ret,
+			     LCT_NORMAL,
+			     VOIDmode,
+			     plus_constant (ptr_mode, untagged_bases[i], bot),
+			     ptr_mode,
+			     tag_colour,
+			     QImode,
+			     gen_int_mode (size, ptr_mode),
+			     ptr_mode);
+	}
+      else
+	{
+	  gcc_assert (sanitize_flags_p (SANITIZE_MEMTAG));
+	  targetm.memtag.tag (bases[i],
+			      bot,
+			      tags[i],
+			      gen_int_mode (size, ptr_mode));
+	}
     }
 }
 
@@ -4264,11 +4278,20 @@ hwasan_emit_uncolour_frame (rtx dynamic, rtx vars, rtx_insn *before)
   rtx size_rtx = expand_simple_binop (Pmode, MINUS, top_rtx, bot_rtx,
 				  NULL_RTX, /* unsignedp = */0, OPTAB_DIRECT);
 
-  rtx ret = init_one_libfunc ("__hwasan_tag_memory");
-  emit_library_call (ret, LCT_NORMAL, VOIDmode,
-      bot_rtx, ptr_mode,
-      const0_rtx, QImode,
-      size_rtx, ptr_mode);
+  if (sanitize_flags_p (SANITIZE_HWADDRESS))
+    {
+      rtx ret = init_one_libfunc ("__hwasan_tag_memory");
+      emit_library_call (ret, LCT_NORMAL, VOIDmode,
+			 bot_rtx, ptr_mode,
+			 const0_rtx, QImode,
+			 size_rtx, ptr_mode);
+    }
+  else
+    {
+      gcc_assert (sanitize_flags_p (SANITIZE_MEMTAG));
+      targetm.memtag.copy_tag (bot_rtx, stack_pointer_rtx);
+      targetm.memtag.tag (bot_rtx, 0, 0, size_rtx);
+    }
 
   do_pending_stack_adjust ();
   rtx_insn *insns = get_insns ();
@@ -4301,6 +4324,8 @@ static GTY(()) tree hwasan_ctor_statements;
 void
 hwasan_finish_file (void)
 {
+  gcc_assert (sanitize_flags_p (SANITIZE_HWADDRESS));
+
   /* Do not emit constructor initialisation for the kernel.
      (the kernel has its own initialisation already).  */
   if (flag_sanitize & SANITIZE_KERNEL_HWADDRESS)
@@ -4355,6 +4380,7 @@ hwasan_check_func (bool is_store, bool recover_p, HOST_WIDE_INT size_in_bytes,
 bool
 hwasan_expand_check_ifn (gimple_stmt_iterator *iter, bool)
 {
+  gcc_assert (sanitize_flags_p (SANITIZE_HWADDRESS));
   gimple *g = gsi_stmt (*iter);
   location_t loc = gimple_location (g);
   bool recover_p;
@@ -4448,7 +4474,7 @@ hwasan_expand_mark_ifn (gimple_stmt_iterator *)
 bool
 gate_hwasan ()
 {
-  return memory_tagging_p ();
+  return sanitize_flags_p (SANITIZE_HWADDRESS);
 }
 
 namespace {
diff --git a/gcc/builtins.c b/gcc/builtins.c
index f8063c138a340a06d45b01c9bb7f43caf75e78b2..416ee2b631d22ffab0ca428bc7fec9127382ef3e 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -5391,6 +5391,17 @@ expand_builtin_frame_address (tree fndecl, tree exp)
 static rtx
 expand_builtin_alloca (tree exp)
 {
+  /* TODO For hardware memory tagging we will need to call the backend to tag
+     this memory since the `hwasan` pass will not be run.
+
+     The `hwasan` pass is mainly to add HWASAN_CHECK internal functions where
+     checks should be made.  With hardware memory tagging the checks are done
+     automatically by the architecture.
+
+     The `hwasan` pass also modifies the behaviour of the alloca builtin
+     function in a target-independent manner, but when memory tagging is
+     handled by the backend it is more convenient to handle the tagging in the
+     alloca hook.  */
   rtx op0;
   rtx result;
   unsigned int align;
@@ -7012,6 +7023,9 @@ expand_builtin_set_thread_pointer (tree exp)
 static void
 expand_stack_restore (tree var)
 {
+  /* TODO If memory tagging is enabled through the hardware we need to uncolour
+     the stack from where we are to where we're going. (i.e. colour in the
+     background stack colour).  */
   rtx_insn *prev;
   rtx sa = expand_normal (var);
 
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 1e4d171e651ae10c5829e52248629c04b03c19f1..53b4658aa74c1e369fb139b0a29cdb6dea41dc3b 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -1043,6 +1043,8 @@ public:
 
   /* HWASAN records the poly_int64 so it can handle any stack variable.  */
   auto_vec<poly_int64> hwasan_vec;
+  /* HWASAN needs to record untagged base pointers when there isn't hardware
+     memory tagging enabled by the architecture.  */
   auto_vec<rtx> hwasan_untagged_base_vec;
   auto_vec<rtx> hwasan_base_vec;
 
@@ -1177,7 +1179,8 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
 	      gcc_assert (stack_vars[i].alignb >= HWASAN_TAG_GRANULE_SIZE);
 	      offset = alloc_stack_frame_space (0, HWASAN_TAG_GRANULE_SIZE);
 	      data->hwasan_vec.safe_push (offset);
-	      data->hwasan_untagged_base_vec.safe_push (virtual_stack_vars_rtx);
+	      if (sanitize_flags_p (SANITIZE_HWADDRESS))
+		data->hwasan_untagged_base_vec.safe_push (virtual_stack_vars_rtx);
 	    }
 	  /* ASAN description strings don't yet have a syntax for expressing
 	     polynomial offsets.  */
@@ -1291,10 +1294,18 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
 	      /* An object with a large alignment requirement means that the
 		 alignment requirement is greater than the required alignment
 		 for tags.  */
-	      if (!large_untagged_base)
-		large_untagged_base = hwasan_create_untagged_base (large_base);
 	      data->hwasan_vec.safe_push (large_alloc);
-	      data->hwasan_untagged_base_vec.safe_push (large_untagged_base);
+
+	      if (sanitize_flags_p (SANITIZE_HWADDRESS))
+	      {
+		/* We only need to record the untagged bases for HWASAN, since
+		   the runtime library for that doesn't accept tagged pointers.
+		   For hardware implementations of memory tagging there is no
+		   use of recording these untagged versions.  */
+		if (!large_untagged_base)
+		  large_untagged_base = hwasan_create_untagged_base (large_base);
+		data->hwasan_untagged_base_vec.safe_push (large_untagged_base);
+	      }
 	    }
 	  offset = large_alloc;
 	  large_alloc += stack_vars[i].size;
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 7bbeed453cf87382b1776ff52991b5cf6ab9204e..7f23d377308f3b517e4ae08eba3a56e8c6565e8a 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -247,6 +247,7 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_ISA_RCPC8_4	   (aarch64_isa_flags & AARCH64_FL_RCPC8_4)
 #define AARCH64_ISA_V8_5	   (aarch64_isa_flags & AARCH64_FL_V8_5)
 #define AARCH64_ISA_TME		   (aarch64_isa_flags & AARCH64_FL_TME)
+#define AARCH64_ISA_MEMTAG	   (aarch64_isa_flags & AARCH64_FL_MEMTAG)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (TARGET_SIMD && AARCH64_ISA_CRYPTO)
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index c556bcd1c37c3c4fdd9a829a28ee4ff56819b89e..a21b5918859305dd6301ac7cb3a4e16271b3cb10 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7459,6 +7459,12 @@ aarch64_classify_address (struct aarch64_address_info *info,
       && (code != POST_INC && code != REG))
     return false;
 
+  /* MTE unspec is not a valid address directly.  It must first be put into a
+     register.  */
+  if (GET_CODE (x) == UNSPEC
+      && XINT (x, 1) == UNSPEC_ADDTAG)
+    return false;
+
   gcc_checking_assert (GET_MODE (x) == VOIDmode
 		       || SCALAR_INT_MODE_P (GET_MODE (x)));
 
@@ -20281,6 +20287,76 @@ aarch64_can_tag_addresses ()
   return true;
 }
 
+/* Implement TARGET_MEMTAG_HAS_MEMORY_TAGGING.  We support automatic memory
+   tagging and tag checking if we have AARCH64_ISA_MEMTAG.  */
+bool
+aarch64_has_memtag_isa ()
+{
+  return AARCH64_ISA_MEMTAG;
+}
+
+/* Implement TARGET_MEMTAG_TAG for AArch64. This is only available when
+   AARCH64_ISA_MEMTAG is available.  TODO Eventually we would just want
+   something to emit a loop of STG or ST2G.  Currently unimplemented.  */
+void
+aarch64_tag_memory (rtx tagged_start, poly_int64 address_offset, uint8_t tag_offset,
+		    rtx size)
+{
+  return;
+}
+
+void
+aarch64_gentag (rtx a, rtx b)
+{
+  if ( ! AARCH64_ISA_MEMTAG)
+    return default_memtag_gentag (a, b);
+
+  emit_insn (gen_random_tag (a, b));
+}
+
+rtx
+aarch64_addtag (rtx base, poly_int64 addr_offset, uint8_t tag_offset)
+{
+  /* Handle problems like the offset is too large by creating  */
+  if (! AARCH64_ISA_MEMTAG)
+    return default_memtag_addtag (base, addr_offset, tag_offset);
+
+  /* If the tag offset is zero then leave it as a PLUS.
+     This can be optimised easier by the RTL backends.  */
+  if (tag_offset == 0)
+    return plus_constant (Pmode, base, addr_offset);
+  return gen_rtx_UNSPEC (DImode,
+			 gen_rtvec (3,
+				    base,
+				    gen_int_mode (addr_offset, DImode),
+				    GEN_INT (tag_offset)),
+			 UNSPEC_ADDTAG);
+}
+
+rtx
+aarch64_addtag_force_operand (rtx oper, rtx target)
+{
+  if (GET_CODE (oper) == UNSPEC
+      && XINT (oper, 1) == UNSPEC_ADDTAG)
+    {
+      rtx base = XVECEXP (oper, 0, 0);
+      rtx offset = XVECEXP (oper, 0, 1);
+      rtx tag_offset = XVECEXP (oper, 0, 2);
+      if (! aarch64_MTE_value_offset (offset, DImode))
+	{
+	  rtx newreg = gen_reg_rtx (DImode);
+	  emit_insn (gen_adddi3 (newreg, base, offset));
+	  offset = const0_rtx;
+	  base = newreg;
+	}
+
+      rtx temp_reg = (target && REG_P (target)) ? target : gen_reg_rtx (DImode);
+      emit_insn (gen_plain_offset_tagdi (temp_reg, base, offset, tag_offset));
+      return temp_reg;
+    }
+  return NULL_RTX;
+}
+
 /* Implement TARGET_ASM_FILE_END for AArch64.  This adds the AArch64 GNU NOTE
    section at the end if needed.  */
 #define GNU_PROPERTY_AARCH64_FEATURE_1_AND	0xc0000000
@@ -20851,6 +20927,21 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_MEMTAG_CAN_TAG_ADDRESSES
 #define TARGET_MEMTAG_CAN_TAG_ADDRESSES aarch64_can_tag_addresses
 
+#undef TARGET_MEMTAG_HAS_MEMORY_TAGGING
+#define TARGET_MEMTAG_HAS_MEMORY_TAGGING aarch64_has_memtag_isa
+
+#undef TARGET_MEMTAG_TAG
+#define TARGET_MEMTAG_TAG aarch64_tag_memory
+
+#undef TARGET_MEMTAG_GENTAG
+#define TARGET_MEMTAG_GENTAG aarch64_gentag
+
+#undef TARGET_MEMTAG_ADDTAG
+#define TARGET_MEMTAG_ADDTAG aarch64_addtag
+
+#undef TARGET_MEMTAG_ADDTAG_FORCE_OPERAND
+#define TARGET_MEMTAG_ADDTAG_FORCE_OPERAND aarch64_addtag_force_operand
+
 #if CHECKING_P
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index e4f9005c27f6f57efba31004389dbed9fd91a360..880d2b40d09b9b229e03aa9bf56ce5ae77a0d350 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -243,6 +243,8 @@ (define_c_enum "unspec" [
     UNSPEC_SPECULATION_TRACKER
     UNSPEC_COPYSIGN
     UNSPEC_TTEST		; Represent transaction test.
+    UNSPEC_GENTAG
+    UNSPEC_ADDTAG
 ])
 
 (define_c_enum "unspecv" [
@@ -445,6 +447,26 @@ (define_expand "cbranch<mode>4"
   "
 )
 
+(define_insn "random_tag"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec:DI [(match_operand:DI 1 "register_operand" "r")] UNSPEC_GENTAG))]
+  "AARCH64_ISA_MEMTAG"
+  "mov\\t%0, %1 // irg\\t%0, %1"
+)
+
+(define_insn "plain_offset_tagdi"
+  [(set (match_operand:DI 0 "register_operand" "=r,r")
+    (unspec:DI
+	[(match_operand:DI 1 "register_operand" "r,r")
+	 (match_operand:DI 2 "aarch64_MTE_value_offset" "I,J")
+	 (match_operand:DI 3 "aarch64_MTE_tag_offset" "i,i")]
+      UNSPEC_ADDTAG))]
+  "AARCH64_ISA_MEMTAG"
+  "@
+  add\\t%0, %1, %2     // addg\\t%0, %1, %2, %3
+  sub\\t%0, %1, #%n2   // subg\\t%0, %1, #%n2, %3"
+)
+
 (define_expand "cbranchcc4"
   [(set (pc) (if_then_else
 	      (match_operator 0 "aarch64_comparison_operator"
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index d8c377994d6f11a58683b19d7ae9d594e5033561..ede9aa49ef14b8cc453098beac613cc3ed181718 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -144,6 +144,18 @@ (define_predicate "aarch64_pluslong_immediate"
   (and (match_code "const_int")
        (match_test "(INTVAL (op) < 0xffffff && INTVAL (op) > -0xffffff)")))
 
+(define_predicate "aarch64_MTE_add_temp"
+  (ior (match_code "const_int") (match_code "const_poly_int")))
+
+(define_predicate "aarch64_MTE_tag_offset"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 16)")))
+
+(define_predicate "aarch64_MTE_value_offset"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 1008) && !(INTVAL (op) & 0xf)")))
+
+
 (define_predicate "aarch64_pluslong_strict_immedate"
   (and (match_operand 0 "aarch64_pluslong_immediate")
        (not (match_operand 0 "aarch64_plus_immediate"))))
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 99949ba0b9317a89019ab5a6d9383e89f2d6ce3c..e5f83932b1d0c93b97e58b4e2cdc57f45617bfa3 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -2976,6 +2976,25 @@ A target hook which lets a backend compute the set of pressure classes to  be us
 True if backend architecture naturally supports ignoring the top byte of pointers.  This feature means that -fsanitize=hwaddress can work.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_MEMTAG_HAS_MEMORY_TAGGING ()
+True if backend architecture naturally supports tagging addresses and checking those tags.  This feature means that -fsanitize=memtag can work.
+@end deftypefn
+
+@deftypefn {Target Hook} uint8_t TARGET_MEMTAG_TAG_SIZE ()
+Return the size in bits of a tag for this platform.
+@end deftypefn
+
+@deftypefn {Target Hook} uint8_t TARGET_MEMTAG_GRANULE_SIZE ()
+Return how many bytes in real memory each byte in shadow memory represents.
+I.e. one byte in shadow memory being colour 1 implies the assocaiated
+targetm.memtag.granule_size () bytes in real memory must all be accessed by
+pointers tagged as 1.
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_MEMTAG_COPY_TAG (rtx @var{to}, rtx @var{from})
+Emit insns to copy the tag in FROM to TO.
+@end deftypefn
+
 @deftypefn {Target Hook} rtx TARGET_MEMTAG_ADDTAG (rtx @var{base}, poly_int64 @var{addr_offset}, uint8_t @var{tag_offset})
 Emit an RTX representing BASE offset in value by ADDR_OFFSET and in tag by TAG_OFFSET.
 The resulting RTX must either be a valid memory address or be able to get
@@ -2996,6 +3015,15 @@ This hook is most often implemented by emitting instructions to put the
 expression into a pseudo register, then returning that pseudo register.
 @end deftypefn
 
+@deftypefn {Target Hook} void TARGET_MEMTAG_TAG (rtx @var{tagged_start}, poly_int64 @var{address_offset}, uint8_t @var{tag_offset}, rtx @var{size})
+This function should emit an RTX to colour memory.
+It's given arguments TAGGED_START, ADDRESS_OFFSET, TAG_OFFSET, SIZE, where
+TAGGED_START and SIZE are RTL expressions, ADDRESS_OFFSET is a poly_int64
+and TAG_OFFSET is a uint8_t.
+It should emit RTL to colour "shadow memory" for the relevant range the
+colour of the tag it was given.
+@end deftypefn
+
 @deftypefn {Target Hook} void TARGET_MEMTAG_GENTAG (rtx @var{base}, rtx @var{untagged})
 Set the BASE argument to UNTAGGED with some random tag.
 This function is used to generate a tagged base for the current stack frame.
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index ab18039b09f9b0a93338fa716d5d044555371ddc..659a07d8b9fb4e2b2c5b7d6c9899be9c723c4c09 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2376,10 +2376,20 @@ in the reload pass.
 
 @hook TARGET_MEMTAG_CAN_TAG_ADDRESSES
 
+@hook TARGET_MEMTAG_HAS_MEMORY_TAGGING
+
+@hook TARGET_MEMTAG_TAG_SIZE
+
+@hook TARGET_MEMTAG_GRANULE_SIZE
+
+@hook TARGET_MEMTAG_COPY_TAG
+
 @hook TARGET_MEMTAG_ADDTAG
 
 @hook TARGET_MEMTAG_ADDTAG_FORCE_OPERAND
 
+@hook TARGET_MEMTAG_TAG
+
 @hook TARGET_MEMTAG_GENTAG
 
 @node Stack and Calling
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 57d8ff9a1a010409d966230140df1017bc3584a8..4ab2bf2f466a7ad509d20e8e4bcfb9df72dc1335 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -259,6 +259,7 @@ enum sanitize_code {
   SANITIZE_HWADDRESS = 1UL << 28,
   SANITIZE_USER_HWADDRESS = 1UL << 29,
   SANITIZE_KERNEL_HWADDRESS = 1UL << 30,
+  SANITIZE_MEMTAG = 1UL << 31,
   SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT,
   SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
 		       | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN
diff --git a/gcc/gcc.c b/gcc/gcc.c
index cf1bd9de660f32f060b9277f89a562873a48684a..2e926c2c3da22ea17cc69b3c8d6cf18b07f93dbd 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -9463,6 +9463,8 @@ sanitize_spec_function (int argc, const char **argv)
     return (flag_sanitize & SANITIZE_KERNEL_HWADDRESS) ? "" : NULL;
   if (strcmp (argv[0], "thread") == 0)
     return (flag_sanitize & SANITIZE_THREAD) ? "" : NULL;
+  if (strcmp (argv[0], "memtag") == 0)
+    return (flag_sanitize & SANITIZE_MEMTAG) ? "" : NULL;
   if (strcmp (argv[0], "undefined") == 0)
     return ((flag_sanitize
 	     & (SANITIZE_UNDEFINED | SANITIZE_UNDEFINED_NONDEFAULT))
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index c692ae86ec6b5fbe345558d7f412f6ecd666bfa1..64d48813c3d16d9fd1888cf74597cf10d0dd3b83 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -502,9 +502,6 @@ expand_HWASAN_MARK (internal_fn, gcall *gc)
   gcc_checking_assert (TREE_CODE (base) == ADDR_EXPR);
   rtx base_rtx = expand_normal (base);
 
-  rtx tag = is_poison ? const0_rtx : hwasan_extract_tag (base_rtx);
-  rtx address = hwasan_create_untagged_base (base_rtx);
-
   tree len = gimple_call_arg (gc, 2);
   gcc_assert (tree_fits_shwi_p (len));
   unsigned HOST_WIDE_INT size_in_bytes = tree_to_shwi (len);
@@ -513,13 +510,25 @@ expand_HWASAN_MARK (internal_fn, gcall *gc)
   size_in_bytes = (size_in_bytes + tg_mask) & ~tg_mask;
   rtx size = gen_int_mode (size_in_bytes, Pmode);
 
-  rtx func = init_one_libfunc ("__hwasan_tag_memory");
-  emit_library_call (func,
-      LCT_NORMAL,
-      VOIDmode,
-      address, ptr_mode,
-      tag, QImode,
-      size, ptr_mode);
+  if (sanitize_flags_p (SANITIZE_HWADDRESS))
+    {
+      rtx func = init_one_libfunc ("__hwasan_tag_memory");
+      rtx address = hwasan_create_untagged_base (base_rtx);
+      rtx tag = is_poison ? const0_rtx : hwasan_extract_tag (base_rtx);
+      emit_library_call (func,
+			 LCT_NORMAL,
+			 VOIDmode,
+			 address, ptr_mode,
+			 tag, QImode,
+			 size, ptr_mode);
+    }
+  else
+    {
+      gcc_assert (sanitize_flags_p (SANITIZE_MEMTAG));
+      if (is_poison)
+	targetm.memtag.copy_tag (base_rtx, stack_pointer_rtx);
+      targetm.memtag.tag (base_rtx, 0, 0, size);
+    }
 }
 
 /* This should get expanded in the sanopt pass.  */
diff --git a/gcc/opts.c b/gcc/opts.c
index 88a94286e71f61f2dce907018e5185f63a830804..659eeb0a62344c250314892a974d245d72b9a84e 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1200,6 +1200,27 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
 	      "%<-fsanitize=hwaddress%> is incompatible with "
 	      "%<-fsanitize=thread%>");
 
+  /* Memtag and ASan conflict with each other.  */
+  if ((opts->x_flag_sanitize & SANITIZE_ADDRESS)
+      && (opts->x_flag_sanitize & SANITIZE_MEMTAG))
+    error_at (loc,
+	      "%<-fsanitize=memtag%> is incompatible with both "
+	      "%<-fsanitize=address%> and %<-fsanitize=kernel-address%>");
+
+  /* Memtag and HWASan conflict with each other.  */
+  if ((opts->x_flag_sanitize & SANITIZE_HWADDRESS)
+      && (opts->x_flag_sanitize & SANITIZE_MEMTAG))
+    error_at (loc,
+	      "%<-fsanitize=memtag%> is incompatible with both "
+	      "%<-fsanitize=hwaddress%> and %<-fsanitize=kernel-hwaddress%>");
+
+  /* Memtag conflicts with TSan.  */
+  if ((opts->x_flag_sanitize & SANITIZE_MEMTAG)
+      && (opts->x_flag_sanitize & SANITIZE_THREAD))
+    error_at (loc,
+	      "%<-fsanitize=memtag%> is incompatible with "
+	      "%<-fsanitize=thread%>");
+
   /* Check error recovery for -fsanitize-recover option.  */
   for (int i = 0; sanitizer_opts[i].name != NULL; ++i)
     if ((opts->x_flag_sanitize_recover & sanitizer_opts[i].flag)
@@ -1220,7 +1241,8 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
   /* Enable -fsanitize-address-use-after-scope if address sanitizer is
      enabled.  */
   if (((opts->x_flag_sanitize & SANITIZE_USER_ADDRESS)
-       || (opts->x_flag_sanitize & SANITIZE_USER_HWADDRESS))
+       || (opts->x_flag_sanitize & SANITIZE_USER_HWADDRESS)
+       || (opts->x_flag_sanitize & SANITIZE_MEMTAG))
       && !opts_set->x_flag_sanitize_address_use_after_scope)
     opts->x_flag_sanitize_address_use_after_scope = true;
 
@@ -1849,6 +1871,7 @@ const struct sanitizer_opts_s sanitizer_opts[] =
 #define SANITIZER_OPT(name, flags, recover) \
     { #name, flags, sizeof #name - 1, recover }
   SANITIZER_OPT (address, (SANITIZE_ADDRESS | SANITIZE_USER_ADDRESS), true),
+  SANITIZER_OPT (memtag, (SANITIZE_MEMTAG), true),
   SANITIZER_OPT (hwaddress, (SANITIZE_HWADDRESS | SANITIZE_USER_HWADDRESS),
 		 true),
   SANITIZER_OPT (kernel-address, (SANITIZE_ADDRESS | SANITIZE_KERNEL_ADDRESS),
diff --git a/gcc/target.def b/gcc/target.def
index badae860335e4a570f189c9f8011da5ab8c15439..2a366b7b58ed459574ab9aa3f008ed0f05bf2666 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -6716,6 +6716,30 @@ DEFHOOK
  bool, (), default_memtag_can_tag_addresses)
 
 DEFHOOK
+(has_memory_tagging,
+ "True if backend architecture naturally supports tagging addresses and\
+ checking those tags.  This feature means that -fsanitize=memtag can work.",
+ bool, (), default_memtag_has_memory_tagging)
+
+DEFHOOK
+(tag_size,
+ "Return the size in bits of a tag for this platform.",
+ uint8_t, (), default_memtag_tag_size)
+
+DEFHOOK
+(granule_size,
+ "Return how many bytes in real memory each byte in shadow memory represents.\n\
+I.e. one byte in shadow memory being colour 1 implies the assocaiated\n\
+targetm.memtag.granule_size () bytes in real memory must all be accessed by\n\
+pointers tagged as 1.",
+uint8_t, (), default_memtag_granule_size)
+
+DEFHOOK
+(copy_tag,
+ "Emit insns to copy the tag in FROM to TO.",
+void, (rtx to, rtx from), default_memtag_copy_tag)
+
+DEFHOOK
 (addtag,
  "Emit an RTX representing BASE offset in value by ADDR_OFFSET and in tag by\
  TAG_OFFSET.\n\
@@ -6740,6 +6764,17 @@ expression into a pseudo register, then returning that pseudo register.",
 rtx, (rtx oper, rtx target), NULL)
 
 DEFHOOK
+(tag,
+ "This function should emit an RTX to colour memory.\n\
+It's given arguments TAGGED_START, ADDRESS_OFFSET, TAG_OFFSET, SIZE, where\n\
+TAGGED_START and SIZE are RTL expressions, ADDRESS_OFFSET is a poly_int64\n\
+and TAG_OFFSET is a uint8_t.\n\
+It should emit RTL to colour \"shadow memory\" for the relevant range the\n\
+colour of the tag it was given.",
+  void, (rtx tagged_start, poly_int64 address_offset, uint8_t tag_offset, rtx size),
+NULL)
+
+DEFHOOK
 (gentag,
  "Set the BASE argument to UNTAGGED with some random tag.\n\
 This function is used to generate a tagged base for the current stack frame.",
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 4418db74f52f669c22702f5a4a093172f48a1b46..9c69589e33121638d349b882b4f26d29ac449d20 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -285,6 +285,10 @@ extern void default_remove_extra_call_preserved_regs (rtx_insn *,
 						      HARD_REG_SET *);
 
 extern bool default_memtag_can_tag_addresses ();
+extern bool default_memtag_has_memory_tagging ();
+extern uint8_t default_memtag_tag_size ();
+extern uint8_t default_memtag_granule_size ();
 extern void default_memtag_gentag (rtx, rtx);
+extern void default_memtag_copy_tag (rtx, rtx);
 extern rtx default_memtag_addtag (rtx, poly_int64, uint8_t);
 #endif /* GCC_TARGHOOKS_H */
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 048c7f3cff5d87d0d40b93f6cf8cb41de670711d..b8a74a5f3750dad102311f1e4298a63416f1261b 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -2377,6 +2377,24 @@ default_memtag_can_tag_addresses ()
   return false;
 }
 
+bool
+default_memtag_has_memory_tagging ()
+{
+  return false;
+}
+
+uint8_t
+default_memtag_tag_size ()
+{
+  return 8;
+}
+
+uint8_t
+default_memtag_granule_size ()
+{
+  return 16;
+}
+
 void
 default_memtag_gentag (rtx base, rtx untagged)
 {
@@ -2403,6 +2421,70 @@ default_memtag_gentag (rtx base, rtx untagged)
     }
 }
 
+void
+default_memtag_copy_tag (rtx to, rtx from)
+{
+  /* TODO:
+      I want to have a sequence as minimal as possible here, since this code
+      sequence could be emitted many times.
+
+      My first attempt was the below,
+
+	  rtx temp = hwasan_extract_tag (from);
+	  store_bit_field (to, 8, 56, 0, 0,
+			  QImode, temp, false);
+      
+     Unfortunately, despite having much less instructions, for AArch64 this can
+     cause a problem in LRA if the `to` RTX eventually resolves to being the
+     stack pointer.
+     This happens because the instruction that gets emitted from
+     `store_bit_field` corresponds to a pattern that can't handle the stack
+     pointer and LRA can't figure out to use a temporary register in the `bfi`
+     instruction's place.
+
+     This doesn't cause a problem at the moment since there's currently no way
+     the stack pointer should be given to this function.  The hook is only used
+     when poisoning variables with HWASAN_MARK, and in that function the `to`
+     RTX should always be pointing to a tagged variable on the stack (since
+     the variable is tagged it can't be the stack pointer since that is
+     untagged).
+
+     Eventually we will be generating random tags as the "start" tag for each
+     frame.  When this happens we can no longer avoid the background colour at
+     compile time since we will not know what offset to avoid.
+     This will mean we no longer avoid a `tag_offset` of 0, and hence
+     `hwasan_with_tag` could emit simple PLUS statements.
+
+     When that happens, the last variable on the stack could very well have
+     a zero tag offset and somewhere else in the compiler could optimise that
+     to simply use the stack pointer.
+
+     That would trigger an ICE due to LRA being unable to reload the
+     `insv_regdi` pattern.
+
+     The code sequence I'm emitting at the moment works just fine in all
+     circumstances, but it would be nice to find a smaller sequence.  */
+  rtx temp = gen_reg_rtx (Pmode);
+  uint64_t tag_mask = 0xFFUL << HWASAN_SHIFT;
+  emit_insn (gen_anddi3 (to, to, GEN_INT (~tag_mask)));
+  /* Can't use GEN_INT (tag_mask) since GEN_INT calls `gen_rtx_CONST_INT` which
+     takes a `HOST_WIDE_INT`.
+     HOST_WIDE_INT can't hold a uint64_t with the top bit set, hence in order
+     to avoid UB we have to emit instructions for the machine to use some
+     uint64_t arithmetic.
+
+     The extra instructions seem to eventually end up with the same output in
+     most cases (I've not yet seen a case where the generation of the mask in
+     three or one `emit_insn` calls changes the codegen).  */
+
+  /* emit_insn (gen_anddi3 (temp, from, GEN_INT (tag_mask)));  */
+  emit_move_insn (temp, GEN_INT (0xff));
+  emit_insn (gen_ashldi3 (temp, temp, HWASAN_SHIFT_RTX));
+  emit_insn (gen_anddi3 (temp, from, temp));
+
+  emit_insn (gen_iordi3 (to, to, from));
+}
+
 rtx
 default_memtag_addtag (rtx base, poly_int64 offset, uint8_t tag_offset)
 {
diff --git a/gcc/testsuite/gcc.dg/hwasan/poly-int-stack-vars.c b/gcc/testsuite/gcc.dg/hwasan/poly-int-stack-vars.c
new file mode 100644
index 0000000000000000000000000000000000000000..cb0ca7d3a06c5a2de258ba20be974009410a7a44
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/hwasan/poly-int-stack-vars.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+
+/* TODO Need to give this test the correct arguments
+   (i.e. compile with memory tagging enabled).
+
+   This is needed because the introduction of HWASAN_CHECK statements stops the
+   compiler from being able to vectorise the loop with SVE, hence stopping the
+   introduction of `addvl` instructions.
+
+   In other words, this test doesn't really belong in this directory, but I
+   haven't yet created the directory for checking memtag code generation and
+   this is a dummy commit anyway.  */
+/* Non-constant sizes.
+   Code designed to store SVE registers on the stack.
+   This is needed to exercise the poly_int64 handling for HWASAN and MTE
+   instrumentation. */
+int u;
+
+void
+foo_sve (int *p)
+{
+  int i;
+  #pragma omp for simd lastprivate(u) schedule (static, 32)
+  for (i = 0; i < 1024; i++)
+    u = p[i];
+}
+
+void
+bar_sve (int *p)
+{
+  int i;
+  #pragma omp taskloop simd lastprivate(u)
+  for (i = 0; i < 1024; i++)
+    u = p[i];
+}
+
+/* Ensure we are storing SVE vectors on the stack -- otherwise we're not
+   exercising the code path for poly_int64's.  */
+/* { dg-final { scan-assembler "addvl" } } */
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 5e077892a6371a7b534007d5dffd421b218ea694..6dcd830c3a93a93ce27d9b80137af6d9d1288ff7 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1765,6 +1765,16 @@ process_options (void)
       flag_sanitize &= ~SANITIZE_HWADDRESS;
     }
 
+  /* Memtag requires hardware support.  */
+  if (flag_sanitize & SANITIZE_MEMTAG && !targetm.memtag.has_memory_tagging ())
+    {
+      warning_at (UNKNOWN_LOCATION, 0,
+		  "%<-fsanitize=memtag%> requires hardware support "
+		  "that is not advertised for this target");
+      flag_sanitize &= ~SANITIZE_MEMTAG;
+    }
+
+
  /* Do not use IPA optimizations for register allocation if profiler is active
     or patchable function entries are inserted for run-time instrumentation
     or port does not emit prologue and epilogue as RTL.  */
diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c
index b4b5c903e13b1695daeff66a680c64fa7da0829d..73ff987544487e44c4c14419fdc2b27b8b6ddb25 100644
--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -1704,7 +1704,9 @@ execute_update_addresses_taken (void)
 		  gimple_ior_addresses_taken (addresses_taken, stmt);
 		  gimple_call_set_arg (stmt, 1, arg);
 		}
-	      else if (is_asan_mark_p (stmt)
+	      else if ((is_asan_mark_p (stmt)
+			&& (!sanitize_flags_p (SANITIZE_MEMTAG)
+			    || !asan_mark_p (stmt, ASAN_MARK_POISON)))
 		       || gimple_call_internal_p (stmt, IFN_GOMP_SIMT_ENTER))
 		;
 	      else
diff mbox series

Patch

diff --git a/gcc/asan.h b/gcc/asan.h
index ff6adf2391ee1602a3c15755312a04f82d6369ce..71dbaee708d0e64911f568503655478b8720f494 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -27,10 +27,10 @@  extern void hwasan_finish_file (void);
 extern void hwasan_record_base (rtx);
 extern uint8_t hwasan_current_tag ();
 extern void hwasan_increment_tag ();
+extern rtx hwasan_extract_tag (rtx);
 extern rtx hwasan_with_tag (rtx, poly_int64);
 extern void hwasan_tag_init ();
 extern rtx hwasan_create_untagged_base (rtx);
-extern rtx hwasan_extract_tag (rtx tagged_pointer);
 extern rtx hwasan_base ();
 extern void hwasan_emit_prologue (rtx *, rtx *, poly_int64 *, uint8_t *, size_t);
 extern rtx_insn *hwasan_emit_uncolour_frame (rtx, rtx, rtx_insn *);
@@ -95,8 +95,12 @@  extern hash_set <tree> *asan_used_labels;
 /* NOTE: The values below define an ABI and are hard-coded to these values in
    libhwasan, hence they can't be changed independently here.  */
 /* How many bits are used to store a tag in a pointer.
-   HWASAN uses the entire top byte of a pointer (i.e. 8 bits).  */
-#define HWASAN_TAG_SIZE 8
+   HWASAN uses the entire top byte of a pointer (i.e. 8 bits).
+   For aarch64 MTE we have 4 bits per colour and that is advertised by the
+   backend hook.  */
+#define HWASAN_TAG_SIZE (sanitize_flags_p (SANITIZE_MEMTAG) \
+			 ? targetm.memtag.tag_size () \
+			 : 8)
 /* Tag Granule of HWASAN shadow stack.
    This is the size in real memory that each byte in the shadow memory refers
    to.  I.e. if a variable is X bytes long in memory then it's colour in shadow
@@ -105,7 +109,12 @@  extern hash_set <tree> *asan_used_labels;
    that are neighbours in memory and share a tag granule would need to share
    the same colour (the shared tag granule can only store one colour).  */
 #define HWASAN_TAG_SHIFT_SIZE 4
-#define HWASAN_TAG_GRANULE_SIZE (1ULL << HWASAN_TAG_SHIFT_SIZE)
+#define HWASAN_TAG_GRANULE_SIZE (sanitize_flags_p (SANITIZE_MEMTAG) \
+				 ? targetm.memtag.granule_size () \
+				 : (1ULL << HWASAN_TAG_SHIFT_SIZE))
+
+/* The following HWASAN_* macros are only used for HWASAN (not MEMTAG), which
+   is why there is no predicate.  */
 /* Define the tag for the stack background.
    This defines what colour the stack pointer will be and hence what tag all
    variables that are not given special tags are (e.g. spilled registers,
diff --git a/gcc/asan.c b/gcc/asan.c
index ef7c90e3358c8fa880b8e4002996f27541c26953..5769d1236908e6d8c75018f04f855928665e4126 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1480,7 +1480,8 @@  asan_redzone_buffer::flush_if_full (void)
 bool
 memory_tagging_p ()
 {
-    return sanitize_flags_p (SANITIZE_HWADDRESS);
+    return sanitize_flags_p (SANITIZE_HWADDRESS)
+      || sanitize_flags_p (SANITIZE_MEMTAG);
 }
 
 /* Are we tagging the stack?  */
@@ -3952,7 +3953,9 @@  hwasan_tag_init ()
   asan_used_labels = NULL;
 
   hwasan_base_ptr = NULL_RTX;
-  tag_offset = HWASAN_STACK_BACKGROUND + 1;
+  tag_offset = sanitize_flags_p (SANITIZE_MEMTAG)
+    ? 0
+    : HWASAN_STACK_BACKGROUND + 1;
 }
 
 tree
@@ -4220,20 +4223,31 @@  hwasan_emit_prologue (rtx *bases,
       if (size.is_constant (&tmp))
 	gcc_assert (tmp % HWASAN_TAG_GRANULE_SIZE == 0);
 
-      rtx ret = init_one_libfunc ("__hwasan_tag_memory");
-      rtx base_tag = hwasan_extract_tag (bases[i]);
-      /* In the case of tag overflow we would want modulo wrapping -- which
-	 should be given from the `plus_constant` in QImode.  */
-      rtx tag_colour = plus_constant (QImode, base_tag, tags[i]);
-      emit_library_call (ret,
-			 LCT_NORMAL,
-			 VOIDmode,
-			 plus_constant (ptr_mode, untagged_bases[i], bot),
-			 ptr_mode,
-			 tag_colour,
-			 QImode,
-			 gen_int_mode (size, ptr_mode),
-			 ptr_mode);
+      if (sanitize_flags_p (SANITIZE_HWADDRESS))
+	{
+	  rtx ret = init_one_libfunc ("__hwasan_tag_memory");
+	  rtx base_tag = hwasan_extract_tag (bases[i]);
+	  /* In the case of tag overflow we would want modulo wrapping -- which
+	     should be given from the `plus_constant` in QImode.  */
+	  rtx tag_colour = plus_constant (QImode, base_tag, tags[i]);
+	  emit_library_call (ret,
+			     LCT_NORMAL,
+			     VOIDmode,
+			     plus_constant (ptr_mode, untagged_bases[i], bot),
+			     ptr_mode,
+			     tag_colour,
+			     QImode,
+			     gen_int_mode (size, ptr_mode),
+			     ptr_mode);
+	}
+      else
+	{
+	  gcc_assert (sanitize_flags_p (SANITIZE_MEMTAG));
+	  targetm.memtag.tag (bases[i],
+			      bot,
+			      tags[i],
+			      gen_int_mode (size, ptr_mode));
+	}
     }
 }
 
@@ -4264,11 +4278,20 @@  hwasan_emit_uncolour_frame (rtx dynamic, rtx vars, rtx_insn *before)
   rtx size_rtx = expand_simple_binop (Pmode, MINUS, top_rtx, bot_rtx,
 				  NULL_RTX, /* unsignedp = */0, OPTAB_DIRECT);
 
-  rtx ret = init_one_libfunc ("__hwasan_tag_memory");
-  emit_library_call (ret, LCT_NORMAL, VOIDmode,
-      bot_rtx, ptr_mode,
-      const0_rtx, QImode,
-      size_rtx, ptr_mode);
+  if (sanitize_flags_p (SANITIZE_HWADDRESS))
+    {
+      rtx ret = init_one_libfunc ("__hwasan_tag_memory");
+      emit_library_call (ret, LCT_NORMAL, VOIDmode,
+			 bot_rtx, ptr_mode,
+			 const0_rtx, QImode,
+			 size_rtx, ptr_mode);
+    }
+  else
+    {
+      gcc_assert (sanitize_flags_p (SANITIZE_MEMTAG));
+      targetm.memtag.copy_tag (bot_rtx, stack_pointer_rtx);
+      targetm.memtag.tag (bot_rtx, 0, 0, size_rtx);
+    }
 
   do_pending_stack_adjust ();
   rtx_insn *insns = get_insns ();
@@ -4301,6 +4324,8 @@  static GTY(()) tree hwasan_ctor_statements;
 void
 hwasan_finish_file (void)
 {
+  gcc_assert (sanitize_flags_p (SANITIZE_HWADDRESS));
+
   /* Do not emit constructor initialisation for the kernel.
      (the kernel has its own initialisation already).  */
   if (flag_sanitize & SANITIZE_KERNEL_HWADDRESS)
@@ -4355,6 +4380,7 @@  hwasan_check_func (bool is_store, bool recover_p, HOST_WIDE_INT size_in_bytes,
 bool
 hwasan_expand_check_ifn (gimple_stmt_iterator *iter, bool)
 {
+  gcc_assert (sanitize_flags_p (SANITIZE_HWADDRESS));
   gimple *g = gsi_stmt (*iter);
   location_t loc = gimple_location (g);
   bool recover_p;
@@ -4448,7 +4474,7 @@  hwasan_expand_mark_ifn (gimple_stmt_iterator *)
 bool
 gate_hwasan ()
 {
-  return memory_tagging_p ();
+  return sanitize_flags_p (SANITIZE_HWADDRESS);
 }
 
 namespace {
diff --git a/gcc/builtins.c b/gcc/builtins.c
index f8063c138a340a06d45b01c9bb7f43caf75e78b2..416ee2b631d22ffab0ca428bc7fec9127382ef3e 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -5391,6 +5391,17 @@  expand_builtin_frame_address (tree fndecl, tree exp)
 static rtx
 expand_builtin_alloca (tree exp)
 {
+  /* TODO For hardware memory tagging we will need to call the backend to tag
+     this memory since the `hwasan` pass will not be run.
+
+     The `hwasan` pass is mainly to add HWASAN_CHECK internal functions where
+     checks should be made.  With hardware memory tagging the checks are done
+     automatically by the architecture.
+
+     The `hwasan` pass also modifies the behaviour of the alloca builtin
+     function in a target-independent manner, but when memory tagging is
+     handled by the backend it is more convenient to handle the tagging in the
+     alloca hook.  */
   rtx op0;
   rtx result;
   unsigned int align;
@@ -7012,6 +7023,9 @@  expand_builtin_set_thread_pointer (tree exp)
 static void
 expand_stack_restore (tree var)
 {
+  /* TODO If memory tagging is enabled through the hardware we need to uncolour
+     the stack from where we are to where we're going. (i.e. colour in the
+     background stack colour).  */
   rtx_insn *prev;
   rtx sa = expand_normal (var);
 
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 1e4d171e651ae10c5829e52248629c04b03c19f1..53b4658aa74c1e369fb139b0a29cdb6dea41dc3b 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -1043,6 +1043,8 @@  public:
 
   /* HWASAN records the poly_int64 so it can handle any stack variable.  */
   auto_vec<poly_int64> hwasan_vec;
+  /* HWASAN needs to record untagged base pointers when there isn't hardware
+     memory tagging enabled by the architecture.  */
   auto_vec<rtx> hwasan_untagged_base_vec;
   auto_vec<rtx> hwasan_base_vec;
 
@@ -1177,7 +1179,8 @@  expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
 	      gcc_assert (stack_vars[i].alignb >= HWASAN_TAG_GRANULE_SIZE);
 	      offset = alloc_stack_frame_space (0, HWASAN_TAG_GRANULE_SIZE);
 	      data->hwasan_vec.safe_push (offset);
-	      data->hwasan_untagged_base_vec.safe_push (virtual_stack_vars_rtx);
+	      if (sanitize_flags_p (SANITIZE_HWADDRESS))
+		data->hwasan_untagged_base_vec.safe_push (virtual_stack_vars_rtx);
 	    }
 	  /* ASAN description strings don't yet have a syntax for expressing
 	     polynomial offsets.  */
@@ -1291,10 +1294,18 @@  expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
 	      /* An object with a large alignment requirement means that the
 		 alignment requirement is greater than the required alignment
 		 for tags.  */
-	      if (!large_untagged_base)
-		large_untagged_base = hwasan_create_untagged_base (large_base);
 	      data->hwasan_vec.safe_push (large_alloc);
-	      data->hwasan_untagged_base_vec.safe_push (large_untagged_base);
+
+	      if (sanitize_flags_p (SANITIZE_HWADDRESS))
+	      {
+		/* We only need to record the untagged bases for HWASAN, since
+		   the runtime library for that doesn't accept tagged pointers.
+		   For hardware implementations of memory tagging there is no
+		   use of recording these untagged versions.  */
+		if (!large_untagged_base)
+		  large_untagged_base = hwasan_create_untagged_base (large_base);
+		data->hwasan_untagged_base_vec.safe_push (large_untagged_base);
+	      }
 	    }
 	  offset = large_alloc;
 	  large_alloc += stack_vars[i].size;
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 7bbeed453cf87382b1776ff52991b5cf6ab9204e..7f23d377308f3b517e4ae08eba3a56e8c6565e8a 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -247,6 +247,7 @@  extern unsigned aarch64_architecture_version;
 #define AARCH64_ISA_RCPC8_4	   (aarch64_isa_flags & AARCH64_FL_RCPC8_4)
 #define AARCH64_ISA_V8_5	   (aarch64_isa_flags & AARCH64_FL_V8_5)
 #define AARCH64_ISA_TME		   (aarch64_isa_flags & AARCH64_FL_TME)
+#define AARCH64_ISA_MEMTAG	   (aarch64_isa_flags & AARCH64_FL_MEMTAG)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (TARGET_SIMD && AARCH64_ISA_CRYPTO)
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index c556bcd1c37c3c4fdd9a829a28ee4ff56819b89e..a21b5918859305dd6301ac7cb3a4e16271b3cb10 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7459,6 +7459,12 @@  aarch64_classify_address (struct aarch64_address_info *info,
       && (code != POST_INC && code != REG))
     return false;
 
+  /* MTE unspec is not a valid address directly.  It must first be put into a
+     register.  */
+  if (GET_CODE (x) == UNSPEC
+      && XINT (x, 1) == UNSPEC_ADDTAG)
+    return false;
+
   gcc_checking_assert (GET_MODE (x) == VOIDmode
 		       || SCALAR_INT_MODE_P (GET_MODE (x)));
 
@@ -20281,6 +20287,76 @@  aarch64_can_tag_addresses ()
   return true;
 }
 
+/* Implement TARGET_MEMTAG_HAS_MEMORY_TAGGING.  We support automatic memory
+   tagging and tag checking if we have AARCH64_ISA_MEMTAG.  */
+bool
+aarch64_has_memtag_isa ()
+{
+  return AARCH64_ISA_MEMTAG;
+}
+
+/* Implement TARGET_MEMTAG_TAG for AArch64. This is only available when
+   AARCH64_ISA_MEMTAG is available.  TODO Eventually we would just want
+   something to emit a loop of STG or ST2G.  Currently unimplemented.  */
+void
+aarch64_tag_memory (rtx tagged_start, poly_int64 address_offset, uint8_t tag_offset,
+		    rtx size)
+{
+  return;
+}
+
+void
+aarch64_gentag (rtx a, rtx b)
+{
+  if ( ! AARCH64_ISA_MEMTAG)
+    return default_memtag_gentag (a, b);
+
+  emit_insn (gen_random_tag (a, b));
+}
+
+rtx
+aarch64_addtag (rtx base, poly_int64 addr_offset, uint8_t tag_offset)
+{
+  /* Handle problems like the offset is too large by creating  */
+  if (! AARCH64_ISA_MEMTAG)
+    return default_memtag_addtag (base, addr_offset, tag_offset);
+
+  /* If the tag offset is zero then leave it as a PLUS.
+     This can be optimised easier by the RTL backends.  */
+  if (tag_offset == 0)
+    return plus_constant (Pmode, base, addr_offset);
+  return gen_rtx_UNSPEC (DImode,
+			 gen_rtvec (3,
+				    base,
+				    gen_int_mode (addr_offset, DImode),
+				    GEN_INT (tag_offset)),
+			 UNSPEC_ADDTAG);
+}
+
+rtx
+aarch64_addtag_force_operand (rtx oper, rtx target)
+{
+  if (GET_CODE (oper) == UNSPEC
+      && XINT (oper, 1) == UNSPEC_ADDTAG)
+    {
+      rtx base = XVECEXP (oper, 0, 0);
+      rtx offset = XVECEXP (oper, 0, 1);
+      rtx tag_offset = XVECEXP (oper, 0, 2);
+      if (! aarch64_MTE_value_offset (offset, DImode))
+	{
+	  rtx newreg = gen_reg_rtx (DImode);
+	  emit_insn (gen_adddi3 (newreg, base, offset));
+	  offset = const0_rtx;
+	  base = newreg;
+	}
+
+      rtx temp_reg = (target && REG_P (target)) ? target : gen_reg_rtx (DImode);
+      emit_insn (gen_plain_offset_tagdi (temp_reg, base, offset, tag_offset));
+      return temp_reg;
+    }
+  return NULL_RTX;
+}
+
 /* Implement TARGET_ASM_FILE_END for AArch64.  This adds the AArch64 GNU NOTE
    section at the end if needed.  */
 #define GNU_PROPERTY_AARCH64_FEATURE_1_AND	0xc0000000
@@ -20851,6 +20927,21 @@  aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_MEMTAG_CAN_TAG_ADDRESSES
 #define TARGET_MEMTAG_CAN_TAG_ADDRESSES aarch64_can_tag_addresses
 
+#undef TARGET_MEMTAG_HAS_MEMORY_TAGGING
+#define TARGET_MEMTAG_HAS_MEMORY_TAGGING aarch64_has_memtag_isa
+
+#undef TARGET_MEMTAG_TAG
+#define TARGET_MEMTAG_TAG aarch64_tag_memory
+
+#undef TARGET_MEMTAG_GENTAG
+#define TARGET_MEMTAG_GENTAG aarch64_gentag
+
+#undef TARGET_MEMTAG_ADDTAG
+#define TARGET_MEMTAG_ADDTAG aarch64_addtag
+
+#undef TARGET_MEMTAG_ADDTAG_FORCE_OPERAND
+#define TARGET_MEMTAG_ADDTAG_FORCE_OPERAND aarch64_addtag_force_operand
+
 #if CHECKING_P
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index e4f9005c27f6f57efba31004389dbed9fd91a360..880d2b40d09b9b229e03aa9bf56ce5ae77a0d350 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -243,6 +243,8 @@  (define_c_enum "unspec" [
     UNSPEC_SPECULATION_TRACKER
     UNSPEC_COPYSIGN
     UNSPEC_TTEST		; Represent transaction test.
+    UNSPEC_GENTAG
+    UNSPEC_ADDTAG
 ])
 
 (define_c_enum "unspecv" [
@@ -445,6 +447,26 @@  (define_expand "cbranch<mode>4"
   "
 )
 
+(define_insn "random_tag"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec:DI [(match_operand:DI 1 "register_operand" "r")] UNSPEC_GENTAG))]
+  "AARCH64_ISA_MEMTAG"
+  "mov\\t%0, %1 // irg\\t%0, %1"
+)
+
+(define_insn "plain_offset_tagdi"
+  [(set (match_operand:DI 0 "register_operand" "=r,r")
+    (unspec:DI
+	[(match_operand:DI 1 "register_operand" "r,r")
+	 (match_operand:DI 2 "aarch64_MTE_value_offset" "I,J")
+	 (match_operand:DI 3 "aarch64_MTE_tag_offset" "i,i")]
+      UNSPEC_ADDTAG))]
+  "AARCH64_ISA_MEMTAG"
+  "@
+  add\\t%0, %1, %2     // addg\\t%0, %1, %2, %3
+  sub\\t%0, %1, #%n2   // subg\\t%0, %1, #%n2, %3"
+)
+
 (define_expand "cbranchcc4"
   [(set (pc) (if_then_else
 	      (match_operator 0 "aarch64_comparison_operator"
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index d8c377994d6f11a58683b19d7ae9d594e5033561..ede9aa49ef14b8cc453098beac613cc3ed181718 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -144,6 +144,18 @@  (define_predicate "aarch64_pluslong_immediate"
   (and (match_code "const_int")
        (match_test "(INTVAL (op) < 0xffffff && INTVAL (op) > -0xffffff)")))
 
+(define_predicate "aarch64_MTE_add_temp"
+  (ior (match_code "const_int") (match_code "const_poly_int")))
+
+(define_predicate "aarch64_MTE_tag_offset"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 16)")))
+
+(define_predicate "aarch64_MTE_value_offset"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 1008) && !(INTVAL (op) & 0xf)")))
+
+
 (define_predicate "aarch64_pluslong_strict_immedate"
   (and (match_operand 0 "aarch64_pluslong_immediate")
        (not (match_operand 0 "aarch64_plus_immediate"))))
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 99949ba0b9317a89019ab5a6d9383e89f2d6ce3c..e5f83932b1d0c93b97e58b4e2cdc57f45617bfa3 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -2976,6 +2976,25 @@  A target hook which lets a backend compute the set of pressure classes to  be us
 True if backend architecture naturally supports ignoring the top byte of pointers.  This feature means that -fsanitize=hwaddress can work.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_MEMTAG_HAS_MEMORY_TAGGING ()
+True if backend architecture naturally supports tagging addresses and checking those tags.  This feature means that -fsanitize=memtag can work.
+@end deftypefn
+
+@deftypefn {Target Hook} uint8_t TARGET_MEMTAG_TAG_SIZE ()
+Return the size in bits of a tag for this platform.
+@end deftypefn
+
+@deftypefn {Target Hook} uint8_t TARGET_MEMTAG_GRANULE_SIZE ()
+Return how many bytes in real memory each byte in shadow memory represents.
+I.e. one byte in shadow memory being colour 1 implies the assocaiated
+targetm.memtag.granule_size () bytes in real memory must all be accessed by
+pointers tagged as 1.
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_MEMTAG_COPY_TAG (rtx @var{to}, rtx @var{from})
+Emit insns to copy the tag in FROM to TO.
+@end deftypefn
+
 @deftypefn {Target Hook} rtx TARGET_MEMTAG_ADDTAG (rtx @var{base}, poly_int64 @var{addr_offset}, uint8_t @var{tag_offset})
 Emit an RTX representing BASE offset in value by ADDR_OFFSET and in tag by TAG_OFFSET.
 The resulting RTX must either be a valid memory address or be able to get
@@ -2996,6 +3015,15 @@  This hook is most often implemented by emitting instructions to put the
 expression into a pseudo register, then returning that pseudo register.
 @end deftypefn
 
+@deftypefn {Target Hook} void TARGET_MEMTAG_TAG (rtx @var{tagged_start}, poly_int64 @var{address_offset}, uint8_t @var{tag_offset}, rtx @var{size})
+This function should emit an RTX to colour memory.
+It's given arguments TAGGED_START, ADDRESS_OFFSET, TAG_OFFSET, SIZE, where
+TAGGED_START and SIZE are RTL expressions, ADDRESS_OFFSET is a poly_int64
+and TAG_OFFSET is a uint8_t.
+It should emit RTL to colour "shadow memory" for the relevant range the
+colour of the tag it was given.
+@end deftypefn
+
 @deftypefn {Target Hook} void TARGET_MEMTAG_GENTAG (rtx @var{base}, rtx @var{untagged})
 Set the BASE argument to UNTAGGED with some random tag.
 This function is used to generate a tagged base for the current stack frame.
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index ab18039b09f9b0a93338fa716d5d044555371ddc..659a07d8b9fb4e2b2c5b7d6c9899be9c723c4c09 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2376,10 +2376,20 @@  in the reload pass.
 
 @hook TARGET_MEMTAG_CAN_TAG_ADDRESSES
 
+@hook TARGET_MEMTAG_HAS_MEMORY_TAGGING
+
+@hook TARGET_MEMTAG_TAG_SIZE
+
+@hook TARGET_MEMTAG_GRANULE_SIZE
+
+@hook TARGET_MEMTAG_COPY_TAG
+
 @hook TARGET_MEMTAG_ADDTAG
 
 @hook TARGET_MEMTAG_ADDTAG_FORCE_OPERAND
 
+@hook TARGET_MEMTAG_TAG
+
 @hook TARGET_MEMTAG_GENTAG
 
 @node Stack and Calling
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 57d8ff9a1a010409d966230140df1017bc3584a8..4ab2bf2f466a7ad509d20e8e4bcfb9df72dc1335 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -259,6 +259,7 @@  enum sanitize_code {
   SANITIZE_HWADDRESS = 1UL << 28,
   SANITIZE_USER_HWADDRESS = 1UL << 29,
   SANITIZE_KERNEL_HWADDRESS = 1UL << 30,
+  SANITIZE_MEMTAG = 1UL << 31,
   SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT,
   SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
 		       | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN
diff --git a/gcc/gcc.c b/gcc/gcc.c
index cf1bd9de660f32f060b9277f89a562873a48684a..2e926c2c3da22ea17cc69b3c8d6cf18b07f93dbd 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -9463,6 +9463,8 @@  sanitize_spec_function (int argc, const char **argv)
     return (flag_sanitize & SANITIZE_KERNEL_HWADDRESS) ? "" : NULL;
   if (strcmp (argv[0], "thread") == 0)
     return (flag_sanitize & SANITIZE_THREAD) ? "" : NULL;
+  if (strcmp (argv[0], "memtag") == 0)
+    return (flag_sanitize & SANITIZE_MEMTAG) ? "" : NULL;
   if (strcmp (argv[0], "undefined") == 0)
     return ((flag_sanitize
 	     & (SANITIZE_UNDEFINED | SANITIZE_UNDEFINED_NONDEFAULT))
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index c692ae86ec6b5fbe345558d7f412f6ecd666bfa1..64d48813c3d16d9fd1888cf74597cf10d0dd3b83 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -502,9 +502,6 @@  expand_HWASAN_MARK (internal_fn, gcall *gc)
   gcc_checking_assert (TREE_CODE (base) == ADDR_EXPR);
   rtx base_rtx = expand_normal (base);
 
-  rtx tag = is_poison ? const0_rtx : hwasan_extract_tag (base_rtx);
-  rtx address = hwasan_create_untagged_base (base_rtx);
-
   tree len = gimple_call_arg (gc, 2);
   gcc_assert (tree_fits_shwi_p (len));
   unsigned HOST_WIDE_INT size_in_bytes = tree_to_shwi (len);
@@ -513,13 +510,25 @@  expand_HWASAN_MARK (internal_fn, gcall *gc)
   size_in_bytes = (size_in_bytes + tg_mask) & ~tg_mask;
   rtx size = gen_int_mode (size_in_bytes, Pmode);
 
-  rtx func = init_one_libfunc ("__hwasan_tag_memory");
-  emit_library_call (func,
-      LCT_NORMAL,
-      VOIDmode,
-      address, ptr_mode,
-      tag, QImode,
-      size, ptr_mode);
+  if (sanitize_flags_p (SANITIZE_HWADDRESS))
+    {
+      rtx func = init_one_libfunc ("__hwasan_tag_memory");
+      rtx address = hwasan_create_untagged_base (base_rtx);
+      rtx tag = is_poison ? const0_rtx : hwasan_extract_tag (base_rtx);
+      emit_library_call (func,
+			 LCT_NORMAL,
+			 VOIDmode,
+			 address, ptr_mode,
+			 tag, QImode,
+			 size, ptr_mode);
+    }
+  else
+    {
+      gcc_assert (sanitize_flags_p (SANITIZE_MEMTAG));
+      if (is_poison)
+	targetm.memtag.copy_tag (base_rtx, stack_pointer_rtx);
+      targetm.memtag.tag (base_rtx, 0, 0, size);
+    }
 }
 
 /* This should get expanded in the sanopt pass.  */
diff --git a/gcc/opts.c b/gcc/opts.c
index 88a94286e71f61f2dce907018e5185f63a830804..659eeb0a62344c250314892a974d245d72b9a84e 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1200,6 +1200,27 @@  finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
 	      "%<-fsanitize=hwaddress%> is incompatible with "
 	      "%<-fsanitize=thread%>");
 
+  /* Memtag and ASan conflict with each other.  */
+  if ((opts->x_flag_sanitize & SANITIZE_ADDRESS)
+      && (opts->x_flag_sanitize & SANITIZE_MEMTAG))
+    error_at (loc,
+	      "%<-fsanitize=memtag%> is incompatible with both "
+	      "%<-fsanitize=address%> and %<-fsanitize=kernel-address%>");
+
+  /* Memtag and HWASan conflict with each other.  */
+  if ((opts->x_flag_sanitize & SANITIZE_HWADDRESS)
+      && (opts->x_flag_sanitize & SANITIZE_MEMTAG))
+    error_at (loc,
+	      "%<-fsanitize=memtag%> is incompatible with both "
+	      "%<-fsanitize=hwaddress%> and %<-fsanitize=kernel-hwaddress%>");
+
+  /* Memtag conflicts with TSan.  */
+  if ((opts->x_flag_sanitize & SANITIZE_MEMTAG)
+      && (opts->x_flag_sanitize & SANITIZE_THREAD))
+    error_at (loc,
+	      "%<-fsanitize=memtag%> is incompatible with "
+	      "%<-fsanitize=thread%>");
+
   /* Check error recovery for -fsanitize-recover option.  */
   for (int i = 0; sanitizer_opts[i].name != NULL; ++i)
     if ((opts->x_flag_sanitize_recover & sanitizer_opts[i].flag)
@@ -1220,7 +1241,8 @@  finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
   /* Enable -fsanitize-address-use-after-scope if address sanitizer is
      enabled.  */
   if (((opts->x_flag_sanitize & SANITIZE_USER_ADDRESS)
-       || (opts->x_flag_sanitize & SANITIZE_USER_HWADDRESS))
+       || (opts->x_flag_sanitize & SANITIZE_USER_HWADDRESS)
+       || (opts->x_flag_sanitize & SANITIZE_MEMTAG))
       && !opts_set->x_flag_sanitize_address_use_after_scope)
     opts->x_flag_sanitize_address_use_after_scope = true;
 
@@ -1849,6 +1871,7 @@  const struct sanitizer_opts_s sanitizer_opts[] =
 #define SANITIZER_OPT(name, flags, recover) \
     { #name, flags, sizeof #name - 1, recover }
   SANITIZER_OPT (address, (SANITIZE_ADDRESS | SANITIZE_USER_ADDRESS), true),
+  SANITIZER_OPT (memtag, (SANITIZE_MEMTAG), true),
   SANITIZER_OPT (hwaddress, (SANITIZE_HWADDRESS | SANITIZE_USER_HWADDRESS),
 		 true),
   SANITIZER_OPT (kernel-address, (SANITIZE_ADDRESS | SANITIZE_KERNEL_ADDRESS),
diff --git a/gcc/target.def b/gcc/target.def
index badae860335e4a570f189c9f8011da5ab8c15439..2a366b7b58ed459574ab9aa3f008ed0f05bf2666 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -6716,6 +6716,30 @@  DEFHOOK
  bool, (), default_memtag_can_tag_addresses)
 
 DEFHOOK
+(has_memory_tagging,
+ "True if backend architecture naturally supports tagging addresses and\
+ checking those tags.  This feature means that -fsanitize=memtag can work.",
+ bool, (), default_memtag_has_memory_tagging)
+
+DEFHOOK
+(tag_size,
+ "Return the size in bits of a tag for this platform.",
+ uint8_t, (), default_memtag_tag_size)
+
+DEFHOOK
+(granule_size,
+ "Return how many bytes in real memory each byte in shadow memory represents.\n\
+I.e. one byte in shadow memory being colour 1 implies the assocaiated\n\
+targetm.memtag.granule_size () bytes in real memory must all be accessed by\n\
+pointers tagged as 1.",
+uint8_t, (), default_memtag_granule_size)
+
+DEFHOOK
+(copy_tag,
+ "Emit insns to copy the tag in FROM to TO.",
+void, (rtx to, rtx from), default_memtag_copy_tag)
+
+DEFHOOK
 (addtag,
  "Emit an RTX representing BASE offset in value by ADDR_OFFSET and in tag by\
  TAG_OFFSET.\n\
@@ -6740,6 +6764,17 @@  expression into a pseudo register, then returning that pseudo register.",
 rtx, (rtx oper, rtx target), NULL)
 
 DEFHOOK
+(tag,
+ "This function should emit an RTX to colour memory.\n\
+It's given arguments TAGGED_START, ADDRESS_OFFSET, TAG_OFFSET, SIZE, where\n\
+TAGGED_START and SIZE are RTL expressions, ADDRESS_OFFSET is a poly_int64\n\
+and TAG_OFFSET is a uint8_t.\n\
+It should emit RTL to colour \"shadow memory\" for the relevant range the\n\
+colour of the tag it was given.",
+  void, (rtx tagged_start, poly_int64 address_offset, uint8_t tag_offset, rtx size),
+NULL)
+
+DEFHOOK
 (gentag,
  "Set the BASE argument to UNTAGGED with some random tag.\n\
 This function is used to generate a tagged base for the current stack frame.",
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 4418db74f52f669c22702f5a4a093172f48a1b46..9c69589e33121638d349b882b4f26d29ac449d20 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -285,6 +285,10 @@  extern void default_remove_extra_call_preserved_regs (rtx_insn *,
 						      HARD_REG_SET *);
 
 extern bool default_memtag_can_tag_addresses ();
+extern bool default_memtag_has_memory_tagging ();
+extern uint8_t default_memtag_tag_size ();
+extern uint8_t default_memtag_granule_size ();
 extern void default_memtag_gentag (rtx, rtx);
+extern void default_memtag_copy_tag (rtx, rtx);
 extern rtx default_memtag_addtag (rtx, poly_int64, uint8_t);
 #endif /* GCC_TARGHOOKS_H */
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 048c7f3cff5d87d0d40b93f6cf8cb41de670711d..b8a74a5f3750dad102311f1e4298a63416f1261b 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -2377,6 +2377,24 @@  default_memtag_can_tag_addresses ()
   return false;
 }
 
+bool
+default_memtag_has_memory_tagging ()
+{
+  return false;
+}
+
+uint8_t
+default_memtag_tag_size ()
+{
+  return 8;
+}
+
+uint8_t
+default_memtag_granule_size ()
+{
+  return 16;
+}
+
 void
 default_memtag_gentag (rtx base, rtx untagged)
 {
@@ -2403,6 +2421,70 @@  default_memtag_gentag (rtx base, rtx untagged)
     }
 }
 
+void
+default_memtag_copy_tag (rtx to, rtx from)
+{
+  /* TODO:
+      I want to have a sequence as minimal as possible here, since this code
+      sequence could be emitted many times.
+
+      My first attempt was the below,
+
+	  rtx temp = hwasan_extract_tag (from);
+	  store_bit_field (to, 8, 56, 0, 0,
+			  QImode, temp, false);
+      
+     Unfortunately, despite having much less instructions, for AArch64 this can
+     cause a problem in LRA if the `to` RTX eventually resolves to being the
+     stack pointer.
+     This happens because the instruction that gets emitted from
+     `store_bit_field` corresponds to a pattern that can't handle the stack
+     pointer and LRA can't figure out to use a temporary register in the `bfi`
+     instruction's place.
+
+     This doesn't cause a problem at the moment since there's currently no way
+     the stack pointer should be given to this function.  The hook is only used
+     when poisoning variables with HWASAN_MARK, and in that function the `to`
+     RTX should always be pointing to a tagged variable on the stack (since
+     the variable is tagged it can't be the stack pointer since that is
+     untagged).
+
+     Eventually we will be generating random tags as the "start" tag for each
+     frame.  When this happens we can no longer avoid the background colour at
+     compile time since we will not know what offset to avoid.
+     This will mean we no longer avoid a `tag_offset` of 0, and hence
+     `hwasan_with_tag` could emit simple PLUS statements.
+
+     When that happens, the last variable on the stack could very well have
+     a zero tag offset and somewhere else in the compiler could optimise that
+     to simply use the stack pointer.
+
+     That would trigger an ICE due to LRA being unable to reload the
+     `insv_regdi` pattern.
+
+     The code sequence I'm emitting at the moment works just fine in all
+     circumstances, but it would be nice to find a smaller sequence.  */
+  rtx temp = gen_reg_rtx (Pmode);
+  uint64_t tag_mask = 0xFFUL << HWASAN_SHIFT;
+  emit_insn (gen_anddi3 (to, to, GEN_INT (~tag_mask)));
+  /* Can't use GEN_INT (tag_mask) since GEN_INT calls `gen_rtx_CONST_INT` which
+     takes a `HOST_WIDE_INT`.
+     HOST_WIDE_INT can't hold a uint64_t with the top bit set, hence in order
+     to avoid UB we have to emit instructions for the machine to use some
+     uint64_t arithmetic.
+
+     The extra instructions seem to eventually end up with the same output in
+     most cases (I've not yet seen a case where the generation of the mask in
+     three or one `emit_insn` calls changes the codegen).  */
+
+  /* emit_insn (gen_anddi3 (temp, from, GEN_INT (tag_mask)));  */
+  emit_move_insn (temp, GEN_INT (0xff));
+  emit_insn (gen_ashldi3 (temp, temp, HWASAN_SHIFT_RTX));
+  emit_insn (gen_anddi3 (temp, from, temp));
+
+  emit_insn (gen_iordi3 (to, to, from));
+}
+
 rtx
 default_memtag_addtag (rtx base, poly_int64 offset, uint8_t tag_offset)
 {
diff --git a/gcc/testsuite/gcc.dg/hwasan/poly-int-stack-vars.c b/gcc/testsuite/gcc.dg/hwasan/poly-int-stack-vars.c
new file mode 100644
index 0000000000000000000000000000000000000000..cb0ca7d3a06c5a2de258ba20be974009410a7a44
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/hwasan/poly-int-stack-vars.c
@@ -0,0 +1,39 @@ 
+/* { dg-do compile } */
+
+/* TODO Need to give this test the correct arguments
+   (i.e. compile with memory tagging enabled).
+
+   This is needed because the introduction of HWASAN_CHECK statements stops the
+   compiler from being able to vectorise the loop with SVE, hence stopping the
+   introduction of `addvl` instructions.
+
+   In other words, this test doesn't really belong in this directory, but I
+   haven't yet created the directory for checking memtag code generation and
+   this is a dummy commit anyway.  */
+/* Non-constant sizes.
+   Code designed to store SVE registers on the stack.
+   This is needed to exercise the poly_int64 handling for HWASAN and MTE
+   instrumentation. */
+int u;
+
+void
+foo_sve (int *p)
+{
+  int i;
+  #pragma omp for simd lastprivate(u) schedule (static, 32)
+  for (i = 0; i < 1024; i++)
+    u = p[i];
+}
+
+void
+bar_sve (int *p)
+{
+  int i;
+  #pragma omp taskloop simd lastprivate(u)
+  for (i = 0; i < 1024; i++)
+    u = p[i];
+}
+
+/* Ensure we are storing SVE vectors on the stack -- otherwise we're not
+   exercising the code path for poly_int64's.  */
+/* { dg-final { scan-assembler "addvl" } } */
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 5e077892a6371a7b534007d5dffd421b218ea694..6dcd830c3a93a93ce27d9b80137af6d9d1288ff7 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1765,6 +1765,16 @@  process_options (void)
       flag_sanitize &= ~SANITIZE_HWADDRESS;
     }
 
+  /* Memtag requires hardware support.  */
+  if (flag_sanitize & SANITIZE_MEMTAG && !targetm.memtag.has_memory_tagging ())
+    {
+      warning_at (UNKNOWN_LOCATION, 0,
+		  "%<-fsanitize=memtag%> requires hardware support "
+		  "that is not advertised for this target");
+      flag_sanitize &= ~SANITIZE_MEMTAG;
+    }
+
+
  /* Do not use IPA optimizations for register allocation if profiler is active
     or patchable function entries are inserted for run-time instrumentation
     or port does not emit prologue and epilogue as RTL.  */
diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c
index b4b5c903e13b1695daeff66a680c64fa7da0829d..73ff987544487e44c4c14419fdc2b27b8b6ddb25 100644
--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -1704,7 +1704,9 @@  execute_update_addresses_taken (void)
 		  gimple_ior_addresses_taken (addresses_taken, stmt);
 		  gimple_call_set_arg (stmt, 1, arg);
 		}
-	      else if (is_asan_mark_p (stmt)
+	      else if ((is_asan_mark_p (stmt)
+			&& (!sanitize_flags_p (SANITIZE_MEMTAG)
+			    || !asan_mark_p (stmt, ASAN_MARK_POISON)))
 		       || gimple_call_internal_p (stmt, IFN_GOMP_SIMT_ENTER))
 		;
 	      else