diff mbox series

Update: [PATCH 5/X] libsanitizer: mid-end: Introduce stack variable handling for HWASAN

Message ID AM6PR08MB3157AC63E3F83D45FD3977E7E0E00@AM6PR08MB3157.eurprd08.prod.outlook.com
State New
Headers show
Series Update: [PATCH 5/X] libsanitizer: mid-end: Introduce stack variable handling for HWASAN | expand

Commit Message

Matthew Malcomson Nov. 19, 2020, 12:57 p.m. UTC
Hi there,

After offline discussion with Richard I've modified the way in which the
initialisation for the hwasan base pointer is emitted.
Originally it was getting emitted during `expand_used_vars`, and
requiring `handle_builtin_alloca` to register a need for it to be
emitted so that `expand_HWASAN_CHOOSE_TAG` can use an initialised base
pointer.

Now we go through the entire expansion of the function body, and then if
`hwasan_frame_base_ptr` was used anywhere we emit the initialisation
just before `parm_birth_insn`.

This is the updated patch for the stack variable handling.
(Testing underway)

MM

---


Handling stack variables has three features.

1) Ensure HWASAN required alignment for stack variables

When tagging shadow memory, we need to ensure that each tag granule is
only used by one variable at a time.

This is done by ensuring that each tagged variable is aligned to the tag
granule representation size and also ensure that the end of each
object is aligned to ensure the start of any other data stored on the
stack is in a different granule.

This patch ensures the above by forcing the stack pointer to be aligned
before and after allocating any stack objects. Since we are forcing
alignment we also use `align_local_variable` to ensure this new alignment
is advertised properly through SET_DECL_ALIGN.

2) Put tags into each stack variable pointer

Make sure that every pointer to a stack variable includes a tag of some
sort on it.

The way tagging works is:
  1) For every new stack frame, a random tag is generated.
  2) A base register is formed from the stack pointer value and this
     random tag.
  3) References to stack variables are now formed with RTL describing an
     offset from this base in both tag and value.

The random tag generation is handled by a backend hook.  This hook
decides whether to introduce a random tag or use the stack background
based on the parameter hwasan-random-frame-tag.  Using the stack
background is necessary for testing and bootstrap.  It is necessary
during bootstrap to avoid breaking the `configure` test program for
determining stack direction.

Using the stack background means that every stack frame has the initial
tag of zero and variables are tagged with incrementing tags from 1,
which also makes debugging a bit easier.

Backend hooks define the size of a tag, the layout of the HWASAN shadow
memory, and handle emitting the code that inserts and extracts tags from a
pointer.

3) For each stack variable, tag and untag the shadow stack on function
   prologue and epilogue.

On entry to each function we tag the relevant shadow stack region for
each stack variable. This stack region is tagged to match the tag added to
each pointer to that variable.

This is the first patch where we use the HWASAN shadow space, so we need
to add in the libhwasan initialisation code that creates this shadow
memory region into the binary we produce.  This instrumentation is done
in `compile_file`.

When exiting a function we need to ensure the shadow stack for this
function has no remaining tags.  Without clearing the shadow stack area
for this stack frame, later function calls could get false positives
when those later function calls check untagged areas (such as parameters
passed on the stack) against a shadow stack area with left-over tag.

Hence we ensure that the entire stack frame is cleared on function exit.

config/ChangeLog:

	* bootstrap-hwasan.mk: Disable random frame tags for stack-tagging
	during bootstrap.

ChangeLog:

	* gcc/asan.c (struct hwasan_stack_var): New.
	(hwasan_sanitize_p): New.
	(hwasan_sanitize_stack_p): New.
	(hwasan_sanitize_allocas_p): New.
	(initialize_sanitizer_builtins): Define new builtins.
	(ATTR_NOTHROW_LIST): New macro.
	(hwasan_current_frame_tag): New.
	(hwasan_frame_base): New.
	(stack_vars_base_reg_p): New.
	(hwasan_maybe_emit_frame_base_init): New.
	(hwasan_record_stack_var): New.
	(hwasan_get_frame_extent): New.
	(hwasan_increment_frame_tag): New.
	(hwasan_record_frame_init): New.
	(hwasan_emit_prologue): New.
	(hwasan_emit_untag_frame): New.
	(hwasan_finish_file): New.
	(hwasan_truncate_to_tag_size): New.
	* gcc/asan.h (hwasan_record_frame_init): New declaration.
	(hwasan_record_stack_var): New declaration.
	(hwasan_emit_prologue): New declaration.
	(hwasan_emit_untag_frame): New declaration.
	(hwasan_get_frame_extent): New declaration.
	(hwasan_maybe_emit_frame_base_init): New declaration.
	(hwasan_frame_base): New declaration.
	(stack_vars_base_reg_p): New declaration.
	(hwasan_current_frame_tag): New declaration.
	(hwasan_increment_frame_tag): New declaration.
	(hwasan_truncate_to_tag_size): New declaration.
	(hwasan_finish_file): New declaration.
	(hwasan_sanitize_p): New declaration.
	(hwasan_sanitize_stack_p): New declaration.
	(hwasan_sanitize_allocas_p): New declaration.
	(HWASAN_TAG_SIZE): New macro.
	(HWASAN_TAG_GRANULE_SIZE): New macro.
	(HWASAN_STACK_BACKGROUND): New macro.
	* gcc/builtin-types.def (BT_FN_VOID_PTR_UINT8_PTRMODE): New.
	* gcc/builtins.def (DEF_SANITIZER_BUILTIN): Enable for HWASAN.
	* gcc/cfgexpand.c (align_local_variable): When using hwasan ensure
	alignment to tag granule.
	(align_frame_offset): New.
	(expand_one_stack_var_at): For hwasan use tag offset.
	(expand_stack_vars): Record stack objects for hwasan.
	(expand_one_stack_var_1): Record stack objects for hwasan.
	(init_vars_expansion): Initialise hwasan state.
	(expand_used_vars): Emit hwasan prologue and generate hwasan epilogue.
	(pass_expand::execute): Emit hwasan base initialization if needed.
	* gcc/doc/tm.texi (TARGET_MEMTAG_TAG_SIZE,TARGET_MEMTAG_GRANULE_SIZE,
	TARGET_MEMTAG_INSERT_RANDOM_TAG,TARGET_MEMTAG_ADD_TAG,
	TARGET_MEMTAG_SET_TAG,TARGET_MEMTAG_EXTRACT_TAG,
	TARGET_MEMTAG_UNTAGGED_POINTER): Document new hooks.
	* gcc/doc/tm.texi.in (TARGET_MEMTAG_TAG_SIZE,TARGET_MEMTAG_GRANULE_SIZE,
	TARGET_MEMTAG_INSERT_RANDOM_TAG,TARGET_MEMTAG_ADD_TAG,
	TARGET_MEMTAG_SET_TAG,TARGET_MEMTAG_EXTRACT_TAG,
	TARGET_MEMTAG_UNTAGGED_POINTER): Document new hooks.
	* gcc/explow.c (get_dynamic_stack_base): Take new `base` argument.
	* gcc/explow.h (get_dynamic_stack_base): Take new `base` argument.
	* gcc/sanitizer.def (BUILT_IN_HWASAN_INIT): New.
	(BUILT_IN_HWASAN_TAG_MEM): New.
	* gcc/target.def (target_memtag_tag_size,target_memtag_granule_size,
	target_memtag_insert_random_tag,target_memtag_add_tag,
	target_memtag_set_tag,target_memtag_extract_tag,
	target_memtag_untagged_pointer): New hooks.
	* gcc/targhooks.c (HWASAN_SHIFT): New.
	(HWASAN_SHIFT_RTX): New.
	(default_memtag_tag_size): New default hook.
	(default_memtag_granule_size): New default hook.
	(default_memtag_insert_random_tag): New default hook.
	(default_memtag_add_tag): New default hook.
	(default_memtag_set_tag): New default hook.
	(default_memtag_extract_tag): New default hook.
	(default_memtag_untagged_pointer): New default hook.
	* gcc/targhooks.h (default_memtag_tag_size): New default hook.
	(default_memtag_granule_size): New default hook.
	(default_memtag_insert_random_tag): New default hook.
	(default_memtag_add_tag): New default hook.
	(default_memtag_set_tag): New default hook.
	(default_memtag_extract_tag): New default hook.
	(default_memtag_untagged_pointer): New default hook.
	* gcc/toplev.c (compile_file): Call hwasan_finish_file when finished.


###############     Attachment also inlined for ease of reply    ###############

Comments

Richard Sandiford Nov. 19, 2020, 3:28 p.m. UTC | #1
Matthew Malcomson <matthew.malcomson@arm.com> writes:
> […]
> +/* hwasan_frame_base_init_seq is the sequence of RTL insns that will initialize
> +   the hwasan_frame_base_ptr.  When the hwasan_frame_base_ptr is requested, we
> +   generate this sequence but do not emit it.  If the sequence was created it
> +   is emitted once the function body has been expanded.
> +
> +   This delay is because the frame base pointer may be needed anywhere in the
> +   function body, or needed by the expand_used_vars function.  Emitting once in
> +   a known place is simpler than requiring the emition of the instructions to

s/emition/emission/

> +   be know where it should go depending on the first place the hwasan frame
> +   base is needed.  */
> +static GTY(()) rtx_insn *hwasan_frame_base_init_seq = NULL;
> […]
> +/* For stack tagging:
> +
> +   Return the 'base pointer' for this function.  If that base pointer has not
> +   yet been created then we create a register to hold it and record the insns
> +   to initialize the register in `hwasan_frame_base_init_seq` for later
> +   emission.  */
> +rtx
> +hwasan_frame_base ()
> +{
> +  if (! hwasan_frame_base_ptr)
> +    {
> +      start_sequence ();
> +      hwasan_frame_base_ptr =
> +	force_reg (Pmode,
> +		   targetm.memtag.insert_random_tag (virtual_stack_vars_rtx,
> +						     NULL_RTX));

Nit: should be formatted as:

      hwasan_frame_base_ptr
	= force_reg (Pmode,
		     targetm.memtag.insert_random_tag (virtual_stack_vars_rtx,
						       NULL_RTX));

> […]
> +  size_t length = hwasan_tagged_stack_vars.length ();
> +  hwasan_stack_var *vars = hwasan_tagged_stack_vars.address ();
> +
> +  poly_int64 bot = 0, top = 0;
> +  size_t i = 0;
> +  for (i = 0; i < length; i++)
> +    {
> +      hwasan_stack_var& cur = vars[i];

Simpler as:

  poly_int64 bot = 0, top = 0;
  for (hwasan_stack_var &cur : hwasan_tagged_stack_vars)

(GCC style is to add a space before “&”, as for “*”)

> +      poly_int64 nearest = cur.nearest_offset;
> +      poly_int64 farthest = cur.farthest_offset;
> +
> +      if (known_ge (nearest, farthest))
> +	{
> +	  top = nearest;
> +	  bot = farthest;
> +	}
> +      else
> +	{
> +	  /* Given how these values are calculated, one must be known greater
> +	     than the other.  */
> +	  gcc_assert (known_le (nearest, farthest));
> +	  top = farthest;
> +	  bot = nearest;
> +	}
> +      poly_int64 size = (top - bot);
> +
> +      /* Assert the edge of each variable is aligned to the HWASAN tag granule
> +	 size.  */
> +      gcc_assert (multiple_p (top, HWASAN_TAG_GRANULE_SIZE));
> +      gcc_assert (multiple_p (bot, HWASAN_TAG_GRANULE_SIZE));
> +      gcc_assert (multiple_p (size, HWASAN_TAG_GRANULE_SIZE));
> +
> +      rtx ret = init_one_libfunc ("__hwasan_tag_memory");
> +      rtx base_tag = targetm.memtag.extract_tag (cur.tagged_base, NULL_RTX);
> +      rtx tag = plus_constant (QImode, base_tag, cur.tag_offset);
> +      tag = hwasan_truncate_to_tag_size (tag, NULL_RTX);
> +
> +      rtx bottom = convert_memory_address (ptr_mode,
> +					   plus_constant (Pmode,
> +							  cur.untagged_base,
> +							  bot));
> +      emit_library_call (ret, LCT_NORMAL, VOIDmode,
> +			 bottom, ptr_mode,
> +			 tag, QImode,
> +			 gen_int_mode (size, ptr_mode), ptr_mode);
> +    }
> +  /* Clear the stack vars, we've emitted the prologue for them all now.  */
> +  hwasan_tagged_stack_vars.truncate (0);
> +}
> +
> +/* For stack tagging:
> +
> +   Return RTL insns to clear the tags between DYNAMIC and VARS pointers
> +   into the stack.  These instructions should be emitted at the end of
> +   every function.
> +
> +   If `dynamic` is NULL_RTX then no insns are returned.  */
> +rtx_insn *
> +hwasan_emit_untag_frame (rtx dynamic, rtx vars)
> +{
> +  if (! dynamic)
> +    return NULL;
> +
> +  start_sequence ();
> +
> +  dynamic = convert_memory_address (ptr_mode, dynamic);
> +  vars = convert_memory_address (ptr_mode, vars);
> +
> +  rtx top_rtx;
> +  rtx bot_rtx;
> +  if (FRAME_GROWS_DOWNWARD)
> +    {
> +      top_rtx = vars;
> +      bot_rtx = dynamic;
> +    }
> +  else
> +    {
> +      top_rtx = dynamic;
> +      bot_rtx = vars;
> +    }
> +
> +  rtx size_rtx = expand_simple_binop (ptr_mode, MINUS, top_rtx, bot_rtx,
> +				      NULL_RTX, /* unsignedp = */0,
> +				      OPTAB_DIRECT);
> +
> +  rtx ret = init_one_libfunc ("__hwasan_tag_memory");
> +  emit_library_call (ret, LCT_NORMAL, VOIDmode,
> +		     bot_rtx, ptr_mode,
> +		     HWASAN_STACK_BACKGROUND, QImode,
> +		     size_rtx, ptr_mode);

Nit: “ret” seems like a strange name for this variable, since it implies
that it's a return value of the library call.  Maybe “fn” or something
would be better.

> […]
> @@ -1216,6 +1255,24 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>  	    {
>  	      offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
>  	      base_align = crtl->max_used_stack_slot_alignment;
> +
> +	      if (hwasan_sanitize_stack_p ())
> +		{
> +		  /* Align again since the point of this alignment is to handle
> +		     the "end" of the object (i.e. smallest address after the
> +		     stack object).  For FRAME_GROWS_DOWNWARD that requires
> +		     aligning the stack before allocating, but for a frame that
> +		     grows upwards that requires aligning the stack after
> +		     allocation.
> +
> +		     Use `frame_offset` to record the offset value rather than
> +		     offset since the frame_offset describes the extent

Would be easier to parse if the second “offset” was in quotes too.

> +		     allocated for this particular variable while `offset`
> +		     describes the address that this variable starts at.  */
> +		  align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
> +		  hwasan_record_stack_var (virtual_stack_vars_rtx, base,
> +					   hwasan_orig_offset, frame_offset);
> +		}
>  	    }
>  	}
>        else
> […]
> +/* The default implementation of TARGET_MEMTAG_SET_TAG.  */
> +rtx
> +default_memtag_set_tag (rtx untagged, rtx tag, rtx target)
> +{
> +  gcc_assert (GET_MODE (untagged) == Pmode);
> +  gcc_assert (GET_MODE (tag) == QImode);

Nit: I think the general preference is to have a single gcc_assert that
combines both conditions, for code size reasons.

> +  tag = expand_simple_binop (Pmode, ASHIFT, tag, HWASAN_SHIFT_RTX, tag,
> +			     /* unsignedp = */1, OPTAB_WIDEN);

This should pass NULL_RTX as the target instead of “tag”, since we can't
guarantee that “tag” is overwritable.  (In practice “tag” would probably
never get chosen as the target anyway since it has the wrong mode.)

OK with those changes, thanks.

Richard

> +  rtx ret = expand_simple_binop (Pmode, IOR, untagged, tag, target,
> +				 /* unsignedp = */1, OPTAB_DIRECT);
> +  gcc_assert (ret);
> +  return ret;
> +}
> […]
Matthew Malcomson Nov. 20, 2020, 6:46 p.m. UTC | #2
Hi there,

I was just doing some double-checks and noticed I'd placed the
documentation in the wrong section of tm.texi.  The `MEMTAG` hooks were
documented in the `Register Classes` section, so I've now moved it to
the `Misc` section.

That's the only change, Ok for trunk?

Matthew


------------------------------------------------------------



Handling stack variables has three features.

1) Ensure HWASAN required alignment for stack variables

When tagging shadow memory, we need to ensure that each tag granule is
only used by one variable at a time.

This is done by ensuring that each tagged variable is aligned to the tag
granule representation size and also ensure that the end of each
object is aligned to ensure the start of any other data stored on the
stack is in a different granule.

This patch ensures the above by forcing the stack pointer to be aligned
before and after allocating any stack objects. Since we are forcing
alignment we also use `align_local_variable` to ensure this new alignment
is advertised properly through SET_DECL_ALIGN.

2) Put tags into each stack variable pointer

Make sure that every pointer to a stack variable includes a tag of some
sort on it.

The way tagging works is:
  1) For every new stack frame, a random tag is generated.
  2) A base register is formed from the stack pointer value and this
     random tag.
  3) References to stack variables are now formed with RTL describing an
     offset from this base in both tag and value.

The random tag generation is handled by a backend hook.  This hook
decides whether to introduce a random tag or use the stack background
based on the parameter hwasan-random-frame-tag.  Using the stack
background is necessary for testing and bootstrap.  It is necessary
during bootstrap to avoid breaking the `configure` test program for
determining stack direction.

Using the stack background means that every stack frame has the initial
tag of zero and variables are tagged with incrementing tags from 1,
which also makes debugging a bit easier.

Backend hooks define the size of a tag, the layout of the HWASAN shadow
memory, and handle emitting the code that inserts and extracts tags from a
pointer.

3) For each stack variable, tag and untag the shadow stack on function
   prologue and epilogue.

On entry to each function we tag the relevant shadow stack region for
each stack variable. This stack region is tagged to match the tag added to
each pointer to that variable.

This is the first patch where we use the HWASAN shadow space, so we need
to add in the libhwasan initialisation code that creates this shadow
memory region into the binary we produce.  This instrumentation is done
in `compile_file`.

When exiting a function we need to ensure the shadow stack for this
function has no remaining tags.  Without clearing the shadow stack area
for this stack frame, later function calls could get false positives
when those later function calls check untagged areas (such as parameters
passed on the stack) against a shadow stack area with left-over tag.

Hence we ensure that the entire stack frame is cleared on function exit.

config/ChangeLog:

	* bootstrap-hwasan.mk: Disable random frame tags for stack-tagging
	during bootstrap.

ChangeLog:

	* gcc/asan.c (struct hwasan_stack_var): New.
	(hwasan_sanitize_p): New.
	(hwasan_sanitize_stack_p): New.
	(hwasan_sanitize_allocas_p): New.
	(initialize_sanitizer_builtins): Define new builtins.
	(ATTR_NOTHROW_LIST): New macro.
	(hwasan_current_frame_tag): New.
	(hwasan_frame_base): New.
	(stack_vars_base_reg_p): New.
	(hwasan_maybe_init_frame_base_init): New.
	(hwasan_record_stack_var): New.
	(hwasan_get_frame_extent): New.
	(hwasan_increment_frame_tag): New.
	(hwasan_record_frame_init): New.
	(hwasan_emit_prologue): New.
	(hwasan_emit_untag_frame): New.
	(hwasan_finish_file): New.
	(hwasan_truncate_to_tag_size): New.
	* gcc/asan.h (hwasan_record_frame_init): New declaration.
	(hwasan_record_stack_var): New declaration.
	(hwasan_emit_prologue): New declaration.
	(hwasan_emit_untag_frame): New declaration.
	(hwasan_get_frame_extent): New declaration.
	(hwasan_maybe_enit_frame_base_init): New declaration.
	(hwasan_frame_base): New declaration.
	(stack_vars_base_reg_p): New declaration.
	(hwasan_current_frame_tag): New declaration.
	(hwasan_increment_frame_tag): New declaration.
	(hwasan_truncate_to_tag_size): New declaration.
	(hwasan_finish_file): New declaration.
	(hwasan_sanitize_p): New declaration.
	(hwasan_sanitize_stack_p): New declaration.
	(hwasan_sanitize_allocas_p): New declaration.
	(HWASAN_TAG_SIZE): New macro.
	(HWASAN_TAG_GRANULE_SIZE): New macro.
	(HWASAN_STACK_BACKGROUND): New macro.
	* gcc/builtin-types.def (BT_FN_VOID_PTR_UINT8_PTRMODE): New.
	* gcc/builtins.def (DEF_SANITIZER_BUILTIN): Enable for HWASAN.
	* gcc/cfgexpand.c (align_local_variable): When using hwasan ensure
	alignment to tag granule.
	(align_frame_offset): New.
	(expand_one_stack_var_at): For hwasan use tag offset.
	(expand_stack_vars): Record stack objects for hwasan.
	(expand_one_stack_var_1): Record stack objects for hwasan.
	(init_vars_expansion): Initialise hwasan state.
	(expand_used_vars): Emit hwasan prologue and generate hwasan epilogue.
	(pass_expand::execute): Emit hwasan base initialization if needed.
	* gcc/doc/tm.texi (TARGET_MEMTAG_TAG_SIZE,TARGET_MEMTAG_GRANULE_SIZE,
	TARGET_MEMTAG_INSERT_RANDOM_TAG,TARGET_MEMTAG_ADD_TAG,
	TARGET_MEMTAG_SET_TAG,TARGET_MEMTAG_EXTRACT_TAG,
	TARGET_MEMTAG_UNTAGGED_POINTER): Document new hooks.
	* gcc/doc/tm.texi.in (TARGET_MEMTAG_TAG_SIZE,TARGET_MEMTAG_GRANULE_SIZE,
	TARGET_MEMTAG_INSERT_RANDOM_TAG,TARGET_MEMTAG_ADD_TAG,
	TARGET_MEMTAG_SET_TAG,TARGET_MEMTAG_EXTRACT_TAG,
	TARGET_MEMTAG_UNTAGGED_POINTER): Document new hooks.
	* gcc/explow.c (get_dynamic_stack_base): Take new `base` argument.
	* gcc/explow.h (get_dynamic_stack_base): Take new `base` argument.
	* gcc/sanitizer.def (BUILT_IN_HWASAN_INIT): New.
	(BUILT_IN_HWASAN_TAG_MEM): New.
	* gcc/target.def (target_memtag_tag_size,target_memtag_granule_size,
	target_memtag_insert_random_tag,target_memtag_add_tag,
	target_memtag_set_tag,target_memtag_extract_tag,
	target_memtag_untagged_pointer): New hooks.
	* gcc/targhooks.c (HWASAN_SHIFT): New.
	(HWASAN_SHIFT_RTX): New.
	(default_memtag_tag_size): New default hook.
	(default_memtag_granule_size): New default hook.
	(default_memtag_insert_random_tag): New default hook.
	(default_memtag_add_tag): New default hook.
	(default_memtag_set_tag): New default hook.
	(default_memtag_extract_tag): New default hook.
	(default_memtag_untagged_pointer): New default hook.
	* gcc/targhooks.h (default_memtag_tag_size): New default hook.
	(default_memtag_granule_size): New default hook.
	(default_memtag_insert_random_tag): New default hook.
	(default_memtag_add_tag): New default hook.
	(default_memtag_set_tag): New default hook.
	(default_memtag_extract_tag): New default hook.
	(default_memtag_untagged_pointer): New default hook.
	* gcc/toplev.c (compile_file): Call hwasan_finish_file when finished.


###############     Attachment also inlined for ease of reply    ###############


diff --git a/config/bootstrap-hwasan.mk b/config/bootstrap-hwasan.mk
index 4f60bed3fd6e98b47a3a38aea6eba2a7c320da25..91989f4bb1db6ccff564383777757b896645e541 100644
--- a/config/bootstrap-hwasan.mk
+++ b/config/bootstrap-hwasan.mk
@@ -1,7 +1,11 @@
 # This option enables -fsanitize=hwaddress for stage2 and stage3.
+# We need to disable random frame tags for bootstrap since the autoconf check
+# for which direction the stack is growing has UB that a random frame tag
+# breaks.  Running with a random frame tag gives approx. 50% chance of
+# bootstrap comparison diff in libiberty/alloca.c.
 
-STAGE2_CFLAGS += -fsanitize=hwaddress
-STAGE3_CFLAGS += -fsanitize=hwaddress
+STAGE2_CFLAGS += -fsanitize=hwaddress --param hwasan-random-frame-tag=0
+STAGE3_CFLAGS += -fsanitize=hwaddress --param hwasan-random-frame-tag=0
 POSTSTAGE1_LDFLAGS += -fsanitize=hwaddress -static-libhwasan \
 		      -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/ \
 		      -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/ \
diff --git a/gcc/asan.h b/gcc/asan.h
index 114b457ef91c4479d43774bed58c24213196ce12..8d5271e6b575d74da277420798557f3274e966ce 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -34,6 +34,22 @@ extern bool asan_expand_mark_ifn (gimple_stmt_iterator *);
 extern bool asan_expand_poison_ifn (gimple_stmt_iterator *, bool *,
 				    hash_map<tree, tree> &);
 
+extern void hwasan_record_frame_init ();
+extern void hwasan_record_stack_var (rtx, rtx, poly_int64, poly_int64);
+extern void hwasan_emit_prologue ();
+extern rtx_insn *hwasan_emit_untag_frame (rtx, rtx);
+extern rtx hwasan_get_frame_extent ();
+extern rtx hwasan_frame_base ();
+extern void hwasan_maybe_emit_frame_base_init (void);
+extern bool stack_vars_base_reg_p (rtx);
+extern uint8_t hwasan_current_frame_tag ();
+extern void hwasan_increment_frame_tag ();
+extern rtx hwasan_truncate_to_tag_size (rtx, rtx);
+extern void hwasan_finish_file (void);
+extern bool hwasan_sanitize_p (void);
+extern bool hwasan_sanitize_stack_p (void);
+extern bool hwasan_sanitize_allocas_p (void);
+
 extern gimple_stmt_iterator create_cond_insert_point
      (gimple_stmt_iterator *, bool, bool, bool, basic_block *, basic_block *);
 
@@ -75,6 +91,26 @@ extern hash_set <tree> *asan_used_labels;
 
 #define ASAN_USE_AFTER_SCOPE_ATTRIBUTE	"use after scope memory"
 
+/* NOTE: The values below and the hooks under targetm.memtag define an ABI and
+   are hard-coded to these values in libhwasan, hence they can't be changed
+   independently here.  */
+/* How many bits are used to store a tag in a pointer.
+   The default version uses the entire top byte of a pointer (i.e. 8 bits).  */
+#define HWASAN_TAG_SIZE targetm.memtag.tag_size ()
+/* Tag Granule of HWASAN shadow stack.
+   This is the size in real memory that each byte in the shadow memory refers
+   to.  I.e. if a variable is X bytes long in memory then its tag in shadow
+   memory will span X / HWASAN_TAG_GRANULE_SIZE bytes.
+   Most variables will need to be aligned to this amount since two variables
+   that are neighbors in memory and share a tag granule would need to share the
+   same tag (the shared tag granule can only store one tag).  */
+#define HWASAN_TAG_GRANULE_SIZE targetm.memtag.granule_size ()
+/* Define the tag for the stack background.
+   This defines what tag the stack pointer will be and hence what tag all
+   variables that are not given special tags are (e.g. spilled registers,
+   and parameters passed on the stack).  */
+#define HWASAN_STACK_BACKGROUND gen_int_mode (0, QImode)
+
 /* Various flags for Asan builtins.  */
 enum asan_check_flags
 {
diff --git a/gcc/asan.c b/gcc/asan.c
index 0b471afff64ea6a0ffbe0add71333ac688c472c6..d1ede3b62291eba698948e06208c482b6f197be5 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -257,6 +257,58 @@ hash_set<tree> *asan_handled_variables = NULL;
 
 hash_set <tree> *asan_used_labels = NULL;
 
+/* Global variables for HWASAN stack tagging.  */
+/* hwasan_frame_tag_offset records the offset from the frame base tag that the
+   next object should have.  */
+static uint8_t hwasan_frame_tag_offset = 0;
+/* hwasan_frame_base_ptr is a pointer with the same address as
+   `virtual_stack_vars_rtx` for the current frame, and with the frame base tag
+   stored in it.  N.b. this global RTX does not need to be marked GTY, but is
+   done so anyway.  The need is not there since all uses are in just one pass
+   (cfgexpand) and there are no calls to ggc_collect between the uses.  We mark
+   it GTY(()) anyway to allow the use of the variable later on if needed by
+   future features.  */
+static GTY(()) rtx hwasan_frame_base_ptr = NULL_RTX;
+/* hwasan_frame_base_init_seq is the sequence of RTL insns that will initialize
+   the hwasan_frame_base_ptr.  When the hwasan_frame_base_ptr is requested, we
+   generate this sequence but do not emit it.  If the sequence was created it
+   is emitted once the function body has been expanded.
+
+   This delay is because the frame base pointer may be needed anywhere in the
+   function body, or needed by the expand_used_vars function.  Emitting once in
+   a known place is simpler than requiring the emission of the instructions to
+   be know where it should go depending on the first place the hwasan frame
+   base is needed.  */
+static GTY(()) rtx_insn *hwasan_frame_base_init_seq = NULL;
+
+/* Structure defining the extent of one object on the stack that HWASAN needs
+   to tag in the corresponding shadow stack space.
+
+   The range this object spans on the stack is between `untagged_base +
+   nearest_offset` and `untagged_base + farthest_offset`.
+   `tagged_base` is an rtx containing the same value as `untagged_base` but
+   with a random tag stored in the top byte.  We record both `untagged_base`
+   and `tagged_base` so that `hwasan_emit_prologue` can use both without having
+   to emit RTL into the instruction stream to re-calculate one from the other.
+   (`hwasan_emit_prologue` needs to use both bases since the
+   __hwasan_tag_memory call it emits uses an untagged value, and it calculates
+   the tag to store in shadow memory based on the tag_offset plus the tag in
+   tagged_base).  */
+struct hwasan_stack_var
+{
+  rtx untagged_base;
+  rtx tagged_base;
+  poly_int64 nearest_offset;
+  poly_int64 farthest_offset;
+  uint8_t tag_offset;
+};
+
+/* Variable recording all stack variables that HWASAN needs to tag.
+   Does not need to be marked as GTY(()) since every use is in the cfgexpand
+   pass and gcc_collect is not called in the middle of that pass.  */
+static vec<hwasan_stack_var> hwasan_tagged_stack_vars;
+
+
 /* Sets shadow offset to value in string VAL.  */
 
 bool
@@ -1359,6 +1411,28 @@ asan_redzone_buffer::flush_if_full (void)
     flush_redzone_payload ();
 }
 
+/* Returns whether we are tagging pointers and checking those tags on memory
+   access.  */
+bool
+hwasan_sanitize_p ()
+{
+  return sanitize_flags_p (SANITIZE_HWADDRESS);
+}
+
+/* Are we tagging the stack?  */
+bool
+hwasan_sanitize_stack_p ()
+{
+  return (hwasan_sanitize_p () && param_hwasan_instrument_stack);
+}
+
+/* Are we tagging alloca objects?  */
+bool
+hwasan_sanitize_allocas_p (void)
+{
+  return (hwasan_sanitize_stack_p () && param_hwasan_instrument_allocas);
+}
+
 /* Insert code to protect stack vars.  The prologue sequence should be emitted
    directly, epilogue sequence returned.  BASE is the register holding the
    stack base, against which OFFSETS array offsets are relative to, OFFSETS
@@ -2908,6 +2982,11 @@ initialize_sanitizer_builtins (void)
     = build_function_type_list (void_type_node, uint64_type_node,
 				ptr_type_node, NULL_TREE);
 
+  tree BT_FN_VOID_PTR_UINT8_PTRMODE
+    = build_function_type_list (void_type_node, ptr_type_node,
+				unsigned_char_type_node,
+				pointer_sized_int_node, NULL_TREE);
+
   tree BT_FN_BOOL_VPTR_PTR_IX_INT_INT[5];
   tree BT_FN_IX_CONST_VPTR_INT[5];
   tree BT_FN_IX_VPTR_IX_INT[5];
@@ -2958,6 +3037,8 @@ initialize_sanitizer_builtins (void)
 #define BT_FN_I16_CONST_VPTR_INT BT_FN_IX_CONST_VPTR_INT[4]
 #define BT_FN_I16_VPTR_I16_INT BT_FN_IX_VPTR_IX_INT[4]
 #define BT_FN_VOID_VPTR_I16_INT BT_FN_VOID_VPTR_IX_INT[4]
+#undef ATTR_NOTHROW_LIST
+#define ATTR_NOTHROW_LIST ECF_NOTHROW
 #undef ATTR_NOTHROW_LEAF_LIST
 #define ATTR_NOTHROW_LEAF_LIST ECF_NOTHROW | ECF_LEAF
 #undef ATTR_TMPURE_NOTHROW_LEAF_LIST
@@ -3709,4 +3790,347 @@ make_pass_asan_O0 (gcc::context *ctxt)
   return new pass_asan_O0 (ctxt);
 }
 
+/* For stack tagging:
+
+   Return the offset from the frame base tag that the "next" expanded object
+   should have.  */
+uint8_t
+hwasan_current_frame_tag ()
+{
+  return hwasan_frame_tag_offset;
+}
+
+/* For stack tagging:
+
+   Return the 'base pointer' for this function.  If that base pointer has not
+   yet been created then we create a register to hold it and record the insns
+   to initialize the register in `hwasan_frame_base_init_seq` for later
+   emission.  */
+rtx
+hwasan_frame_base ()
+{
+  if (! hwasan_frame_base_ptr)
+    {
+      start_sequence ();
+      hwasan_frame_base_ptr
+	= force_reg (Pmode,
+		     targetm.memtag.insert_random_tag (virtual_stack_vars_rtx,
+						       NULL_RTX));
+      hwasan_frame_base_init_seq = get_insns ();
+      end_sequence ();
+    }
+
+  return hwasan_frame_base_ptr;
+}
+
+/* For stack tagging:
+
+   Check whether this RTX is a standard pointer addressing the base of the
+   stack variables for this frame.  Returns true if the RTX is either
+   virtual_stack_vars_rtx or hwasan_frame_base_ptr.  */
+bool
+stack_vars_base_reg_p (rtx base)
+{
+  return base == virtual_stack_vars_rtx || base == hwasan_frame_base_ptr;
+}
+
+/* For stack tagging:
+
+   Emit frame base initialisation.
+   If hwasan_frame_base has been used before here then
+   hwasan_frame_base_init_seq contains the sequence of instructions to
+   initialize it.  This must be put just before the hwasan prologue, so we emit
+   the insns before parm_birth_insn (which will point to the first instruction
+   of the hwasan prologue if it exists).
+
+   We update `parm_birth_insn` to point to the start of this initialisation
+   since that represents the end of the initialisation done by
+   expand_function_{start,end} functions and we want to maintain that.  */
+void
+hwasan_maybe_emit_frame_base_init ()
+{
+  if (! hwasan_frame_base_init_seq)
+    return;
+  emit_insn_before (hwasan_frame_base_init_seq, parm_birth_insn);
+  parm_birth_insn = hwasan_frame_base_init_seq;
+}
+
+/* Record a compile-time constant size stack variable that HWASAN will need to
+   tag.  This record of the range of a stack variable will be used by
+   `hwasan_emit_prologue` to emit the RTL at the start of each frame which will
+   set tags in the shadow memory according to the assigned tag for each object.
+
+   The range that the object spans in stack space should be described by the
+   bounds `untagged_base + nearest_offset` and
+   `untagged_base + farthest_offset`.
+   `tagged_base` is the base address which contains the "base frame tag" for
+   this frame, and from which the value to address this object with will be
+   calculated.
+
+   We record the `untagged_base` since the functions in the hwasan library we
+   use to tag memory take pointers without a tag.  */
+void
+hwasan_record_stack_var (rtx untagged_base, rtx tagged_base,
+			 poly_int64 nearest_offset, poly_int64 farthest_offset)
+{
+  hwasan_stack_var cur_var;
+  cur_var.untagged_base = untagged_base;
+  cur_var.tagged_base = tagged_base;
+  cur_var.nearest_offset = nearest_offset;
+  cur_var.farthest_offset = farthest_offset;
+  cur_var.tag_offset = hwasan_current_frame_tag ();
+
+  hwasan_tagged_stack_vars.safe_push (cur_var);
+}
+
+/* Return the RTX representing the farthest extent of the statically allocated
+   stack objects for this frame.  If hwasan_frame_base_ptr has not been
+   initialized then we are not storing any static variables on the stack in
+   this frame.  In this case we return NULL_RTX to represent that.
+
+   Otherwise simply return virtual_stack_vars_rtx + frame_offset.  */
+rtx
+hwasan_get_frame_extent ()
+{
+  return (hwasan_frame_base_ptr
+	  ? plus_constant (Pmode, virtual_stack_vars_rtx, frame_offset)
+	  : NULL_RTX);
+}
+
+/* For stack tagging:
+
+   Increment the frame tag offset modulo the size a tag can represent.  */
+void
+hwasan_increment_frame_tag ()
+{
+  uint8_t tag_bits = HWASAN_TAG_SIZE;
+  gcc_assert (HWASAN_TAG_SIZE
+	      <= sizeof (hwasan_frame_tag_offset) * CHAR_BIT);
+  hwasan_frame_tag_offset = (hwasan_frame_tag_offset + 1) % (1 << tag_bits);
+  /* The "background tag" of the stack is zero by definition.
+     This is the tag that objects like parameters passed on the stack and
+     spilled registers are given.  It is handy to avoid this tag for objects
+     whose tags we decide ourselves, partly to ensure that buffer overruns
+     can't affect these important variables (e.g. saved link register, saved
+     stack pointer etc) and partly to make debugging easier (everything with a
+     tag of zero is space allocated automatically by the compiler).
+
+     This is not feasible when using random frame tags (the default
+     configuration for hwasan) since the tag for the given frame is randomly
+     chosen at runtime.  In order to avoid any tags matching the stack
+     background we would need to decide tag offsets at runtime instead of
+     compile time (and pay the resulting performance cost).
+
+     When not using random base tags for each frame (i.e. when compiled with
+     `--param hwasan-random-frame-tag=0`) the base tag for each frame is zero.
+     This means the tag that each object gets is equal to the
+     hwasan_frame_tag_offset used in determining it.
+     When this is the case we *can* ensure no object gets the tag of zero by
+     simply ensuring no object has the hwasan_frame_tag_offset of zero.
+
+     There is the extra complication that we only record the
+     hwasan_frame_tag_offset here (which is the offset from the tag stored in
+     the stack pointer).  In the kernel, the tag in the stack pointer is 0xff
+     rather than zero.  This does not cause problems since tags of 0xff are
+     never checked in the kernel.  As mentioned at the beginning of this
+     comment the background tag of the stack is zero by definition, which means
+     that for the kernel we should skip offsets of both 0 and 1 from the stack
+     pointer.  Avoiding the offset of 0 ensures we use a tag which will be
+     checked, avoiding the offset of 1 ensures we use a tag that is not the
+     same as the background.  */
+  if (hwasan_frame_tag_offset == 0 && ! param_hwasan_random_frame_tag)
+    hwasan_frame_tag_offset += 1;
+  if (hwasan_frame_tag_offset == 1 && ! param_hwasan_random_frame_tag
+      && sanitize_flags_p (SANITIZE_KERNEL_HWADDRESS))
+    hwasan_frame_tag_offset += 1;
+}
+
+/* Clear internal state for the next function.
+   This function is called before variables on the stack get expanded, in
+   `init_vars_expansion`.  */
+void
+hwasan_record_frame_init ()
+{
+  delete asan_used_labels;
+  asan_used_labels = NULL;
+
+  /* If this isn't the case then some stack variable was recorded *before*
+     hwasan_record_frame_init is called, yet *after* the hwasan prologue for
+     the previous frame was emitted.  Such stack variables would not have
+     their shadow stack filled in.  */
+  gcc_assert (hwasan_tagged_stack_vars.is_empty ());
+  hwasan_frame_base_ptr = NULL_RTX;
+  hwasan_frame_base_init_seq = NULL;
+
+  /* When not using a random frame tag we can avoid the background stack
+     color which gives the user a little better debug output upon a crash.
+     Meanwhile, when using a random frame tag it will be nice to avoid adding
+     tags for the first object since that is unnecessary extra work.
+     Hence set the initial hwasan_frame_tag_offset to be 0 if using a random
+     frame tag and 1 otherwise.
+
+     As described in hwasan_increment_frame_tag, in the kernel the stack
+     pointer has the tag 0xff.  That means that to avoid 0xff and 0 (the tag
+     which the kernel does not check and the background tag respectively) we
+     start with a tag offset of 2.  */
+  hwasan_frame_tag_offset = param_hwasan_random_frame_tag
+    ? 0
+    : sanitize_flags_p (SANITIZE_KERNEL_HWADDRESS) ? 2 : 1;
+}
+
+/* For stack tagging:
+   (Emits HWASAN equivalent of what is emitted by
+   `asan_emit_stack_protection`).
+
+   Emits the extra prologue code to set the shadow stack as required for HWASAN
+   stack instrumentation.
+
+   Uses the vector of recorded stack variables hwasan_tagged_stack_vars.  When
+   this function has completed hwasan_tagged_stack_vars is empty and all
+   objects it had pointed to are deallocated.  */
+void
+hwasan_emit_prologue ()
+{
+  /* We need untagged base pointers since libhwasan only accepts untagged
+    pointers in __hwasan_tag_memory.  We need the tagged base pointer to obtain
+    the base tag for an offset.  */
+
+  if (hwasan_tagged_stack_vars.is_empty ())
+    return;
+
+  poly_int64 bot = 0, top = 0;
+  for (hwasan_stack_var &cur : hwasan_tagged_stack_vars)
+    {
+      poly_int64 nearest = cur.nearest_offset;
+      poly_int64 farthest = cur.farthest_offset;
+
+      if (known_ge (nearest, farthest))
+	{
+	  top = nearest;
+	  bot = farthest;
+	}
+      else
+	{
+	  /* Given how these values are calculated, one must be known greater
+	     than the other.  */
+	  gcc_assert (known_le (nearest, farthest));
+	  top = farthest;
+	  bot = nearest;
+	}
+      poly_int64 size = (top - bot);
+
+      /* Assert the edge of each variable is aligned to the HWASAN tag granule
+	 size.  */
+      gcc_assert (multiple_p (top, HWASAN_TAG_GRANULE_SIZE));
+      gcc_assert (multiple_p (bot, HWASAN_TAG_GRANULE_SIZE));
+      gcc_assert (multiple_p (size, HWASAN_TAG_GRANULE_SIZE));
+
+      rtx fn = init_one_libfunc ("__hwasan_tag_memory");
+      rtx base_tag = targetm.memtag.extract_tag (cur.tagged_base, NULL_RTX);
+      rtx tag = plus_constant (QImode, base_tag, cur.tag_offset);
+      tag = hwasan_truncate_to_tag_size (tag, NULL_RTX);
+
+      rtx bottom = convert_memory_address (ptr_mode,
+					   plus_constant (Pmode,
+							  cur.untagged_base,
+							  bot));
+      emit_library_call (fn, LCT_NORMAL, VOIDmode,
+			 bottom, ptr_mode,
+			 tag, QImode,
+			 gen_int_mode (size, ptr_mode), ptr_mode);
+    }
+  /* Clear the stack vars, we've emitted the prologue for them all now.  */
+  hwasan_tagged_stack_vars.truncate (0);
+}
+
+/* For stack tagging:
+
+   Return RTL insns to clear the tags between DYNAMIC and VARS pointers
+   into the stack.  These instructions should be emitted at the end of
+   every function.
+
+   If `dynamic` is NULL_RTX then no insns are returned.  */
+rtx_insn *
+hwasan_emit_untag_frame (rtx dynamic, rtx vars)
+{
+  if (! dynamic)
+    return NULL;
+
+  start_sequence ();
+
+  dynamic = convert_memory_address (ptr_mode, dynamic);
+  vars = convert_memory_address (ptr_mode, vars);
+
+  rtx top_rtx;
+  rtx bot_rtx;
+  if (FRAME_GROWS_DOWNWARD)
+    {
+      top_rtx = vars;
+      bot_rtx = dynamic;
+    }
+  else
+    {
+      top_rtx = dynamic;
+      bot_rtx = vars;
+    }
+
+  rtx size_rtx = expand_simple_binop (ptr_mode, MINUS, top_rtx, bot_rtx,
+				      NULL_RTX, /* unsignedp = */0,
+				      OPTAB_DIRECT);
+
+  rtx fn = init_one_libfunc ("__hwasan_tag_memory");
+  emit_library_call (fn, LCT_NORMAL, VOIDmode,
+		     bot_rtx, ptr_mode,
+		     HWASAN_STACK_BACKGROUND, QImode,
+		     size_rtx, ptr_mode);
+
+  do_pending_stack_adjust ();
+  rtx_insn *insns = get_insns ();
+  end_sequence ();
+  return insns;
+}
+
+/* Needs to be GTY(()), because cgraph_build_static_cdtor may
+   invoke ggc_collect.  */
+static GTY(()) tree hwasan_ctor_statements;
+
+/* Insert module initialization into this TU.  This initialization calls the
+   initialization code for libhwasan.  */
+void
+hwasan_finish_file (void)
+{
+  /* Do not emit constructor initialization for the kernel.
+     (the kernel has its own initialization already).  */
+  if (flag_sanitize & SANITIZE_KERNEL_HWADDRESS)
+    return;
+
+  /* Avoid instrumenting code in the hwasan constructors/destructors.  */
+  flag_sanitize &= ~SANITIZE_HWADDRESS;
+  int priority = MAX_RESERVED_INIT_PRIORITY - 1;
+  tree fn = builtin_decl_implicit (BUILT_IN_HWASAN_INIT);
+  append_to_statement_list (build_call_expr (fn, 0), &hwasan_ctor_statements);
+  cgraph_build_static_cdtor ('I', hwasan_ctor_statements, priority);
+  flag_sanitize |= SANITIZE_HWADDRESS;
+}
+
+/* For stack tagging:
+
+   Truncate `tag` to the number of bits that a tag uses (i.e. to
+   HWASAN_TAG_SIZE).  Store the result in `target` if it's convenient.  */
+rtx
+hwasan_truncate_to_tag_size (rtx tag, rtx target)
+{
+  gcc_assert (GET_MODE (tag) == QImode);
+  if (HWASAN_TAG_SIZE != GET_MODE_PRECISION (QImode))
+    {
+      gcc_assert (GET_MODE_PRECISION (QImode) > HWASAN_TAG_SIZE);
+      rtx mask = gen_int_mode ((HOST_WIDE_INT_1U << HWASAN_TAG_SIZE) - 1,
+			       QImode);
+      tag = expand_simple_binop (QImode, AND, tag, mask, target,
+				 /* unsignedp = */1, OPTAB_WIDEN);
+      gcc_assert (tag);
+    }
+  return tag;
+}
+
 #include "gt-asan.h"
diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 4a82ee421bef42154ccd88e52f7a19f48b340c73..1ad6657da45cc4976532e1b8bc233f67d8da9ccf 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -639,6 +639,8 @@ DEF_FUNCTION_TYPE_3 (BT_FN_PTR_PTR_CONST_SIZE_BOOL,
 		     BT_PTR, BT_PTR, BT_CONST_SIZE, BT_BOOL)
 DEF_FUNCTION_TYPE_3 (BT_FN_PTR_SIZE_SIZE_PTRMODE,
 		     BT_PTR, BT_SIZE, BT_SIZE, BT_PTRMODE)
+DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_UINT8_PTRMODE, BT_VOID, BT_PTR, BT_UINT8,
+		     BT_PTRMODE)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
 		     BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
diff --git a/gcc/builtins.def b/gcc/builtins.def
index b4494c712a1751fbb37378f38cc1411d11a37331..97bb5d0b0aee7fa9ee4c82e2d80eae866fc23829 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -245,6 +245,7 @@ along with GCC; see the file COPYING3.  If not see
   DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,    \
 	       true, true, true, ATTRS, true, \
 	      (flag_sanitize & (SANITIZE_ADDRESS | SANITIZE_THREAD \
+				| SANITIZE_HWADDRESS \
 				| SANITIZE_UNDEFINED \
 				| SANITIZE_UNDEFINED_NONDEFAULT) \
 	       || flag_sanitize_coverage))
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 1df6f4bc55a39230c98e58af6c2d765652db8324..231c2ee32362fc3967b1cd7b70bd330ce49648d3 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -376,15 +376,18 @@ align_local_variable (tree decl, bool really_expand)
 	align = GET_MODE_ALIGNMENT (mode);
     }
   else
-    {
-      align = LOCAL_DECL_ALIGNMENT (decl);
-      /* Don't change DECL_ALIGN when called from estimated_stack_frame_size.
-	 That is done before IPA and could bump alignment based on host
-	 backend even for offloaded code which wants different
-	 LOCAL_DECL_ALIGNMENT.  */
-      if (really_expand)
-	SET_DECL_ALIGN (decl, align);
-    }
+    align = LOCAL_DECL_ALIGNMENT (decl);
+
+  if (hwasan_sanitize_stack_p ())
+    align = MAX (align, (unsigned) HWASAN_TAG_GRANULE_SIZE * BITS_PER_UNIT);
+
+  if (TREE_CODE (decl) != SSA_NAME && really_expand)
+    /* Don't change DECL_ALIGN when called from estimated_stack_frame_size.
+       That is done before IPA and could bump alignment based on host
+       backend even for offloaded code which wants different
+       LOCAL_DECL_ALIGNMENT.  */
+    SET_DECL_ALIGN (decl, align);
+
   return align / BITS_PER_UNIT;
 }
 
@@ -428,6 +431,14 @@ alloc_stack_frame_space (poly_int64 size, unsigned HOST_WIDE_INT align)
   return offset;
 }
 
+/* Ensure that the stack is aligned to ALIGN bytes.
+   Return the new frame offset.  */
+static poly_int64
+align_frame_offset (unsigned HOST_WIDE_INT align)
+{
+  return alloc_stack_frame_space (0, align);
+}
+
 /* Accumulate DECL into STACK_VARS.  */
 
 static void
@@ -1004,7 +1015,12 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
   /* If this fails, we've overflowed the stack frame.  Error nicely?  */
   gcc_assert (known_eq (offset, trunc_int_for_mode (offset, Pmode)));
 
-  x = plus_constant (Pmode, base, offset);
+  if (hwasan_sanitize_stack_p ())
+    x = targetm.memtag.add_tag (base, offset,
+				hwasan_current_frame_tag ());
+  else
+    x = plus_constant (Pmode, base, offset);
+
   x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
 		   ? TYPE_MODE (TREE_TYPE (decl))
 		   : DECL_MODE (decl), x);
@@ -1013,7 +1029,7 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
      If it is we generate stack slots only accidentally so it isn't as
      important, we'll simply set the alignment directly on the MEM.  */
 
-  if (base == virtual_stack_vars_rtx)
+  if (stack_vars_base_reg_p (base))
     offset -= frame_phase;
   align = known_alignment (offset);
   align *= BITS_PER_UNIT;
@@ -1056,13 +1072,13 @@ public:
 /* A subroutine of expand_used_vars.  Give each partition representative
    a unique location within the stack frame.  Update each partition member
    with that location.  */
-
 static void
 expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
 {
   size_t si, i, j, n = stack_vars_num;
   poly_uint64 large_size = 0, large_alloc = 0;
   rtx large_base = NULL;
+  rtx large_untagged_base = NULL;
   unsigned large_align = 0;
   bool large_allocation_done = false;
   tree decl;
@@ -1113,7 +1129,7 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
     {
       rtx base;
       unsigned base_align, alignb;
-      poly_int64 offset;
+      poly_int64 offset = 0;
 
       i = stack_vars_sorted[si];
 
@@ -1134,10 +1150,33 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
       if (pred && !pred (i))
 	continue;
 
+      base = (hwasan_sanitize_stack_p ()
+	      ? hwasan_frame_base ()
+	      : virtual_stack_vars_rtx);
       alignb = stack_vars[i].alignb;
       if (alignb * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
 	{
-	  base = virtual_stack_vars_rtx;
+	  poly_int64 hwasan_orig_offset;
+	  if (hwasan_sanitize_stack_p ())
+	    {
+	      /* There must be no tag granule "shared" between different
+		 objects.  This means that no HWASAN_TAG_GRANULE_SIZE byte
+		 chunk can have more than one object in it.
+
+		 We ensure this by forcing the end of the last bit of data to
+		 be aligned to HWASAN_TAG_GRANULE_SIZE bytes here, and setting
+		 the start of each variable to be aligned to
+		 HWASAN_TAG_GRANULE_SIZE bytes in `align_local_variable`.
+
+		 We can't align just one of the start or end, since there are
+		 untagged things stored on the stack which we do not align to
+		 HWASAN_TAG_GRANULE_SIZE bytes.  If we only aligned the start
+		 or the end of tagged objects then untagged objects could end
+		 up sharing the first granule of a tagged object or sharing the
+		 last granule of a tagged object respectively.  */
+	      hwasan_orig_offset = align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
+	      gcc_assert (stack_vars[i].alignb >= HWASAN_TAG_GRANULE_SIZE);
+	    }
 	  /* ASAN description strings don't yet have a syntax for expressing
 	     polynomial offsets.  */
 	  HOST_WIDE_INT prev_offset;
@@ -1148,7 +1187,7 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
 	    {
 	      if (data->asan_vec.is_empty ())
 		{
-		  alloc_stack_frame_space (0, ASAN_RED_ZONE_SIZE);
+		  align_frame_offset (ASAN_RED_ZONE_SIZE);
 		  prev_offset = frame_offset.to_constant ();
 		}
 	      prev_offset = align_base (prev_offset,
@@ -1216,6 +1255,24 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
 	    {
 	      offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
 	      base_align = crtl->max_used_stack_slot_alignment;
+
+	      if (hwasan_sanitize_stack_p ())
+		{
+		  /* Align again since the point of this alignment is to handle
+		     the "end" of the object (i.e. smallest address after the
+		     stack object).  For FRAME_GROWS_DOWNWARD that requires
+		     aligning the stack before allocating, but for a frame that
+		     grows upwards that requires aligning the stack after
+		     allocation.
+
+		     Use `frame_offset` to record the offset value rather than
+		     offset since the `frame_offset` describes the extent
+		     allocated for this particular variable while `offset`
+		     describes the address that this variable starts at.  */
+		  align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
+		  hwasan_record_stack_var (virtual_stack_vars_rtx, base,
+					   hwasan_orig_offset, frame_offset);
+		}
 	    }
 	}
       else
@@ -1236,14 +1293,33 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
 	      loffset = alloc_stack_frame_space
 		(rtx_to_poly_int64 (large_allocsize),
 		 PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
-	      large_base = get_dynamic_stack_base (loffset, large_align);
+	      large_base = get_dynamic_stack_base (loffset, large_align, base);
 	      large_allocation_done = true;
 	    }
-	  gcc_assert (large_base != NULL);
 
+	  gcc_assert (large_base != NULL);
 	  large_alloc = aligned_upper_bound (large_alloc, alignb);
 	  offset = large_alloc;
 	  large_alloc += stack_vars[i].size;
+	  if (hwasan_sanitize_stack_p ())
+	    {
+	      /* An object with a large alignment requirement means that the
+		 alignment requirement is greater than the required alignment
+		 for tags.  */
+	      if (!large_untagged_base)
+		large_untagged_base
+		  = targetm.memtag.untagged_pointer (large_base, NULL_RTX);
+	      /* Ensure the end of the variable is also aligned correctly.  */
+	      poly_int64 align_again
+		= aligned_upper_bound (large_alloc, HWASAN_TAG_GRANULE_SIZE);
+	      /* For large allocations we always allocate a chunk of space
+		 (which is addressed by large_untagged_base/large_base) and
+		 then use positive offsets from that.  Hence the farthest
+		 offset is `align_again` and the nearest offset from the base
+		 is `offset`.  */
+	      hwasan_record_stack_var (large_untagged_base, large_base,
+				       offset, align_again);
+	    }
 
 	  base = large_base;
 	  base_align = large_align;
@@ -1254,9 +1330,10 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
       for (j = i; j != EOC; j = stack_vars[j].next)
 	{
 	  expand_one_stack_var_at (stack_vars[j].decl,
-				   base, base_align,
-				   offset);
+				   base, base_align, offset);
 	}
+      if (hwasan_sanitize_stack_p ())
+	hwasan_increment_frame_tag ();
     }
 
   gcc_assert (known_eq (large_alloc, large_size));
@@ -1347,10 +1424,37 @@ expand_one_stack_var_1 (tree var)
   /* We handle highly aligned variables in expand_stack_vars.  */
   gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
 
-  offset = alloc_stack_frame_space (size, byte_align);
+  rtx base;
+  if (hwasan_sanitize_stack_p ())
+    {
+      /* Allocate zero bytes to align the stack.  */
+      poly_int64 hwasan_orig_offset
+	= align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
+      offset = alloc_stack_frame_space (size, byte_align);
+      align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
+      base = hwasan_frame_base ();
+      /* Use `frame_offset` to automatically account for machines where the
+	 frame grows upwards.
+
+	 `offset` will always point to the "start" of the stack object, which
+	 will be the smallest address, for ! FRAME_GROWS_DOWNWARD this is *not*
+	 the "furthest" offset from the base delimiting the current stack
+	 object.  `frame_offset` will always delimit the extent that the frame.
+	 */
+      hwasan_record_stack_var (virtual_stack_vars_rtx, base,
+			       hwasan_orig_offset, frame_offset);
+    }
+  else
+    {
+      offset = alloc_stack_frame_space (size, byte_align);
+      base = virtual_stack_vars_rtx;
+    }
 
-  expand_one_stack_var_at (var, virtual_stack_vars_rtx,
+  expand_one_stack_var_at (var, base,
 			   crtl->max_used_stack_slot_alignment, offset);
+
+  if (hwasan_sanitize_stack_p ())
+    hwasan_increment_frame_tag ();
 }
 
 /* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
@@ -1950,6 +2054,8 @@ init_vars_expansion (void)
   /* Initialize local stack smashing state.  */
   has_protected_decls = false;
   has_short_buffer = false;
+  if (hwasan_sanitize_stack_p ())
+    hwasan_record_frame_init ();
 }
 
 /* Free up stack variable graph data.  */
@@ -2277,10 +2383,26 @@ expand_used_vars (void)
       expand_stack_vars (NULL, &data);
     }
 
+  if (hwasan_sanitize_stack_p ())
+    hwasan_emit_prologue ();
   if (asan_sanitize_allocas_p () && cfun->calls_alloca)
     var_end_seq = asan_emit_allocas_unpoison (virtual_stack_dynamic_rtx,
 					      virtual_stack_vars_rtx,
 					      var_end_seq);
+  else if (hwasan_sanitize_allocas_p () && cfun->calls_alloca)
+    /* When using out-of-line instrumentation we only want to emit one function
+       call for clearing the tags in a region of shadow stack.  When there are
+       alloca calls in this frame we want to emit a call using the
+       virtual_stack_dynamic_rtx, but when not we use the hwasan_frame_extent
+       rtx we created in expand_stack_vars.  */
+    var_end_seq = hwasan_emit_untag_frame (virtual_stack_dynamic_rtx,
+					   virtual_stack_vars_rtx);
+  else if (hwasan_sanitize_stack_p ())
+    /* If no variables were stored on the stack, `hwasan_get_frame_extent`
+       will return NULL_RTX and hence `hwasan_emit_untag_frame` will return
+       NULL (i.e. an empty sequence).  */
+    var_end_seq = hwasan_emit_untag_frame (hwasan_get_frame_extent (),
+					   virtual_stack_vars_rtx);
 
   fini_vars_expansion ();
 
@@ -6641,6 +6763,9 @@ pass_expand::execute (function *fun)
       emit_insn_after (var_ret_seq, after);
     }
 
+  if (hwasan_sanitize_stack_p ())
+    hwasan_maybe_emit_frame_base_init ();
+
   /* Zap the tree EH table.  */
   set_eh_throw_stmt_table (fun, NULL);
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 298fe4b295e2f81d679786f21f499183bc07078f..f06d5e8911241d3fa0f2c7a101a3a2468defd227 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -12230,3 +12230,60 @@ work.
 At preset, this feature does not support address spaces.  It also requires
 @code{Pmode} to be the same as @code{ptr_mode}.
 @end deftypefn
+
+@deftypefn {Target Hook} uint8_t TARGET_MEMTAG_TAG_SIZE ()
+Return the size of a tag (in bits) for this platform.
+
+The default returns 8.
+@end deftypefn
+
+@deftypefn {Target Hook} uint8_t TARGET_MEMTAG_GRANULE_SIZE ()
+Return the size in real memory that each byte in shadow memory refers to.
+I.e. if a variable is @var{X} bytes long in memory, then this hook should
+return the value @var{Y} such that the tag in shadow memory spans
+@var{X}/@var{Y} bytes.
+
+Most variables will need to be aligned to this amount since two variables
+that are neighbors in memory and share a tag granule would need to share
+the same tag.
+
+The default returns 16.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_INSERT_RANDOM_TAG (rtx @var{untagged}, rtx @var{target})
+Return an RTX representing the value of @var{untagged} but with a
+(possibly) random tag in it.
+Put that value into @var{target} if it is convenient to do so.
+This function is used to generate a tagged base for the current stack frame.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_ADD_TAG (rtx @var{base}, poly_int64 @var{addr_offset}, uint8_t @var{tag_offset})
+Return an RTX that represents the result of adding @var{addr_offset} to
+the address in pointer @var{base} and @var{tag_offset} to the tag in pointer
+@var{base}.
+The resulting RTX must either be a valid memory address or be able to get
+put into an operand with @code{force_operand}.
+
+Unlike other memtag hooks, this must return an expression and not emit any
+RTL.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_SET_TAG (rtx @var{untagged_base}, rtx @var{tag}, rtx @var{target})
+Return an RTX representing @var{untagged_base} but with the tag @var{tag}.
+Try and store this in @var{target} if convenient.
+@var{untagged_base} is required to have a zero tag when this hook is called.
+The default of this hook is to set the top byte of @var{untagged_base} to
+@var{tag}.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_EXTRACT_TAG (rtx @var{tagged_pointer}, rtx @var{target})
+Return an RTX representing the tag stored in @var{tagged_pointer}.
+Store the result in @var{target} if it is convenient.
+The default represents the top byte of the original pointer.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_UNTAGGED_POINTER (rtx @var{tagged_pointer}, rtx @var{target})
+Return an RTX representing @var{tagged_pointer} with its tag set to zero.
+Store the result in @var{target} if convenient.
+The default clears the top byte of the original pointer.
+@end deftypefn
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 8fbd36e2bf31e098f7827ce331fd7059c8a747bc..b08923c8f28455fe77e061625e78ed1bf538e792 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -8186,3 +8186,17 @@ maintainer is familiar with.
 @hook TARGET_RUN_TARGET_SELFTESTS
 
 @hook TARGET_MEMTAG_CAN_TAG_ADDRESSES
+
+@hook TARGET_MEMTAG_TAG_SIZE
+
+@hook TARGET_MEMTAG_GRANULE_SIZE
+
+@hook TARGET_MEMTAG_INSERT_RANDOM_TAG
+
+@hook TARGET_MEMTAG_ADD_TAG
+
+@hook TARGET_MEMTAG_SET_TAG
+
+@hook TARGET_MEMTAG_EXTRACT_TAG
+
+@hook TARGET_MEMTAG_UNTAGGED_POINTER
diff --git a/gcc/explow.h b/gcc/explow.h
index 0df8c62b82a8bf1d8d6baf0b6fb658e66361a407..581831cb19fdf9e8fd969bb30139e1358279a34d 100644
--- a/gcc/explow.h
+++ b/gcc/explow.h
@@ -106,7 +106,7 @@ extern rtx allocate_dynamic_stack_space (rtx, unsigned, unsigned,
 extern void get_dynamic_stack_size (rtx *, unsigned, unsigned, HOST_WIDE_INT *);
 
 /* Returns the address of the dynamic stack space without allocating it.  */
-extern rtx get_dynamic_stack_base (poly_int64, unsigned);
+extern rtx get_dynamic_stack_base (poly_int64, unsigned, rtx);
 
 /* Return an rtx doing runtime alignment to REQUIRED_ALIGN on TARGET.  */
 extern rtx align_dynamic_address (rtx, unsigned);
diff --git a/gcc/explow.c b/gcc/explow.c
index 0fbc6d25b816457a3d13ed45d16b5dd0513cfacd..41c3f6ace49c0e55c080e10b917842b1b21d49eb 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -1583,10 +1583,14 @@ allocate_dynamic_stack_space (rtx size, unsigned size_align,
    OFFSET is the offset of the area into the virtual stack vars area.
 
    REQUIRED_ALIGN is the alignment (in bits) required for the region
-   of memory.  */
+   of memory.
+
+   BASE is the rtx of the base of this virtual stack vars area.
+   The only time this is not `virtual_stack_vars_rtx` is when tagging pointers
+   on the stack.  */
 
 rtx
-get_dynamic_stack_base (poly_int64 offset, unsigned required_align)
+get_dynamic_stack_base (poly_int64 offset, unsigned required_align, rtx base)
 {
   rtx target;
 
@@ -1594,7 +1598,7 @@ get_dynamic_stack_base (poly_int64 offset, unsigned required_align)
     crtl->preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
 
   target = gen_reg_rtx (Pmode);
-  emit_move_insn (target, virtual_stack_vars_rtx);
+  emit_move_insn (target, base);
   target = expand_binop (Pmode, add_optab, target,
 			 gen_int_mode (offset, Pmode),
 			 NULL_RTX, 1, OPTAB_LIB_WIDEN);
diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def
index a32715ddb92e69b7ca7be28a8f17a369b891bd76..4f854fb994229fd4ed91d3b5cff7c7acff9a55bc 100644
--- a/gcc/sanitizer.def
+++ b/gcc/sanitizer.def
@@ -180,6 +180,12 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_POINTER_COMPARE, "__sanitizer_ptr_cmp",
 DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_POINTER_SUBTRACT, "__sanitizer_ptr_sub",
 		      BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST)
 
+/* Hardware Address Sanitizer.  */
+DEF_SANITIZER_BUILTIN(BUILT_IN_HWASAN_INIT, "__hwasan_init",
+		      BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_HWASAN_TAG_MEM, "__hwasan_tag_memory",
+		      BT_FN_VOID_PTR_UINT8_PTRMODE, ATTR_NOTHROW_LIST)
+
 /* Thread Sanitizer */
 DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_INIT, "__tsan_init", 
 		      BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
diff --git a/gcc/target.def b/gcc/target.def
index 25f0ae228210f926077020082f129fb2e599f062..44807438431488a5a7aa8f8125d256869e152b68 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -6874,6 +6874,71 @@ At preset, this feature does not support address spaces.  It also requires\n\
 @code{Pmode} to be the same as @code{ptr_mode}.",
  bool, (), default_memtag_can_tag_addresses)
 
+DEFHOOK
+(tag_size,
+ "Return the size of a tag (in bits) for this platform.\n\
+\n\
+The default returns 8.",
+  uint8_t, (), default_memtag_tag_size)
+
+DEFHOOK
+(granule_size,
+ "Return the size in real memory that each byte in shadow memory refers to.\n\
+I.e. if a variable is @var{X} bytes long in memory, then this hook should\n\
+return the value @var{Y} such that the tag in shadow memory spans\n\
+@var{X}/@var{Y} bytes.\n\
+\n\
+Most variables will need to be aligned to this amount since two variables\n\
+that are neighbors in memory and share a tag granule would need to share\n\
+the same tag.\n\
+\n\
+The default returns 16.",
+  uint8_t, (), default_memtag_granule_size)
+
+DEFHOOK
+(insert_random_tag,
+ "Return an RTX representing the value of @var{untagged} but with a\n\
+(possibly) random tag in it.\n\
+Put that value into @var{target} if it is convenient to do so.\n\
+This function is used to generate a tagged base for the current stack frame.",
+  rtx, (rtx untagged, rtx target), default_memtag_insert_random_tag)
+
+DEFHOOK
+(add_tag,
+ "Return an RTX that represents the result of adding @var{addr_offset} to\n\
+the address in pointer @var{base} and @var{tag_offset} to the tag in pointer\n\
+@var{base}.\n\
+The resulting RTX must either be a valid memory address or be able to get\n\
+put into an operand with @code{force_operand}.\n\
+\n\
+Unlike other memtag hooks, this must return an expression and not emit any\n\
+RTL.",
+  rtx, (rtx base, poly_int64 addr_offset, uint8_t tag_offset),
+  default_memtag_add_tag)
+
+DEFHOOK
+(set_tag,
+ "Return an RTX representing @var{untagged_base} but with the tag @var{tag}.\n\
+Try and store this in @var{target} if convenient.\n\
+@var{untagged_base} is required to have a zero tag when this hook is called.\n\
+The default of this hook is to set the top byte of @var{untagged_base} to\n\
+@var{tag}.",
+  rtx, (rtx untagged_base, rtx tag, rtx target), default_memtag_set_tag)
+
+DEFHOOK
+(extract_tag,
+ "Return an RTX representing the tag stored in @var{tagged_pointer}.\n\
+Store the result in @var{target} if it is convenient.\n\
+The default represents the top byte of the original pointer.",
+  rtx, (rtx tagged_pointer, rtx target), default_memtag_extract_tag)
+
+DEFHOOK
+(untagged_pointer,
+ "Return an RTX representing @var{tagged_pointer} with its tag set to zero.\n\
+Store the result in @var{target} if convenient.\n\
+The default clears the top byte of the original pointer.",
+  rtx, (rtx tagged_pointer, rtx target), default_memtag_untagged_pointer)
+
 HOOK_VECTOR_END (memtag)
 #undef HOOK_PREFIX
 #define HOOK_PREFIX "TARGET_"
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 0065c686978d7120978430013c73b1055aaf95c7..68e8688a32f18481ee61f06879aacff20163105b 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -287,4 +287,12 @@ extern bool speculation_safe_value_not_needed (bool);
 extern rtx default_speculation_safe_value (machine_mode, rtx, rtx, rtx);
 
 extern bool default_memtag_can_tag_addresses ();
+extern uint8_t default_memtag_tag_size ();
+extern uint8_t default_memtag_granule_size ();
+extern rtx default_memtag_insert_random_tag (rtx, rtx);
+extern rtx default_memtag_add_tag (rtx, poly_int64, uint8_t);
+extern rtx default_memtag_set_tag (rtx, rtx, rtx);
+extern rtx default_memtag_extract_tag (rtx, rtx);
+extern rtx default_memtag_untagged_pointer (rtx, rtx);
+
 #endif /* GCC_TARGHOOKS_H */
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 46cb536041d396c32fd08042581d6d5cd5ad0395..e634df3f6c6837e422246a7736c0de4471ce1e77 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -73,6 +73,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "varasm.h"
 #include "flags.h"
 #include "explow.h"
+#include "expmed.h"
 #include "calls.h"
 #include "expr.h"
 #include "output.h"
@@ -86,6 +87,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "langhooks.h"
 #include "sbitmap.h"
 #include "function-abi.h"
+#include "attribs.h"
+#include "asan.h"
+#include "emit-rtl.h"
 
 bool
 default_legitimate_address_p (machine_mode mode ATTRIBUTE_UNUSED,
@@ -2415,10 +2419,115 @@ default_speculation_safe_value (machine_mode mode ATTRIBUTE_UNUSED,
   return result;
 }
 
+/* How many bits to shift in order to access the tag bits.
+   The default is to store the tag in the top 8 bits of a 64 bit pointer, hence
+   shifting 56 bits will leave just the tag.  */
+#define HWASAN_SHIFT (GET_MODE_PRECISION (Pmode) - 8)
+#define HWASAN_SHIFT_RTX GEN_INT (HWASAN_SHIFT)
+
 bool
 default_memtag_can_tag_addresses ()
 {
   return false;
 }
 
+uint8_t
+default_memtag_tag_size ()
+{
+  return 8;
+}
+
+uint8_t
+default_memtag_granule_size ()
+{
+  return 16;
+}
+
+/* The default implementation of TARGET_MEMTAG_INSERT_RANDOM_TAG.  */
+rtx
+default_memtag_insert_random_tag (rtx untagged, rtx target)
+{
+  gcc_assert (param_hwasan_instrument_stack);
+  if (param_hwasan_random_frame_tag)
+    {
+      rtx fn = init_one_libfunc ("__hwasan_generate_tag");
+      rtx new_tag = emit_library_call_value (fn, NULL_RTX, LCT_NORMAL, QImode);
+      return targetm.memtag.set_tag (untagged, new_tag, target);
+    }
+  else
+    {
+      /* NOTE: The kernel API does not have __hwasan_generate_tag exposed.
+	 In the future we may add the option emit random tags with inline
+	 instrumentation instead of function calls.  This would be the same
+	 between the kernel and userland.  */
+      return untagged;
+    }
+}
+
+/* The default implementation of TARGET_MEMTAG_ADD_TAG.  */
+rtx
+default_memtag_add_tag (rtx base, poly_int64 offset, uint8_t tag_offset)
+{
+  /* Need to look into what the most efficient code sequence is.
+     This is a code sequence that would be emitted *many* times, so we
+     want it as small as possible.
+
+     There are two places where tag overflow is a question:
+       - Tagging the shadow stack.
+	  (both tagging and untagging).
+       - Tagging addressable pointers.
+
+     We need to ensure both behaviors are the same (i.e. that the tag that
+     ends up in a pointer after "overflowing" the tag bits with a tag addition
+     is the same that ends up in the shadow space).
+
+     The aim is that the behavior of tag addition should follow modulo
+     wrapping in both instances.
+
+     The libhwasan code doesn't have any path that increments a pointer's tag,
+     which means it has no opinion on what happens when a tag increment
+     overflows (and hence we can choose our own behavior).  */
+
+  offset += ((uint64_t)tag_offset << HWASAN_SHIFT);
+  return plus_constant (Pmode, base, offset);
+}
+
+/* The default implementation of TARGET_MEMTAG_SET_TAG.  */
+rtx
+default_memtag_set_tag (rtx untagged, rtx tag, rtx target)
+{
+  gcc_assert (GET_MODE (untagged) == Pmode && GET_MODE (tag) == QImode);
+  tag = expand_simple_binop (Pmode, ASHIFT, tag, HWASAN_SHIFT_RTX, NULL_RTX,
+			     /* unsignedp = */1, OPTAB_WIDEN);
+  rtx ret = expand_simple_binop (Pmode, IOR, untagged, tag, target,
+				 /* unsignedp = */1, OPTAB_DIRECT);
+  gcc_assert (ret);
+  return ret;
+}
+
+/* The default implementation of TARGET_MEMTAG_EXTRACT_TAG.  */
+rtx
+default_memtag_extract_tag (rtx tagged_pointer, rtx target)
+{
+  rtx tag = expand_simple_binop (Pmode, LSHIFTRT, tagged_pointer,
+				 HWASAN_SHIFT_RTX, target,
+				 /* unsignedp = */0,
+				 OPTAB_DIRECT);
+  rtx ret = gen_lowpart (QImode, tag);
+  gcc_assert (ret);
+  return ret;
+}
+
+/* The default implementation of TARGET_MEMTAG_UNTAGGED_POINTER.  */
+rtx
+default_memtag_untagged_pointer (rtx tagged_pointer, rtx target)
+{
+  rtx tag_mask = gen_int_mode ((HOST_WIDE_INT_1U << HWASAN_SHIFT) - 1, Pmode);
+  rtx untagged_base = expand_simple_binop (Pmode, AND, tagged_pointer,
+					   tag_mask, target, true,
+					   OPTAB_DIRECT);
+  gcc_assert (untagged_base);
+  return untagged_base;
+}
+
 #include "gt-targhooks.h"
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 2a3e7c064a5fbb6913481104975ca85615e49f8e..9938b6afbd4fa22898dbc3c29b92061a71810b08 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -512,6 +512,9 @@ compile_file (void)
       if (flag_sanitize & SANITIZE_THREAD)
 	tsan_finish_file ();
 
+      if (flag_sanitize & SANITIZE_HWADDRESS)
+	hwasan_finish_file ();
+
       omp_finish_file ();
 
       output_shared_constant_pool ();
Richard Sandiford Nov. 23, 2020, 8:10 p.m. UTC | #3
Matthew Malcomson <matthew.malcomson@arm.com> writes:
> Hi there,
>
> I was just doing some double-checks and noticed I'd placed the
> documentation in the wrong section of tm.texi.  The `MEMTAG` hooks were
> documented in the `Register Classes` section, so I've now moved it to
> the `Misc` section.
>
> That's the only change, Ok for trunk?

OK, thanks.

Richard
Richard Sandiford Nov. 23, 2020, 8:16 p.m. UTC | #4
Sorry for the earlier OK, was replying to the wrong message…

Matthew Malcomson <matthew.malcomson@arm.com> writes:
> @@ -1216,6 +1255,24 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>  	    {
>  	      offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
>  	      base_align = crtl->max_used_stack_slot_alignment;
> +
> +	      if (hwasan_sanitize_stack_p ())
> +		{
> +		  /* Align again since the point of this alignment is to handle
> +		     the "end" of the object (i.e. smallest address after the
> +		     stack object).  For FRAME_GROWS_DOWNWARD that requires
> +		     aligning the stack before allocating, but for a frame that
> +		     grows upwards that requires aligning the stack after
> +		     allocation.
> +
> +		     Use `frame_offset` to record the offset value rather than
> +		     offset since the `frame_offset` describes the extent

What I meant here was to quote “offset”, i.e.:

		     Use `frame_offset` to record the offset value rather than
		     `offset` since `frame_offset` describes the extent

Without the quoting, “the offset value rather than offset” was quite
hard to parse.

OK with that change, thanks.

Richard

> +		     allocated for this particular variable while `offset`
> +		     describes the address that this variable starts at.  */
> +		  align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
> +		  hwasan_record_stack_var (virtual_stack_vars_rtx, base,
> +					   hwasan_orig_offset, frame_offset);
> +		}
>  	    }
>  	}
>        else
Hongtao Liu Nov. 24, 2020, 12:30 p.m. UTC | #5
Hi:
  I'm learning about this patch, and I see one place that might be
slighted improved.

+      poly_int64 size = (top - bot);
+
+      /* Assert the edge of each variable is aligned to the HWASAN tag granule
+        size.  */
+      gcc_assert (multiple_p (top, HWASAN_TAG_GRANULE_SIZE));
+      gcc_assert (multiple_p (bot, HWASAN_TAG_GRANULE_SIZE));
+      gcc_assert (multiple_p (size, HWASAN_TAG_GRANULE_SIZE));
+

The last gcc_assert looks redundant?

On Sat, Nov 21, 2020 at 2:48 AM Matthew Malcomson via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
>
> Hi there,
>
> I was just doing some double-checks and noticed I'd placed the
> documentation in the wrong section of tm.texi.  The `MEMTAG` hooks were
> documented in the `Register Classes` section, so I've now moved it to
> the `Misc` section.
>
> That's the only change, Ok for trunk?
>
> Matthew
>
>
> ------------------------------------------------------------
>
>
>
> Handling stack variables has three features.
>
> 1) Ensure HWASAN required alignment for stack variables
>
> When tagging shadow memory, we need to ensure that each tag granule is
> only used by one variable at a time.
>
> This is done by ensuring that each tagged variable is aligned to the tag
> granule representation size and also ensure that the end of each
> object is aligned to ensure the start of any other data stored on the
> stack is in a different granule.
>
> This patch ensures the above by forcing the stack pointer to be aligned
> before and after allocating any stack objects. Since we are forcing
> alignment we also use `align_local_variable` to ensure this new alignment
> is advertised properly through SET_DECL_ALIGN.
>
> 2) Put tags into each stack variable pointer
>
> Make sure that every pointer to a stack variable includes a tag of some
> sort on it.
>
> The way tagging works is:
>   1) For every new stack frame, a random tag is generated.
>   2) A base register is formed from the stack pointer value and this
>      random tag.
>   3) References to stack variables are now formed with RTL describing an
>      offset from this base in both tag and value.
>
> The random tag generation is handled by a backend hook.  This hook
> decides whether to introduce a random tag or use the stack background
> based on the parameter hwasan-random-frame-tag.  Using the stack
> background is necessary for testing and bootstrap.  It is necessary
> during bootstrap to avoid breaking the `configure` test program for
> determining stack direction.
>
> Using the stack background means that every stack frame has the initial
> tag of zero and variables are tagged with incrementing tags from 1,
> which also makes debugging a bit easier.
>
> Backend hooks define the size of a tag, the layout of the HWASAN shadow
> memory, and handle emitting the code that inserts and extracts tags from a
> pointer.
>
> 3) For each stack variable, tag and untag the shadow stack on function
>    prologue and epilogue.
>
> On entry to each function we tag the relevant shadow stack region for
> each stack variable. This stack region is tagged to match the tag added to
> each pointer to that variable.
>
> This is the first patch where we use the HWASAN shadow space, so we need
> to add in the libhwasan initialisation code that creates this shadow
> memory region into the binary we produce.  This instrumentation is done
> in `compile_file`.
>
> When exiting a function we need to ensure the shadow stack for this
> function has no remaining tags.  Without clearing the shadow stack area
> for this stack frame, later function calls could get false positives
> when those later function calls check untagged areas (such as parameters
> passed on the stack) against a shadow stack area with left-over tag.
>
> Hence we ensure that the entire stack frame is cleared on function exit.
>
> config/ChangeLog:
>
>         * bootstrap-hwasan.mk: Disable random frame tags for stack-tagging
>         during bootstrap.
>
> ChangeLog:
>
>         * gcc/asan.c (struct hwasan_stack_var): New.
>         (hwasan_sanitize_p): New.
>         (hwasan_sanitize_stack_p): New.
>         (hwasan_sanitize_allocas_p): New.
>         (initialize_sanitizer_builtins): Define new builtins.
>         (ATTR_NOTHROW_LIST): New macro.
>         (hwasan_current_frame_tag): New.
>         (hwasan_frame_base): New.
>         (stack_vars_base_reg_p): New.
>         (hwasan_maybe_init_frame_base_init): New.
>         (hwasan_record_stack_var): New.
>         (hwasan_get_frame_extent): New.
>         (hwasan_increment_frame_tag): New.
>         (hwasan_record_frame_init): New.
>         (hwasan_emit_prologue): New.
>         (hwasan_emit_untag_frame): New.
>         (hwasan_finish_file): New.
>         (hwasan_truncate_to_tag_size): New.
>         * gcc/asan.h (hwasan_record_frame_init): New declaration.
>         (hwasan_record_stack_var): New declaration.
>         (hwasan_emit_prologue): New declaration.
>         (hwasan_emit_untag_frame): New declaration.
>         (hwasan_get_frame_extent): New declaration.
>         (hwasan_maybe_enit_frame_base_init): New declaration.
>         (hwasan_frame_base): New declaration.
>         (stack_vars_base_reg_p): New declaration.
>         (hwasan_current_frame_tag): New declaration.
>         (hwasan_increment_frame_tag): New declaration.
>         (hwasan_truncate_to_tag_size): New declaration.
>         (hwasan_finish_file): New declaration.
>         (hwasan_sanitize_p): New declaration.
>         (hwasan_sanitize_stack_p): New declaration.
>         (hwasan_sanitize_allocas_p): New declaration.
>         (HWASAN_TAG_SIZE): New macro.
>         (HWASAN_TAG_GRANULE_SIZE): New macro.
>         (HWASAN_STACK_BACKGROUND): New macro.
>         * gcc/builtin-types.def (BT_FN_VOID_PTR_UINT8_PTRMODE): New.
>         * gcc/builtins.def (DEF_SANITIZER_BUILTIN): Enable for HWASAN.
>         * gcc/cfgexpand.c (align_local_variable): When using hwasan ensure
>         alignment to tag granule.
>         (align_frame_offset): New.
>         (expand_one_stack_var_at): For hwasan use tag offset.
>         (expand_stack_vars): Record stack objects for hwasan.
>         (expand_one_stack_var_1): Record stack objects for hwasan.
>         (init_vars_expansion): Initialise hwasan state.
>         (expand_used_vars): Emit hwasan prologue and generate hwasan epilogue.
>         (pass_expand::execute): Emit hwasan base initialization if needed.
>         * gcc/doc/tm.texi (TARGET_MEMTAG_TAG_SIZE,TARGET_MEMTAG_GRANULE_SIZE,
>         TARGET_MEMTAG_INSERT_RANDOM_TAG,TARGET_MEMTAG_ADD_TAG,
>         TARGET_MEMTAG_SET_TAG,TARGET_MEMTAG_EXTRACT_TAG,
>         TARGET_MEMTAG_UNTAGGED_POINTER): Document new hooks.
>         * gcc/doc/tm.texi.in (TARGET_MEMTAG_TAG_SIZE,TARGET_MEMTAG_GRANULE_SIZE,
>         TARGET_MEMTAG_INSERT_RANDOM_TAG,TARGET_MEMTAG_ADD_TAG,
>         TARGET_MEMTAG_SET_TAG,TARGET_MEMTAG_EXTRACT_TAG,
>         TARGET_MEMTAG_UNTAGGED_POINTER): Document new hooks.
>         * gcc/explow.c (get_dynamic_stack_base): Take new `base` argument.
>         * gcc/explow.h (get_dynamic_stack_base): Take new `base` argument.
>         * gcc/sanitizer.def (BUILT_IN_HWASAN_INIT): New.
>         (BUILT_IN_HWASAN_TAG_MEM): New.
>         * gcc/target.def (target_memtag_tag_size,target_memtag_granule_size,
>         target_memtag_insert_random_tag,target_memtag_add_tag,
>         target_memtag_set_tag,target_memtag_extract_tag,
>         target_memtag_untagged_pointer): New hooks.
>         * gcc/targhooks.c (HWASAN_SHIFT): New.
>         (HWASAN_SHIFT_RTX): New.
>         (default_memtag_tag_size): New default hook.
>         (default_memtag_granule_size): New default hook.
>         (default_memtag_insert_random_tag): New default hook.
>         (default_memtag_add_tag): New default hook.
>         (default_memtag_set_tag): New default hook.
>         (default_memtag_extract_tag): New default hook.
>         (default_memtag_untagged_pointer): New default hook.
>         * gcc/targhooks.h (default_memtag_tag_size): New default hook.
>         (default_memtag_granule_size): New default hook.
>         (default_memtag_insert_random_tag): New default hook.
>         (default_memtag_add_tag): New default hook.
>         (default_memtag_set_tag): New default hook.
>         (default_memtag_extract_tag): New default hook.
>         (default_memtag_untagged_pointer): New default hook.
>         * gcc/toplev.c (compile_file): Call hwasan_finish_file when finished.
>
>
> ###############     Attachment also inlined for ease of reply    ###############
>
>
> diff --git a/config/bootstrap-hwasan.mk b/config/bootstrap-hwasan.mk
> index 4f60bed3fd6e98b47a3a38aea6eba2a7c320da25..91989f4bb1db6ccff564383777757b896645e541 100644
> --- a/config/bootstrap-hwasan.mk
> +++ b/config/bootstrap-hwasan.mk
> @@ -1,7 +1,11 @@
>  # This option enables -fsanitize=hwaddress for stage2 and stage3.
> +# We need to disable random frame tags for bootstrap since the autoconf check
> +# for which direction the stack is growing has UB that a random frame tag
> +# breaks.  Running with a random frame tag gives approx. 50% chance of
> +# bootstrap comparison diff in libiberty/alloca.c.
>
> -STAGE2_CFLAGS += -fsanitize=hwaddress
> -STAGE3_CFLAGS += -fsanitize=hwaddress
> +STAGE2_CFLAGS += -fsanitize=hwaddress --param hwasan-random-frame-tag=0
> +STAGE3_CFLAGS += -fsanitize=hwaddress --param hwasan-random-frame-tag=0
>  POSTSTAGE1_LDFLAGS += -fsanitize=hwaddress -static-libhwasan \
>                       -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/ \
>                       -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/ \
> diff --git a/gcc/asan.h b/gcc/asan.h
> index 114b457ef91c4479d43774bed58c24213196ce12..8d5271e6b575d74da277420798557f3274e966ce 100644
> --- a/gcc/asan.h
> +++ b/gcc/asan.h
> @@ -34,6 +34,22 @@ extern bool asan_expand_mark_ifn (gimple_stmt_iterator *);
>  extern bool asan_expand_poison_ifn (gimple_stmt_iterator *, bool *,
>                                     hash_map<tree, tree> &);
>
> +extern void hwasan_record_frame_init ();
> +extern void hwasan_record_stack_var (rtx, rtx, poly_int64, poly_int64);
> +extern void hwasan_emit_prologue ();
> +extern rtx_insn *hwasan_emit_untag_frame (rtx, rtx);
> +extern rtx hwasan_get_frame_extent ();
> +extern rtx hwasan_frame_base ();
> +extern void hwasan_maybe_emit_frame_base_init (void);
> +extern bool stack_vars_base_reg_p (rtx);
> +extern uint8_t hwasan_current_frame_tag ();
> +extern void hwasan_increment_frame_tag ();
> +extern rtx hwasan_truncate_to_tag_size (rtx, rtx);
> +extern void hwasan_finish_file (void);
> +extern bool hwasan_sanitize_p (void);
> +extern bool hwasan_sanitize_stack_p (void);
> +extern bool hwasan_sanitize_allocas_p (void);
> +
>  extern gimple_stmt_iterator create_cond_insert_point
>       (gimple_stmt_iterator *, bool, bool, bool, basic_block *, basic_block *);
>
> @@ -75,6 +91,26 @@ extern hash_set <tree> *asan_used_labels;
>
>  #define ASAN_USE_AFTER_SCOPE_ATTRIBUTE "use after scope memory"
>
> +/* NOTE: The values below and the hooks under targetm.memtag define an ABI and
> +   are hard-coded to these values in libhwasan, hence they can't be changed
> +   independently here.  */
> +/* How many bits are used to store a tag in a pointer.
> +   The default version uses the entire top byte of a pointer (i.e. 8 bits).  */
> +#define HWASAN_TAG_SIZE targetm.memtag.tag_size ()
> +/* Tag Granule of HWASAN shadow stack.
> +   This is the size in real memory that each byte in the shadow memory refers
> +   to.  I.e. if a variable is X bytes long in memory then its tag in shadow
> +   memory will span X / HWASAN_TAG_GRANULE_SIZE bytes.
> +   Most variables will need to be aligned to this amount since two variables
> +   that are neighbors in memory and share a tag granule would need to share the
> +   same tag (the shared tag granule can only store one tag).  */
> +#define HWASAN_TAG_GRANULE_SIZE targetm.memtag.granule_size ()
> +/* Define the tag for the stack background.
> +   This defines what tag the stack pointer will be and hence what tag all
> +   variables that are not given special tags are (e.g. spilled registers,
> +   and parameters passed on the stack).  */
> +#define HWASAN_STACK_BACKGROUND gen_int_mode (0, QImode)
> +
>  /* Various flags for Asan builtins.  */
>  enum asan_check_flags
>  {
> diff --git a/gcc/asan.c b/gcc/asan.c
> index 0b471afff64ea6a0ffbe0add71333ac688c472c6..d1ede3b62291eba698948e06208c482b6f197be5 100644
> --- a/gcc/asan.c
> +++ b/gcc/asan.c
> @@ -257,6 +257,58 @@ hash_set<tree> *asan_handled_variables = NULL;
>
>  hash_set <tree> *asan_used_labels = NULL;
>
> +/* Global variables for HWASAN stack tagging.  */
> +/* hwasan_frame_tag_offset records the offset from the frame base tag that the
> +   next object should have.  */
> +static uint8_t hwasan_frame_tag_offset = 0;
> +/* hwasan_frame_base_ptr is a pointer with the same address as
> +   `virtual_stack_vars_rtx` for the current frame, and with the frame base tag
> +   stored in it.  N.b. this global RTX does not need to be marked GTY, but is
> +   done so anyway.  The need is not there since all uses are in just one pass
> +   (cfgexpand) and there are no calls to ggc_collect between the uses.  We mark
> +   it GTY(()) anyway to allow the use of the variable later on if needed by
> +   future features.  */
> +static GTY(()) rtx hwasan_frame_base_ptr = NULL_RTX;
> +/* hwasan_frame_base_init_seq is the sequence of RTL insns that will initialize
> +   the hwasan_frame_base_ptr.  When the hwasan_frame_base_ptr is requested, we
> +   generate this sequence but do not emit it.  If the sequence was created it
> +   is emitted once the function body has been expanded.
> +
> +   This delay is because the frame base pointer may be needed anywhere in the
> +   function body, or needed by the expand_used_vars function.  Emitting once in
> +   a known place is simpler than requiring the emission of the instructions to
> +   be know where it should go depending on the first place the hwasan frame
> +   base is needed.  */
> +static GTY(()) rtx_insn *hwasan_frame_base_init_seq = NULL;
> +
> +/* Structure defining the extent of one object on the stack that HWASAN needs
> +   to tag in the corresponding shadow stack space.
> +
> +   The range this object spans on the stack is between `untagged_base +
> +   nearest_offset` and `untagged_base + farthest_offset`.
> +   `tagged_base` is an rtx containing the same value as `untagged_base` but
> +   with a random tag stored in the top byte.  We record both `untagged_base`
> +   and `tagged_base` so that `hwasan_emit_prologue` can use both without having
> +   to emit RTL into the instruction stream to re-calculate one from the other.
> +   (`hwasan_emit_prologue` needs to use both bases since the
> +   __hwasan_tag_memory call it emits uses an untagged value, and it calculates
> +   the tag to store in shadow memory based on the tag_offset plus the tag in
> +   tagged_base).  */
> +struct hwasan_stack_var
> +{
> +  rtx untagged_base;
> +  rtx tagged_base;
> +  poly_int64 nearest_offset;
> +  poly_int64 farthest_offset;
> +  uint8_t tag_offset;
> +};
> +
> +/* Variable recording all stack variables that HWASAN needs to tag.
> +   Does not need to be marked as GTY(()) since every use is in the cfgexpand
> +   pass and gcc_collect is not called in the middle of that pass.  */
> +static vec<hwasan_stack_var> hwasan_tagged_stack_vars;
> +
> +
>  /* Sets shadow offset to value in string VAL.  */
>
>  bool
> @@ -1359,6 +1411,28 @@ asan_redzone_buffer::flush_if_full (void)
>      flush_redzone_payload ();
>  }
>
> +/* Returns whether we are tagging pointers and checking those tags on memory
> +   access.  */
> +bool
> +hwasan_sanitize_p ()
> +{
> +  return sanitize_flags_p (SANITIZE_HWADDRESS);
> +}
> +
> +/* Are we tagging the stack?  */
> +bool
> +hwasan_sanitize_stack_p ()
> +{
> +  return (hwasan_sanitize_p () && param_hwasan_instrument_stack);
> +}
> +
> +/* Are we tagging alloca objects?  */
> +bool
> +hwasan_sanitize_allocas_p (void)
> +{
> +  return (hwasan_sanitize_stack_p () && param_hwasan_instrument_allocas);
> +}
> +
>  /* Insert code to protect stack vars.  The prologue sequence should be emitted
>     directly, epilogue sequence returned.  BASE is the register holding the
>     stack base, against which OFFSETS array offsets are relative to, OFFSETS
> @@ -2908,6 +2982,11 @@ initialize_sanitizer_builtins (void)
>      = build_function_type_list (void_type_node, uint64_type_node,
>                                 ptr_type_node, NULL_TREE);
>
> +  tree BT_FN_VOID_PTR_UINT8_PTRMODE
> +    = build_function_type_list (void_type_node, ptr_type_node,
> +                               unsigned_char_type_node,
> +                               pointer_sized_int_node, NULL_TREE);
> +
>    tree BT_FN_BOOL_VPTR_PTR_IX_INT_INT[5];
>    tree BT_FN_IX_CONST_VPTR_INT[5];
>    tree BT_FN_IX_VPTR_IX_INT[5];
> @@ -2958,6 +3037,8 @@ initialize_sanitizer_builtins (void)
>  #define BT_FN_I16_CONST_VPTR_INT BT_FN_IX_CONST_VPTR_INT[4]
>  #define BT_FN_I16_VPTR_I16_INT BT_FN_IX_VPTR_IX_INT[4]
>  #define BT_FN_VOID_VPTR_I16_INT BT_FN_VOID_VPTR_IX_INT[4]
> +#undef ATTR_NOTHROW_LIST
> +#define ATTR_NOTHROW_LIST ECF_NOTHROW
>  #undef ATTR_NOTHROW_LEAF_LIST
>  #define ATTR_NOTHROW_LEAF_LIST ECF_NOTHROW | ECF_LEAF
>  #undef ATTR_TMPURE_NOTHROW_LEAF_LIST
> @@ -3709,4 +3790,347 @@ make_pass_asan_O0 (gcc::context *ctxt)
>    return new pass_asan_O0 (ctxt);
>  }
>
> +/* For stack tagging:
> +
> +   Return the offset from the frame base tag that the "next" expanded object
> +   should have.  */
> +uint8_t
> +hwasan_current_frame_tag ()
> +{
> +  return hwasan_frame_tag_offset;
> +}
> +
> +/* For stack tagging:
> +
> +   Return the 'base pointer' for this function.  If that base pointer has not
> +   yet been created then we create a register to hold it and record the insns
> +   to initialize the register in `hwasan_frame_base_init_seq` for later
> +   emission.  */
> +rtx
> +hwasan_frame_base ()
> +{
> +  if (! hwasan_frame_base_ptr)
> +    {
> +      start_sequence ();
> +      hwasan_frame_base_ptr
> +       = force_reg (Pmode,
> +                    targetm.memtag.insert_random_tag (virtual_stack_vars_rtx,
> +                                                      NULL_RTX));
> +      hwasan_frame_base_init_seq = get_insns ();
> +      end_sequence ();
> +    }
> +
> +  return hwasan_frame_base_ptr;
> +}
> +
> +/* For stack tagging:
> +
> +   Check whether this RTX is a standard pointer addressing the base of the
> +   stack variables for this frame.  Returns true if the RTX is either
> +   virtual_stack_vars_rtx or hwasan_frame_base_ptr.  */
> +bool
> +stack_vars_base_reg_p (rtx base)
> +{
> +  return base == virtual_stack_vars_rtx || base == hwasan_frame_base_ptr;
> +}
> +
> +/* For stack tagging:
> +
> +   Emit frame base initialisation.
> +   If hwasan_frame_base has been used before here then
> +   hwasan_frame_base_init_seq contains the sequence of instructions to
> +   initialize it.  This must be put just before the hwasan prologue, so we emit
> +   the insns before parm_birth_insn (which will point to the first instruction
> +   of the hwasan prologue if it exists).
> +
> +   We update `parm_birth_insn` to point to the start of this initialisation
> +   since that represents the end of the initialisation done by
> +   expand_function_{start,end} functions and we want to maintain that.  */
> +void
> +hwasan_maybe_emit_frame_base_init ()
> +{
> +  if (! hwasan_frame_base_init_seq)
> +    return;
> +  emit_insn_before (hwasan_frame_base_init_seq, parm_birth_insn);
> +  parm_birth_insn = hwasan_frame_base_init_seq;
> +}
> +
> +/* Record a compile-time constant size stack variable that HWASAN will need to
> +   tag.  This record of the range of a stack variable will be used by
> +   `hwasan_emit_prologue` to emit the RTL at the start of each frame which will
> +   set tags in the shadow memory according to the assigned tag for each object.
> +
> +   The range that the object spans in stack space should be described by the
> +   bounds `untagged_base + nearest_offset` and
> +   `untagged_base + farthest_offset`.
> +   `tagged_base` is the base address which contains the "base frame tag" for
> +   this frame, and from which the value to address this object with will be
> +   calculated.
> +
> +   We record the `untagged_base` since the functions in the hwasan library we
> +   use to tag memory take pointers without a tag.  */
> +void
> +hwasan_record_stack_var (rtx untagged_base, rtx tagged_base,
> +                        poly_int64 nearest_offset, poly_int64 farthest_offset)
> +{
> +  hwasan_stack_var cur_var;
> +  cur_var.untagged_base = untagged_base;
> +  cur_var.tagged_base = tagged_base;
> +  cur_var.nearest_offset = nearest_offset;
> +  cur_var.farthest_offset = farthest_offset;
> +  cur_var.tag_offset = hwasan_current_frame_tag ();
> +
> +  hwasan_tagged_stack_vars.safe_push (cur_var);
> +}
> +
> +/* Return the RTX representing the farthest extent of the statically allocated
> +   stack objects for this frame.  If hwasan_frame_base_ptr has not been
> +   initialized then we are not storing any static variables on the stack in
> +   this frame.  In this case we return NULL_RTX to represent that.
> +
> +   Otherwise simply return virtual_stack_vars_rtx + frame_offset.  */
> +rtx
> +hwasan_get_frame_extent ()
> +{
> +  return (hwasan_frame_base_ptr
> +         ? plus_constant (Pmode, virtual_stack_vars_rtx, frame_offset)
> +         : NULL_RTX);
> +}
> +
> +/* For stack tagging:
> +
> +   Increment the frame tag offset modulo the size a tag can represent.  */
> +void
> +hwasan_increment_frame_tag ()
> +{
> +  uint8_t tag_bits = HWASAN_TAG_SIZE;
> +  gcc_assert (HWASAN_TAG_SIZE
> +             <= sizeof (hwasan_frame_tag_offset) * CHAR_BIT);
> +  hwasan_frame_tag_offset = (hwasan_frame_tag_offset + 1) % (1 << tag_bits);
> +  /* The "background tag" of the stack is zero by definition.
> +     This is the tag that objects like parameters passed on the stack and
> +     spilled registers are given.  It is handy to avoid this tag for objects
> +     whose tags we decide ourselves, partly to ensure that buffer overruns
> +     can't affect these important variables (e.g. saved link register, saved
> +     stack pointer etc) and partly to make debugging easier (everything with a
> +     tag of zero is space allocated automatically by the compiler).
> +
> +     This is not feasible when using random frame tags (the default
> +     configuration for hwasan) since the tag for the given frame is randomly
> +     chosen at runtime.  In order to avoid any tags matching the stack
> +     background we would need to decide tag offsets at runtime instead of
> +     compile time (and pay the resulting performance cost).
> +
> +     When not using random base tags for each frame (i.e. when compiled with
> +     `--param hwasan-random-frame-tag=0`) the base tag for each frame is zero.
> +     This means the tag that each object gets is equal to the
> +     hwasan_frame_tag_offset used in determining it.
> +     When this is the case we *can* ensure no object gets the tag of zero by
> +     simply ensuring no object has the hwasan_frame_tag_offset of zero.
> +
> +     There is the extra complication that we only record the
> +     hwasan_frame_tag_offset here (which is the offset from the tag stored in
> +     the stack pointer).  In the kernel, the tag in the stack pointer is 0xff
> +     rather than zero.  This does not cause problems since tags of 0xff are
> +     never checked in the kernel.  As mentioned at the beginning of this
> +     comment the background tag of the stack is zero by definition, which means
> +     that for the kernel we should skip offsets of both 0 and 1 from the stack
> +     pointer.  Avoiding the offset of 0 ensures we use a tag which will be
> +     checked, avoiding the offset of 1 ensures we use a tag that is not the
> +     same as the background.  */
> +  if (hwasan_frame_tag_offset == 0 && ! param_hwasan_random_frame_tag)
> +    hwasan_frame_tag_offset += 1;
> +  if (hwasan_frame_tag_offset == 1 && ! param_hwasan_random_frame_tag
> +      && sanitize_flags_p (SANITIZE_KERNEL_HWADDRESS))
> +    hwasan_frame_tag_offset += 1;
> +}
> +
> +/* Clear internal state for the next function.
> +   This function is called before variables on the stack get expanded, in
> +   `init_vars_expansion`.  */
> +void
> +hwasan_record_frame_init ()
> +{
> +  delete asan_used_labels;
> +  asan_used_labels = NULL;
> +
> +  /* If this isn't the case then some stack variable was recorded *before*
> +     hwasan_record_frame_init is called, yet *after* the hwasan prologue for
> +     the previous frame was emitted.  Such stack variables would not have
> +     their shadow stack filled in.  */
> +  gcc_assert (hwasan_tagged_stack_vars.is_empty ());
> +  hwasan_frame_base_ptr = NULL_RTX;
> +  hwasan_frame_base_init_seq = NULL;
> +
> +  /* When not using a random frame tag we can avoid the background stack
> +     color which gives the user a little better debug output upon a crash.
> +     Meanwhile, when using a random frame tag it will be nice to avoid adding
> +     tags for the first object since that is unnecessary extra work.
> +     Hence set the initial hwasan_frame_tag_offset to be 0 if using a random
> +     frame tag and 1 otherwise.
> +
> +     As described in hwasan_increment_frame_tag, in the kernel the stack
> +     pointer has the tag 0xff.  That means that to avoid 0xff and 0 (the tag
> +     which the kernel does not check and the background tag respectively) we
> +     start with a tag offset of 2.  */
> +  hwasan_frame_tag_offset = param_hwasan_random_frame_tag
> +    ? 0
> +    : sanitize_flags_p (SANITIZE_KERNEL_HWADDRESS) ? 2 : 1;
> +}
> +
> +/* For stack tagging:
> +   (Emits HWASAN equivalent of what is emitted by
> +   `asan_emit_stack_protection`).
> +
> +   Emits the extra prologue code to set the shadow stack as required for HWASAN
> +   stack instrumentation.
> +
> +   Uses the vector of recorded stack variables hwasan_tagged_stack_vars.  When
> +   this function has completed hwasan_tagged_stack_vars is empty and all
> +   objects it had pointed to are deallocated.  */
> +void
> +hwasan_emit_prologue ()
> +{
> +  /* We need untagged base pointers since libhwasan only accepts untagged
> +    pointers in __hwasan_tag_memory.  We need the tagged base pointer to obtain
> +    the base tag for an offset.  */
> +
> +  if (hwasan_tagged_stack_vars.is_empty ())
> +    return;
> +
> +  poly_int64 bot = 0, top = 0;
> +  for (hwasan_stack_var &cur : hwasan_tagged_stack_vars)
> +    {
> +      poly_int64 nearest = cur.nearest_offset;
> +      poly_int64 farthest = cur.farthest_offset;
> +
> +      if (known_ge (nearest, farthest))
> +       {
> +         top = nearest;
> +         bot = farthest;
> +       }
> +      else
> +       {
> +         /* Given how these values are calculated, one must be known greater
> +            than the other.  */
> +         gcc_assert (known_le (nearest, farthest));
> +         top = farthest;
> +         bot = nearest;
> +       }
> +      poly_int64 size = (top - bot);
> +
> +      /* Assert the edge of each variable is aligned to the HWASAN tag granule
> +        size.  */
> +      gcc_assert (multiple_p (top, HWASAN_TAG_GRANULE_SIZE));
> +      gcc_assert (multiple_p (bot, HWASAN_TAG_GRANULE_SIZE));
> +      gcc_assert (multiple_p (size, HWASAN_TAG_GRANULE_SIZE));
> +
> +      rtx fn = init_one_libfunc ("__hwasan_tag_memory");
> +      rtx base_tag = targetm.memtag.extract_tag (cur.tagged_base, NULL_RTX);
> +      rtx tag = plus_constant (QImode, base_tag, cur.tag_offset);
> +      tag = hwasan_truncate_to_tag_size (tag, NULL_RTX);
> +
> +      rtx bottom = convert_memory_address (ptr_mode,
> +                                          plus_constant (Pmode,
> +                                                         cur.untagged_base,
> +                                                         bot));
> +      emit_library_call (fn, LCT_NORMAL, VOIDmode,
> +                        bottom, ptr_mode,
> +                        tag, QImode,
> +                        gen_int_mode (size, ptr_mode), ptr_mode);
> +    }
> +  /* Clear the stack vars, we've emitted the prologue for them all now.  */
> +  hwasan_tagged_stack_vars.truncate (0);
> +}
> +
> +/* For stack tagging:
> +
> +   Return RTL insns to clear the tags between DYNAMIC and VARS pointers
> +   into the stack.  These instructions should be emitted at the end of
> +   every function.
> +
> +   If `dynamic` is NULL_RTX then no insns are returned.  */
> +rtx_insn *
> +hwasan_emit_untag_frame (rtx dynamic, rtx vars)
> +{
> +  if (! dynamic)
> +    return NULL;
> +
> +  start_sequence ();
> +
> +  dynamic = convert_memory_address (ptr_mode, dynamic);
> +  vars = convert_memory_address (ptr_mode, vars);
> +
> +  rtx top_rtx;
> +  rtx bot_rtx;
> +  if (FRAME_GROWS_DOWNWARD)
> +    {
> +      top_rtx = vars;
> +      bot_rtx = dynamic;
> +    }
> +  else
> +    {
> +      top_rtx = dynamic;
> +      bot_rtx = vars;
> +    }
> +
> +  rtx size_rtx = expand_simple_binop (ptr_mode, MINUS, top_rtx, bot_rtx,
> +                                     NULL_RTX, /* unsignedp = */0,
> +                                     OPTAB_DIRECT);
> +
> +  rtx fn = init_one_libfunc ("__hwasan_tag_memory");
> +  emit_library_call (fn, LCT_NORMAL, VOIDmode,
> +                    bot_rtx, ptr_mode,
> +                    HWASAN_STACK_BACKGROUND, QImode,
> +                    size_rtx, ptr_mode);
> +
> +  do_pending_stack_adjust ();
> +  rtx_insn *insns = get_insns ();
> +  end_sequence ();
> +  return insns;
> +}
> +
> +/* Needs to be GTY(()), because cgraph_build_static_cdtor may
> +   invoke ggc_collect.  */
> +static GTY(()) tree hwasan_ctor_statements;
> +
> +/* Insert module initialization into this TU.  This initialization calls the
> +   initialization code for libhwasan.  */
> +void
> +hwasan_finish_file (void)
> +{
> +  /* Do not emit constructor initialization for the kernel.
> +     (the kernel has its own initialization already).  */
> +  if (flag_sanitize & SANITIZE_KERNEL_HWADDRESS)
> +    return;
> +
> +  /* Avoid instrumenting code in the hwasan constructors/destructors.  */
> +  flag_sanitize &= ~SANITIZE_HWADDRESS;
> +  int priority = MAX_RESERVED_INIT_PRIORITY - 1;
> +  tree fn = builtin_decl_implicit (BUILT_IN_HWASAN_INIT);
> +  append_to_statement_list (build_call_expr (fn, 0), &hwasan_ctor_statements);
> +  cgraph_build_static_cdtor ('I', hwasan_ctor_statements, priority);
> +  flag_sanitize |= SANITIZE_HWADDRESS;
> +}
> +
> +/* For stack tagging:
> +
> +   Truncate `tag` to the number of bits that a tag uses (i.e. to
> +   HWASAN_TAG_SIZE).  Store the result in `target` if it's convenient.  */
> +rtx
> +hwasan_truncate_to_tag_size (rtx tag, rtx target)
> +{
> +  gcc_assert (GET_MODE (tag) == QImode);
> +  if (HWASAN_TAG_SIZE != GET_MODE_PRECISION (QImode))
> +    {
> +      gcc_assert (GET_MODE_PRECISION (QImode) > HWASAN_TAG_SIZE);
> +      rtx mask = gen_int_mode ((HOST_WIDE_INT_1U << HWASAN_TAG_SIZE) - 1,
> +                              QImode);
> +      tag = expand_simple_binop (QImode, AND, tag, mask, target,
> +                                /* unsignedp = */1, OPTAB_WIDEN);
> +      gcc_assert (tag);
> +    }
> +  return tag;
> +}
> +
>  #include "gt-asan.h"
> diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
> index 4a82ee421bef42154ccd88e52f7a19f48b340c73..1ad6657da45cc4976532e1b8bc233f67d8da9ccf 100644
> --- a/gcc/builtin-types.def
> +++ b/gcc/builtin-types.def
> @@ -639,6 +639,8 @@ DEF_FUNCTION_TYPE_3 (BT_FN_PTR_PTR_CONST_SIZE_BOOL,
>                      BT_PTR, BT_PTR, BT_CONST_SIZE, BT_BOOL)
>  DEF_FUNCTION_TYPE_3 (BT_FN_PTR_SIZE_SIZE_PTRMODE,
>                      BT_PTR, BT_SIZE, BT_SIZE, BT_PTRMODE)
> +DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_UINT8_PTRMODE, BT_VOID, BT_PTR, BT_UINT8,
> +                    BT_PTRMODE)
>
>  DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
>                      BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
> diff --git a/gcc/builtins.def b/gcc/builtins.def
> index b4494c712a1751fbb37378f38cc1411d11a37331..97bb5d0b0aee7fa9ee4c82e2d80eae866fc23829 100644
> --- a/gcc/builtins.def
> +++ b/gcc/builtins.def
> @@ -245,6 +245,7 @@ along with GCC; see the file COPYING3.  If not see
>    DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,    \
>                true, true, true, ATTRS, true, \
>               (flag_sanitize & (SANITIZE_ADDRESS | SANITIZE_THREAD \
> +                               | SANITIZE_HWADDRESS \
>                                 | SANITIZE_UNDEFINED \
>                                 | SANITIZE_UNDEFINED_NONDEFAULT) \
>                || flag_sanitize_coverage))
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 1df6f4bc55a39230c98e58af6c2d765652db8324..231c2ee32362fc3967b1cd7b70bd330ce49648d3 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -376,15 +376,18 @@ align_local_variable (tree decl, bool really_expand)
>         align = GET_MODE_ALIGNMENT (mode);
>      }
>    else
> -    {
> -      align = LOCAL_DECL_ALIGNMENT (decl);
> -      /* Don't change DECL_ALIGN when called from estimated_stack_frame_size.
> -        That is done before IPA and could bump alignment based on host
> -        backend even for offloaded code which wants different
> -        LOCAL_DECL_ALIGNMENT.  */
> -      if (really_expand)
> -       SET_DECL_ALIGN (decl, align);
> -    }
> +    align = LOCAL_DECL_ALIGNMENT (decl);
> +
> +  if (hwasan_sanitize_stack_p ())
> +    align = MAX (align, (unsigned) HWASAN_TAG_GRANULE_SIZE * BITS_PER_UNIT);
> +
> +  if (TREE_CODE (decl) != SSA_NAME && really_expand)
> +    /* Don't change DECL_ALIGN when called from estimated_stack_frame_size.
> +       That is done before IPA and could bump alignment based on host
> +       backend even for offloaded code which wants different
> +       LOCAL_DECL_ALIGNMENT.  */
> +    SET_DECL_ALIGN (decl, align);
> +
>    return align / BITS_PER_UNIT;
>  }
>
> @@ -428,6 +431,14 @@ alloc_stack_frame_space (poly_int64 size, unsigned HOST_WIDE_INT align)
>    return offset;
>  }
>
> +/* Ensure that the stack is aligned to ALIGN bytes.
> +   Return the new frame offset.  */
> +static poly_int64
> +align_frame_offset (unsigned HOST_WIDE_INT align)
> +{
> +  return alloc_stack_frame_space (0, align);
> +}
> +
>  /* Accumulate DECL into STACK_VARS.  */
>
>  static void
> @@ -1004,7 +1015,12 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>    /* If this fails, we've overflowed the stack frame.  Error nicely?  */
>    gcc_assert (known_eq (offset, trunc_int_for_mode (offset, Pmode)));
>
> -  x = plus_constant (Pmode, base, offset);
> +  if (hwasan_sanitize_stack_p ())
> +    x = targetm.memtag.add_tag (base, offset,
> +                               hwasan_current_frame_tag ());
> +  else
> +    x = plus_constant (Pmode, base, offset);
> +
>    x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
>                    ? TYPE_MODE (TREE_TYPE (decl))
>                    : DECL_MODE (decl), x);
> @@ -1013,7 +1029,7 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>       If it is we generate stack slots only accidentally so it isn't as
>       important, we'll simply set the alignment directly on the MEM.  */
>
> -  if (base == virtual_stack_vars_rtx)
> +  if (stack_vars_base_reg_p (base))
>      offset -= frame_phase;
>    align = known_alignment (offset);
>    align *= BITS_PER_UNIT;
> @@ -1056,13 +1072,13 @@ public:
>  /* A subroutine of expand_used_vars.  Give each partition representative
>     a unique location within the stack frame.  Update each partition member
>     with that location.  */
> -
>  static void
>  expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>  {
>    size_t si, i, j, n = stack_vars_num;
>    poly_uint64 large_size = 0, large_alloc = 0;
>    rtx large_base = NULL;
> +  rtx large_untagged_base = NULL;
>    unsigned large_align = 0;
>    bool large_allocation_done = false;
>    tree decl;
> @@ -1113,7 +1129,7 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>      {
>        rtx base;
>        unsigned base_align, alignb;
> -      poly_int64 offset;
> +      poly_int64 offset = 0;
>
>        i = stack_vars_sorted[si];
>
> @@ -1134,10 +1150,33 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>        if (pred && !pred (i))
>         continue;
>
> +      base = (hwasan_sanitize_stack_p ()
> +             ? hwasan_frame_base ()
> +             : virtual_stack_vars_rtx);
>        alignb = stack_vars[i].alignb;
>        if (alignb * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
>         {
> -         base = virtual_stack_vars_rtx;
> +         poly_int64 hwasan_orig_offset;
> +         if (hwasan_sanitize_stack_p ())
> +           {
> +             /* There must be no tag granule "shared" between different
> +                objects.  This means that no HWASAN_TAG_GRANULE_SIZE byte
> +                chunk can have more than one object in it.
> +
> +                We ensure this by forcing the end of the last bit of data to
> +                be aligned to HWASAN_TAG_GRANULE_SIZE bytes here, and setting
> +                the start of each variable to be aligned to
> +                HWASAN_TAG_GRANULE_SIZE bytes in `align_local_variable`.
> +
> +                We can't align just one of the start or end, since there are
> +                untagged things stored on the stack which we do not align to
> +                HWASAN_TAG_GRANULE_SIZE bytes.  If we only aligned the start
> +                or the end of tagged objects then untagged objects could end
> +                up sharing the first granule of a tagged object or sharing the
> +                last granule of a tagged object respectively.  */
> +             hwasan_orig_offset = align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
> +             gcc_assert (stack_vars[i].alignb >= HWASAN_TAG_GRANULE_SIZE);
> +           }
>           /* ASAN description strings don't yet have a syntax for expressing
>              polynomial offsets.  */
>           HOST_WIDE_INT prev_offset;
> @@ -1148,7 +1187,7 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>             {
>               if (data->asan_vec.is_empty ())
>                 {
> -                 alloc_stack_frame_space (0, ASAN_RED_ZONE_SIZE);
> +                 align_frame_offset (ASAN_RED_ZONE_SIZE);
>                   prev_offset = frame_offset.to_constant ();
>                 }
>               prev_offset = align_base (prev_offset,
> @@ -1216,6 +1255,24 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>             {
>               offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
>               base_align = crtl->max_used_stack_slot_alignment;
> +
> +             if (hwasan_sanitize_stack_p ())
> +               {
> +                 /* Align again since the point of this alignment is to handle
> +                    the "end" of the object (i.e. smallest address after the
> +                    stack object).  For FRAME_GROWS_DOWNWARD that requires
> +                    aligning the stack before allocating, but for a frame that
> +                    grows upwards that requires aligning the stack after
> +                    allocation.
> +
> +                    Use `frame_offset` to record the offset value rather than
> +                    offset since the `frame_offset` describes the extent
> +                    allocated for this particular variable while `offset`
> +                    describes the address that this variable starts at.  */
> +                 align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
> +                 hwasan_record_stack_var (virtual_stack_vars_rtx, base,
> +                                          hwasan_orig_offset, frame_offset);
> +               }
>             }
>         }
>        else
> @@ -1236,14 +1293,33 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>               loffset = alloc_stack_frame_space
>                 (rtx_to_poly_int64 (large_allocsize),
>                  PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
> -             large_base = get_dynamic_stack_base (loffset, large_align);
> +             large_base = get_dynamic_stack_base (loffset, large_align, base);
>               large_allocation_done = true;
>             }
> -         gcc_assert (large_base != NULL);
>
> +         gcc_assert (large_base != NULL);
>           large_alloc = aligned_upper_bound (large_alloc, alignb);
>           offset = large_alloc;
>           large_alloc += stack_vars[i].size;
> +         if (hwasan_sanitize_stack_p ())
> +           {
> +             /* An object with a large alignment requirement means that the
> +                alignment requirement is greater than the required alignment
> +                for tags.  */
> +             if (!large_untagged_base)
> +               large_untagged_base
> +                 = targetm.memtag.untagged_pointer (large_base, NULL_RTX);
> +             /* Ensure the end of the variable is also aligned correctly.  */
> +             poly_int64 align_again
> +               = aligned_upper_bound (large_alloc, HWASAN_TAG_GRANULE_SIZE);
> +             /* For large allocations we always allocate a chunk of space
> +                (which is addressed by large_untagged_base/large_base) and
> +                then use positive offsets from that.  Hence the farthest
> +                offset is `align_again` and the nearest offset from the base
> +                is `offset`.  */
> +             hwasan_record_stack_var (large_untagged_base, large_base,
> +                                      offset, align_again);
> +           }
>
>           base = large_base;
>           base_align = large_align;
> @@ -1254,9 +1330,10 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>        for (j = i; j != EOC; j = stack_vars[j].next)
>         {
>           expand_one_stack_var_at (stack_vars[j].decl,
> -                                  base, base_align,
> -                                  offset);
> +                                  base, base_align, offset);
>         }
> +      if (hwasan_sanitize_stack_p ())
> +       hwasan_increment_frame_tag ();
>      }
>
>    gcc_assert (known_eq (large_alloc, large_size));
> @@ -1347,10 +1424,37 @@ expand_one_stack_var_1 (tree var)
>    /* We handle highly aligned variables in expand_stack_vars.  */
>    gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
>
> -  offset = alloc_stack_frame_space (size, byte_align);
> +  rtx base;
> +  if (hwasan_sanitize_stack_p ())
> +    {
> +      /* Allocate zero bytes to align the stack.  */
> +      poly_int64 hwasan_orig_offset
> +       = align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
> +      offset = alloc_stack_frame_space (size, byte_align);
> +      align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
> +      base = hwasan_frame_base ();
> +      /* Use `frame_offset` to automatically account for machines where the
> +        frame grows upwards.
> +
> +        `offset` will always point to the "start" of the stack object, which
> +        will be the smallest address, for ! FRAME_GROWS_DOWNWARD this is *not*
> +        the "furthest" offset from the base delimiting the current stack
> +        object.  `frame_offset` will always delimit the extent that the frame.
> +        */
> +      hwasan_record_stack_var (virtual_stack_vars_rtx, base,
> +                              hwasan_orig_offset, frame_offset);
> +    }
> +  else
> +    {
> +      offset = alloc_stack_frame_space (size, byte_align);
> +      base = virtual_stack_vars_rtx;
> +    }
>
> -  expand_one_stack_var_at (var, virtual_stack_vars_rtx,
> +  expand_one_stack_var_at (var, base,
>                            crtl->max_used_stack_slot_alignment, offset);
> +
> +  if (hwasan_sanitize_stack_p ())
> +    hwasan_increment_frame_tag ();
>  }
>
>  /* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
> @@ -1950,6 +2054,8 @@ init_vars_expansion (void)
>    /* Initialize local stack smashing state.  */
>    has_protected_decls = false;
>    has_short_buffer = false;
> +  if (hwasan_sanitize_stack_p ())
> +    hwasan_record_frame_init ();
>  }
>
>  /* Free up stack variable graph data.  */
> @@ -2277,10 +2383,26 @@ expand_used_vars (void)
>        expand_stack_vars (NULL, &data);
>      }
>
> +  if (hwasan_sanitize_stack_p ())
> +    hwasan_emit_prologue ();
>    if (asan_sanitize_allocas_p () && cfun->calls_alloca)
>      var_end_seq = asan_emit_allocas_unpoison (virtual_stack_dynamic_rtx,
>                                               virtual_stack_vars_rtx,
>                                               var_end_seq);
> +  else if (hwasan_sanitize_allocas_p () && cfun->calls_alloca)
> +    /* When using out-of-line instrumentation we only want to emit one function
> +       call for clearing the tags in a region of shadow stack.  When there are
> +       alloca calls in this frame we want to emit a call using the
> +       virtual_stack_dynamic_rtx, but when not we use the hwasan_frame_extent
> +       rtx we created in expand_stack_vars.  */
> +    var_end_seq = hwasan_emit_untag_frame (virtual_stack_dynamic_rtx,
> +                                          virtual_stack_vars_rtx);
> +  else if (hwasan_sanitize_stack_p ())
> +    /* If no variables were stored on the stack, `hwasan_get_frame_extent`
> +       will return NULL_RTX and hence `hwasan_emit_untag_frame` will return
> +       NULL (i.e. an empty sequence).  */
> +    var_end_seq = hwasan_emit_untag_frame (hwasan_get_frame_extent (),
> +                                          virtual_stack_vars_rtx);
>
>    fini_vars_expansion ();
>
> @@ -6641,6 +6763,9 @@ pass_expand::execute (function *fun)
>        emit_insn_after (var_ret_seq, after);
>      }
>
> +  if (hwasan_sanitize_stack_p ())
> +    hwasan_maybe_emit_frame_base_init ();
> +
>    /* Zap the tree EH table.  */
>    set_eh_throw_stmt_table (fun, NULL);
>
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 298fe4b295e2f81d679786f21f499183bc07078f..f06d5e8911241d3fa0f2c7a101a3a2468defd227 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -12230,3 +12230,60 @@ work.
>  At preset, this feature does not support address spaces.  It also requires
>  @code{Pmode} to be the same as @code{ptr_mode}.
>  @end deftypefn
> +
> +@deftypefn {Target Hook} uint8_t TARGET_MEMTAG_TAG_SIZE ()
> +Return the size of a tag (in bits) for this platform.
> +
> +The default returns 8.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} uint8_t TARGET_MEMTAG_GRANULE_SIZE ()
> +Return the size in real memory that each byte in shadow memory refers to.
> +I.e. if a variable is @var{X} bytes long in memory, then this hook should
> +return the value @var{Y} such that the tag in shadow memory spans
> +@var{X}/@var{Y} bytes.
> +
> +Most variables will need to be aligned to this amount since two variables
> +that are neighbors in memory and share a tag granule would need to share
> +the same tag.
> +
> +The default returns 16.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} rtx TARGET_MEMTAG_INSERT_RANDOM_TAG (rtx @var{untagged}, rtx @var{target})
> +Return an RTX representing the value of @var{untagged} but with a
> +(possibly) random tag in it.
> +Put that value into @var{target} if it is convenient to do so.
> +This function is used to generate a tagged base for the current stack frame.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} rtx TARGET_MEMTAG_ADD_TAG (rtx @var{base}, poly_int64 @var{addr_offset}, uint8_t @var{tag_offset})
> +Return an RTX that represents the result of adding @var{addr_offset} to
> +the address in pointer @var{base} and @var{tag_offset} to the tag in pointer
> +@var{base}.
> +The resulting RTX must either be a valid memory address or be able to get
> +put into an operand with @code{force_operand}.
> +
> +Unlike other memtag hooks, this must return an expression and not emit any
> +RTL.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} rtx TARGET_MEMTAG_SET_TAG (rtx @var{untagged_base}, rtx @var{tag}, rtx @var{target})
> +Return an RTX representing @var{untagged_base} but with the tag @var{tag}.
> +Try and store this in @var{target} if convenient.
> +@var{untagged_base} is required to have a zero tag when this hook is called.
> +The default of this hook is to set the top byte of @var{untagged_base} to
> +@var{tag}.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} rtx TARGET_MEMTAG_EXTRACT_TAG (rtx @var{tagged_pointer}, rtx @var{target})
> +Return an RTX representing the tag stored in @var{tagged_pointer}.
> +Store the result in @var{target} if it is convenient.
> +The default represents the top byte of the original pointer.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} rtx TARGET_MEMTAG_UNTAGGED_POINTER (rtx @var{tagged_pointer}, rtx @var{target})
> +Return an RTX representing @var{tagged_pointer} with its tag set to zero.
> +Store the result in @var{target} if convenient.
> +The default clears the top byte of the original pointer.
> +@end deftypefn
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index 8fbd36e2bf31e098f7827ce331fd7059c8a747bc..b08923c8f28455fe77e061625e78ed1bf538e792 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -8186,3 +8186,17 @@ maintainer is familiar with.
>  @hook TARGET_RUN_TARGET_SELFTESTS
>
>  @hook TARGET_MEMTAG_CAN_TAG_ADDRESSES
> +
> +@hook TARGET_MEMTAG_TAG_SIZE
> +
> +@hook TARGET_MEMTAG_GRANULE_SIZE
> +
> +@hook TARGET_MEMTAG_INSERT_RANDOM_TAG
> +
> +@hook TARGET_MEMTAG_ADD_TAG
> +
> +@hook TARGET_MEMTAG_SET_TAG
> +
> +@hook TARGET_MEMTAG_EXTRACT_TAG
> +
> +@hook TARGET_MEMTAG_UNTAGGED_POINTER
> diff --git a/gcc/explow.h b/gcc/explow.h
> index 0df8c62b82a8bf1d8d6baf0b6fb658e66361a407..581831cb19fdf9e8fd969bb30139e1358279a34d 100644
> --- a/gcc/explow.h
> +++ b/gcc/explow.h
> @@ -106,7 +106,7 @@ extern rtx allocate_dynamic_stack_space (rtx, unsigned, unsigned,
>  extern void get_dynamic_stack_size (rtx *, unsigned, unsigned, HOST_WIDE_INT *);
>
>  /* Returns the address of the dynamic stack space without allocating it.  */
> -extern rtx get_dynamic_stack_base (poly_int64, unsigned);
> +extern rtx get_dynamic_stack_base (poly_int64, unsigned, rtx);
>
>  /* Return an rtx doing runtime alignment to REQUIRED_ALIGN on TARGET.  */
>  extern rtx align_dynamic_address (rtx, unsigned);
> diff --git a/gcc/explow.c b/gcc/explow.c
> index 0fbc6d25b816457a3d13ed45d16b5dd0513cfacd..41c3f6ace49c0e55c080e10b917842b1b21d49eb 100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -1583,10 +1583,14 @@ allocate_dynamic_stack_space (rtx size, unsigned size_align,
>     OFFSET is the offset of the area into the virtual stack vars area.
>
>     REQUIRED_ALIGN is the alignment (in bits) required for the region
> -   of memory.  */
> +   of memory.
> +
> +   BASE is the rtx of the base of this virtual stack vars area.
> +   The only time this is not `virtual_stack_vars_rtx` is when tagging pointers
> +   on the stack.  */
>
>  rtx
> -get_dynamic_stack_base (poly_int64 offset, unsigned required_align)
> +get_dynamic_stack_base (poly_int64 offset, unsigned required_align, rtx base)
>  {
>    rtx target;
>
> @@ -1594,7 +1598,7 @@ get_dynamic_stack_base (poly_int64 offset, unsigned required_align)
>      crtl->preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
>
>    target = gen_reg_rtx (Pmode);
> -  emit_move_insn (target, virtual_stack_vars_rtx);
> +  emit_move_insn (target, base);
>    target = expand_binop (Pmode, add_optab, target,
>                          gen_int_mode (offset, Pmode),
>                          NULL_RTX, 1, OPTAB_LIB_WIDEN);
> diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def
> index a32715ddb92e69b7ca7be28a8f17a369b891bd76..4f854fb994229fd4ed91d3b5cff7c7acff9a55bc 100644
> --- a/gcc/sanitizer.def
> +++ b/gcc/sanitizer.def
> @@ -180,6 +180,12 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_POINTER_COMPARE, "__sanitizer_ptr_cmp",
>  DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_POINTER_SUBTRACT, "__sanitizer_ptr_sub",
>                       BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST)
>
> +/* Hardware Address Sanitizer.  */
> +DEF_SANITIZER_BUILTIN(BUILT_IN_HWASAN_INIT, "__hwasan_init",
> +                     BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_HWASAN_TAG_MEM, "__hwasan_tag_memory",
> +                     BT_FN_VOID_PTR_UINT8_PTRMODE, ATTR_NOTHROW_LIST)
> +
>  /* Thread Sanitizer */
>  DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_INIT, "__tsan_init",
>                       BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
> diff --git a/gcc/target.def b/gcc/target.def
> index 25f0ae228210f926077020082f129fb2e599f062..44807438431488a5a7aa8f8125d256869e152b68 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -6874,6 +6874,71 @@ At preset, this feature does not support address spaces.  It also requires\n\
>  @code{Pmode} to be the same as @code{ptr_mode}.",
>   bool, (), default_memtag_can_tag_addresses)
>
> +DEFHOOK
> +(tag_size,
> + "Return the size of a tag (in bits) for this platform.\n\
> +\n\
> +The default returns 8.",
> +  uint8_t, (), default_memtag_tag_size)
> +
> +DEFHOOK
> +(granule_size,
> + "Return the size in real memory that each byte in shadow memory refers to.\n\
> +I.e. if a variable is @var{X} bytes long in memory, then this hook should\n\
> +return the value @var{Y} such that the tag in shadow memory spans\n\
> +@var{X}/@var{Y} bytes.\n\
> +\n\
> +Most variables will need to be aligned to this amount since two variables\n\
> +that are neighbors in memory and share a tag granule would need to share\n\
> +the same tag.\n\
> +\n\
> +The default returns 16.",
> +  uint8_t, (), default_memtag_granule_size)
> +
> +DEFHOOK
> +(insert_random_tag,
> + "Return an RTX representing the value of @var{untagged} but with a\n\
> +(possibly) random tag in it.\n\
> +Put that value into @var{target} if it is convenient to do so.\n\
> +This function is used to generate a tagged base for the current stack frame.",
> +  rtx, (rtx untagged, rtx target), default_memtag_insert_random_tag)
> +
> +DEFHOOK
> +(add_tag,
> + "Return an RTX that represents the result of adding @var{addr_offset} to\n\
> +the address in pointer @var{base} and @var{tag_offset} to the tag in pointer\n\
> +@var{base}.\n\
> +The resulting RTX must either be a valid memory address or be able to get\n\
> +put into an operand with @code{force_operand}.\n\
> +\n\
> +Unlike other memtag hooks, this must return an expression and not emit any\n\
> +RTL.",
> +  rtx, (rtx base, poly_int64 addr_offset, uint8_t tag_offset),
> +  default_memtag_add_tag)
> +
> +DEFHOOK
> +(set_tag,
> + "Return an RTX representing @var{untagged_base} but with the tag @var{tag}.\n\
> +Try and store this in @var{target} if convenient.\n\
> +@var{untagged_base} is required to have a zero tag when this hook is called.\n\
> +The default of this hook is to set the top byte of @var{untagged_base} to\n\
> +@var{tag}.",
> +  rtx, (rtx untagged_base, rtx tag, rtx target), default_memtag_set_tag)
> +
> +DEFHOOK
> +(extract_tag,
> + "Return an RTX representing the tag stored in @var{tagged_pointer}.\n\
> +Store the result in @var{target} if it is convenient.\n\
> +The default represents the top byte of the original pointer.",
> +  rtx, (rtx tagged_pointer, rtx target), default_memtag_extract_tag)
> +
> +DEFHOOK
> +(untagged_pointer,
> + "Return an RTX representing @var{tagged_pointer} with its tag set to zero.\n\
> +Store the result in @var{target} if convenient.\n\
> +The default clears the top byte of the original pointer.",
> +  rtx, (rtx tagged_pointer, rtx target), default_memtag_untagged_pointer)
> +
>  HOOK_VECTOR_END (memtag)
>  #undef HOOK_PREFIX
>  #define HOOK_PREFIX "TARGET_"
> diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> index 0065c686978d7120978430013c73b1055aaf95c7..68e8688a32f18481ee61f06879aacff20163105b 100644
> --- a/gcc/targhooks.h
> +++ b/gcc/targhooks.h
> @@ -287,4 +287,12 @@ extern bool speculation_safe_value_not_needed (bool);
>  extern rtx default_speculation_safe_value (machine_mode, rtx, rtx, rtx);
>
>  extern bool default_memtag_can_tag_addresses ();
> +extern uint8_t default_memtag_tag_size ();
> +extern uint8_t default_memtag_granule_size ();
> +extern rtx default_memtag_insert_random_tag (rtx, rtx);
> +extern rtx default_memtag_add_tag (rtx, poly_int64, uint8_t);
> +extern rtx default_memtag_set_tag (rtx, rtx, rtx);
> +extern rtx default_memtag_extract_tag (rtx, rtx);
> +extern rtx default_memtag_untagged_pointer (rtx, rtx);
> +
>  #endif /* GCC_TARGHOOKS_H */
> diff --git a/gcc/targhooks.c b/gcc/targhooks.c
> index 46cb536041d396c32fd08042581d6d5cd5ad0395..e634df3f6c6837e422246a7736c0de4471ce1e77 100644
> --- a/gcc/targhooks.c
> +++ b/gcc/targhooks.c
> @@ -73,6 +73,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "varasm.h"
>  #include "flags.h"
>  #include "explow.h"
> +#include "expmed.h"
>  #include "calls.h"
>  #include "expr.h"
>  #include "output.h"
> @@ -86,6 +87,9 @@ along with GCC; see the file COPYING3.  If not see
>  #include "langhooks.h"
>  #include "sbitmap.h"
>  #include "function-abi.h"
> +#include "attribs.h"
> +#include "asan.h"
> +#include "emit-rtl.h"
>
>  bool
>  default_legitimate_address_p (machine_mode mode ATTRIBUTE_UNUSED,
> @@ -2415,10 +2419,115 @@ default_speculation_safe_value (machine_mode mode ATTRIBUTE_UNUSED,
>    return result;
>  }
>
> +/* How many bits to shift in order to access the tag bits.
> +   The default is to store the tag in the top 8 bits of a 64 bit pointer, hence
> +   shifting 56 bits will leave just the tag.  */
> +#define HWASAN_SHIFT (GET_MODE_PRECISION (Pmode) - 8)
> +#define HWASAN_SHIFT_RTX GEN_INT (HWASAN_SHIFT)
> +
>  bool
>  default_memtag_can_tag_addresses ()
>  {
>    return false;
>  }
>
> +uint8_t
> +default_memtag_tag_size ()
> +{
> +  return 8;
> +}
> +
> +uint8_t
> +default_memtag_granule_size ()
> +{
> +  return 16;
> +}
> +
> +/* The default implementation of TARGET_MEMTAG_INSERT_RANDOM_TAG.  */
> +rtx
> +default_memtag_insert_random_tag (rtx untagged, rtx target)
> +{
> +  gcc_assert (param_hwasan_instrument_stack);
> +  if (param_hwasan_random_frame_tag)
> +    {
> +      rtx fn = init_one_libfunc ("__hwasan_generate_tag");
> +      rtx new_tag = emit_library_call_value (fn, NULL_RTX, LCT_NORMAL, QImode);
> +      return targetm.memtag.set_tag (untagged, new_tag, target);
> +    }
> +  else
> +    {
> +      /* NOTE: The kernel API does not have __hwasan_generate_tag exposed.
> +        In the future we may add the option emit random tags with inline
> +        instrumentation instead of function calls.  This would be the same
> +        between the kernel and userland.  */
> +      return untagged;
> +    }
> +}
> +
> +/* The default implementation of TARGET_MEMTAG_ADD_TAG.  */
> +rtx
> +default_memtag_add_tag (rtx base, poly_int64 offset, uint8_t tag_offset)
> +{
> +  /* Need to look into what the most efficient code sequence is.
> +     This is a code sequence that would be emitted *many* times, so we
> +     want it as small as possible.
> +
> +     There are two places where tag overflow is a question:
> +       - Tagging the shadow stack.
> +         (both tagging and untagging).
> +       - Tagging addressable pointers.
> +
> +     We need to ensure both behaviors are the same (i.e. that the tag that
> +     ends up in a pointer after "overflowing" the tag bits with a tag addition
> +     is the same that ends up in the shadow space).
> +
> +     The aim is that the behavior of tag addition should follow modulo
> +     wrapping in both instances.
> +
> +     The libhwasan code doesn't have any path that increments a pointer's tag,
> +     which means it has no opinion on what happens when a tag increment
> +     overflows (and hence we can choose our own behavior).  */
> +
> +  offset += ((uint64_t)tag_offset << HWASAN_SHIFT);
> +  return plus_constant (Pmode, base, offset);
> +}
> +
> +/* The default implementation of TARGET_MEMTAG_SET_TAG.  */
> +rtx
> +default_memtag_set_tag (rtx untagged, rtx tag, rtx target)
> +{
> +  gcc_assert (GET_MODE (untagged) == Pmode && GET_MODE (tag) == QImode);
> +  tag = expand_simple_binop (Pmode, ASHIFT, tag, HWASAN_SHIFT_RTX, NULL_RTX,
> +                            /* unsignedp = */1, OPTAB_WIDEN);
> +  rtx ret = expand_simple_binop (Pmode, IOR, untagged, tag, target,
> +                                /* unsignedp = */1, OPTAB_DIRECT);
> +  gcc_assert (ret);
> +  return ret;
> +}
> +
> +/* The default implementation of TARGET_MEMTAG_EXTRACT_TAG.  */
> +rtx
> +default_memtag_extract_tag (rtx tagged_pointer, rtx target)
> +{
> +  rtx tag = expand_simple_binop (Pmode, LSHIFTRT, tagged_pointer,
> +                                HWASAN_SHIFT_RTX, target,
> +                                /* unsignedp = */0,
> +                                OPTAB_DIRECT);
> +  rtx ret = gen_lowpart (QImode, tag);
> +  gcc_assert (ret);
> +  return ret;
> +}
> +
> +/* The default implementation of TARGET_MEMTAG_UNTAGGED_POINTER.  */
> +rtx
> +default_memtag_untagged_pointer (rtx tagged_pointer, rtx target)
> +{
> +  rtx tag_mask = gen_int_mode ((HOST_WIDE_INT_1U << HWASAN_SHIFT) - 1, Pmode);
> +  rtx untagged_base = expand_simple_binop (Pmode, AND, tagged_pointer,
> +                                          tag_mask, target, true,
> +                                          OPTAB_DIRECT);
> +  gcc_assert (untagged_base);
> +  return untagged_base;
> +}
> +
>  #include "gt-targhooks.h"
> diff --git a/gcc/toplev.c b/gcc/toplev.c
> index 2a3e7c064a5fbb6913481104975ca85615e49f8e..9938b6afbd4fa22898dbc3c29b92061a71810b08 100644
> --- a/gcc/toplev.c
> +++ b/gcc/toplev.c
> @@ -512,6 +512,9 @@ compile_file (void)
>        if (flag_sanitize & SANITIZE_THREAD)
>         tsan_finish_file ();
>
> +      if (flag_sanitize & SANITIZE_HWADDRESS)
> +       hwasan_finish_file ();
> +
>        omp_finish_file ();
>
>        output_shared_constant_pool ();
>
Matthew Malcomson Nov. 24, 2020, 4:45 p.m. UTC | #6
On 24/11/2020 12:30, Hongtao Liu wrote:
> Hi:
>    I'm learning about this patch, and I see one place that might be
> slighted improved.
> 
> +      poly_int64 size = (top - bot);
> +
> +      /* Assert the edge of each variable is aligned to the HWASAN tag granule
> +        size.  */
> +      gcc_assert (multiple_p (top, HWASAN_TAG_GRANULE_SIZE));
> +      gcc_assert (multiple_p (bot, HWASAN_TAG_GRANULE_SIZE));
> +      gcc_assert (multiple_p (size, HWASAN_TAG_GRANULE_SIZE));
> +
> 
> The last gcc_assert looks redundant?

Hi

I think you're right.

Just FYI I'm planning on making that change as an obvious fix after the 
patchset as it is now goes in.

That way I can say I ran all my tests on the patch series that I applied 
without going through the all the tests again.

Thanks for the catch!

Matthew
> 
> On Sat, Nov 21, 2020 at 2:48 AM Matthew Malcomson via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>>
diff mbox series

Patch

diff --git a/config/bootstrap-hwasan.mk b/config/bootstrap-hwasan.mk
index 4f60bed3fd6e98b47a3a38aea6eba2a7c320da25..91989f4bb1db6ccff564383777757b896645e541 100644
--- a/config/bootstrap-hwasan.mk
+++ b/config/bootstrap-hwasan.mk
@@ -1,7 +1,11 @@ 
 # This option enables -fsanitize=hwaddress for stage2 and stage3.
+# We need to disable random frame tags for bootstrap since the autoconf check
+# for which direction the stack is growing has UB that a random frame tag
+# breaks.  Running with a random frame tag gives approx. 50% chance of
+# bootstrap comparison diff in libiberty/alloca.c.
 
-STAGE2_CFLAGS += -fsanitize=hwaddress
-STAGE3_CFLAGS += -fsanitize=hwaddress
+STAGE2_CFLAGS += -fsanitize=hwaddress --param hwasan-random-frame-tag=0
+STAGE3_CFLAGS += -fsanitize=hwaddress --param hwasan-random-frame-tag=0
 POSTSTAGE1_LDFLAGS += -fsanitize=hwaddress -static-libhwasan \
 		      -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/ \
 		      -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/ \
diff --git a/gcc/asan.h b/gcc/asan.h
index 114b457ef91c4479d43774bed58c24213196ce12..8d5271e6b575d74da277420798557f3274e966ce 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -34,6 +34,22 @@  extern bool asan_expand_mark_ifn (gimple_stmt_iterator *);
 extern bool asan_expand_poison_ifn (gimple_stmt_iterator *, bool *,
 				    hash_map<tree, tree> &);
 
+extern void hwasan_record_frame_init ();
+extern void hwasan_record_stack_var (rtx, rtx, poly_int64, poly_int64);
+extern void hwasan_emit_prologue ();
+extern rtx_insn *hwasan_emit_untag_frame (rtx, rtx);
+extern rtx hwasan_get_frame_extent ();
+extern rtx hwasan_frame_base ();
+extern void hwasan_maybe_emit_frame_base_init (void);
+extern bool stack_vars_base_reg_p (rtx);
+extern uint8_t hwasan_current_frame_tag ();
+extern void hwasan_increment_frame_tag ();
+extern rtx hwasan_truncate_to_tag_size (rtx, rtx);
+extern void hwasan_finish_file (void);
+extern bool hwasan_sanitize_p (void);
+extern bool hwasan_sanitize_stack_p (void);
+extern bool hwasan_sanitize_allocas_p (void);
+
 extern gimple_stmt_iterator create_cond_insert_point
      (gimple_stmt_iterator *, bool, bool, bool, basic_block *, basic_block *);
 
@@ -75,6 +91,26 @@  extern hash_set <tree> *asan_used_labels;
 
 #define ASAN_USE_AFTER_SCOPE_ATTRIBUTE	"use after scope memory"
 
+/* NOTE: The values below and the hooks under targetm.memtag define an ABI and
+   are hard-coded to these values in libhwasan, hence they can't be changed
+   independently here.  */
+/* How many bits are used to store a tag in a pointer.
+   The default version uses the entire top byte of a pointer (i.e. 8 bits).  */
+#define HWASAN_TAG_SIZE targetm.memtag.tag_size ()
+/* Tag Granule of HWASAN shadow stack.
+   This is the size in real memory that each byte in the shadow memory refers
+   to.  I.e. if a variable is X bytes long in memory then its tag in shadow
+   memory will span X / HWASAN_TAG_GRANULE_SIZE bytes.
+   Most variables will need to be aligned to this amount since two variables
+   that are neighbors in memory and share a tag granule would need to share the
+   same tag (the shared tag granule can only store one tag).  */
+#define HWASAN_TAG_GRANULE_SIZE targetm.memtag.granule_size ()
+/* Define the tag for the stack background.
+   This defines what tag the stack pointer will be and hence what tag all
+   variables that are not given special tags are (e.g. spilled registers,
+   and parameters passed on the stack).  */
+#define HWASAN_STACK_BACKGROUND gen_int_mode (0, QImode)
+
 /* Various flags for Asan builtins.  */
 enum asan_check_flags
 {
diff --git a/gcc/asan.c b/gcc/asan.c
index 0b471afff64ea6a0ffbe0add71333ac688c472c6..157774f4cb666515b5862f48d13d7211f35ffa12 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -257,6 +257,58 @@  hash_set<tree> *asan_handled_variables = NULL;
 
 hash_set <tree> *asan_used_labels = NULL;
 
+/* Global variables for HWASAN stack tagging.  */
+/* hwasan_frame_tag_offset records the offset from the frame base tag that the
+   next object should have.  */
+static uint8_t hwasan_frame_tag_offset = 0;
+/* hwasan_frame_base_ptr is a pointer with the same address as
+   `virtual_stack_vars_rtx` for the current frame, and with the frame base tag
+   stored in it.  N.b. this global RTX does not need to be marked GTY, but is
+   done so anyway.  The need is not there since all uses are in just one pass
+   (cfgexpand) and there are no calls to ggc_collect between the uses.  We mark
+   it GTY(()) anyway to allow the use of the variable later on if needed by
+   future features.  */
+static GTY(()) rtx hwasan_frame_base_ptr = NULL_RTX;
+/* hwasan_frame_base_init_seq is the sequence of RTL insns that will initialize
+   the hwasan_frame_base_ptr.  When the hwasan_frame_base_ptr is requested, we
+   generate this sequence but do not emit it.  If the sequence was created it
+   is emitted once the function body has been expanded.
+
+   This delay is because the frame base pointer may be needed anywhere in the
+   function body, or needed by the expand_used_vars function.  Emitting once in
+   a known place is simpler than requiring the emition of the instructions to
+   be know where it should go depending on the first place the hwasan frame
+   base is needed.  */
+static GTY(()) rtx_insn *hwasan_frame_base_init_seq = NULL;
+
+/* Structure defining the extent of one object on the stack that HWASAN needs
+   to tag in the corresponding shadow stack space.
+
+   The range this object spans on the stack is between `untagged_base +
+   nearest_offset` and `untagged_base + farthest_offset`.
+   `tagged_base` is an rtx containing the same value as `untagged_base` but
+   with a random tag stored in the top byte.  We record both `untagged_base`
+   and `tagged_base` so that `hwasan_emit_prologue` can use both without having
+   to emit RTL into the instruction stream to re-calculate one from the other.
+   (`hwasan_emit_prologue` needs to use both bases since the
+   __hwasan_tag_memory call it emits uses an untagged value, and it calculates
+   the tag to store in shadow memory based on the tag_offset plus the tag in
+   tagged_base).  */
+struct hwasan_stack_var
+{
+  rtx untagged_base;
+  rtx tagged_base;
+  poly_int64 nearest_offset;
+  poly_int64 farthest_offset;
+  uint8_t tag_offset;
+};
+
+/* Variable recording all stack variables that HWASAN needs to tag.
+   Does not need to be marked as GTY(()) since every use is in the cfgexpand
+   pass and gcc_collect is not called in the middle of that pass.  */
+static vec<hwasan_stack_var> hwasan_tagged_stack_vars;
+
+
 /* Sets shadow offset to value in string VAL.  */
 
 bool
@@ -1359,6 +1411,28 @@  asan_redzone_buffer::flush_if_full (void)
     flush_redzone_payload ();
 }
 
+/* Returns whether we are tagging pointers and checking those tags on memory
+   access.  */
+bool
+hwasan_sanitize_p ()
+{
+  return sanitize_flags_p (SANITIZE_HWADDRESS);
+}
+
+/* Are we tagging the stack?  */
+bool
+hwasan_sanitize_stack_p ()
+{
+  return (hwasan_sanitize_p () && param_hwasan_instrument_stack);
+}
+
+/* Are we tagging alloca objects?  */
+bool
+hwasan_sanitize_allocas_p (void)
+{
+  return (hwasan_sanitize_stack_p () && param_hwasan_protect_allocas);
+}
+
 /* Insert code to protect stack vars.  The prologue sequence should be emitted
    directly, epilogue sequence returned.  BASE is the register holding the
    stack base, against which OFFSETS array offsets are relative to, OFFSETS
@@ -2908,6 +2982,11 @@  initialize_sanitizer_builtins (void)
     = build_function_type_list (void_type_node, uint64_type_node,
 				ptr_type_node, NULL_TREE);
 
+  tree BT_FN_VOID_PTR_UINT8_PTRMODE
+    = build_function_type_list (void_type_node, ptr_type_node,
+				unsigned_char_type_node,
+				pointer_sized_int_node, NULL_TREE);
+
   tree BT_FN_BOOL_VPTR_PTR_IX_INT_INT[5];
   tree BT_FN_IX_CONST_VPTR_INT[5];
   tree BT_FN_IX_VPTR_IX_INT[5];
@@ -2958,6 +3037,8 @@  initialize_sanitizer_builtins (void)
 #define BT_FN_I16_CONST_VPTR_INT BT_FN_IX_CONST_VPTR_INT[4]
 #define BT_FN_I16_VPTR_I16_INT BT_FN_IX_VPTR_IX_INT[4]
 #define BT_FN_VOID_VPTR_I16_INT BT_FN_VOID_VPTR_IX_INT[4]
+#undef ATTR_NOTHROW_LIST
+#define ATTR_NOTHROW_LIST ECF_NOTHROW
 #undef ATTR_NOTHROW_LEAF_LIST
 #define ATTR_NOTHROW_LEAF_LIST ECF_NOTHROW | ECF_LEAF
 #undef ATTR_TMPURE_NOTHROW_LEAF_LIST
@@ -3709,4 +3790,352 @@  make_pass_asan_O0 (gcc::context *ctxt)
   return new pass_asan_O0 (ctxt);
 }
 
+/* For stack tagging:
+
+   Return the offset from the frame base tag that the "next" expanded object
+   should have.  */
+uint8_t
+hwasan_current_frame_tag ()
+{
+  return hwasan_frame_tag_offset;
+}
+
+/* For stack tagging:
+
+   Return the 'base pointer' for this function.  If that base pointer has not
+   yet been created then we create a register to hold it and record the insns
+   to initialize the register in `hwasan_frame_base_init_seq` for later
+   emission.  */
+rtx
+hwasan_frame_base ()
+{
+  if (! hwasan_frame_base_ptr)
+    {
+      start_sequence ();
+      hwasan_frame_base_ptr =
+	force_reg (Pmode,
+		   targetm.memtag.insert_random_tag (virtual_stack_vars_rtx,
+						     NULL_RTX));
+      hwasan_frame_base_init_seq = get_insns ();
+      end_sequence ();
+    }
+
+  return hwasan_frame_base_ptr;
+}
+
+/* For stack tagging:
+
+   Check whether this RTX is a standard pointer addressing the base of the
+   stack variables for this frame.  Returns true if the RTX is either
+   virtual_stack_vars_rtx or hwasan_frame_base_ptr.  */
+bool
+stack_vars_base_reg_p (rtx base)
+{
+  return base == virtual_stack_vars_rtx || base == hwasan_frame_base_ptr;
+}
+
+/* For stack tagging:
+
+   Emit frame base initialisation.
+   If hwasan_frame_base has been used before here then
+   hwasan_frame_base_init_seq contains the sequence of instructions to
+   initialize it.  This must be put just before the hwasan prologue, so we emit
+   the insns before parm_birth_insn (which will point to the first instruction
+   of the hwasan prologue if it exists).
+
+   We update `parm_birth_insn` to point to the start of this initialisation
+   since that represents the end of the initialisation done by
+   expand_function_{start,end} functions and we want to maintain that.  */
+void
+hwasan_maybe_emit_frame_base_init ()
+{
+  if (! hwasan_frame_base_init_seq)
+    return;
+  emit_insn_before (hwasan_frame_base_init_seq, parm_birth_insn);
+  parm_birth_insn = hwasan_frame_base_init_seq;
+}
+
+/* Record a compile-time constant size stack variable that HWASAN will need to
+   tag.  This record of the range of a stack variable will be used by
+   `hwasan_emit_prologue` to emit the RTL at the start of each frame which will
+   set tags in the shadow memory according to the assigned tag for each object.
+
+   The range that the object spans in stack space should be described by the
+   bounds `untagged_base + nearest_offset` and
+   `untagged_base + farthest_offset`.
+   `tagged_base` is the base address which contains the "base frame tag" for
+   this frame, and from which the value to address this object with will be
+   calculated.
+
+   We record the `untagged_base` since the functions in the hwasan library we
+   use to tag memory take pointers without a tag.  */
+void
+hwasan_record_stack_var (rtx untagged_base, rtx tagged_base,
+			 poly_int64 nearest_offset, poly_int64 farthest_offset)
+{
+  hwasan_stack_var cur_var;
+  cur_var.untagged_base = untagged_base;
+  cur_var.tagged_base = tagged_base;
+  cur_var.nearest_offset = nearest_offset;
+  cur_var.farthest_offset = farthest_offset;
+  cur_var.tag_offset = hwasan_current_frame_tag ();
+
+  hwasan_tagged_stack_vars.safe_push (cur_var);
+}
+
+/* Return the RTX representing the farthest extent of the statically allocated
+   stack objects for this frame.  If hwasan_frame_base_ptr has not been
+   initialized then we are not storing any static variables on the stack in
+   this frame.  In this case we return NULL_RTX to represent that.
+
+   Otherwise simply return virtual_stack_vars_rtx + frame_offset.  */
+rtx
+hwasan_get_frame_extent ()
+{
+  return (hwasan_frame_base_ptr
+	  ? plus_constant (Pmode, virtual_stack_vars_rtx, frame_offset)
+	  : NULL_RTX);
+}
+
+/* For stack tagging:
+
+   Increment the frame tag offset modulo the size a tag can represent.  */
+void
+hwasan_increment_frame_tag ()
+{
+  uint8_t tag_bits = HWASAN_TAG_SIZE;
+  gcc_assert (HWASAN_TAG_SIZE
+	      <= sizeof (hwasan_frame_tag_offset) * CHAR_BIT);
+  hwasan_frame_tag_offset = (hwasan_frame_tag_offset + 1) % (1 << tag_bits);
+  /* The "background tag" of the stack is zero by definition.
+     This is the tag that objects like parameters passed on the stack and
+     spilled registers are given.  It is handy to avoid this tag for objects
+     whose tags we decide ourselves, partly to ensure that buffer overruns
+     can't affect these important variables (e.g. saved link register, saved
+     stack pointer etc) and partly to make debugging easier (everything with a
+     tag of zero is space allocated automatically by the compiler).
+
+     This is not feasible when using random frame tags (the default
+     configuration for hwasan) since the tag for the given frame is randomly
+     chosen at runtime.  In order to avoid any tags matching the stack
+     background we would need to decide tag offsets at runtime instead of
+     compile time (and pay the resulting performance cost).
+
+     When not using random base tags for each frame (i.e. when compiled with
+     `--param hwasan-random-frame-tag=0`) the base tag for each frame is zero.
+     This means the tag that each object gets is equal to the
+     hwasan_frame_tag_offset used in determining it.
+     When this is the case we *can* ensure no object gets the tag of zero by
+     simply ensuring no object has the hwasan_frame_tag_offset of zero.
+
+     There is the extra complication that we only record the
+     hwasan_frame_tag_offset here (which is the offset from the tag stored in
+     the stack pointer).  In the kernel, the tag in the stack pointer is 0xff
+     rather than zero.  This does not cause problems since tags of 0xff are
+     never checked in the kernel.  As mentioned at the beginning of this
+     comment the background tag of the stack is zero by definition, which means
+     that for the kernel we should skip offsets of both 0 and 1 from the stack
+     pointer.  Avoiding the offset of 0 ensures we use a tag which will be
+     checked, avoiding the offset of 1 ensures we use a tag that is not the
+     same as the background.  */
+  if (hwasan_frame_tag_offset == 0 && ! param_hwasan_random_frame_tag)
+    hwasan_frame_tag_offset += 1;
+  if (hwasan_frame_tag_offset == 1 && ! param_hwasan_random_frame_tag
+      && sanitize_flags_p (SANITIZE_KERNEL_HWADDRESS))
+    hwasan_frame_tag_offset += 1;
+}
+
+/* Clear internal state for the next function.
+   This function is called before variables on the stack get expanded, in
+   `init_vars_expansion`.  */
+void
+hwasan_record_frame_init ()
+{
+  delete asan_used_labels;
+  asan_used_labels = NULL;
+
+  /* If this isn't the case then some stack variable was recorded *before*
+     hwasan_record_frame_init is called, yet *after* the hwasan prologue for
+     the previous frame was emitted.  Such stack variables would not have
+     their shadow stack filled in.  */
+  gcc_assert (hwasan_tagged_stack_vars.is_empty ());
+  hwasan_frame_base_ptr = NULL_RTX;
+  hwasan_frame_base_init_seq = NULL;
+
+  /* When not using a random frame tag we can avoid the background stack
+     color which gives the user a little better debug output upon a crash.
+     Meanwhile, when using a random frame tag it will be nice to avoid adding
+     tags for the first object since that is unnecessary extra work.
+     Hence set the initial hwasan_frame_tag_offset to be 0 if using a random
+     frame tag and 1 otherwise.
+
+     As described in hwasan_increment_frame_tag, in the kernel the stack
+     pointer has the tag 0xff.  That means that to avoid 0xff and 0 (the tag
+     which the kernel does not check and the background tag respectively) we
+     start with a tag offset of 2.  */
+  hwasan_frame_tag_offset = param_hwasan_random_frame_tag
+    ? 0
+    : sanitize_flags_p (SANITIZE_KERNEL_HWADDRESS) ? 2 : 1;
+}
+
+/* For stack tagging:
+   (Emits HWASAN equivalent of what is emitted by
+   `asan_emit_stack_protection`).
+
+   Emits the extra prologue code to set the shadow stack as required for HWASAN
+   stack instrumentation.
+
+   Uses the vector of recorded stack variables hwasan_tagged_stack_vars.  When
+   this function has completed hwasan_tagged_stack_vars is empty and all
+   objects it had pointed to are deallocated.  */
+void
+hwasan_emit_prologue ()
+{
+  /* We need untagged base pointers since libhwasan only accepts untagged
+    pointers in __hwasan_tag_memory.  We need the tagged base pointer to obtain
+    the base tag for an offset.  */
+
+  if (hwasan_tagged_stack_vars.is_empty ())
+    return;
+
+  size_t length = hwasan_tagged_stack_vars.length ();
+  hwasan_stack_var *vars = hwasan_tagged_stack_vars.address ();
+
+  poly_int64 bot = 0, top = 0;
+  size_t i = 0;
+  for (i = 0; i < length; i++)
+    {
+      hwasan_stack_var& cur = vars[i];
+      poly_int64 nearest = cur.nearest_offset;
+      poly_int64 farthest = cur.farthest_offset;
+
+      if (known_ge (nearest, farthest))
+	{
+	  top = nearest;
+	  bot = farthest;
+	}
+      else
+	{
+	  /* Given how these values are calculated, one must be known greater
+	     than the other.  */
+	  gcc_assert (known_le (nearest, farthest));
+	  top = farthest;
+	  bot = nearest;
+	}
+      poly_int64 size = (top - bot);
+
+      /* Assert the edge of each variable is aligned to the HWASAN tag granule
+	 size.  */
+      gcc_assert (multiple_p (top, HWASAN_TAG_GRANULE_SIZE));
+      gcc_assert (multiple_p (bot, HWASAN_TAG_GRANULE_SIZE));
+      gcc_assert (multiple_p (size, HWASAN_TAG_GRANULE_SIZE));
+
+      rtx ret = init_one_libfunc ("__hwasan_tag_memory");
+      rtx base_tag = targetm.memtag.extract_tag (cur.tagged_base, NULL_RTX);
+      rtx tag = plus_constant (QImode, base_tag, cur.tag_offset);
+      tag = hwasan_truncate_to_tag_size (tag, NULL_RTX);
+
+      rtx bottom = convert_memory_address (ptr_mode,
+					   plus_constant (Pmode,
+							  cur.untagged_base,
+							  bot));
+      emit_library_call (ret, LCT_NORMAL, VOIDmode,
+			 bottom, ptr_mode,
+			 tag, QImode,
+			 gen_int_mode (size, ptr_mode), ptr_mode);
+    }
+  /* Clear the stack vars, we've emitted the prologue for them all now.  */
+  hwasan_tagged_stack_vars.truncate (0);
+}
+
+/* For stack tagging:
+
+   Return RTL insns to clear the tags between DYNAMIC and VARS pointers
+   into the stack.  These instructions should be emitted at the end of
+   every function.
+
+   If `dynamic` is NULL_RTX then no insns are returned.  */
+rtx_insn *
+hwasan_emit_untag_frame (rtx dynamic, rtx vars)
+{
+  if (! dynamic)
+    return NULL;
+
+  start_sequence ();
+
+  dynamic = convert_memory_address (ptr_mode, dynamic);
+  vars = convert_memory_address (ptr_mode, vars);
+
+  rtx top_rtx;
+  rtx bot_rtx;
+  if (FRAME_GROWS_DOWNWARD)
+    {
+      top_rtx = vars;
+      bot_rtx = dynamic;
+    }
+  else
+    {
+      top_rtx = dynamic;
+      bot_rtx = vars;
+    }
+
+  rtx size_rtx = expand_simple_binop (ptr_mode, MINUS, top_rtx, bot_rtx,
+				      NULL_RTX, /* unsignedp = */0,
+				      OPTAB_DIRECT);
+
+  rtx ret = init_one_libfunc ("__hwasan_tag_memory");
+  emit_library_call (ret, LCT_NORMAL, VOIDmode,
+		     bot_rtx, ptr_mode,
+		     HWASAN_STACK_BACKGROUND, QImode,
+		     size_rtx, ptr_mode);
+
+  do_pending_stack_adjust ();
+  rtx_insn *insns = get_insns ();
+  end_sequence ();
+  return insns;
+}
+
+/* Needs to be GTY(()), because cgraph_build_static_cdtor may
+   invoke ggc_collect.  */
+static GTY(()) tree hwasan_ctor_statements;
+
+/* Insert module initialization into this TU.  This initialization calls the
+   initialization code for libhwasan.  */
+void
+hwasan_finish_file (void)
+{
+  /* Do not emit constructor initialization for the kernel.
+     (the kernel has its own initialization already).  */
+  if (flag_sanitize & SANITIZE_KERNEL_HWADDRESS)
+    return;
+
+  /* Avoid instrumenting code in the hwasan constructors/destructors.  */
+  flag_sanitize &= ~SANITIZE_HWADDRESS;
+  int priority = MAX_RESERVED_INIT_PRIORITY - 1;
+  tree fn = builtin_decl_implicit (BUILT_IN_HWASAN_INIT);
+  append_to_statement_list (build_call_expr (fn, 0), &hwasan_ctor_statements);
+  cgraph_build_static_cdtor ('I', hwasan_ctor_statements, priority);
+  flag_sanitize |= SANITIZE_HWADDRESS;
+}
+
+/* For stack tagging:
+
+   Truncate `tag` to the number of bits that a tag uses (i.e. to
+   HWASAN_TAG_SIZE).  Store the result in `target` if it's convenient.  */
+rtx
+hwasan_truncate_to_tag_size (rtx tag, rtx target)
+{
+  gcc_assert (GET_MODE (tag) == QImode);
+  if (HWASAN_TAG_SIZE != GET_MODE_PRECISION (QImode))
+    {
+      gcc_assert (GET_MODE_PRECISION (QImode) > HWASAN_TAG_SIZE);
+      rtx mask = gen_int_mode ((HOST_WIDE_INT_1U << HWASAN_TAG_SIZE) - 1,
+			       QImode);
+      tag = expand_simple_binop (QImode, AND, tag, mask, target,
+				 /* unsignedp = */1, OPTAB_WIDEN);
+      gcc_assert (tag);
+    }
+  return tag;
+}
+
 #include "gt-asan.h"
diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 4a82ee421bef42154ccd88e52f7a19f48b340c73..1ad6657da45cc4976532e1b8bc233f67d8da9ccf 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -639,6 +639,8 @@  DEF_FUNCTION_TYPE_3 (BT_FN_PTR_PTR_CONST_SIZE_BOOL,
 		     BT_PTR, BT_PTR, BT_CONST_SIZE, BT_BOOL)
 DEF_FUNCTION_TYPE_3 (BT_FN_PTR_SIZE_SIZE_PTRMODE,
 		     BT_PTR, BT_SIZE, BT_SIZE, BT_PTRMODE)
+DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_UINT8_PTRMODE, BT_VOID, BT_PTR, BT_UINT8,
+		     BT_PTRMODE)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
 		     BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
diff --git a/gcc/builtins.def b/gcc/builtins.def
index b4494c712a1751fbb37378f38cc1411d11a37331..97bb5d0b0aee7fa9ee4c82e2d80eae866fc23829 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -245,6 +245,7 @@  along with GCC; see the file COPYING3.  If not see
   DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,    \
 	       true, true, true, ATTRS, true, \
 	      (flag_sanitize & (SANITIZE_ADDRESS | SANITIZE_THREAD \
+				| SANITIZE_HWADDRESS \
 				| SANITIZE_UNDEFINED \
 				| SANITIZE_UNDEFINED_NONDEFAULT) \
 	       || flag_sanitize_coverage))
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 1df6f4bc55a39230c98e58af6c2d765652db8324..60f79ce799302a0a276199848a6f1b7d8a4aa4eb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -376,15 +376,18 @@  align_local_variable (tree decl, bool really_expand)
 	align = GET_MODE_ALIGNMENT (mode);
     }
   else
-    {
-      align = LOCAL_DECL_ALIGNMENT (decl);
-      /* Don't change DECL_ALIGN when called from estimated_stack_frame_size.
-	 That is done before IPA and could bump alignment based on host
-	 backend even for offloaded code which wants different
-	 LOCAL_DECL_ALIGNMENT.  */
-      if (really_expand)
-	SET_DECL_ALIGN (decl, align);
-    }
+    align = LOCAL_DECL_ALIGNMENT (decl);
+
+  if (hwasan_sanitize_stack_p ())
+    align = MAX (align, (unsigned) HWASAN_TAG_GRANULE_SIZE * BITS_PER_UNIT);
+
+  if (TREE_CODE (decl) != SSA_NAME && really_expand)
+    /* Don't change DECL_ALIGN when called from estimated_stack_frame_size.
+       That is done before IPA and could bump alignment based on host
+       backend even for offloaded code which wants different
+       LOCAL_DECL_ALIGNMENT.  */
+    SET_DECL_ALIGN (decl, align);
+
   return align / BITS_PER_UNIT;
 }
 
@@ -428,6 +431,14 @@  alloc_stack_frame_space (poly_int64 size, unsigned HOST_WIDE_INT align)
   return offset;
 }
 
+/* Ensure that the stack is aligned to ALIGN bytes.
+   Return the new frame offset.  */
+static poly_int64
+align_frame_offset (unsigned HOST_WIDE_INT align)
+{
+  return alloc_stack_frame_space (0, align);
+}
+
 /* Accumulate DECL into STACK_VARS.  */
 
 static void
@@ -1004,7 +1015,12 @@  expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
   /* If this fails, we've overflowed the stack frame.  Error nicely?  */
   gcc_assert (known_eq (offset, trunc_int_for_mode (offset, Pmode)));
 
-  x = plus_constant (Pmode, base, offset);
+  if (hwasan_sanitize_stack_p ())
+    x = targetm.memtag.add_tag (base, offset,
+				hwasan_current_frame_tag ());
+  else
+    x = plus_constant (Pmode, base, offset);
+
   x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
 		   ? TYPE_MODE (TREE_TYPE (decl))
 		   : DECL_MODE (decl), x);
@@ -1013,7 +1029,7 @@  expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
      If it is we generate stack slots only accidentally so it isn't as
      important, we'll simply set the alignment directly on the MEM.  */
 
-  if (base == virtual_stack_vars_rtx)
+  if (stack_vars_base_reg_p (base))
     offset -= frame_phase;
   align = known_alignment (offset);
   align *= BITS_PER_UNIT;
@@ -1056,13 +1072,13 @@  public:
 /* A subroutine of expand_used_vars.  Give each partition representative
    a unique location within the stack frame.  Update each partition member
    with that location.  */
-
 static void
 expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
 {
   size_t si, i, j, n = stack_vars_num;
   poly_uint64 large_size = 0, large_alloc = 0;
   rtx large_base = NULL;
+  rtx large_untagged_base = NULL;
   unsigned large_align = 0;
   bool large_allocation_done = false;
   tree decl;
@@ -1113,7 +1129,7 @@  expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
     {
       rtx base;
       unsigned base_align, alignb;
-      poly_int64 offset;
+      poly_int64 offset = 0;
 
       i = stack_vars_sorted[si];
 
@@ -1134,10 +1150,33 @@  expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
       if (pred && !pred (i))
 	continue;
 
+      base = (hwasan_sanitize_stack_p ()
+	      ? hwasan_frame_base ()
+	      : virtual_stack_vars_rtx);
       alignb = stack_vars[i].alignb;
       if (alignb * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
 	{
-	  base = virtual_stack_vars_rtx;
+	  poly_int64 hwasan_orig_offset;
+	  if (hwasan_sanitize_stack_p ())
+	    {
+	      /* There must be no tag granule "shared" between different
+		 objects.  This means that no HWASAN_TAG_GRANULE_SIZE byte
+		 chunk can have more than one object in it.
+
+		 We ensure this by forcing the end of the last bit of data to
+		 be aligned to HWASAN_TAG_GRANULE_SIZE bytes here, and setting
+		 the start of each variable to be aligned to
+		 HWASAN_TAG_GRANULE_SIZE bytes in `align_local_variable`.
+
+		 We can't align just one of the start or end, since there are
+		 untagged things stored on the stack which we do not align to
+		 HWASAN_TAG_GRANULE_SIZE bytes.  If we only aligned the start
+		 or the end of tagged objects then untagged objects could end
+		 up sharing the first granule of a tagged object or sharing the
+		 last granule of a tagged object respectively.  */
+	      hwasan_orig_offset = align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
+	      gcc_assert (stack_vars[i].alignb >= HWASAN_TAG_GRANULE_SIZE);
+	    }
 	  /* ASAN description strings don't yet have a syntax for expressing
 	     polynomial offsets.  */
 	  HOST_WIDE_INT prev_offset;
@@ -1148,7 +1187,7 @@  expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
 	    {
 	      if (data->asan_vec.is_empty ())
 		{
-		  alloc_stack_frame_space (0, ASAN_RED_ZONE_SIZE);
+		  align_frame_offset (ASAN_RED_ZONE_SIZE);
 		  prev_offset = frame_offset.to_constant ();
 		}
 	      prev_offset = align_base (prev_offset,
@@ -1216,6 +1255,24 @@  expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
 	    {
 	      offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
 	      base_align = crtl->max_used_stack_slot_alignment;
+
+	      if (hwasan_sanitize_stack_p ())
+		{
+		  /* Align again since the point of this alignment is to handle
+		     the "end" of the object (i.e. smallest address after the
+		     stack object).  For FRAME_GROWS_DOWNWARD that requires
+		     aligning the stack before allocating, but for a frame that
+		     grows upwards that requires aligning the stack after
+		     allocation.
+
+		     Use `frame_offset` to record the offset value rather than
+		     offset since the frame_offset describes the extent
+		     allocated for this particular variable while `offset`
+		     describes the address that this variable starts at.  */
+		  align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
+		  hwasan_record_stack_var (virtual_stack_vars_rtx, base,
+					   hwasan_orig_offset, frame_offset);
+		}
 	    }
 	}
       else
@@ -1236,14 +1293,33 @@  expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
 	      loffset = alloc_stack_frame_space
 		(rtx_to_poly_int64 (large_allocsize),
 		 PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
-	      large_base = get_dynamic_stack_base (loffset, large_align);
+	      large_base = get_dynamic_stack_base (loffset, large_align, base);
 	      large_allocation_done = true;
 	    }
-	  gcc_assert (large_base != NULL);
 
+	  gcc_assert (large_base != NULL);
 	  large_alloc = aligned_upper_bound (large_alloc, alignb);
 	  offset = large_alloc;
 	  large_alloc += stack_vars[i].size;
+	  if (hwasan_sanitize_stack_p ())
+	    {
+	      /* An object with a large alignment requirement means that the
+		 alignment requirement is greater than the required alignment
+		 for tags.  */
+	      if (!large_untagged_base)
+		large_untagged_base
+		  = targetm.memtag.untagged_pointer (large_base, NULL_RTX);
+	      /* Ensure the end of the variable is also aligned correctly.  */
+	      poly_int64 align_again
+		= aligned_upper_bound (large_alloc, HWASAN_TAG_GRANULE_SIZE);
+	      /* For large allocations we always allocate a chunk of space
+		 (which is addressed by large_untagged_base/large_base) and
+		 then use positive offsets from that.  Hence the farthest
+		 offset is `align_again` and the nearest offset from the base
+		 is `offset`.  */
+	      hwasan_record_stack_var (large_untagged_base, large_base,
+				       offset, align_again);
+	    }
 
 	  base = large_base;
 	  base_align = large_align;
@@ -1254,9 +1330,10 @@  expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
       for (j = i; j != EOC; j = stack_vars[j].next)
 	{
 	  expand_one_stack_var_at (stack_vars[j].decl,
-				   base, base_align,
-				   offset);
+				   base, base_align, offset);
 	}
+      if (hwasan_sanitize_stack_p ())
+	hwasan_increment_frame_tag ();
     }
 
   gcc_assert (known_eq (large_alloc, large_size));
@@ -1347,10 +1424,37 @@  expand_one_stack_var_1 (tree var)
   /* We handle highly aligned variables in expand_stack_vars.  */
   gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
 
-  offset = alloc_stack_frame_space (size, byte_align);
+  rtx base;
+  if (hwasan_sanitize_stack_p ())
+    {
+      /* Allocate zero bytes to align the stack.  */
+      poly_int64 hwasan_orig_offset
+	= align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
+      offset = alloc_stack_frame_space (size, byte_align);
+      align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
+      base = hwasan_frame_base ();
+      /* Use `frame_offset` to automatically account for machines where the
+	 frame grows upwards.
+
+	 `offset` will always point to the "start" of the stack object, which
+	 will be the smallest address, for ! FRAME_GROWS_DOWNWARD this is *not*
+	 the "furthest" offset from the base delimiting the current stack
+	 object.  `frame_offset` will always delimit the extent that the frame.
+	 */
+      hwasan_record_stack_var (virtual_stack_vars_rtx, base,
+			       hwasan_orig_offset, frame_offset);
+    }
+  else
+    {
+      offset = alloc_stack_frame_space (size, byte_align);
+      base = virtual_stack_vars_rtx;
+    }
 
-  expand_one_stack_var_at (var, virtual_stack_vars_rtx,
+  expand_one_stack_var_at (var, base,
 			   crtl->max_used_stack_slot_alignment, offset);
+
+  if (hwasan_sanitize_stack_p ())
+    hwasan_increment_frame_tag ();
 }
 
 /* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
@@ -1950,6 +2054,8 @@  init_vars_expansion (void)
   /* Initialize local stack smashing state.  */
   has_protected_decls = false;
   has_short_buffer = false;
+  if (hwasan_sanitize_stack_p ())
+    hwasan_record_frame_init ();
 }
 
 /* Free up stack variable graph data.  */
@@ -2277,10 +2383,26 @@  expand_used_vars (void)
       expand_stack_vars (NULL, &data);
     }
 
+  if (hwasan_sanitize_stack_p ())
+    hwasan_emit_prologue ();
   if (asan_sanitize_allocas_p () && cfun->calls_alloca)
     var_end_seq = asan_emit_allocas_unpoison (virtual_stack_dynamic_rtx,
 					      virtual_stack_vars_rtx,
 					      var_end_seq);
+  else if (hwasan_sanitize_allocas_p () && cfun->calls_alloca)
+    /* When using out-of-line instrumentation we only want to emit one function
+       call for clearing the tags in a region of shadow stack.  When there are
+       alloca calls in this frame we want to emit a call using the
+       virtual_stack_dynamic_rtx, but when not we use the hwasan_frame_extent
+       rtx we created in expand_stack_vars.  */
+    var_end_seq = hwasan_emit_untag_frame (virtual_stack_dynamic_rtx,
+					   virtual_stack_vars_rtx);
+  else if (hwasan_sanitize_stack_p ())
+    /* If no variables were stored on the stack, `hwasan_get_frame_extent`
+       will return NULL_RTX and hence `hwasan_emit_untag_frame` will return
+       NULL (i.e. an empty sequence).  */
+    var_end_seq = hwasan_emit_untag_frame (hwasan_get_frame_extent (),
+					   virtual_stack_vars_rtx);
 
   fini_vars_expansion ();
 
@@ -6641,6 +6763,9 @@  pass_expand::execute (function *fun)
       emit_insn_after (var_ret_seq, after);
     }
 
+  if (hwasan_sanitize_stack_p ())
+    hwasan_maybe_emit_frame_base_init ();
+
   /* Zap the tree EH table.  */
   set_eh_throw_stmt_table (fun, NULL);
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 45f902d370e39e85865bdab4c9bb837e8e20220c..f459ff22fd94a57cf5c694bd83674b8f7bf3260c 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -2980,6 +2980,63 @@  True if backend architecture naturally supports ignoring some region of
 pointers.  This feature means that @option{-fsanitize=hwaddress} can work.
 @end deftypefn
 
+@deftypefn {Target Hook} uint8_t TARGET_MEMTAG_TAG_SIZE ()
+Return the size of a tag (in bits) for this platform.
+
+The default returns 8.
+@end deftypefn
+
+@deftypefn {Target Hook} uint8_t TARGET_MEMTAG_GRANULE_SIZE ()
+Return the size in real memory that each byte in shadow memory refers to.
+I.e. if a variable is @var{X} bytes long in memory, then this hook should
+return the value @var{Y} such that the tag in shadow memory spans
+@var{X}/@var{Y} bytes.
+
+Most variables will need to be aligned to this amount since two variables
+that are neighbors in memory and share a tag granule would need to share
+the same tag.
+
+The default returns 16.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_INSERT_RANDOM_TAG (rtx @var{untagged}, rtx @var{target})
+Return an RTX representing the value of @var{untagged} but with a
+(possibly) random tag in it.
+Put that value into @var{target} if it is convenient to do so.
+This function is used to generate a tagged base for the current stack frame.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_ADD_TAG (rtx @var{base}, poly_int64 @var{addr_offset}, uint8_t @var{tag_offset})
+Return an RTX that represents the result of adding @var{addr_offset} to
+the address in pointer @var{base} and @var{tag_offset} to the tag in pointer
+@var{base}.
+The resulting RTX must either be a valid memory address or be able to get
+put into an operand with @code{force_operand}.
+
+Unlike other memtag hooks, this must return an expression and not emit any
+RTL.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_SET_TAG (rtx @var{untagged_base}, rtx @var{tag}, rtx @var{target})
+Return an RTX representing @var{untagged_base} but with the tag @var{tag}.
+Try and store this in @var{target} if convenient.
+@var{untagged_base} is required to have a zero tag when this hook is called.
+The default of this hook is to set the top byte of @var{untagged_base} to
+@var{tag}.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_EXTRACT_TAG (rtx @var{tagged_pointer}, rtx @var{target})
+Return an RTX representing the tag stored in @var{tagged_pointer}.
+Store the result in @var{target} if it is convenient.
+The default represents the top byte of the original pointer.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_UNTAGGED_POINTER (rtx @var{tagged_pointer}, rtx @var{target})
+Return an RTX representing @var{tagged_pointer} with its tag set to zero.
+Store the result in @var{target} if convenient.
+The default clears the top byte of the original pointer.
+@end deftypefn
+
 @node Stack and Calling
 @section Stack Layout and Calling Conventions
 @cindex calling conventions
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 2a298f05da40ef37373e8d8138c85700de9c3532..b675d01e346d52e4df241a75902b21984924b0f1 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2377,6 +2377,20 @@  in the reload pass.
 
 @hook TARGET_MEMTAG_CAN_TAG_ADDRESSES
 
+@hook TARGET_MEMTAG_TAG_SIZE
+
+@hook TARGET_MEMTAG_GRANULE_SIZE
+
+@hook TARGET_MEMTAG_INSERT_RANDOM_TAG
+
+@hook TARGET_MEMTAG_ADD_TAG
+
+@hook TARGET_MEMTAG_SET_TAG
+
+@hook TARGET_MEMTAG_EXTRACT_TAG
+
+@hook TARGET_MEMTAG_UNTAGGED_POINTER
+
 @node Stack and Calling
 @section Stack Layout and Calling Conventions
 @cindex calling conventions
diff --git a/gcc/explow.h b/gcc/explow.h
index 0df8c62b82a8bf1d8d6baf0b6fb658e66361a407..581831cb19fdf9e8fd969bb30139e1358279a34d 100644
--- a/gcc/explow.h
+++ b/gcc/explow.h
@@ -106,7 +106,7 @@  extern rtx allocate_dynamic_stack_space (rtx, unsigned, unsigned,
 extern void get_dynamic_stack_size (rtx *, unsigned, unsigned, HOST_WIDE_INT *);
 
 /* Returns the address of the dynamic stack space without allocating it.  */
-extern rtx get_dynamic_stack_base (poly_int64, unsigned);
+extern rtx get_dynamic_stack_base (poly_int64, unsigned, rtx);
 
 /* Return an rtx doing runtime alignment to REQUIRED_ALIGN on TARGET.  */
 extern rtx align_dynamic_address (rtx, unsigned);
diff --git a/gcc/explow.c b/gcc/explow.c
index 0fbc6d25b816457a3d13ed45d16b5dd0513cfacd..41c3f6ace49c0e55c080e10b917842b1b21d49eb 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -1583,10 +1583,14 @@  allocate_dynamic_stack_space (rtx size, unsigned size_align,
    OFFSET is the offset of the area into the virtual stack vars area.
 
    REQUIRED_ALIGN is the alignment (in bits) required for the region
-   of memory.  */
+   of memory.
+
+   BASE is the rtx of the base of this virtual stack vars area.
+   The only time this is not `virtual_stack_vars_rtx` is when tagging pointers
+   on the stack.  */
 
 rtx
-get_dynamic_stack_base (poly_int64 offset, unsigned required_align)
+get_dynamic_stack_base (poly_int64 offset, unsigned required_align, rtx base)
 {
   rtx target;
 
@@ -1594,7 +1598,7 @@  get_dynamic_stack_base (poly_int64 offset, unsigned required_align)
     crtl->preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
 
   target = gen_reg_rtx (Pmode);
-  emit_move_insn (target, virtual_stack_vars_rtx);
+  emit_move_insn (target, base);
   target = expand_binop (Pmode, add_optab, target,
 			 gen_int_mode (offset, Pmode),
 			 NULL_RTX, 1, OPTAB_LIB_WIDEN);
diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def
index a32715ddb92e69b7ca7be28a8f17a369b891bd76..4f854fb994229fd4ed91d3b5cff7c7acff9a55bc 100644
--- a/gcc/sanitizer.def
+++ b/gcc/sanitizer.def
@@ -180,6 +180,12 @@  DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_POINTER_COMPARE, "__sanitizer_ptr_cmp",
 DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_POINTER_SUBTRACT, "__sanitizer_ptr_sub",
 		      BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST)
 
+/* Hardware Address Sanitizer.  */
+DEF_SANITIZER_BUILTIN(BUILT_IN_HWASAN_INIT, "__hwasan_init",
+		      BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_HWASAN_TAG_MEM, "__hwasan_tag_memory",
+		      BT_FN_VOID_PTR_UINT8_PTRMODE, ATTR_NOTHROW_LIST)
+
 /* Thread Sanitizer */
 DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_INIT, "__tsan_init", 
 		      BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
diff --git a/gcc/target.def b/gcc/target.def
index 0945aa71ebbc5776ae036778f8588cf5bd0436f2..489cd6f215ef3ad9031749e7428e3664eb0abc09 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -6870,6 +6870,71 @@  DEFHOOK
 pointers.  This feature means that @option{-fsanitize=hwaddress} can work.",
  bool, (), default_memtag_can_tag_addresses)
 
+DEFHOOK
+(tag_size,
+ "Return the size of a tag (in bits) for this platform.\n\
+\n\
+The default returns 8.",
+  uint8_t, (), default_memtag_tag_size)
+
+DEFHOOK
+(granule_size,
+ "Return the size in real memory that each byte in shadow memory refers to.\n\
+I.e. if a variable is @var{X} bytes long in memory, then this hook should\n\
+return the value @var{Y} such that the tag in shadow memory spans\n\
+@var{X}/@var{Y} bytes.\n\
+\n\
+Most variables will need to be aligned to this amount since two variables\n\
+that are neighbors in memory and share a tag granule would need to share\n\
+the same tag.\n\
+\n\
+The default returns 16.",
+  uint8_t, (), default_memtag_granule_size)
+
+DEFHOOK
+(insert_random_tag,
+ "Return an RTX representing the value of @var{untagged} but with a\n\
+(possibly) random tag in it.\n\
+Put that value into @var{target} if it is convenient to do so.\n\
+This function is used to generate a tagged base for the current stack frame.",
+  rtx, (rtx untagged, rtx target), default_memtag_insert_random_tag)
+
+DEFHOOK
+(add_tag,
+ "Return an RTX that represents the result of adding @var{addr_offset} to\n\
+the address in pointer @var{base} and @var{tag_offset} to the tag in pointer\n\
+@var{base}.\n\
+The resulting RTX must either be a valid memory address or be able to get\n\
+put into an operand with @code{force_operand}.\n\
+\n\
+Unlike other memtag hooks, this must return an expression and not emit any\n\
+RTL.",
+  rtx, (rtx base, poly_int64 addr_offset, uint8_t tag_offset),
+  default_memtag_add_tag)
+
+DEFHOOK
+(set_tag,
+ "Return an RTX representing @var{untagged_base} but with the tag @var{tag}.\n\
+Try and store this in @var{target} if convenient.\n\
+@var{untagged_base} is required to have a zero tag when this hook is called.\n\
+The default of this hook is to set the top byte of @var{untagged_base} to\n\
+@var{tag}.",
+  rtx, (rtx untagged_base, rtx tag, rtx target), default_memtag_set_tag)
+
+DEFHOOK
+(extract_tag,
+ "Return an RTX representing the tag stored in @var{tagged_pointer}.\n\
+Store the result in @var{target} if it is convenient.\n\
+The default represents the top byte of the original pointer.",
+  rtx, (rtx tagged_pointer, rtx target), default_memtag_extract_tag)
+
+DEFHOOK
+(untagged_pointer,
+ "Return an RTX representing @var{tagged_pointer} with its tag set to zero.\n\
+Store the result in @var{target} if convenient.\n\
+The default clears the top byte of the original pointer.",
+  rtx, (rtx tagged_pointer, rtx target), default_memtag_untagged_pointer)
+
 HOOK_VECTOR_END (memtag)
 #undef HOOK_PREFIX
 #define HOOK_PREFIX "TARGET_"
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 0065c686978d7120978430013c73b1055aaf95c7..68e8688a32f18481ee61f06879aacff20163105b 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -287,4 +287,12 @@  extern bool speculation_safe_value_not_needed (bool);
 extern rtx default_speculation_safe_value (machine_mode, rtx, rtx, rtx);
 
 extern bool default_memtag_can_tag_addresses ();
+extern uint8_t default_memtag_tag_size ();
+extern uint8_t default_memtag_granule_size ();
+extern rtx default_memtag_insert_random_tag (rtx, rtx);
+extern rtx default_memtag_add_tag (rtx, poly_int64, uint8_t);
+extern rtx default_memtag_set_tag (rtx, rtx, rtx);
+extern rtx default_memtag_extract_tag (rtx, rtx);
+extern rtx default_memtag_untagged_pointer (rtx, rtx);
+
 #endif /* GCC_TARGHOOKS_H */
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 46cb536041d396c32fd08042581d6d5cd5ad0395..e66b1d0074b1921b7613f8e5444c7322c0479506 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -73,6 +73,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "varasm.h"
 #include "flags.h"
 #include "explow.h"
+#include "expmed.h"
 #include "calls.h"
 #include "expr.h"
 #include "output.h"
@@ -86,6 +87,9 @@  along with GCC; see the file COPYING3.  If not see
 #include "langhooks.h"
 #include "sbitmap.h"
 #include "function-abi.h"
+#include "attribs.h"
+#include "asan.h"
+#include "emit-rtl.h"
 
 bool
 default_legitimate_address_p (machine_mode mode ATTRIBUTE_UNUSED,
@@ -2415,10 +2419,116 @@  default_speculation_safe_value (machine_mode mode ATTRIBUTE_UNUSED,
   return result;
 }
 
+/* How many bits to shift in order to access the tag bits.
+   The default is to store the tag in the top 8 bits of a 64 bit pointer, hence
+   shifting 56 bits will leave just the tag.  */
+#define HWASAN_SHIFT (GET_MODE_PRECISION (Pmode) - 8)
+#define HWASAN_SHIFT_RTX GEN_INT (HWASAN_SHIFT)
+
 bool
 default_memtag_can_tag_addresses ()
 {
   return false;
 }
 
+uint8_t
+default_memtag_tag_size ()
+{
+  return 8;
+}
+
+uint8_t
+default_memtag_granule_size ()
+{
+  return 16;
+}
+
+/* The default implementation of TARGET_MEMTAG_INSERT_RANDOM_TAG.  */
+rtx
+default_memtag_insert_random_tag (rtx untagged, rtx target)
+{
+  gcc_assert (param_hwasan_instrument_stack);
+  if (param_hwasan_random_frame_tag)
+    {
+      rtx fn = init_one_libfunc ("__hwasan_generate_tag");
+      rtx new_tag = emit_library_call_value (fn, NULL_RTX, LCT_NORMAL, QImode);
+      return targetm.memtag.set_tag (untagged, new_tag, target);
+    }
+  else
+    {
+      /* NOTE: The kernel API does not have __hwasan_generate_tag exposed.
+	 In the future we may add the option emit random tags with inline
+	 instrumentation instead of function calls.  This would be the same
+	 between the kernel and userland.  */
+      return untagged;
+    }
+}
+
+/* The default implementation of TARGET_MEMTAG_ADD_TAG.  */
+rtx
+default_memtag_add_tag (rtx base, poly_int64 offset, uint8_t tag_offset)
+{
+  /* Need to look into what the most efficient code sequence is.
+     This is a code sequence that would be emitted *many* times, so we
+     want it as small as possible.
+
+     There are two places where tag overflow is a question:
+       - Tagging the shadow stack.
+	  (both tagging and untagging).
+       - Tagging addressable pointers.
+
+     We need to ensure both behaviors are the same (i.e. that the tag that
+     ends up in a pointer after "overflowing" the tag bits with a tag addition
+     is the same that ends up in the shadow space).
+
+     The aim is that the behavior of tag addition should follow modulo
+     wrapping in both instances.
+
+     The libhwasan code doesn't have any path that increments a pointer's tag,
+     which means it has no opinion on what happens when a tag increment
+     overflows (and hence we can choose our own behavior).  */
+
+  offset += ((uint64_t)tag_offset << HWASAN_SHIFT);
+  return plus_constant (Pmode, base, offset);
+}
+
+/* The default implementation of TARGET_MEMTAG_SET_TAG.  */
+rtx
+default_memtag_set_tag (rtx untagged, rtx tag, rtx target)
+{
+  gcc_assert (GET_MODE (untagged) == Pmode);
+  gcc_assert (GET_MODE (tag) == QImode);
+  tag = expand_simple_binop (Pmode, ASHIFT, tag, HWASAN_SHIFT_RTX, tag,
+			     /* unsignedp = */1, OPTAB_WIDEN);
+  rtx ret = expand_simple_binop (Pmode, IOR, untagged, tag, target,
+				 /* unsignedp = */1, OPTAB_DIRECT);
+  gcc_assert (ret);
+  return ret;
+}
+
+/* The default implementation of TARGET_MEMTAG_EXTRACT_TAG.  */
+rtx
+default_memtag_extract_tag (rtx tagged_pointer, rtx target)
+{
+  rtx tag = expand_simple_binop (Pmode, LSHIFTRT, tagged_pointer,
+				 HWASAN_SHIFT_RTX, target,
+				 /* unsignedp = */0,
+				 OPTAB_DIRECT);
+  rtx ret = gen_lowpart (QImode, tag);
+  gcc_assert (ret);
+  return ret;
+}
+
+/* The default implementation of TARGET_MEMTAG_UNTAGGED_POINTER.  */
+rtx
+default_memtag_untagged_pointer (rtx tagged_pointer, rtx target)
+{
+  rtx tag_mask = gen_int_mode ((HOST_WIDE_INT_1U << HWASAN_SHIFT) - 1, Pmode);
+  rtx untagged_base = expand_simple_binop (Pmode, AND, tagged_pointer,
+					   tag_mask, target, true,
+					   OPTAB_DIRECT);
+  gcc_assert (untagged_base);
+  return untagged_base;
+}
+
 #include "gt-targhooks.h"
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 2a3e7c064a5fbb6913481104975ca85615e49f8e..9938b6afbd4fa22898dbc3c29b92061a71810b08 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -512,6 +512,9 @@  compile_file (void)
       if (flag_sanitize & SANITIZE_THREAD)
 	tsan_finish_file ();
 
+      if (flag_sanitize & SANITIZE_HWADDRESS)
+	hwasan_finish_file ();
+
       omp_finish_file ();
 
       output_shared_constant_pool ();