[1/4] x86: Add -mindirect-branch=

Message ID 20180112131549.18143-2-hjl.tools@gmail.com
State New
Headers show
Series
  • x86: CVE-2017-5715, aka Spectre
Related show

Commit Message

H.J. Lu Jan. 12, 2018, 1:15 p.m.
Add -mindirect-branch= option to convert indirect call and jump to call
and return thunks.  The default is 'keep', which keeps indirect call and
jump unmodified.  'thunk' converts indirect call and jump to call and
return thunk.  'thunk-inline' converts indirect call and jump to inlined
call and return thunk.  'thunk-extern' converts indirect call and jump to
external call and return thunk provided in a separate object file.  You
can control this behavior for a specific function by using the function
attribute indirect_branch.

2 kinds of thunks are geneated.  Memory thunk where the function address
is at the top of the stack:

__x86_indirect_thunk:
	call L2
L1:
	pause
	jmp L1
L2:
	lea 8(%rsp), %rsp|lea 4(%esp), %esp
	ret

Indirect jmp via memory, "jmp mem", is converted to

	push memory
	jmp __x86_indirect_thunk

Indirect call via memory, "call mem", is converted to

	jmp L2
L1:
	push [mem]
	jmp __x86_indirect_thunk
L2:
	call L1

Register thunk where the function address is in a register, reg:

__x86_indirect_thunk_reg:
	call	L2
L1:
	pause
	jmp	L1
L2:
	movq	%reg, (%rsp)|movl    %reg, (%esp)
	ret

where reg is one of (r|e)ax, (r|e)dx, (r|e)cx, (r|e)bx, (r|e)si, (r|e)di,
(r|e)bp, r8, r9, r10, r11, r12, r13, r14 and r15.

Indirect jmp via register, "jmp reg", is converted to

	jmp __x86_indirect_thunk_reg

Indirect call via register, "call reg", is converted to

	call __x86_indirect_thunk_reg

gcc/

	* config/i386/i386-opts.h (indirect_branch): New.
	* config/i386/i386-protos.h (ix86_output_indirect_jmp): Likewise.
	* config/i386/i386.c (ix86_using_red_zone): Disallow red-zone
	with local indirect jump when converting indirect call and jump.
	(ix86_set_indirect_branch_type): New.
	(ix86_set_current_function): Call ix86_set_indirect_branch_type.
	(indirectlabelno): New.
	(indirect_thunk_needed): Likewise.
	(indirect_thunk_bnd_needed): Likewise.
	(indirect_thunks_used): Likewise.
	(indirect_thunks_bnd_used): Likewise.
	(INDIRECT_LABEL): Likewise.
	(indirect_thunk_name): Likewise.
	(output_indirect_thunk): Likewise.
	(output_indirect_thunk_function): Likewise.
	(ix86_output_indirect_branch): Likewise.
	(ix86_output_indirect_jmp): Likewise.
	(ix86_code_end): Call output_indirect_thunk_function if needed.
	(ix86_output_call_insn): Call ix86_output_indirect_branch if
	needed.
	(ix86_handle_fndecl_attribute): Handle indirect_branch.
	(ix86_attribute_table): Add indirect_branch.
	* config/i386/i386.h (machine_function): Add indirect_branch_type
	and has_local_indirect_jump.
	* config/i386/i386.md (indirect_jump): Set has_local_indirect_jump
	to true.
	(tablejump): Likewise.
	(*indirect_jump): Use ix86_output_indirect_jmp.
	(*tablejump_1): Likewise.
	(simple_return_indirect_internal): Likewise.
	* config/i386/i386.opt (mindirect-branch=): New option.
	(indirect_branch): New.
	(keep): Likewise.
	(thunk): Likewise.
	(thunk-inline): Likewise.
	(thunk-extern): Likewise.
	* doc/extend.texi: Document indirect_branch function attribute.
	* doc/invoke.texi: Document -mindirect-branch= option.

gcc/testsuite/

	* gcc.target/i386/indirect-thunk-1.c: New test.
	* gcc.target/i386/indirect-thunk-2.c: Likewise.
	* gcc.target/i386/indirect-thunk-3.c: Likewise.
	* gcc.target/i386/indirect-thunk-4.c: Likewise.
	* gcc.target/i386/indirect-thunk-5.c: Likewise.
	* gcc.target/i386/indirect-thunk-6.c: Likewise.
	* gcc.target/i386/indirect-thunk-7.c: Likewise.
	* gcc.target/i386/indirect-thunk-attr-1.c: Likewise.
	* gcc.target/i386/indirect-thunk-attr-2.c: Likewise.
	* gcc.target/i386/indirect-thunk-attr-3.c: Likewise.
	* gcc.target/i386/indirect-thunk-attr-4.c: Likewise.
	* gcc.target/i386/indirect-thunk-attr-5.c: Likewise.
	* gcc.target/i386/indirect-thunk-attr-6.c: Likewise.
	* gcc.target/i386/indirect-thunk-attr-7.c: Likewise.
	* gcc.target/i386/indirect-thunk-attr-8.c: Likewise.
	* gcc.target/i386/indirect-thunk-bnd-1.c: Likewise.
	* gcc.target/i386/indirect-thunk-bnd-2.c: Likewise.
	* gcc.target/i386/indirect-thunk-bnd-3.c: Likewise.
	* gcc.target/i386/indirect-thunk-bnd-4.c: Likewise.
	* gcc.target/i386/indirect-thunk-extern-1.c: Likewise.
	* gcc.target/i386/indirect-thunk-extern-2.c: Likewise.
	* gcc.target/i386/indirect-thunk-extern-3.c: Likewise.
	* gcc.target/i386/indirect-thunk-extern-4.c: Likewise.
	* gcc.target/i386/indirect-thunk-extern-5.c: Likewise.
	* gcc.target/i386/indirect-thunk-extern-6.c: Likewise.
	* gcc.target/i386/indirect-thunk-extern-7.c: Likewise.
	* gcc.target/i386/indirect-thunk-inline-1.c: Likewise.
	* gcc.target/i386/indirect-thunk-inline-2.c: Likewise.
	* gcc.target/i386/indirect-thunk-inline-3.c: Likewise.
	* gcc.target/i386/indirect-thunk-inline-4.c: Likewise.
	* gcc.target/i386/indirect-thunk-inline-5.c: Likewise.
	* gcc.target/i386/indirect-thunk-inline-6.c: Likewise.
	* gcc.target/i386/indirect-thunk-inline-7.c: Likewise.
---
 gcc/config/i386/i386-opts.h                        |   8 +
 gcc/config/i386/i386-protos.h                      |   1 +
 gcc/config/i386/i386.c                             | 512 ++++++++++++++++++++-
 gcc/config/i386/i386.h                             |   7 +
 gcc/config/i386/i386.md                            |   8 +-
 gcc/config/i386/i386.opt                           |  20 +
 gcc/doc/extend.texi                                |  10 +
 gcc/doc/invoke.texi                                |  14 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-1.c   |  19 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-2.c   |  19 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-3.c   |  20 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-4.c   |  20 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-5.c   |  16 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-6.c   |  17 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-7.c   |  43 ++
 .../gcc.target/i386/indirect-thunk-attr-1.c        |  22 +
 .../gcc.target/i386/indirect-thunk-attr-2.c        |  20 +
 .../gcc.target/i386/indirect-thunk-attr-3.c        |  21 +
 .../gcc.target/i386/indirect-thunk-attr-4.c        |  20 +
 .../gcc.target/i386/indirect-thunk-attr-5.c        |  22 +
 .../gcc.target/i386/indirect-thunk-attr-6.c        |  21 +
 .../gcc.target/i386/indirect-thunk-attr-7.c        |  44 ++
 .../gcc.target/i386/indirect-thunk-attr-8.c        |  41 ++
 .../gcc.target/i386/indirect-thunk-bnd-1.c         |  19 +
 .../gcc.target/i386/indirect-thunk-bnd-2.c         |  20 +
 .../gcc.target/i386/indirect-thunk-bnd-3.c         |  18 +
 .../gcc.target/i386/indirect-thunk-bnd-4.c         |  19 +
 .../gcc.target/i386/indirect-thunk-extern-1.c      |  19 +
 .../gcc.target/i386/indirect-thunk-extern-2.c      |  19 +
 .../gcc.target/i386/indirect-thunk-extern-3.c      |  20 +
 .../gcc.target/i386/indirect-thunk-extern-4.c      |  20 +
 .../gcc.target/i386/indirect-thunk-extern-5.c      |  16 +
 .../gcc.target/i386/indirect-thunk-extern-6.c      |  17 +
 .../gcc.target/i386/indirect-thunk-extern-7.c      |  43 ++
 .../gcc.target/i386/indirect-thunk-inline-1.c      |  18 +
 .../gcc.target/i386/indirect-thunk-inline-2.c      |  18 +
 .../gcc.target/i386/indirect-thunk-inline-3.c      |  19 +
 .../gcc.target/i386/indirect-thunk-inline-4.c      |  19 +
 .../gcc.target/i386/indirect-thunk-inline-5.c      |  15 +
 .../gcc.target/i386/indirect-thunk-inline-6.c      |  16 +
 .../gcc.target/i386/indirect-thunk-inline-7.c      |  42 ++
 41 files changed, 1306 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-7.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-7.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-8.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-extern-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-extern-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-extern-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-extern-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-extern-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-extern-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-extern-7.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-inline-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-inline-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-inline-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-inline-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-inline-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-inline-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c

Comments

Jan Hubicka Jan. 12, 2018, 5:47 p.m. | #1
> gcc/
> 
> 	* config/i386/i386-opts.h (indirect_branch): New.
> 	* config/i386/i386-protos.h (ix86_output_indirect_jmp): Likewise.
> 	* config/i386/i386.c (ix86_using_red_zone): Disallow red-zone
> 	with local indirect jump when converting indirect call and jump.
> 	(ix86_set_indirect_branch_type): New.
> 	(ix86_set_current_function): Call ix86_set_indirect_branch_type.
> 	(indirectlabelno): New.
> 	(indirect_thunk_needed): Likewise.
> 	(indirect_thunk_bnd_needed): Likewise.
> 	(indirect_thunks_used): Likewise.
> 	(indirect_thunks_bnd_used): Likewise.
> 	(INDIRECT_LABEL): Likewise.
> 	(indirect_thunk_name): Likewise.
> 	(output_indirect_thunk): Likewise.
> 	(output_indirect_thunk_function): Likewise.
> 	(ix86_output_indirect_branch): Likewise.
> 	(ix86_output_indirect_jmp): Likewise.
> 	(ix86_code_end): Call output_indirect_thunk_function if needed.
> 	(ix86_output_call_insn): Call ix86_output_indirect_branch if
> 	needed.
> 	(ix86_handle_fndecl_attribute): Handle indirect_branch.
> 	(ix86_attribute_table): Add indirect_branch.
> 	* config/i386/i386.h (machine_function): Add indirect_branch_type
> 	and has_local_indirect_jump.
> 	* config/i386/i386.md (indirect_jump): Set has_local_indirect_jump
> 	to true.
> 	(tablejump): Likewise.
> 	(*indirect_jump): Use ix86_output_indirect_jmp.
> 	(*tablejump_1): Likewise.
> 	(simple_return_indirect_internal): Likewise.
> 	* config/i386/i386.opt (mindirect-branch=): New option.
> 	(indirect_branch): New.
> 	(keep): Likewise.
> 	(thunk): Likewise.
> 	(thunk-inline): Likewise.
> 	(thunk-extern): Likewise.
> 	* doc/extend.texi: Document indirect_branch function attribute.
> 	* doc/invoke.texi: Document -mindirect-branch= option.
> 
> diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h
> index f245c1573cf..f14cbeee7a1 100644
> --- a/gcc/config/i386/i386-opts.h
> +++ b/gcc/config/i386/i386-opts.h
> @@ -106,4 +106,12 @@ enum prefer_vector_width {
>      PVW_AVX512
>  };
>  
> +enum indirect_branch {
> +  indirect_branch_unset = 0,
> +  indirect_branch_keep,
> +  indirect_branch_thunk,
> +  indirect_branch_thunk_inline,
> +  indirect_branch_thunk_extern
> +};
I think it would make sense to simply place the body of your introduction email
here as a comment explaining what enum indirect_branhc does.
>  
> -/* Return true if a red-zone is in use.  */
> +/* Return true if a red-zone is in use.  We can't use red-zone when
> +   there are local indirect jumps, like "indirect_jump" or "tablejump",
> +   which jumps to another place in the function, since "call" in the
> +   indirect thunk pushes the return address onto stack, destroying
> +   red-zone.  */

Technically we can use red-zone if we reserve space for the call address, right?
Perhaps mention it as TODO, that should even not be too hard to implement
but is probably not being that important code quality wise in practice.
>  
>  bool
>  ix86_using_red_zone (void)
>  {
> -  return TARGET_RED_ZONE && !TARGET_64BIT_MS_ABI;
> +  return (TARGET_RED_ZONE
> +	  && !TARGET_64BIT_MS_ABI
> +	  && (!cfun->machine->has_local_indirect_jump
> +	      || cfun->machine->indirect_branch_type == indirect_branch_keep));
>  }
>  
>  /* Return a string that documents the current -m options.  The caller is
> @@ -5797,6 +5804,37 @@ ix86_set_func_type (tree fndecl)
>      }
>  }
>  
> @@ -10639,6 +10681,191 @@ ix86_setup_frame_addresses (void)
>  # endif
>  #endif
>  
> +static int indirectlabelno;
> +static bool indirect_thunk_needed = false;
> +static bool indirect_thunk_bnd_needed = false;
> +
> +static int indirect_thunks_used;
> +static int indirect_thunks_bnd_used;

Add comments for variables.
> +
> +#ifndef INDIRECT_LABEL
> +# define INDIRECT_LABEL "LIND"
> +#endif
> +
> +/* Fills in the label name that should be used for the indirect thunk.  */
> +
> +static void
> +indirect_thunk_name (char name[32], int regno, bool need_bnd_p)
> +{
> +  if (USE_HIDDEN_LINKONCE)
> +    {
> +      const char *bnd = need_bnd_p ? "_bnd" : "";
> +      if (regno >= 0)
> +	{
> +	  const char *reg_prefix;
> +	  if (LEGACY_INT_REGNO_P (regno))
> +	    reg_prefix = TARGET_64BIT ? "r" : "e";
> +	  else
> +	    reg_prefix = "";
> +	  sprintf (name, "__x86_indirect_thunk%s_%s%s",
> +		   bnd, reg_prefix, reg_names[regno]);
> +	}
> +      else
> +	sprintf (name, "__x86_indirect_thunk%s", bnd);
> +    }
> +  else
> +    {
> +      if (regno >= 0)
> +	{
> +	  if (need_bnd_p)
> +	    ASM_GENERATE_INTERNAL_LABEL (name, "LITBR", regno);
> +	  else
> +	    ASM_GENERATE_INTERNAL_LABEL (name, "LITR", regno);
> +	}
> +      else
> +	{
> +	  if (need_bnd_p)
> +	    ASM_GENERATE_INTERNAL_LABEL (name, "LITB", 0);
> +	  else
> +	    ASM_GENERATE_INTERNAL_LABEL (name, "LIT", 0);
> +	}
> +    }
> +}
> +
> +/* Output a call and return thunk for indirect branch.  If BND_P is
> +   true, the BND prefix is needed.   If REGNO != -1,  the function
> +   address is in REGNO.  Otherwise, the function address is on the
> +   top of stack.  */

Add here comment which gives example of full thunk body so it is easier
to see what is going on here.
> +
> +static void
> +output_indirect_thunk (bool need_bnd_p, int regno)
> +{
> +  char indirectlabel1[32];
> +  char indirectlabel2[32];
> +
> +  ASM_GENERATE_INTERNAL_LABEL (indirectlabel1, INDIRECT_LABEL,
> +			       indirectlabelno++);
> +  ASM_GENERATE_INTERNAL_LABEL (indirectlabel2, INDIRECT_LABEL,
> +			       indirectlabelno++);
> +
> +  /* Call */
> +  if (need_bnd_p)
> +    fputs ("\tbnd call\t", asm_out_file);
> +  else
> +    fputs ("\tcall\t", asm_out_file);
> +  assemble_name_raw (asm_out_file, indirectlabel2);
> +  fputc ('\n', asm_out_file);
> +
> +  ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1);
> +
> +  /* Pause .  */
> +  fprintf (asm_out_file, "\tpause\n");
> +
> +  /* Jump.  */
> +  fputs ("\tjmp\t", asm_out_file);
> +  assemble_name_raw (asm_out_file, indirectlabel1);
> +  fputc ('\n', asm_out_file);
> +
> +  ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2);
> +
> +  if (regno >= 0)
> +    {
> +      /* MOV.  */
> +      rtx xops[2];
> +      xops[0] = gen_rtx_MEM (word_mode, stack_pointer_rtx);
> +      xops[1] = gen_rtx_REG (word_mode, regno);
> +      output_asm_insn ("mov\t{%1, %0|%0, %1}", xops);
> +    }
> +  else
> +    {
> +      /* LEA.  */
> +      rtx xops[2];
> +      xops[0] = stack_pointer_rtx;
> +      xops[1] = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD);
> +      output_asm_insn ("lea\t{%E1, %0|%0, %E1}", xops);
> +    }
> +
> +  if (need_bnd_p)
> +    fputs ("\tbnd ret\n", asm_out_file);
> +  else
> +    fputs ("\tret\n", asm_out_file);
> +}
> +
> +/* Output a funtion with a call and return thunk for indirect branch.
> +   If BND_P is true, the BND prefix is needed.   If REGNO != -1,  the
> +   function address is in REGNO.  Otherwise, the function address is
> +   on the top of stack.  */
> +
> +static void
> +output_indirect_thunk_function (bool need_bnd_p, int regno)
> +{
> +  char name[32];
> +  tree decl;
> +
> +  /* Create __x86_indirect_thunk/__x86_indirect_thunk_bnd.  */
> +  indirect_thunk_name (name, regno, need_bnd_p);
> +  decl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL,
> +		     get_identifier (name),
> +		     build_function_type_list (void_type_node, NULL_TREE));
> +  DECL_RESULT (decl) = build_decl (BUILTINS_LOCATION, RESULT_DECL,
> +				   NULL_TREE, void_type_node);
> +  TREE_PUBLIC (decl) = 1;
> +  TREE_STATIC (decl) = 1;
> +  DECL_IGNORED_P (decl) = 1;

DECL_ARTIFICIAL as well?

> +
> +#if TARGET_MACHO
> +  if (TARGET_MACHO)
> +    {
> +      switch_to_section (darwin_sections[picbase_thunk_section]);
> +      fputs ("\t.weak_definition\t", asm_out_file);
> +      assemble_name (asm_out_file, name);
> +      fputs ("\n\t.private_extern\t", asm_out_file);
> +      assemble_name (asm_out_file, name);
> +      putc ('\n', asm_out_file);
> +      ASM_OUTPUT_LABEL (asm_out_file, name);
> +      DECL_WEAK (decl) = 1;
> +    }
> +  else
> +#endif
> +    if (USE_HIDDEN_LINKONCE)
> +      {
> +	cgraph_node::create (decl)->set_comdat_group (DECL_ASSEMBLER_NAME (decl));
> +
> +	targetm.asm_out.unique_section (decl, 0);
> +	switch_to_section (get_named_section (decl, NULL, 0));
> +
> +	targetm.asm_out.globalize_label (asm_out_file, name);
> +	fputs ("\t.hidden\t", asm_out_file);
> +	assemble_name (asm_out_file, name);
> +	putc ('\n', asm_out_file);
> +	ASM_DECLARE_FUNCTION_NAME (asm_out_file, name, decl);
> +      }
> +    else
> +      {
> +	switch_to_section (text_section);
> +	ASM_OUTPUT_LABEL (asm_out_file, name);
> +      }
> +

Why do you need to output asm visibility directives by hand when you create
symbol for the function anyway?
I would expect that you can just use similar code as cgraph_node::expand_thunk
when calls output_mi_thunk and get this done in a way that is independent of
target assembler.
>  
> +/* Output indirect branch via a call and return thunk.  CALL_OP is
> +   the branch target.  XASM is the assembly template for CALL_OP.
> +   Branch is a tail call if SIBCALL_P is true.  */

Again please add example of code sequences output here to make code easier to follow.
> +
> +static void
> +ix86_output_indirect_branch (rtx call_op, const char *xasm,
> +			     bool sibcall_p)
> +{
> +  char thunk_name_buf[32];
> +  char *thunk_name;
> +  char push_buf[64];
> +  bool need_bnd_p = ix86_bnd_prefixed_insn_p (current_output_insn);
> +  int regno;
> +
> +  if (REG_P (call_op))
> +    regno = REGNO (call_op);
> +  else
> +    regno = -1;
> +
> +  if (cfun->machine->indirect_branch_type
> +      != indirect_branch_thunk_inline)
> +    {
> +      if (cfun->machine->indirect_branch_type == indirect_branch_thunk)
> +	{
> +	  if (regno >= 0)
> +	    {
> +	      int i = regno;
> +	      if (i >= FIRST_REX_INT_REG)
> +		i -= (FIRST_REX_INT_REG - LAST_INT_REG - 1);
> +	      if (need_bnd_p)
> +		indirect_thunks_bnd_used |= 1 << i;
> +	      else
> +		indirect_thunks_used |= 1 << i;
> +	    }
> +	  else
> +	    {
> +	      if (need_bnd_p)
> +		indirect_thunk_bnd_needed = true;
> +	      else
> +		indirect_thunk_needed = true;
> +	    }
> +	}
> +      indirect_thunk_name (thunk_name_buf, regno, need_bnd_p);
> +      thunk_name = thunk_name_buf;
> +    }
> +  else
> +    thunk_name = NULL;
> +
> +  snprintf (push_buf, sizeof (push_buf), "push{%c}\t%s",
> +	    TARGET_64BIT ? 'q' : 'l', xasm);
> +
> +  if (sibcall_p)
> +    {
> +      if (regno < 0)
> +	output_asm_insn (push_buf, &call_op);
> +      if (thunk_name != NULL)
> +	{
> +	  if (need_bnd_p)
> +	    fprintf (asm_out_file, "\tbnd jmp\t%s\n", thunk_name);
> +	  else
> +	    fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
> +	}
> +      else
> +	output_indirect_thunk (need_bnd_p, regno);
> +    }
> +  else
> +    {
> +      if (regno >= 0 && thunk_name != NULL)
> +	{
> +	  if (need_bnd_p)
> +	    fprintf (asm_out_file, "\tbnd call\t%s\n", thunk_name);
> +	  else
> +	    fprintf (asm_out_file, "\tcall\t%s\n", thunk_name);
> +	  return;
> +	}
> +
> +      char indirectlabel1[32];
> +      char indirectlabel2[32];
> +
> +      ASM_GENERATE_INTERNAL_LABEL (indirectlabel1,
> +				   INDIRECT_LABEL,
> +				   indirectlabelno++);
> +      ASM_GENERATE_INTERNAL_LABEL (indirectlabel2,
> +				   INDIRECT_LABEL,
> +				   indirectlabelno++);
> +
> +      /* Jump.  */
> +      if (need_bnd_p)
> +	fputs ("\tbnd jmp\t", asm_out_file);
> +      else
> +	fputs ("\tjmp\t", asm_out_file);
> +      assemble_name_raw (asm_out_file, indirectlabel2);
> +      fputc ('\n', asm_out_file);
> +
> +      ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1);
> +
> +      if (MEM_P (call_op))
> +	{
> +	  struct ix86_address parts;
> +	  rtx addr = XEXP (call_op, 0);
> +	  if (ix86_decompose_address (addr, &parts)
> +	      && parts.base == stack_pointer_rtx)
> +	    {
> +	      /* Since call will adjust stack by -UNITS_PER_WORD,
> +		 we must convert "disp(stack, index, scale)" to
> +		 "disp+UNITS_PER_WORD(stack, index, scale)".  */
> +	      if (parts.index)
> +		{
> +		  addr = gen_rtx_MULT (Pmode, parts.index,
> +				       GEN_INT (parts.scale));
> +		  addr = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
> +				       addr);
> +		}
> +	      else
> +		addr = stack_pointer_rtx;
> +
> +	      rtx disp;
> +	      if (parts.disp != NULL_RTX)
> +		disp = plus_constant (Pmode, parts.disp,
> +				      UNITS_PER_WORD);
> +	      else
> +		disp = GEN_INT (UNITS_PER_WORD);
> +
> +	      addr = gen_rtx_PLUS (Pmode, addr, disp);
> +	      call_op = gen_rtx_MEM (GET_MODE (call_op), addr);
> +	    }
> +	}
> +
> +      if (regno < 0)
> +	output_asm_insn (push_buf, &call_op);
> +
> +      if (thunk_name != NULL)
> +	{
> +	  if (need_bnd_p)
> +	    fprintf (asm_out_file, "\tbnd jmp\t%s\n", thunk_name);
> +	  else
> +	    fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
> +	}
> +      else
> +	output_indirect_thunk (need_bnd_p, regno);
> +
> +      ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2);
> +
> +      /* Call.  */
> +      if (need_bnd_p)
> +	fputs ("\tbnd call\t", asm_out_file);
> +      else
> +	fputs ("\tcall\t", asm_out_file);
> +      assemble_name_raw (asm_out_file, indirectlabel1);
> +      fputc ('\n', asm_out_file);
> +    }
> +}

It is really not very pretty that the whole sequence is injected into insn stream
as a single blob.  How opaque it is? Does it need to be patched at dynamic link time?

I suppose it may make sense to split the insn and at least explicitly represent
the fact that we (sometimes) push the target to stack.  
Why the memory variant exists at first place? 
>  
>  (define_insn "*indirect_jump"
>    [(set (pc) (match_operand:W 0 "indirect_branch_operand" "rBw"))]
>    ""
> -  "%!jmp\t%A0"
> +  "* return ix86_output_indirect_jmp (operands[0], false);"
>    [(set_attr "type" "ibr")
>     (set_attr "length_immediate" "0")
>     (set_attr "maybe_prefix_bnd" "1")])

I think you also want to update type to "many" when we do more than just indirect branch.
We do not care much about this, but it feels wrong to have attributes off reality.
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index f3e4a63ab46..ddb6035be96 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -5754,6 +5754,16 @@ Specify which floating-point unit to use.  You must specify the
>  @code{target("fpmath=sse+387")} because the comma would separate
>  different options.
>  
> +@item indirect_branch("@var{choice}")
> +@cindex @code{indirect_branch} function attribute, x86
> +On x86 targets, the @code{indirect_branch} attribute causes the compiler
> +to convert indirect call and jump with @var{choice}.  @samp{keep}
> +keeps indirect call and jump unmodified.  @samp{thunk} converts indirect
> +call and jump to call and return thunk.  @samp{thunk-inline} converts
> +indirect call and jump to inlined call and return thunk.
> +@samp{thunk-extern} converts indirect call and jump to external call
> +and return thunk provided in a separate object file.

Please expand the documentation in a way that random user who is not aware of the
issue will understand that those are security features that come at a cost.

> +@opindex -mindirect-branch
> +Convert indirect call and jump with @var{choice}.  The default is
> +@samp{keep}, which keeps indirect call and jump unmodified.
> +@samp{thunk} converts indirect call and jump to call and return thunk.
> +@samp{thunk-inline} converts indirect call and jump to inlined call
> +and return thunk.  @samp{thunk-extern} converts indirect call and jump
> +to external call and return thunk provided in a separate object file.
> +You can control this behavior for a specific function by using the
> +function attribute @code{indirect_branch}.  @xref{Function Attributes}.

Similarly here.

Rest of the patch seems OK. We may want incrementally represent more of the
indirect jump/call seqeuence in RTL, but at this point probably keeping things
simple and localized is not a bad idea. This can be done incrementally.

Please make updated patch and I would like to give others chance to comment today.

Honza
H.J. Lu Jan. 13, 2018, 3:49 p.m. | #2
On Fri, Jan 12, 2018 at 9:47 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> gcc/
>>
>>       * config/i386/i386-opts.h (indirect_branch): New.
>>       * config/i386/i386-protos.h (ix86_output_indirect_jmp): Likewise.
>>       * config/i386/i386.c (ix86_using_red_zone): Disallow red-zone
>>       with local indirect jump when converting indirect call and jump.
>>       (ix86_set_indirect_branch_type): New.
>>       (ix86_set_current_function): Call ix86_set_indirect_branch_type.
>>       (indirectlabelno): New.
>>       (indirect_thunk_needed): Likewise.
>>       (indirect_thunk_bnd_needed): Likewise.
>>       (indirect_thunks_used): Likewise.
>>       (indirect_thunks_bnd_used): Likewise.
>>       (INDIRECT_LABEL): Likewise.
>>       (indirect_thunk_name): Likewise.
>>       (output_indirect_thunk): Likewise.
>>       (output_indirect_thunk_function): Likewise.
>>       (ix86_output_indirect_branch): Likewise.
>>       (ix86_output_indirect_jmp): Likewise.
>>       (ix86_code_end): Call output_indirect_thunk_function if needed.
>>       (ix86_output_call_insn): Call ix86_output_indirect_branch if
>>       needed.
>>       (ix86_handle_fndecl_attribute): Handle indirect_branch.
>>       (ix86_attribute_table): Add indirect_branch.
>>       * config/i386/i386.h (machine_function): Add indirect_branch_type
>>       and has_local_indirect_jump.
>>       * config/i386/i386.md (indirect_jump): Set has_local_indirect_jump
>>       to true.
>>       (tablejump): Likewise.
>>       (*indirect_jump): Use ix86_output_indirect_jmp.
>>       (*tablejump_1): Likewise.
>>       (simple_return_indirect_internal): Likewise.
>>       * config/i386/i386.opt (mindirect-branch=): New option.
>>       (indirect_branch): New.
>>       (keep): Likewise.
>>       (thunk): Likewise.
>>       (thunk-inline): Likewise.
>>       (thunk-extern): Likewise.
>>       * doc/extend.texi: Document indirect_branch function attribute.
>>       * doc/invoke.texi: Document -mindirect-branch= option.
>>
>> diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h
>> index f245c1573cf..f14cbeee7a1 100644
>> --- a/gcc/config/i386/i386-opts.h
>> +++ b/gcc/config/i386/i386-opts.h
>> @@ -106,4 +106,12 @@ enum prefer_vector_width {
>>      PVW_AVX512
>>  };
>>
>> +enum indirect_branch {
>> +  indirect_branch_unset = 0,
>> +  indirect_branch_keep,
>> +  indirect_branch_thunk,
>> +  indirect_branch_thunk_inline,
>> +  indirect_branch_thunk_extern
>> +};
> I think it would make sense to simply place the body of your introduction email
> here as a comment explaining what enum indirect_branhc does.

Will do.

>> -/* Return true if a red-zone is in use.  */
>> +/* Return true if a red-zone is in use.  We can't use red-zone when
>> +   there are local indirect jumps, like "indirect_jump" or "tablejump",
>> +   which jumps to another place in the function, since "call" in the
>> +   indirect thunk pushes the return address onto stack, destroying
>> +   red-zone.  */
>
> Technically we can use red-zone if we reserve space for the call address, right?
> Perhaps mention it as TODO, that should even not be too hard to implement
> but is probably not being that important code quality wise in practice.

Will do,

>>  bool
>>  ix86_using_red_zone (void)
>>  {
>> -  return TARGET_RED_ZONE && !TARGET_64BIT_MS_ABI;
>> +  return (TARGET_RED_ZONE
>> +       && !TARGET_64BIT_MS_ABI
>> +       && (!cfun->machine->has_local_indirect_jump
>> +           || cfun->machine->indirect_branch_type == indirect_branch_keep));
>>  }
>>
>>  /* Return a string that documents the current -m options.  The caller is
>> @@ -5797,6 +5804,37 @@ ix86_set_func_type (tree fndecl)
>>      }
>>  }
>>
>> @@ -10639,6 +10681,191 @@ ix86_setup_frame_addresses (void)
>>  # endif
>>  #endif
>>
>> +static int indirectlabelno;
>> +static bool indirect_thunk_needed = false;
>> +static bool indirect_thunk_bnd_needed = false;
>> +
>> +static int indirect_thunks_used;
>> +static int indirect_thunks_bnd_used;
>
> Add comments for variables.

Will do.

>> +
>> +#ifndef INDIRECT_LABEL
>> +# define INDIRECT_LABEL "LIND"
>> +#endif
>> +
>> +/* Fills in the label name that should be used for the indirect thunk.  */
>> +
>> +static void
>> +indirect_thunk_name (char name[32], int regno, bool need_bnd_p)
>> +{
>> +  if (USE_HIDDEN_LINKONCE)
>> +    {
>> +      const char *bnd = need_bnd_p ? "_bnd" : "";
>> +      if (regno >= 0)
>> +     {
>> +       const char *reg_prefix;
>> +       if (LEGACY_INT_REGNO_P (regno))
>> +         reg_prefix = TARGET_64BIT ? "r" : "e";
>> +       else
>> +         reg_prefix = "";
>> +       sprintf (name, "__x86_indirect_thunk%s_%s%s",
>> +                bnd, reg_prefix, reg_names[regno]);
>> +     }
>> +      else
>> +     sprintf (name, "__x86_indirect_thunk%s", bnd);
>> +    }
>> +  else
>> +    {
>> +      if (regno >= 0)
>> +     {
>> +       if (need_bnd_p)
>> +         ASM_GENERATE_INTERNAL_LABEL (name, "LITBR", regno);
>> +       else
>> +         ASM_GENERATE_INTERNAL_LABEL (name, "LITR", regno);
>> +     }
>> +      else
>> +     {
>> +       if (need_bnd_p)
>> +         ASM_GENERATE_INTERNAL_LABEL (name, "LITB", 0);
>> +       else
>> +         ASM_GENERATE_INTERNAL_LABEL (name, "LIT", 0);
>> +     }
>> +    }
>> +}
>> +
>> +/* Output a call and return thunk for indirect branch.  If BND_P is
>> +   true, the BND prefix is needed.   If REGNO != -1,  the function
>> +   address is in REGNO.  Otherwise, the function address is on the
>> +   top of stack.  */
>
> Add here comment which gives example of full thunk body so it is easier
> to see what is going on here.

Sure.

>> +
>> +static void
>> +output_indirect_thunk (bool need_bnd_p, int regno)
>> +{
>> +  char indirectlabel1[32];
>> +  char indirectlabel2[32];
>> +
>> +  ASM_GENERATE_INTERNAL_LABEL (indirectlabel1, INDIRECT_LABEL,
>> +                            indirectlabelno++);
>> +  ASM_GENERATE_INTERNAL_LABEL (indirectlabel2, INDIRECT_LABEL,
>> +                            indirectlabelno++);
>> +
>> +  /* Call */
>> +  if (need_bnd_p)
>> +    fputs ("\tbnd call\t", asm_out_file);
>> +  else
>> +    fputs ("\tcall\t", asm_out_file);
>> +  assemble_name_raw (asm_out_file, indirectlabel2);
>> +  fputc ('\n', asm_out_file);
>> +
>> +  ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1);
>> +
>> +  /* Pause .  */
>> +  fprintf (asm_out_file, "\tpause\n");
>> +
>> +  /* Jump.  */
>> +  fputs ("\tjmp\t", asm_out_file);
>> +  assemble_name_raw (asm_out_file, indirectlabel1);
>> +  fputc ('\n', asm_out_file);
>> +
>> +  ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2);
>> +
>> +  if (regno >= 0)
>> +    {
>> +      /* MOV.  */
>> +      rtx xops[2];
>> +      xops[0] = gen_rtx_MEM (word_mode, stack_pointer_rtx);
>> +      xops[1] = gen_rtx_REG (word_mode, regno);
>> +      output_asm_insn ("mov\t{%1, %0|%0, %1}", xops);
>> +    }
>> +  else
>> +    {
>> +      /* LEA.  */
>> +      rtx xops[2];
>> +      xops[0] = stack_pointer_rtx;
>> +      xops[1] = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD);
>> +      output_asm_insn ("lea\t{%E1, %0|%0, %E1}", xops);
>> +    }
>> +
>> +  if (need_bnd_p)
>> +    fputs ("\tbnd ret\n", asm_out_file);
>> +  else
>> +    fputs ("\tret\n", asm_out_file);
>> +}
>> +
>> +/* Output a funtion with a call and return thunk for indirect branch.
>> +   If BND_P is true, the BND prefix is needed.   If REGNO != -1,  the
>> +   function address is in REGNO.  Otherwise, the function address is
>> +   on the top of stack.  */
>> +
>> +static void
>> +output_indirect_thunk_function (bool need_bnd_p, int regno)
>> +{
>> +  char name[32];
>> +  tree decl;
>> +
>> +  /* Create __x86_indirect_thunk/__x86_indirect_thunk_bnd.  */
>> +  indirect_thunk_name (name, regno, need_bnd_p);
>> +  decl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL,
>> +                  get_identifier (name),
>> +                  build_function_type_list (void_type_node, NULL_TREE));
>> +  DECL_RESULT (decl) = build_decl (BUILTINS_LOCATION, RESULT_DECL,
>> +                                NULL_TREE, void_type_node);
>> +  TREE_PUBLIC (decl) = 1;
>> +  TREE_STATIC (decl) = 1;
>> +  DECL_IGNORED_P (decl) = 1;
>
> DECL_ARTIFICIAL as well?

This is done exactly the same way as PIC thunk.  I don't think we
should change it here.

>
>> +
>> +#if TARGET_MACHO
>> +  if (TARGET_MACHO)
>> +    {
>> +      switch_to_section (darwin_sections[picbase_thunk_section]);
>> +      fputs ("\t.weak_definition\t", asm_out_file);
>> +      assemble_name (asm_out_file, name);
>> +      fputs ("\n\t.private_extern\t", asm_out_file);
>> +      assemble_name (asm_out_file, name);
>> +      putc ('\n', asm_out_file);
>> +      ASM_OUTPUT_LABEL (asm_out_file, name);
>> +      DECL_WEAK (decl) = 1;
>> +    }
>> +  else
>> +#endif
>> +    if (USE_HIDDEN_LINKONCE)
>> +      {
>> +     cgraph_node::create (decl)->set_comdat_group (DECL_ASSEMBLER_NAME (decl));
>> +
>> +     targetm.asm_out.unique_section (decl, 0);
>> +     switch_to_section (get_named_section (decl, NULL, 0));
>> +
>> +     targetm.asm_out.globalize_label (asm_out_file, name);
>> +     fputs ("\t.hidden\t", asm_out_file);
>> +     assemble_name (asm_out_file, name);
>> +     putc ('\n', asm_out_file);
>> +     ASM_DECLARE_FUNCTION_NAME (asm_out_file, name, decl);
>> +      }
>> +    else
>> +      {
>> +     switch_to_section (text_section);
>> +     ASM_OUTPUT_LABEL (asm_out_file, name);
>> +      }
>> +
>
> Why do you need to output asm visibility directives by hand when you create
> symbol for the function anyway?

This is done exactly the same way as PIC thunk.  I don't think we
should change it here.

> I would expect that you can just use similar code as cgraph_node::expand_thunk
> when calls output_mi_thunk and get this done in a way that is independent of
> target assembler.

I took a look.  I don't see an easy to do it.   I'd like to keep it exactly the
same as PIC thunk.  And we change both thunks together later if needed.

>>
>> +/* Output indirect branch via a call and return thunk.  CALL_OP is
>> +   the branch target.  XASM is the assembly template for CALL_OP.
>> +   Branch is a tail call if SIBCALL_P is true.  */
>
> Again please add example of code sequences output here to make code easier to follow.

Will do.

>> +
>> +static void
>> +ix86_output_indirect_branch (rtx call_op, const char *xasm,
>> +                          bool sibcall_p)
>> +{
>> +  char thunk_name_buf[32];
>> +  char *thunk_name;
>> +  char push_buf[64];
>> +  bool need_bnd_p = ix86_bnd_prefixed_insn_p (current_output_insn);
>> +  int regno;
>> +
>> +  if (REG_P (call_op))
>> +    regno = REGNO (call_op);
>> +  else
>> +    regno = -1;
>> +
>> +  if (cfun->machine->indirect_branch_type
>> +      != indirect_branch_thunk_inline)
>> +    {
>> +      if (cfun->machine->indirect_branch_type == indirect_branch_thunk)
>> +     {
>> +       if (regno >= 0)
>> +         {
>> +           int i = regno;
>> +           if (i >= FIRST_REX_INT_REG)
>> +             i -= (FIRST_REX_INT_REG - LAST_INT_REG - 1);
>> +           if (need_bnd_p)
>> +             indirect_thunks_bnd_used |= 1 << i;
>> +           else
>> +             indirect_thunks_used |= 1 << i;
>> +         }
>> +       else
>> +         {
>> +           if (need_bnd_p)
>> +             indirect_thunk_bnd_needed = true;
>> +           else
>> +             indirect_thunk_needed = true;
>> +         }
>> +     }
>> +      indirect_thunk_name (thunk_name_buf, regno, need_bnd_p);
>> +      thunk_name = thunk_name_buf;
>> +    }
>> +  else
>> +    thunk_name = NULL;
>> +
>> +  snprintf (push_buf, sizeof (push_buf), "push{%c}\t%s",
>> +         TARGET_64BIT ? 'q' : 'l', xasm);
>> +
>> +  if (sibcall_p)
>> +    {
>> +      if (regno < 0)
>> +     output_asm_insn (push_buf, &call_op);
>> +      if (thunk_name != NULL)
>> +     {
>> +       if (need_bnd_p)
>> +         fprintf (asm_out_file, "\tbnd jmp\t%s\n", thunk_name);
>> +       else
>> +         fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
>> +     }
>> +      else
>> +     output_indirect_thunk (need_bnd_p, regno);
>> +    }
>> +  else
>> +    {
>> +      if (regno >= 0 && thunk_name != NULL)
>> +     {
>> +       if (need_bnd_p)
>> +         fprintf (asm_out_file, "\tbnd call\t%s\n", thunk_name);
>> +       else
>> +         fprintf (asm_out_file, "\tcall\t%s\n", thunk_name);
>> +       return;
>> +     }
>> +
>> +      char indirectlabel1[32];
>> +      char indirectlabel2[32];
>> +
>> +      ASM_GENERATE_INTERNAL_LABEL (indirectlabel1,
>> +                                INDIRECT_LABEL,
>> +                                indirectlabelno++);
>> +      ASM_GENERATE_INTERNAL_LABEL (indirectlabel2,
>> +                                INDIRECT_LABEL,
>> +                                indirectlabelno++);
>> +
>> +      /* Jump.  */
>> +      if (need_bnd_p)
>> +     fputs ("\tbnd jmp\t", asm_out_file);
>> +      else
>> +     fputs ("\tjmp\t", asm_out_file);
>> +      assemble_name_raw (asm_out_file, indirectlabel2);
>> +      fputc ('\n', asm_out_file);
>> +
>> +      ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1);
>> +
>> +      if (MEM_P (call_op))
>> +     {
>> +       struct ix86_address parts;
>> +       rtx addr = XEXP (call_op, 0);
>> +       if (ix86_decompose_address (addr, &parts)
>> +           && parts.base == stack_pointer_rtx)
>> +         {
>> +           /* Since call will adjust stack by -UNITS_PER_WORD,
>> +              we must convert "disp(stack, index, scale)" to
>> +              "disp+UNITS_PER_WORD(stack, index, scale)".  */
>> +           if (parts.index)
>> +             {
>> +               addr = gen_rtx_MULT (Pmode, parts.index,
>> +                                    GEN_INT (parts.scale));
>> +               addr = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
>> +                                    addr);
>> +             }
>> +           else
>> +             addr = stack_pointer_rtx;
>> +
>> +           rtx disp;
>> +           if (parts.disp != NULL_RTX)
>> +             disp = plus_constant (Pmode, parts.disp,
>> +                                   UNITS_PER_WORD);
>> +           else
>> +             disp = GEN_INT (UNITS_PER_WORD);
>> +
>> +           addr = gen_rtx_PLUS (Pmode, addr, disp);
>> +           call_op = gen_rtx_MEM (GET_MODE (call_op), addr);
>> +         }
>> +     }
>> +
>> +      if (regno < 0)
>> +     output_asm_insn (push_buf, &call_op);
>> +
>> +      if (thunk_name != NULL)
>> +     {
>> +       if (need_bnd_p)
>> +         fprintf (asm_out_file, "\tbnd jmp\t%s\n", thunk_name);
>> +       else
>> +         fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
>> +     }
>> +      else
>> +     output_indirect_thunk (need_bnd_p, regno);
>> +
>> +      ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2);
>> +
>> +      /* Call.  */
>> +      if (need_bnd_p)
>> +     fputs ("\tbnd call\t", asm_out_file);
>> +      else
>> +     fputs ("\tcall\t", asm_out_file);
>> +      assemble_name_raw (asm_out_file, indirectlabel1);
>> +      fputc ('\n', asm_out_file);
>> +    }
>> +}
>
> It is really not very pretty that the whole sequence is injected into insn stream
> as a single blob.  How opaque it is? Does it need to be patched at dynamic link time?

It must be very opaque to optimizers.   No, we don't patch at load-time.

> I suppose it may make sense to split the insn and at least explicitly represent
> the fact that we (sometimes) push the target to stack.

Did you mean "split the function"?  I will break it into
ix86_output_indirect_branch_via_reg and
ix86_output_indirect_branch_via_push.

> Why the memory variant exists at first place?

When the function address is in memory, we can push it onto stack
to save a register.   Also it is needed to cover "call foo" to "call [foo@GOT]"
conversion.

>>  (define_insn "*indirect_jump"
>>    [(set (pc) (match_operand:W 0 "indirect_branch_operand" "rBw"))]
>>    ""
>> -  "%!jmp\t%A0"
>> +  "* return ix86_output_indirect_jmp (operands[0], false);"
>>    [(set_attr "type" "ibr")
>>     (set_attr "length_immediate" "0")
>>     (set_attr "maybe_prefix_bnd" "1")])
>
> I think you also want to update type to "many" when we do more than just indirect branch.

Did you mean "multi"?  I will change to "multi".

> We do not care much about this, but it feels wrong to have attributes off reality.
>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>> index f3e4a63ab46..ddb6035be96 100644
>> --- a/gcc/doc/extend.texi
>> +++ b/gcc/doc/extend.texi
>> @@ -5754,6 +5754,16 @@ Specify which floating-point unit to use.  You must specify the
>>  @code{target("fpmath=sse+387")} because the comma would separate
>>  different options.
>>
>> +@item indirect_branch("@var{choice}")
>> +@cindex @code{indirect_branch} function attribute, x86
>> +On x86 targets, the @code{indirect_branch} attribute causes the compiler
>> +to convert indirect call and jump with @var{choice}.  @samp{keep}
>> +keeps indirect call and jump unmodified.  @samp{thunk} converts indirect
>> +call and jump to call and return thunk.  @samp{thunk-inline} converts
>> +indirect call and jump to inlined call and return thunk.
>> +@samp{thunk-extern} converts indirect call and jump to external call
>> +and return thunk provided in a separate object file.
>
> Please expand the documentation in a way that random user who is not aware of the
> issue will understand that those are security features that come at a cost.
>
>> +@opindex -mindirect-branch
>> +Convert indirect call and jump with @var{choice}.  The default is
>> +@samp{keep}, which keeps indirect call and jump unmodified.
>> +@samp{thunk} converts indirect call and jump to call and return thunk.
>> +@samp{thunk-inline} converts indirect call and jump to inlined call
>> +and return thunk.  @samp{thunk-extern} converts indirect call and jump
>> +to external call and return thunk provided in a separate object file.
>> +You can control this behavior for a specific function by using the
>> +function attribute @code{indirect_branch}.  @xref{Function Attributes}.
>
> Similarly here.

I will update documentation with user guide info  after Intel white
paper is published.

> Rest of the patch seems OK. We may want incrementally represent more of the
> indirect jump/call seqeuence in RTL, but at this point probably keeping things
> simple and localized is not a bad idea. This can be done incrementally.
>
> Please make updated patch and I would like to give others chance to comment today.
>
> Honza
Jan Hubicka Jan. 13, 2018, 3:56 p.m. | #3
> >> +/* Output a funtion with a call and return thunk for indirect branch.
> >> +   If BND_P is true, the BND prefix is needed.   If REGNO != -1,  the
> >> +   function address is in REGNO.  Otherwise, the function address is
> >> +   on the top of stack.  */
> >> +
> >> +static void
> >> +output_indirect_thunk_function (bool need_bnd_p, int regno)
> >> +{
> >> +  char name[32];
> >> +  tree decl;
> >> +
> >> +  /* Create __x86_indirect_thunk/__x86_indirect_thunk_bnd.  */
> >> +  indirect_thunk_name (name, regno, need_bnd_p);
> >> +  decl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL,
> >> +                  get_identifier (name),
> >> +                  build_function_type_list (void_type_node, NULL_TREE));
> >> +  DECL_RESULT (decl) = build_decl (BUILTINS_LOCATION, RESULT_DECL,
> >> +                                NULL_TREE, void_type_node);
> >> +  TREE_PUBLIC (decl) = 1;
> >> +  TREE_STATIC (decl) = 1;
> >> +  DECL_IGNORED_P (decl) = 1;
> >
> > DECL_ARTIFICIAL as well?
> 
> This is done exactly the same way as PIC thunk.  I don't think we
> should change it here.
> >
> > Why do you need to output asm visibility directives by hand when you create
> > symbol for the function anyway?
> 
> This is done exactly the same way as PIC thunk.  I don't think we
> should change it here.

I see, it is pretty ancient code.  Perhaps you can at least commonize
the uglness so we don't duplicate the ifdefs for MACHO?

> 
> > I would expect that you can just use similar code as cgraph_node::expand_thunk
> > when calls output_mi_thunk and get this done in a way that is independent of
> > target assembler.
> 
> I took a look.  I don't see an easy to do it.   I'd like to keep it exactly the
> same as PIC thunk.  And we change both thunks together later if needed.

OK, lets keep it done same was as PIC thunk.
Next stage1 we could clean up both.
> > I suppose it may make sense to split the insn and at least explicitly represent
> > the fact that we (sometimes) push the target to stack.
> 
> Did you mean "split the function"?  I will break it into
> ix86_output_indirect_branch_via_reg and
> ix86_output_indirect_branch_via_push.

I mean define_split to insert push instruction into the instruction
stream rather then printing it as part of the indirect jump insn.
> 
> > Why the memory variant exists at first place?
> 
> When the function address is in memory, we can push it onto stack
> to save a register.   Also it is needed to cover "call foo" to "call [foo@GOT]"
> conversion.
> >
> > I think you also want to update type to "many" when we do more than just indirect branch.
> 
> Did you mean "multi"?  I will change to "multi".

Yes.

> 
> > We do not care much about this, but it feels wrong to have attributes off reality.
> >> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> >> index f3e4a63ab46..ddb6035be96 100644
> >> --- a/gcc/doc/extend.texi
> >> +++ b/gcc/doc/extend.texi
> >> @@ -5754,6 +5754,16 @@ Specify which floating-point unit to use.  You must specify the
> >>  @code{target("fpmath=sse+387")} because the comma would separate
> >>  different options.
> >>
> >> +@item indirect_branch("@var{choice}")
> >> +@cindex @code{indirect_branch} function attribute, x86
> >> +On x86 targets, the @code{indirect_branch} attribute causes the compiler
> >> +to convert indirect call and jump with @var{choice}.  @samp{keep}
> >> +keeps indirect call and jump unmodified.  @samp{thunk} converts indirect
> >> +call and jump to call and return thunk.  @samp{thunk-inline} converts
> >> +indirect call and jump to inlined call and return thunk.
> >> +@samp{thunk-extern} converts indirect call and jump to external call
> >> +and return thunk provided in a separate object file.
> >
> > Please expand the documentation in a way that random user who is not aware of the
> > issue will understand that those are security features that come at a cost.
> >
> >> +@opindex -mindirect-branch
> >> +Convert indirect call and jump with @var{choice}.  The default is
> >> +@samp{keep}, which keeps indirect call and jump unmodified.
> >> +@samp{thunk} converts indirect call and jump to call and return thunk.
> >> +@samp{thunk-inline} converts indirect call and jump to inlined call
> >> +and return thunk.  @samp{thunk-extern} converts indirect call and jump
> >> +to external call and return thunk provided in a separate object file.
> >> +You can control this behavior for a specific function by using the
> >> +function attribute @code{indirect_branch}.  @xref{Function Attributes}.
> >
> > Similarly here.
> 
> I will update documentation with user guide info  after Intel white
> paper is published.

OK,
thanks!
Honza
> 
> > Rest of the patch seems OK. We may want incrementally represent more of the
> > indirect jump/call seqeuence in RTL, but at this point probably keeping things
> > simple and localized is not a bad idea. This can be done incrementally.
> >
> > Please make updated patch and I would like to give others chance to comment today.
> >
> > Honza
> 
> 
> 
> -- 
> H.J.
H.J. Lu Jan. 13, 2018, 4:06 p.m. | #4
On Sat, Jan 13, 2018 at 7:56 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> +/* Output a funtion with a call and return thunk for indirect branch.
>> >> +   If BND_P is true, the BND prefix is needed.   If REGNO != -1,  the
>> >> +   function address is in REGNO.  Otherwise, the function address is
>> >> +   on the top of stack.  */
>> >> +
>> >> +static void
>> >> +output_indirect_thunk_function (bool need_bnd_p, int regno)
>> >> +{
>> >> +  char name[32];
>> >> +  tree decl;
>> >> +
>> >> +  /* Create __x86_indirect_thunk/__x86_indirect_thunk_bnd.  */
>> >> +  indirect_thunk_name (name, regno, need_bnd_p);
>> >> +  decl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL,
>> >> +                  get_identifier (name),
>> >> +                  build_function_type_list (void_type_node, NULL_TREE));
>> >> +  DECL_RESULT (decl) = build_decl (BUILTINS_LOCATION, RESULT_DECL,
>> >> +                                NULL_TREE, void_type_node);
>> >> +  TREE_PUBLIC (decl) = 1;
>> >> +  TREE_STATIC (decl) = 1;
>> >> +  DECL_IGNORED_P (decl) = 1;
>> >
>> > DECL_ARTIFICIAL as well?
>>
>> This is done exactly the same way as PIC thunk.  I don't think we
>> should change it here.
>> >
>> > Why do you need to output asm visibility directives by hand when you create
>> > symbol for the function anyway?
>>
>> This is done exactly the same way as PIC thunk.  I don't think we
>> should change it here.
>
> I see, it is pretty ancient code.  Perhaps you can at least commonize
> the uglness so we don't duplicate the ifdefs for MACHO?

I don't think we should such surgery at such late stage.  I prefer to
keep it for later cleanup.

>>
>> > I would expect that you can just use similar code as cgraph_node::expand_thunk
>> > when calls output_mi_thunk and get this done in a way that is independent of
>> > target assembler.
>>
>> I took a look.  I don't see an easy to do it.   I'd like to keep it exactly the
>> same as PIC thunk.  And we change both thunks together later if needed.
>
> OK, lets keep it done same was as PIC thunk.
> Next stage1 we could clean up both.
>> > I suppose it may make sense to split the insn and at least explicitly represent
>> > the fact that we (sometimes) push the target to stack.
>>
>> Did you mean "split the function"?  I will break it into
>> ix86_output_indirect_branch_via_reg and
>> ix86_output_indirect_branch_via_push.
>
> I mean define_split to insert push instruction into the instruction
> stream rather then printing it as part of the indirect jump insn.

We don't want anything added between these instructions.
Split it may lead to trouble.

>>
>> > Why the memory variant exists at first place?
>>
>> When the function address is in memory, we can push it onto stack
>> to save a register.   Also it is needed to cover "call foo" to "call [foo@GOT]"
>> conversion.
>> >
>> > I think you also want to update type to "many" when we do more than just indirect branch.
>>
>> Did you mean "multi"?  I will change to "multi".
>
> Yes.
>
>>
>> > We do not care much about this, but it feels wrong to have attributes off reality.
>> >> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>> >> index f3e4a63ab46..ddb6035be96 100644
>> >> --- a/gcc/doc/extend.texi
>> >> +++ b/gcc/doc/extend.texi
>> >> @@ -5754,6 +5754,16 @@ Specify which floating-point unit to use.  You must specify the
>> >>  @code{target("fpmath=sse+387")} because the comma would separate
>> >>  different options.
>> >>
>> >> +@item indirect_branch("@var{choice}")
>> >> +@cindex @code{indirect_branch} function attribute, x86
>> >> +On x86 targets, the @code{indirect_branch} attribute causes the compiler
>> >> +to convert indirect call and jump with @var{choice}.  @samp{keep}
>> >> +keeps indirect call and jump unmodified.  @samp{thunk} converts indirect
>> >> +call and jump to call and return thunk.  @samp{thunk-inline} converts
>> >> +indirect call and jump to inlined call and return thunk.
>> >> +@samp{thunk-extern} converts indirect call and jump to external call
>> >> +and return thunk provided in a separate object file.
>> >
>> > Please expand the documentation in a way that random user who is not aware of the
>> > issue will understand that those are security features that come at a cost.
>> >
>> >> +@opindex -mindirect-branch
>> >> +Convert indirect call and jump with @var{choice}.  The default is
>> >> +@samp{keep}, which keeps indirect call and jump unmodified.
>> >> +@samp{thunk} converts indirect call and jump to call and return thunk.
>> >> +@samp{thunk-inline} converts indirect call and jump to inlined call
>> >> +and return thunk.  @samp{thunk-extern} converts indirect call and jump
>> >> +to external call and return thunk provided in a separate object file.
>> >> +You can control this behavior for a specific function by using the
>> >> +function attribute @code{indirect_branch}.  @xref{Function Attributes}.
>> >
>> > Similarly here.
>>
>> I will update documentation with user guide info  after Intel white
>> paper is published.
>
> OK,
> thanks!
> Honza
>>
>> > Rest of the patch seems OK. We may want incrementally represent more of the
>> > indirect jump/call seqeuence in RTL, but at this point probably keeping things
>> > simple and localized is not a bad idea. This can be done incrementally.
>> >
>> > Please make updated patch and I would like to give others chance to comment today.
>> >
>> > Honza
>>
>>
>>
>> --
>> H.J.
Jan Hubicka Jan. 13, 2018, 5:30 p.m. | #5
> On Sat, Jan 13, 2018 at 7:56 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> >> +/* Output a funtion with a call and return thunk for indirect branch.
> >> >> +   If BND_P is true, the BND prefix is needed.   If REGNO != -1,  the
> >> >> +   function address is in REGNO.  Otherwise, the function address is
> >> >> +   on the top of stack.  */
> >> >> +
> >> >> +static void
> >> >> +output_indirect_thunk_function (bool need_bnd_p, int regno)
> >> >> +{
> >> >> +  char name[32];
> >> >> +  tree decl;
> >> >> +
> >> >> +  /* Create __x86_indirect_thunk/__x86_indirect_thunk_bnd.  */
> >> >> +  indirect_thunk_name (name, regno, need_bnd_p);
> >> >> +  decl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL,
> >> >> +                  get_identifier (name),
> >> >> +                  build_function_type_list (void_type_node, NULL_TREE));
> >> >> +  DECL_RESULT (decl) = build_decl (BUILTINS_LOCATION, RESULT_DECL,
> >> >> +                                NULL_TREE, void_type_node);
> >> >> +  TREE_PUBLIC (decl) = 1;
> >> >> +  TREE_STATIC (decl) = 1;
> >> >> +  DECL_IGNORED_P (decl) = 1;
> >> >
> >> > DECL_ARTIFICIAL as well?
> >>
> >> This is done exactly the same way as PIC thunk.  I don't think we
> >> should change it here.
> >> >
> >> > Why do you need to output asm visibility directives by hand when you create
> >> > symbol for the function anyway?
> >>
> >> This is done exactly the same way as PIC thunk.  I don't think we
> >> should change it here.
> >
> > I see, it is pretty ancient code.  Perhaps you can at least commonize
> > the uglness so we don't duplicate the ifdefs for MACHO?
> 
> I don't think we should such surgery at such late stage.  I prefer to
> keep it for later cleanup.

OK, lets keep it as it is and clean up next stage1.

Honza
> 
> >>
> >> > I would expect that you can just use similar code as cgraph_node::expand_thunk
> >> > when calls output_mi_thunk and get this done in a way that is independent of
> >> > target assembler.
> >>
> >> I took a look.  I don't see an easy to do it.   I'd like to keep it exactly the
> >> same as PIC thunk.  And we change both thunks together later if needed.
> >
> > OK, lets keep it done same was as PIC thunk.
> > Next stage1 we could clean up both.
> >> > I suppose it may make sense to split the insn and at least explicitly represent
> >> > the fact that we (sometimes) push the target to stack.
> >>
> >> Did you mean "split the function"?  I will break it into
> >> ix86_output_indirect_branch_via_reg and
> >> ix86_output_indirect_branch_via_push.
> >
> > I mean define_split to insert push instruction into the instruction
> > stream rather then printing it as part of the indirect jump insn.
> 
> We don't want anything added between these instructions.
> Split it may lead to trouble.
> 
> >>
> >> > Why the memory variant exists at first place?
> >>
> >> When the function address is in memory, we can push it onto stack
> >> to save a register.   Also it is needed to cover "call foo" to "call [foo@GOT]"
> >> conversion.
> >> >
> >> > I think you also want to update type to "many" when we do more than just indirect branch.
> >>
> >> Did you mean "multi"?  I will change to "multi".
> >
> > Yes.
> >
> >>
> >> > We do not care much about this, but it feels wrong to have attributes off reality.
> >> >> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> >> >> index f3e4a63ab46..ddb6035be96 100644
> >> >> --- a/gcc/doc/extend.texi
> >> >> +++ b/gcc/doc/extend.texi
> >> >> @@ -5754,6 +5754,16 @@ Specify which floating-point unit to use.  You must specify the
> >> >>  @code{target("fpmath=sse+387")} because the comma would separate
> >> >>  different options.
> >> >>
> >> >> +@item indirect_branch("@var{choice}")
> >> >> +@cindex @code{indirect_branch} function attribute, x86
> >> >> +On x86 targets, the @code{indirect_branch} attribute causes the compiler
> >> >> +to convert indirect call and jump with @var{choice}.  @samp{keep}
> >> >> +keeps indirect call and jump unmodified.  @samp{thunk} converts indirect
> >> >> +call and jump to call and return thunk.  @samp{thunk-inline} converts
> >> >> +indirect call and jump to inlined call and return thunk.
> >> >> +@samp{thunk-extern} converts indirect call and jump to external call
> >> >> +and return thunk provided in a separate object file.
> >> >
> >> > Please expand the documentation in a way that random user who is not aware of the
> >> > issue will understand that those are security features that come at a cost.
> >> >
> >> >> +@opindex -mindirect-branch
> >> >> +Convert indirect call and jump with @var{choice}.  The default is
> >> >> +@samp{keep}, which keeps indirect call and jump unmodified.
> >> >> +@samp{thunk} converts indirect call and jump to call and return thunk.
> >> >> +@samp{thunk-inline} converts indirect call and jump to inlined call
> >> >> +and return thunk.  @samp{thunk-extern} converts indirect call and jump
> >> >> +to external call and return thunk provided in a separate object file.
> >> >> +You can control this behavior for a specific function by using the
> >> >> +function attribute @code{indirect_branch}.  @xref{Function Attributes}.
> >> >
> >> > Similarly here.
> >>
> >> I will update documentation with user guide info  after Intel white
> >> paper is published.
> >
> > OK,
> > thanks!
> > Honza
> >>
> >> > Rest of the patch seems OK. We may want incrementally represent more of the
> >> > indirect jump/call seqeuence in RTL, but at this point probably keeping things
> >> > simple and localized is not a bad idea. This can be done incrementally.
> >> >
> >> > Please make updated patch and I would like to give others chance to comment today.
> >> >
> >> > Honza
> >>
> >>
> >>
> >> --
> >> H.J.
> 
> 
> 
> -- 
> H.J.
H.J. Lu Jan. 14, 2018, 3:40 a.m. | #6
On Sat, Jan 13, 2018 at 9:30 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> On Sat, Jan 13, 2018 at 7:56 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> >> +/* Output a funtion with a call and return thunk for indirect branch.
>> >> >> +   If BND_P is true, the BND prefix is needed.   If REGNO != -1,  the
>> >> >> +   function address is in REGNO.  Otherwise, the function address is
>> >> >> +   on the top of stack.  */
>> >> >> +
>> >> >> +static void
>> >> >> +output_indirect_thunk_function (bool need_bnd_p, int regno)
>> >> >> +{
>> >> >> +  char name[32];
>> >> >> +  tree decl;
>> >> >> +
>> >> >> +  /* Create __x86_indirect_thunk/__x86_indirect_thunk_bnd.  */
>> >> >> +  indirect_thunk_name (name, regno, need_bnd_p);
>> >> >> +  decl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL,
>> >> >> +                  get_identifier (name),
>> >> >> +                  build_function_type_list (void_type_node, NULL_TREE));
>> >> >> +  DECL_RESULT (decl) = build_decl (BUILTINS_LOCATION, RESULT_DECL,
>> >> >> +                                NULL_TREE, void_type_node);
>> >> >> +  TREE_PUBLIC (decl) = 1;
>> >> >> +  TREE_STATIC (decl) = 1;
>> >> >> +  DECL_IGNORED_P (decl) = 1;
>> >> >
>> >> > DECL_ARTIFICIAL as well?
>> >>
>> >> This is done exactly the same way as PIC thunk.  I don't think we
>> >> should change it here.
>> >> >
>> >> > Why do you need to output asm visibility directives by hand when you create
>> >> > symbol for the function anyway?
>> >>
>> >> This is done exactly the same way as PIC thunk.  I don't think we
>> >> should change it here.
>> >
>> > I see, it is pretty ancient code.  Perhaps you can at least commonize
>> > the uglness so we don't duplicate the ifdefs for MACHO?
>>
>> I don't think we should such surgery at such late stage.  I prefer to
>> keep it for later cleanup.
>
> OK, lets keep it as it is and clean up next stage1.
>
>

The new set of patches are at

https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01200.html
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01197.html
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01198.html
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01201.html
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01199.html

Patch

diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h
index f245c1573cf..f14cbeee7a1 100644
--- a/gcc/config/i386/i386-opts.h
+++ b/gcc/config/i386/i386-opts.h
@@ -106,4 +106,12 @@  enum prefer_vector_width {
     PVW_AVX512
 };
 
+enum indirect_branch {
+  indirect_branch_unset = 0,
+  indirect_branch_keep,
+  indirect_branch_thunk,
+  indirect_branch_thunk_inline,
+  indirect_branch_thunk_extern
+};
+
 #endif
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 0e49652898c..bf11cc426f9 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -305,6 +305,7 @@  extern enum attr_cpu ix86_schedule;
 #endif
 
 extern const char * ix86_output_call_insn (rtx_insn *insn, rtx call_op);
+extern const char * ix86_output_indirect_jmp (rtx call_op, bool ret_p);
 extern bool ix86_operands_ok_for_move_multiple (rtx *operands, bool load,
 						machine_mode mode);
 extern int ix86_min_insn_size (rtx_insn *);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 5ee3be386df..cead7a61a91 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2724,12 +2724,19 @@  make_pass_insert_endbranch (gcc::context *ctxt)
   return new pass_insert_endbranch (ctxt);
 }
 
-/* Return true if a red-zone is in use.  */
+/* Return true if a red-zone is in use.  We can't use red-zone when
+   there are local indirect jumps, like "indirect_jump" or "tablejump",
+   which jumps to another place in the function, since "call" in the
+   indirect thunk pushes the return address onto stack, destroying
+   red-zone.  */
 
 bool
 ix86_using_red_zone (void)
 {
-  return TARGET_RED_ZONE && !TARGET_64BIT_MS_ABI;
+  return (TARGET_RED_ZONE
+	  && !TARGET_64BIT_MS_ABI
+	  && (!cfun->machine->has_local_indirect_jump
+	      || cfun->machine->indirect_branch_type == indirect_branch_keep));
 }
 
 /* Return a string that documents the current -m options.  The caller is
@@ -5797,6 +5804,37 @@  ix86_set_func_type (tree fndecl)
     }
 }
 
+/* Set the indirect_branch_type field from the function FNDECL.  */
+
+static void
+ix86_set_indirect_branch_type (tree fndecl)
+{
+  if (cfun->machine->indirect_branch_type == indirect_branch_unset)
+    {
+      tree attr = lookup_attribute ("indirect_branch",
+				    DECL_ATTRIBUTES (fndecl));
+      if (attr != NULL)
+	{
+	  tree args = TREE_VALUE (attr);
+	  if (args == NULL)
+	    gcc_unreachable ();
+	  tree cst = TREE_VALUE (args);
+	  if (strcmp (TREE_STRING_POINTER (cst), "keep") == 0)
+	    cfun->machine->indirect_branch_type = indirect_branch_keep;
+	  else if (strcmp (TREE_STRING_POINTER (cst), "thunk") == 0)
+	    cfun->machine->indirect_branch_type = indirect_branch_thunk;
+	  else if (strcmp (TREE_STRING_POINTER (cst), "thunk-inline") == 0)
+	    cfun->machine->indirect_branch_type = indirect_branch_thunk_inline;
+	  else if (strcmp (TREE_STRING_POINTER (cst), "thunk-extern") == 0)
+	    cfun->machine->indirect_branch_type = indirect_branch_thunk_extern;
+	  else
+	    gcc_unreachable ();
+	}
+      else
+	cfun->machine->indirect_branch_type = ix86_indirect_branch;
+    }
+}
+
 /* Establish appropriate back-end context for processing the function
    FNDECL.  The argument might be NULL to indicate processing at top
    level, outside of any function scope.  */
@@ -5812,7 +5850,10 @@  ix86_set_current_function (tree fndecl)
 	 one is extern inline and one isn't.  Call ix86_set_func_type
 	 to set the func_type field.  */
       if (fndecl != NULL_TREE)
-	ix86_set_func_type (fndecl);
+	{
+	  ix86_set_func_type (fndecl);
+	  ix86_set_indirect_branch_type (fndecl);
+	}
       return;
     }
 
@@ -5832,6 +5873,7 @@  ix86_set_current_function (tree fndecl)
     }
 
   ix86_set_func_type (fndecl);
+  ix86_set_indirect_branch_type (fndecl);
 
   tree new_tree = DECL_FUNCTION_SPECIFIC_TARGET (fndecl);
   if (new_tree == NULL_TREE)
@@ -10639,6 +10681,191 @@  ix86_setup_frame_addresses (void)
 # endif
 #endif
 
+static int indirectlabelno;
+static bool indirect_thunk_needed = false;
+static bool indirect_thunk_bnd_needed = false;
+
+static int indirect_thunks_used;
+static int indirect_thunks_bnd_used;
+
+#ifndef INDIRECT_LABEL
+# define INDIRECT_LABEL "LIND"
+#endif
+
+/* Fills in the label name that should be used for the indirect thunk.  */
+
+static void
+indirect_thunk_name (char name[32], int regno, bool need_bnd_p)
+{
+  if (USE_HIDDEN_LINKONCE)
+    {
+      const char *bnd = need_bnd_p ? "_bnd" : "";
+      if (regno >= 0)
+	{
+	  const char *reg_prefix;
+	  if (LEGACY_INT_REGNO_P (regno))
+	    reg_prefix = TARGET_64BIT ? "r" : "e";
+	  else
+	    reg_prefix = "";
+	  sprintf (name, "__x86_indirect_thunk%s_%s%s",
+		   bnd, reg_prefix, reg_names[regno]);
+	}
+      else
+	sprintf (name, "__x86_indirect_thunk%s", bnd);
+    }
+  else
+    {
+      if (regno >= 0)
+	{
+	  if (need_bnd_p)
+	    ASM_GENERATE_INTERNAL_LABEL (name, "LITBR", regno);
+	  else
+	    ASM_GENERATE_INTERNAL_LABEL (name, "LITR", regno);
+	}
+      else
+	{
+	  if (need_bnd_p)
+	    ASM_GENERATE_INTERNAL_LABEL (name, "LITB", 0);
+	  else
+	    ASM_GENERATE_INTERNAL_LABEL (name, "LIT", 0);
+	}
+    }
+}
+
+/* Output a call and return thunk for indirect branch.  If BND_P is
+   true, the BND prefix is needed.   If REGNO != -1,  the function
+   address is in REGNO.  Otherwise, the function address is on the
+   top of stack.  */
+
+static void
+output_indirect_thunk (bool need_bnd_p, int regno)
+{
+  char indirectlabel1[32];
+  char indirectlabel2[32];
+
+  ASM_GENERATE_INTERNAL_LABEL (indirectlabel1, INDIRECT_LABEL,
+			       indirectlabelno++);
+  ASM_GENERATE_INTERNAL_LABEL (indirectlabel2, INDIRECT_LABEL,
+			       indirectlabelno++);
+
+  /* Call */
+  if (need_bnd_p)
+    fputs ("\tbnd call\t", asm_out_file);
+  else
+    fputs ("\tcall\t", asm_out_file);
+  assemble_name_raw (asm_out_file, indirectlabel2);
+  fputc ('\n', asm_out_file);
+
+  ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1);
+
+  /* Pause .  */
+  fprintf (asm_out_file, "\tpause\n");
+
+  /* Jump.  */
+  fputs ("\tjmp\t", asm_out_file);
+  assemble_name_raw (asm_out_file, indirectlabel1);
+  fputc ('\n', asm_out_file);
+
+  ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2);
+
+  if (regno >= 0)
+    {
+      /* MOV.  */
+      rtx xops[2];
+      xops[0] = gen_rtx_MEM (word_mode, stack_pointer_rtx);
+      xops[1] = gen_rtx_REG (word_mode, regno);
+      output_asm_insn ("mov\t{%1, %0|%0, %1}", xops);
+    }
+  else
+    {
+      /* LEA.  */
+      rtx xops[2];
+      xops[0] = stack_pointer_rtx;
+      xops[1] = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD);
+      output_asm_insn ("lea\t{%E1, %0|%0, %E1}", xops);
+    }
+
+  if (need_bnd_p)
+    fputs ("\tbnd ret\n", asm_out_file);
+  else
+    fputs ("\tret\n", asm_out_file);
+}
+
+/* Output a funtion with a call and return thunk for indirect branch.
+   If BND_P is true, the BND prefix is needed.   If REGNO != -1,  the
+   function address is in REGNO.  Otherwise, the function address is
+   on the top of stack.  */
+
+static void
+output_indirect_thunk_function (bool need_bnd_p, int regno)
+{
+  char name[32];
+  tree decl;
+
+  /* Create __x86_indirect_thunk/__x86_indirect_thunk_bnd.  */
+  indirect_thunk_name (name, regno, need_bnd_p);
+  decl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL,
+		     get_identifier (name),
+		     build_function_type_list (void_type_node, NULL_TREE));
+  DECL_RESULT (decl) = build_decl (BUILTINS_LOCATION, RESULT_DECL,
+				   NULL_TREE, void_type_node);
+  TREE_PUBLIC (decl) = 1;
+  TREE_STATIC (decl) = 1;
+  DECL_IGNORED_P (decl) = 1;
+
+#if TARGET_MACHO
+  if (TARGET_MACHO)
+    {
+      switch_to_section (darwin_sections[picbase_thunk_section]);
+      fputs ("\t.weak_definition\t", asm_out_file);
+      assemble_name (asm_out_file, name);
+      fputs ("\n\t.private_extern\t", asm_out_file);
+      assemble_name (asm_out_file, name);
+      putc ('\n', asm_out_file);
+      ASM_OUTPUT_LABEL (asm_out_file, name);
+      DECL_WEAK (decl) = 1;
+    }
+  else
+#endif
+    if (USE_HIDDEN_LINKONCE)
+      {
+	cgraph_node::create (decl)->set_comdat_group (DECL_ASSEMBLER_NAME (decl));
+
+	targetm.asm_out.unique_section (decl, 0);
+	switch_to_section (get_named_section (decl, NULL, 0));
+
+	targetm.asm_out.globalize_label (asm_out_file, name);
+	fputs ("\t.hidden\t", asm_out_file);
+	assemble_name (asm_out_file, name);
+	putc ('\n', asm_out_file);
+	ASM_DECLARE_FUNCTION_NAME (asm_out_file, name, decl);
+      }
+    else
+      {
+	switch_to_section (text_section);
+	ASM_OUTPUT_LABEL (asm_out_file, name);
+      }
+
+  DECL_INITIAL (decl) = make_node (BLOCK);
+  current_function_decl = decl;
+  allocate_struct_function (decl, false);
+  init_function_start (decl);
+  /* We're about to hide the function body from callees of final_* by
+     emitting it directly; tell them we're a thunk, if they care.  */
+  cfun->is_thunk = true;
+  first_function_block_is_cold = false;
+  /* Make sure unwind info is emitted for the thunk if needed.  */
+  final_start_function (emit_barrier (), asm_out_file, 1);
+
+  output_indirect_thunk (need_bnd_p, regno);
+
+  final_end_function ();
+  init_insn_lengths ();
+  free_after_compilation (cfun);
+  set_cfun (NULL);
+  current_function_decl = NULL;
+}
+
 static int pic_labels_used;
 
 /* Fills in the label name that should be used for a pc thunk for
@@ -10665,11 +10892,32 @@  ix86_code_end (void)
   rtx xops[2];
   int regno;
 
+  if (indirect_thunk_needed)
+    output_indirect_thunk_function (false, -1);
+  if (indirect_thunk_bnd_needed)
+    output_indirect_thunk_function (true, -1);
+
+  for (regno = FIRST_REX_INT_REG; regno <= LAST_REX_INT_REG; regno++)
+    {
+      int i = regno - FIRST_REX_INT_REG + LAST_INT_REG + 1;
+      if ((indirect_thunks_used & (1 << i)))
+	output_indirect_thunk_function (false, regno);
+
+      if ((indirect_thunks_bnd_used & (1 << i)))
+	output_indirect_thunk_function (true, regno);
+    }
+
   for (regno = FIRST_INT_REG; regno <= LAST_INT_REG; regno++)
     {
       char name[32];
       tree decl;
 
+      if ((indirect_thunks_used & (1 << regno)))
+	output_indirect_thunk_function (false, regno);
+
+      if ((indirect_thunks_bnd_used & (1 << regno)))
+	output_indirect_thunk_function (true, regno);
+
       if (!(pic_labels_used & (1 << regno)))
 	continue;
 
@@ -28150,12 +28398,189 @@  ix86_nopic_noplt_attribute_p (rtx call_op)
   return false;
 }
 
+/* Output indirect branch via a call and return thunk.  CALL_OP is
+   the branch target.  XASM is the assembly template for CALL_OP.
+   Branch is a tail call if SIBCALL_P is true.  */
+
+static void
+ix86_output_indirect_branch (rtx call_op, const char *xasm,
+			     bool sibcall_p)
+{
+  char thunk_name_buf[32];
+  char *thunk_name;
+  char push_buf[64];
+  bool need_bnd_p = ix86_bnd_prefixed_insn_p (current_output_insn);
+  int regno;
+
+  if (REG_P (call_op))
+    regno = REGNO (call_op);
+  else
+    regno = -1;
+
+  if (cfun->machine->indirect_branch_type
+      != indirect_branch_thunk_inline)
+    {
+      if (cfun->machine->indirect_branch_type == indirect_branch_thunk)
+	{
+	  if (regno >= 0)
+	    {
+	      int i = regno;
+	      if (i >= FIRST_REX_INT_REG)
+		i -= (FIRST_REX_INT_REG - LAST_INT_REG - 1);
+	      if (need_bnd_p)
+		indirect_thunks_bnd_used |= 1 << i;
+	      else
+		indirect_thunks_used |= 1 << i;
+	    }
+	  else
+	    {
+	      if (need_bnd_p)
+		indirect_thunk_bnd_needed = true;
+	      else
+		indirect_thunk_needed = true;
+	    }
+	}
+      indirect_thunk_name (thunk_name_buf, regno, need_bnd_p);
+      thunk_name = thunk_name_buf;
+    }
+  else
+    thunk_name = NULL;
+
+  snprintf (push_buf, sizeof (push_buf), "push{%c}\t%s",
+	    TARGET_64BIT ? 'q' : 'l', xasm);
+
+  if (sibcall_p)
+    {
+      if (regno < 0)
+	output_asm_insn (push_buf, &call_op);
+      if (thunk_name != NULL)
+	{
+	  if (need_bnd_p)
+	    fprintf (asm_out_file, "\tbnd jmp\t%s\n", thunk_name);
+	  else
+	    fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
+	}
+      else
+	output_indirect_thunk (need_bnd_p, regno);
+    }
+  else
+    {
+      if (regno >= 0 && thunk_name != NULL)
+	{
+	  if (need_bnd_p)
+	    fprintf (asm_out_file, "\tbnd call\t%s\n", thunk_name);
+	  else
+	    fprintf (asm_out_file, "\tcall\t%s\n", thunk_name);
+	  return;
+	}
+
+      char indirectlabel1[32];
+      char indirectlabel2[32];
+
+      ASM_GENERATE_INTERNAL_LABEL (indirectlabel1,
+				   INDIRECT_LABEL,
+				   indirectlabelno++);
+      ASM_GENERATE_INTERNAL_LABEL (indirectlabel2,
+				   INDIRECT_LABEL,
+				   indirectlabelno++);
+
+      /* Jump.  */
+      if (need_bnd_p)
+	fputs ("\tbnd jmp\t", asm_out_file);
+      else
+	fputs ("\tjmp\t", asm_out_file);
+      assemble_name_raw (asm_out_file, indirectlabel2);
+      fputc ('\n', asm_out_file);
+
+      ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1);
+
+      if (MEM_P (call_op))
+	{
+	  struct ix86_address parts;
+	  rtx addr = XEXP (call_op, 0);
+	  if (ix86_decompose_address (addr, &parts)
+	      && parts.base == stack_pointer_rtx)
+	    {
+	      /* Since call will adjust stack by -UNITS_PER_WORD,
+		 we must convert "disp(stack, index, scale)" to
+		 "disp+UNITS_PER_WORD(stack, index, scale)".  */
+	      if (parts.index)
+		{
+		  addr = gen_rtx_MULT (Pmode, parts.index,
+				       GEN_INT (parts.scale));
+		  addr = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
+				       addr);
+		}
+	      else
+		addr = stack_pointer_rtx;
+
+	      rtx disp;
+	      if (parts.disp != NULL_RTX)
+		disp = plus_constant (Pmode, parts.disp,
+				      UNITS_PER_WORD);
+	      else
+		disp = GEN_INT (UNITS_PER_WORD);
+
+	      addr = gen_rtx_PLUS (Pmode, addr, disp);
+	      call_op = gen_rtx_MEM (GET_MODE (call_op), addr);
+	    }
+	}
+
+      if (regno < 0)
+	output_asm_insn (push_buf, &call_op);
+
+      if (thunk_name != NULL)
+	{
+	  if (need_bnd_p)
+	    fprintf (asm_out_file, "\tbnd jmp\t%s\n", thunk_name);
+	  else
+	    fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
+	}
+      else
+	output_indirect_thunk (need_bnd_p, regno);
+
+      ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2);
+
+      /* Call.  */
+      if (need_bnd_p)
+	fputs ("\tbnd call\t", asm_out_file);
+      else
+	fputs ("\tcall\t", asm_out_file);
+      assemble_name_raw (asm_out_file, indirectlabel1);
+      fputc ('\n', asm_out_file);
+    }
+}
+
+/* Output indirect jump.  CALL_OP is the jump target.  Jump is a
+   function return if RET_P is true.  */
+
+const char *
+ix86_output_indirect_jmp (rtx call_op, bool ret_p)
+{
+  if (cfun->machine->indirect_branch_type != indirect_branch_keep)
+    {
+      /* We can't have red-zone if this isn't a function return since
+	 "call" in the indirect thunk pushes the return address onto
+	 stack, destroying red-zone.  */
+      if (!ret_p && ix86_red_zone_size != 0)
+	gcc_unreachable ();
+
+      ix86_output_indirect_branch (call_op, "%0", true);
+      return "";
+    }
+  else
+    return "%!jmp\t%A0";
+}
+
 /* Output the assembly for a call instruction.  */
 
 const char *
 ix86_output_call_insn (rtx_insn *insn, rtx call_op)
 {
   bool direct_p = constant_call_address_operand (call_op, VOIDmode);
+  bool output_indirect_p
+    = (!TARGET_SEH
+       && cfun->machine->indirect_branch_type != indirect_branch_keep);
   bool seh_nop_p = false;
   const char *xasm;
 
@@ -28165,10 +28590,21 @@  ix86_output_call_insn (rtx_insn *insn, rtx call_op)
 	{
 	  if (ix86_nopic_noplt_attribute_p (call_op))
 	    {
+	      direct_p = false;
 	      if (TARGET_64BIT)
-		xasm = "%!jmp\t{*%p0@GOTPCREL(%%rip)|[QWORD PTR %p0@GOTPCREL[rip]]}";
+		{
+		  if (output_indirect_p)
+		    xasm = "{%p0@GOTPCREL(%%rip)|[QWORD PTR %p0@GOTPCREL[rip]]}";
+		  else
+		    xasm = "%!jmp\t{*%p0@GOTPCREL(%%rip)|[QWORD PTR %p0@GOTPCREL[rip]]}";
+		}
 	      else
-		xasm = "%!jmp\t{*%p0@GOT|[DWORD PTR %p0@GOT]}";
+		{
+		  if (output_indirect_p)
+		    xasm = "{%p0@GOT|[DWORD PTR %p0@GOT]}";
+		  else
+		    xasm = "%!jmp\t{*%p0@GOT|[DWORD PTR %p0@GOT]}";
+		}
 	    }
 	  else
 	    xasm = "%!jmp\t%P0";
@@ -28178,9 +28614,17 @@  ix86_output_call_insn (rtx_insn *insn, rtx call_op)
       else if (TARGET_SEH)
 	xasm = "%!rex.W jmp\t%A0";
       else
-	xasm = "%!jmp\t%A0";
+	{
+	  if (output_indirect_p)
+	    xasm = "%0";
+	  else
+	    xasm = "%!jmp\t%A0";
+	}
 
-      output_asm_insn (xasm, &call_op);
+      if (output_indirect_p && !direct_p)
+	ix86_output_indirect_branch (call_op, xasm, true);
+      else
+	output_asm_insn (xasm, &call_op);
       return "";
     }
 
@@ -28218,18 +28662,37 @@  ix86_output_call_insn (rtx_insn *insn, rtx call_op)
     {
       if (ix86_nopic_noplt_attribute_p (call_op))
 	{
+	  direct_p = false;
 	  if (TARGET_64BIT)
-	    xasm = "%!call\t{*%p0@GOTPCREL(%%rip)|[QWORD PTR %p0@GOTPCREL[rip]]}";
+	    {
+	      if (output_indirect_p)
+		xasm = "{%p0@GOTPCREL(%%rip)|[QWORD PTR %p0@GOTPCREL[rip]]}";
+	      else
+		xasm = "%!call\t{*%p0@GOTPCREL(%%rip)|[QWORD PTR %p0@GOTPCREL[rip]]}";
+	    }
 	  else
-	    xasm = "%!call\t{*%p0@GOT|[DWORD PTR %p0@GOT]}";
+	    {
+	      if (output_indirect_p)
+		xasm = "{%p0@GOT|[DWORD PTR %p0@GOT]}";
+	      else
+		xasm = "%!call\t{*%p0@GOT|[DWORD PTR %p0@GOT]}";
+	    }
 	}
       else
 	xasm = "%!call\t%P0";
     }
   else
-    xasm = "%!call\t%A0";
+    {
+      if (output_indirect_p)
+	xasm = "%0";
+      else
+	xasm = "%!call\t%A0";
+    }
 
-  output_asm_insn (xasm, &call_op);
+  if (output_indirect_p && !direct_p)
+    ix86_output_indirect_branch (call_op, xasm, false);
+  else
+    output_asm_insn (xasm, &call_op);
 
   if (seh_nop_p)
     return "nop";
@@ -40387,7 +40850,7 @@  ix86_handle_struct_attribute (tree *node, tree name, tree, int,
 }
 
 static tree
-ix86_handle_fndecl_attribute (tree *node, tree name, tree, int,
+ix86_handle_fndecl_attribute (tree *node, tree name, tree args, int,
 			      bool *no_add_attrs)
 {
   if (TREE_CODE (*node) != FUNCTION_DECL)
@@ -40396,6 +40859,29 @@  ix86_handle_fndecl_attribute (tree *node, tree name, tree, int,
                name);
       *no_add_attrs = true;
     }
+
+  if (is_attribute_p ("indirect_branch", name))
+    {
+      tree cst = TREE_VALUE (args);
+      if (TREE_CODE (cst) != STRING_CST)
+	{
+	  warning (OPT_Wattributes,
+		   "%qE attribute requires a string constant argument",
+		   name);
+	  *no_add_attrs = true;
+	}
+      else if (strcmp (TREE_STRING_POINTER (cst), "keep") != 0
+	       && strcmp (TREE_STRING_POINTER (cst), "thunk") != 0
+	       && strcmp (TREE_STRING_POINTER (cst), "thunk-inline") != 0
+	       && strcmp (TREE_STRING_POINTER (cst), "thunk-extern") != 0)
+	{
+	  warning (OPT_Wattributes,
+		   "argument to %qE attribute is not "
+		   "(keep|thunk|thunk-inline|thunk-extern)", name);
+	  *no_add_attrs = true;
+	}
+    }
+
   return NULL_TREE;
 }
 
@@ -44842,6 +45328,8 @@  static const struct attribute_spec ix86_attribute_table[] =
     ix86_handle_no_caller_saved_registers_attribute, NULL },
   { "naked", 0, 0, true, false, false, false,
     ix86_handle_fndecl_attribute, NULL },
+  { "indirect_branch", 1, 1, true, false, false, false,
+    ix86_handle_fndecl_attribute, NULL },
 
   /* End element.  */
   { NULL, 0, 0, false, false, false, false, NULL, NULL }
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 933f261ea66..3b939086112 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2570,6 +2570,13 @@  struct GTY(()) machine_function {
   /* Function type.  */
   ENUM_BITFIELD(function_type) func_type : 2;
 
+  /* How to generate indirec branch.  */
+  ENUM_BITFIELD(indirect_branch) indirect_branch_type : 3;
+
+  /* If true, the current function has local indirect jumps, like
+     "indirect_jump" or "tablejump".  */
+  BOOL_BITFIELD has_local_indirect_jump : 1;
+
   /* If true, the current function is a function specified with
      the "interrupt" or "no_caller_saved_registers" attribute.  */
   BOOL_BITFIELD no_caller_saved_registers : 1;
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 3f587806407..a7573c468ae 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -12313,12 +12313,13 @@ 
 {
   if (TARGET_X32)
     operands[0] = convert_memory_address (word_mode, operands[0]);
+  cfun->machine->has_local_indirect_jump = true;
 })
 
 (define_insn "*indirect_jump"
   [(set (pc) (match_operand:W 0 "indirect_branch_operand" "rBw"))]
   ""
-  "%!jmp\t%A0"
+  "* return ix86_output_indirect_jmp (operands[0], false);"
   [(set_attr "type" "ibr")
    (set_attr "length_immediate" "0")
    (set_attr "maybe_prefix_bnd" "1")])
@@ -12362,13 +12363,14 @@ 
 
   if (TARGET_X32)
     operands[0] = convert_memory_address (word_mode, operands[0]);
+  cfun->machine->has_local_indirect_jump = true;
 })
 
 (define_insn "*tablejump_1"
   [(set (pc) (match_operand:W 0 "indirect_branch_operand" "rBw"))
    (use (label_ref (match_operand 1)))]
   ""
-  "%!jmp\t%A0"
+  "* return ix86_output_indirect_jmp (operands[0], false);"
   [(set_attr "type" "ibr")
    (set_attr "length_immediate" "0")
    (set_attr "maybe_prefix_bnd" "1")])
@@ -13097,7 +13099,7 @@ 
   [(simple_return)
    (use (match_operand 0 "register_operand" "r"))]
   "reload_completed"
-  "%!jmp\t%A0"
+  "* return ix86_output_indirect_jmp (operands[0], true);"
   [(set_attr "type" "ibr")
    (set_attr "length_immediate" "0")
    (set_attr "maybe_prefix_bnd" "1")])
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 09aaa97c2fc..59e5cc8e7e4 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -1021,3 +1021,23 @@  indirect jump.
 mforce-indirect-call
 Target Report Var(flag_force_indirect_call) Init(0)
 Make all function calls indirect.
+
+mindirect-branch=
+Target Report RejectNegative Joined Enum(indirect_branch) Var(ix86_indirect_branch) Init(indirect_branch_keep)
+Convert indirect call and jump to call and return thunks.
+
+Enum
+Name(indirect_branch) Type(enum indirect_branch)
+Known indirect branch choices (for use with the -mindirect-branch= option):
+
+EnumValue
+Enum(indirect_branch) String(keep) Value(indirect_branch_keep)
+
+EnumValue
+Enum(indirect_branch) String(thunk) Value(indirect_branch_thunk)
+
+EnumValue
+Enum(indirect_branch) String(thunk-inline) Value(indirect_branch_thunk_inline)
+
+EnumValue
+Enum(indirect_branch) String(thunk-extern) Value(indirect_branch_thunk_extern)
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index f3e4a63ab46..ddb6035be96 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -5754,6 +5754,16 @@  Specify which floating-point unit to use.  You must specify the
 @code{target("fpmath=sse+387")} because the comma would separate
 different options.
 
+@item indirect_branch("@var{choice}")
+@cindex @code{indirect_branch} function attribute, x86
+On x86 targets, the @code{indirect_branch} attribute causes the compiler
+to convert indirect call and jump with @var{choice}.  @samp{keep}
+keeps indirect call and jump unmodified.  @samp{thunk} converts indirect
+call and jump to call and return thunk.  @samp{thunk-inline} converts
+indirect call and jump to inlined call and return thunk.
+@samp{thunk-extern} converts indirect call and jump to external call
+and return thunk provided in a separate object file.
+
 @item nocf_check
 @cindex @code{nocf_check} function attribute
 The @code{nocf_check} attribute on a function is used to inform the
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a9449a86064..0d685c3576b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1229,7 +1229,8 @@  See RS/6000 and PowerPC Options.
 -mstack-protector-guard-reg=@var{reg} @gol
 -mstack-protector-guard-offset=@var{offset} @gol
 -mstack-protector-guard-symbol=@var{symbol} -mmitigate-rop @gol
--mgeneral-regs-only  -mcall-ms2sysv-xlogues}
+-mgeneral-regs-only -mcall-ms2sysv-xlogues @gol
+-mindirect-branch=@var{choice}}
 
 @emph{x86 Windows Options}
 @gccoptlist{-mconsole  -mcygwin  -mno-cygwin  -mdll @gol
@@ -26838,6 +26839,17 @@  Generate code that uses only the general-purpose registers.  This
 prevents the compiler from using floating-point, vector, mask and bound
 registers.
 
+@item -mindirect-branch=@var{choice}
+@opindex -mindirect-branch
+Convert indirect call and jump with @var{choice}.  The default is
+@samp{keep}, which keeps indirect call and jump unmodified.
+@samp{thunk} converts indirect call and jump to call and return thunk.
+@samp{thunk-inline} converts indirect call and jump to inlined call
+and return thunk.  @samp{thunk-extern} converts indirect call and jump
+to external call and return thunk provided in a separate object file.
+You can control this behavior for a specific function by using the
+function attribute @code{indirect_branch}.  @xref{Function Attributes}.
+
 @end table
 
 These @samp{-m} switches are supported in addition to the above
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-1.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-1.c
new file mode 100644
index 00000000000..d1d2ee78797
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-1.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch;
+
+void
+male_indirect_jump (long offset)
+{
+  dispatch(offset);
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler {\tpause} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-2.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-2.c
new file mode 100644
index 00000000000..08646c6b823
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-2.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch[256];
+
+void
+male_indirect_jump (long offset)
+{
+  dispatch[offset](offset);
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler {\tpause} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c
new file mode 100644
index 00000000000..af244de2238
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch;
+
+int
+male_indirect_jump (long offset)
+{
+  dispatch(offset);
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "call\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler {\tpause} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c
new file mode 100644
index 00000000000..b8aedd5a4e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch[256];
+
+int
+male_indirect_jump (long offset)
+{
+  dispatch[offset](offset);
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "call\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler {\tpause} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-5.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-5.c
new file mode 100644
index 00000000000..6ffb9235f94
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-5.c
@@ -0,0 +1,16 @@ 
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fpic -fno-plt -mindirect-branch=thunk" } */
+
+extern void bar (void);
+
+void
+foo (void)
+{
+  bar ();
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*bar@GOT" } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler {\tpause} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c
new file mode 100644
index 00000000000..e6d9d148cd2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fpic -fno-plt -mindirect-branch=thunk" } */
+
+extern void bar (void);
+
+int
+foo (void)
+{
+  bar ();
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*bar@GOT" } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" } } */
+/* { dg-final { scan-assembler-times "jmp\[ \t\]*\.LIND" 2 } } */
+/* { dg-final { scan-assembler-times "call\[ \t\]*\.LIND" 2 } } */
+/* { dg-final { scan-assembler {\tpause} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c
new file mode 100644
index 00000000000..d892d8f5992
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c
@@ -0,0 +1,43 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk -fno-pic" } */
+
+void func0 (void);
+void func1 (void);
+void func2 (void);
+void func3 (void);
+void func4 (void);
+void func4 (void);
+void func5 (void);
+
+void
+bar (int i)
+{
+  switch (i)
+    {
+    default:
+      func0 ();
+      break;
+    case 1:
+      func1 ();
+      break;
+    case 2:
+      func2 ();
+      break;
+    case 3:
+      func3 ();
+      break;
+    case 4:
+      func4 ();
+      break;
+    case 5:
+      func5 ();
+      break;
+    }
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*\.L\[0-9\]+\\(,%" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler {\tpause} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-1.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-1.c
new file mode 100644
index 00000000000..24188d0b62d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-1.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch;
+
+extern void male_indirect_jump (long)
+  __attribute__ ((indirect_branch("thunk")));
+
+void
+male_indirect_jump (long offset)
+{
+  dispatch(offset);
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler {\tpause} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-2.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-2.c
new file mode 100644
index 00000000000..03184b90cda
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-2.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch[256];
+
+__attribute__ ((indirect_branch("thunk")))
+void
+male_indirect_jump (long offset)
+{
+  dispatch[offset](offset);
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler {\tpause} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-3.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-3.c
new file mode 100644
index 00000000000..af167840b81
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-3.c
@@ -0,0 +1,21 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch;
+extern int male_indirect_jump (long)
+  __attribute__ ((indirect_branch("thunk-inline")));
+
+int
+male_indirect_jump (long offset)
+{
+  dispatch(offset);
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler-times "jmp\[ \t\]*\.LIND" 2 } } */
+/* { dg-final { scan-assembler-times "call\[ \t\]*\.LIND" 2 } } */
+/* { dg-final { scan-assembler-not "__x86_indirect_thunk" } } */
+/* { dg-final { scan-assembler-not "pushq\[ \t\]%rax" { target x32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-4.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-4.c
new file mode 100644
index 00000000000..146124894a0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-4.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch[256];
+
+__attribute__ ((indirect_branch("thunk-inline")))
+int
+male_indirect_jump (long offset)
+{
+  dispatch[offset](offset);
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler-times "jmp\[ \t\]*\.LIND" 2 } } */
+/* { dg-final { scan-assembler-times "call\[ \t\]*\.LIND" 2 } } */
+/* { dg-final { scan-assembler-not "__x86_indirect_thunk" } } */
+/* { dg-final { scan-assembler-not "pushq\[ \t\]%rax" { target x32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-5.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-5.c
new file mode 100644
index 00000000000..568327cd8e7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-5.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch;
+extern int male_indirect_jump (long)
+  __attribute__ ((indirect_branch("thunk-extern")));
+
+int
+male_indirect_jump (long offset)
+{
+  dispatch(offset);
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler-times "jmp\[ \t\]*\.LIND" 1 { target { ! x32 } } } } */
+/* { dg-final { scan-assembler-times "call\[ \t\]*\.LIND" 1 { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "call\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler-not {\t(pause|pause)} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-6.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-6.c
new file mode 100644
index 00000000000..bd8a99e7828
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-6.c
@@ -0,0 +1,21 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch[256];
+
+__attribute__ ((indirect_branch("thunk-extern")))
+int
+male_indirect_jump (long offset)
+{
+  dispatch[offset](offset);
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler-times "jmp\[ \t\]*\.LIND" 1 { target { ! x32 } } } } */
+/* { dg-final { scan-assembler-times "call\[ \t\]*\.LIND" 1 { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "call\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler-not {\t(pause|pause)} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-7.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-7.c
new file mode 100644
index 00000000000..356015c9799
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-7.c
@@ -0,0 +1,44 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-pic" } */
+
+void func0 (void);
+void func1 (void);
+void func2 (void);
+void func3 (void);
+void func4 (void);
+void func4 (void);
+void func5 (void);
+
+__attribute__ ((indirect_branch("thunk-extern")))
+void
+bar (int i)
+{
+  switch (i)
+    {
+    default:
+      func0 ();
+      break;
+    case 1:
+      func1 ();
+      break;
+    case 2:
+      func2 ();
+      break;
+    case 3:
+      func3 ();
+      break;
+    case 4:
+      func4 ();
+      break;
+    case 5:
+      func5 ();
+      break;
+    }
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*\.L\[0-9\]+\\(,%" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" } } */
+/* { dg-final { scan-assembler-not {\t(pause|pause)} } } */
+/* { dg-final { scan-assembler-not "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler-not "call\[ \t\]*\.LIND" } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-8.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-8.c
new file mode 100644
index 00000000000..6960fa0bbfb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-attr-8.c
@@ -0,0 +1,41 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk -fno-pic" } */
+
+void func0 (void);
+void func1 (void);
+void func2 (void);
+void func3 (void);
+void func4 (void);
+void func4 (void);
+void func5 (void);
+
+__attribute__ ((indirect_branch("keep")))
+void
+bar (int i)
+{
+  switch (i)
+    {
+    default:
+      func0 ();
+      break;
+    case 1:
+      func1 ();
+      break;
+    case 2:
+      func2 ();
+      break;
+    case 3:
+      func3 ();
+      break;
+    case 4:
+      func4 ();
+      break;
+    case 5:
+      func5 ();
+      break;
+    }
+}
+
+/* { dg-final { scan-assembler-not "__x86_indirect_thunk" } } */
+/* { dg-final { scan-assembler-not "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler-not "call\[ \t\]*\.LIND" } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-1.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-1.c
new file mode 100644
index 00000000000..febf32d76ea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-1.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile { target { ! x32 } } } */
+/* { dg-options "-O2 -mindirect-branch=thunk -fcheck-pointer-bounds -mmpx -fno-pic" } */
+
+void (*dispatch) (char *);
+char buf[10];
+
+void
+foo (void)
+{
+  dispatch (buf);
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "pushq\[ \t\]%rax" { target x32 } } } */
+/* { dg-final { scan-assembler "bnd jmp\[ \t\]*__x86_indirect_thunk_bnd" } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "bnd call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "bnd ret" } } */
+/* { dg-final { scan-assembler {\tpause} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-2.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-2.c
new file mode 100644
index 00000000000..319ba30b78b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-2.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile { target { ! x32 } } } */
+/* { dg-options "-O2 -mindirect-branch=thunk -fcheck-pointer-bounds -mmpx -fno-pic" } */
+
+void (*dispatch) (char *);
+char buf[10];
+
+int
+foo (void)
+{
+  dispatch (buf);
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "pushq\[ \t\]%rax" { target x32 } } } */
+/* { dg-final { scan-assembler "bnd jmp\[ \t\]*__x86_indirect_thunk_bnd" } } */
+/* { dg-final { scan-assembler "bnd jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "bnd call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "bnd ret" } } */
+/* { dg-final { scan-assembler {\tpause} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-3.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-3.c
new file mode 100644
index 00000000000..9168b3146f5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-3.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile { target { *-*-linux* && { ! x32 } } } } */
+/* { dg-options "-O2 -mindirect-branch=thunk -fcheck-pointer-bounds -mmpx -fpic -fno-plt" } */
+
+void bar (char *);
+char buf[10];
+
+void
+foo (void)
+{
+  bar (buf);
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*bar@GOT" } } */
+/* { dg-final { scan-assembler "bnd jmp\[ \t\]*__x86_indirect_thunk_bnd" } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "bnd call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "bnd ret" } } */
+/* { dg-final { scan-assembler {\tpause} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c
new file mode 100644
index 00000000000..d3b36d44c7c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile { target { *-*-linux* && { ! x32 } } } } */
+/* { dg-options "-O2 -mindirect-branch=thunk -fcheck-pointer-bounds -mmpx -fpic -fno-plt" } */
+
+void bar (char *);
+char buf[10];
+
+int
+foo (void)
+{
+  bar (buf);
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*bar@GOT" } } */
+/* { dg-final { scan-assembler "bnd jmp\[ \t\]*__x86_indirect_thunk" } } */
+/* { dg-final { scan-assembler "bnd jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler-times "bnd call\[ \t\]*\.LIND" 2 } } */
+/* { dg-final { scan-assembler "bnd ret" } } */
+/* { dg-final { scan-assembler {\tpause} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-1.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-1.c
new file mode 100644
index 00000000000..9e50b282f77
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-1.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk-extern -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch;
+
+void
+male_indirect_jump (long offset)
+{
+  dispatch(offset);
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler-not {\t(pause|pause)} } } */
+/* { dg-final { scan-assembler-not "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler-not "call\[ \t\]*\.LIND" } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-2.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-2.c
new file mode 100644
index 00000000000..f897d1c0497
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-2.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk-extern -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch[256];
+
+void
+male_indirect_jump (long offset)
+{
+  dispatch[offset](offset);
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler-not {\t(pause|pause)} } } */
+/* { dg-final { scan-assembler-not "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler-not "call\[ \t\]*\.LIND" } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-3.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-3.c
new file mode 100644
index 00000000000..25905cd0016
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-3.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk-extern -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch;
+
+int
+male_indirect_jump (long offset)
+{
+  dispatch(offset);
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler-times "jmp\[ \t\]*\.LIND" 1 { target { ! x32 } } } } */
+/* { dg-final { scan-assembler-times "call\[ \t\]*\.LIND" 1 { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "call\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler-not {\t(pause|pause)} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-4.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-4.c
new file mode 100644
index 00000000000..a7fa12183af
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-4.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk-extern -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch[256];
+
+int
+male_indirect_jump (long offset)
+{
+  dispatch[offset](offset);
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler-times "jmp\[ \t\]*\.LIND" 1 { target { ! x32 } } } } */
+/* { dg-final { scan-assembler-times "call\[ \t\]*\.LIND" 1 { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "call\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler-not {\t(pause|pause)} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-5.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-5.c
new file mode 100644
index 00000000000..48a49760be6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-5.c
@@ -0,0 +1,16 @@ 
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fpic -fno-plt -mindirect-branch=thunk-extern" } */
+
+extern void bar (void);
+
+void
+foo (void)
+{
+  bar ();
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*bar@GOT" } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" } } */
+/* { dg-final { scan-assembler-not {\t(pause|pause)} } } */
+/* { dg-final { scan-assembler-not "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler-not "call\[ \t\]*\.LIND" } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-6.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-6.c
new file mode 100644
index 00000000000..a1c662f7d23
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-6.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fpic -fno-plt -mindirect-branch=thunk-extern" } */
+
+extern void bar (void);
+
+int
+foo (void)
+{
+  bar ();
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*bar@GOT" } } */
+/* { dg-final { scan-assembler-times "jmp\[ \t\]*\.LIND" 1 } } */
+/* { dg-final { scan-assembler-times "call\[ \t\]*\.LIND" 1 } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" } } */
+/* { dg-final { scan-assembler-not {\t(pause|pause)} } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-7.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-7.c
new file mode 100644
index 00000000000..40a665ea640
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-extern-7.c
@@ -0,0 +1,43 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk-extern -fno-pic" } */
+
+void func0 (void);
+void func1 (void);
+void func2 (void);
+void func3 (void);
+void func4 (void);
+void func4 (void);
+void func5 (void);
+
+void
+bar (int i)
+{
+  switch (i)
+    {
+    default:
+      func0 ();
+      break;
+    case 1:
+      func1 ();
+      break;
+    case 2:
+      func2 ();
+      break;
+    case 3:
+      func3 ();
+      break;
+    case 4:
+      func4 ();
+      break;
+    case 5:
+      func5 ();
+      break;
+    }
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*\.L\[0-9\]+\\(,%" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */
+/* { dg-final { scan-assembler-not {\t(pause|pause)} } } */
+/* { dg-final { scan-assembler-not "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler-not "call\[ \t\]*\.LIND" } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-1.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-1.c
new file mode 100644
index 00000000000..3ace8d1b031
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-1.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk-inline -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch;
+
+void
+male_indirect_jump (long offset)
+{
+  dispatch(offset);
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler-not "__x86_indirect_thunk" } } */
+/* { dg-final { scan-assembler-not "pushq\[ \t\]%rax" { target x32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-2.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-2.c
new file mode 100644
index 00000000000..6c97b96f1f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-2.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk-inline -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch[256];
+
+void
+male_indirect_jump (long offset)
+{
+  dispatch[offset](offset);
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler-not "__x86_indirect_thunk" } } */
+/* { dg-final { scan-assembler-not "pushq\[ \t\]%rax" { target x32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-3.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-3.c
new file mode 100644
index 00000000000..8f6759cbf06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-3.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk-inline -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch;
+
+int
+male_indirect_jump (long offset)
+{
+  dispatch(offset);
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler-times "jmp\[ \t\]*\.LIND" 2 } } */
+/* { dg-final { scan-assembler-times "call\[ \t\]*\.LIND" 2 } } */
+/* { dg-final { scan-assembler-not "__x86_indirect_thunk" } } */
+/* { dg-final { scan-assembler-not "pushq\[ \t\]%rax" { target x32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-4.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-4.c
new file mode 100644
index 00000000000..b07d08cab0f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-4.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk-inline -fno-pic" } */
+
+typedef void (*dispatch_t)(long offset);
+
+dispatch_t dispatch[256];
+
+int
+male_indirect_jump (long offset)
+{
+  dispatch[offset](offset);
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler-times "jmp\[ \t\]*\.LIND" 2 } } */
+/* { dg-final { scan-assembler-times "call\[ \t\]*\.LIND" 2 } } */
+/* { dg-final { scan-assembler-not "__x86_indirect_thunk" } } */
+/* { dg-final { scan-assembler-not "pushq\[ \t\]%rax" { target x32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-5.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-5.c
new file mode 100644
index 00000000000..10794886b1b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-5.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fpic -fno-plt -mindirect-branch=thunk-inline" } */
+
+extern void bar (void);
+
+void
+foo (void)
+{
+  bar ();
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*bar@GOT" } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler-not "__x86_indirect_thunk" } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-6.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-6.c
new file mode 100644
index 00000000000..a26ec4b06ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-6.c
@@ -0,0 +1,16 @@ 
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fpic -fno-plt -mindirect-branch=thunk-inline" } */
+
+extern void bar (void);
+
+int
+foo (void)
+{
+  bar ();
+  return 0;
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*bar@GOT" } } */
+/* { dg-final { scan-assembler-times "jmp\[ \t\]*\.LIND" 2 } } */
+/* { dg-final { scan-assembler-times "call\[ \t\]*\.LIND" 2 } } */
+/* { dg-final { scan-assembler-not "__x86_indirect_thunk" } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c
new file mode 100644
index 00000000000..77253af17c6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c
@@ -0,0 +1,42 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk-inline -fno-pic" } */
+
+void func0 (void);
+void func1 (void);
+void func2 (void);
+void func3 (void);
+void func4 (void);
+void func4 (void);
+void func5 (void);
+
+void
+bar (int i)
+{
+  switch (i)
+    {
+    default:
+      func0 ();
+      break;
+    case 1:
+      func1 ();
+      break;
+    case 2:
+      func2 ();
+      break;
+    case 3:
+      func3 ();
+      break;
+    case 4:
+      func4 ();
+      break;
+    case 5:
+      func5 ();
+      break;
+    }
+}
+
+/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*\.L\[0-9\]+\\(,%" { target { ! x32 } } } } */
+/* { dg-final { scan-assembler-not "pushq\[ \t\]%rax" { target x32 } } } */
+/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */
+/* { dg-final { scan-assembler-not "__x86_indirect_thunk" } } */