diff mbox series

rs6000: Call flow implementation for PC-relative addressing

Message ID bec8578a-8cb6-8c3e-1938-df90d192665d@linux.ibm.com
State New
Headers show
Series rs6000: Call flow implementation for PC-relative addressing | expand

Commit Message

Bill Schmidt May 24, 2019, 2:11 a.m. UTC
Hi,

This patch contains the changes to implement call flow for PC-relative addressing.  
It's an amalgam of several internal patches that Alan and I worked on, and as a 
result it's hard to tease apart individual pieces much further.  So I apologize 
that this patch is a little larger than the others.  Also, I've CC'd Alan so he 
can help answer questions about the patch, particularly the PLT bits I'm not very
familiar with.

Following are descriptions of the individual patches that are combined here.

(1) When a function uses PC-relative code generation, all direct calls (other than 
sibcalls) that the function makes to local or external callees should appear as
"bl sym@notoc" and should not be followed by a nop instruction.  @notoc indicates
that the assembler should annotate the call with R_PPC64_REL24_NOTOC, meaning
that the caller does not guarantee that r2 contains a valid TOC pointer.  Thus
the linker should not try to replace a subsequent "nop" with a TOC restore
instruction.

I've added a test case for the four cases handled here:  local calls with/without
a return value, and external cases with/without a return value.

(2) If a caller preserves the TOC pointer and the callee does not, or vice versa,
then a sibcall will cause an inconsistency.  Don't allow that.

(3) The linker needs a @notoc directive on sibcall targets when the caller does not
provide or preserve a TOC pointer.  This patch provides for that.

In creating the new sibcall patterns, I did not duplicate the "c" alternatives
that allow for bctr or blr sibcalls.  I don't think there's a way to generate
those currently.  The bctr would be legitimate for PC-relative sibcalls if you
can prove that the target function is in the same binary, but we don't appear
to detect that possibility today.

    (4) This patch deletes all the extra insns added to handle pcrel calls,
    instead opting to use existing insns but making their output
    conditional on rs6000_pcrel_p(cfun).  There isn't a need to
    differentiate between pcrel and non-pcrel calls at the point rtl is
    created; rs6000_pcrel_p is valid right up to the final pass, as
    evidenced by use of rs6000_pcrel_p to emit .localentry.
    
    There is one case however where we do need new insns: The existing
    nonlocal indirect call insns mention r2 in their rtl.  That isn't
    correct for pcrel indirect calls, and may cause problems when/if r2
    is allocated as any other volatile gpr in pcrel code.
    
    The patch also fixes pcrel inline PLT calls (which are used for
    -fno-plt and -mlongcall) to use a pcrel load from the PLT rather than
    attempting (and failing) to use TOC-relative loads.  This requires
    some changes in the way relocs are emitted.  For prefix insns we can't
    write
       .reloc .,R_PPC64_PLT_PCREL34_NOTOC,foo
       pld 12,0(0),1
    since the pld may require a padding nop.  Instead it's necessary to
    put the .reloc after the instruction or use a label on the insn.  Like
    this (which is what the patch does):
       pld 12,0(0),1
       .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo
    or this:
       .reloc 0f,R_PPC64_PLT_PCREL34_NOTOC,foo
    0: pld 12,0(0),1

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
Is this okay for trunk?

Thanks!
Bill


[gcc]

2019-05-23  Bill Schmidt  <wschmidt@linux.ibm.com>
	    Alan Modra  <amodra@gmail.com>

	* config/rs6000/rs6000.c (rs6000_call_template_1): Handle pcrel
	calls here...
	(rs6000_indirect_call_template_1): ...and here.
	(rs6000_indirect_sibcall_template): Handle plt_pcrel34.  Rework
	tocsave, plt16_ha, plt16_lo, mtctr indirect calls.
	(rs6000_decl_ok_for_sibcall): New function.
	(rs6000_function_ok_for_sibcall): Refactor.
	(rs6000_longcall_ref): Use UNSPEC_PLT_PCREL when pcrel.
	(rs6000_call_aix): Don't emit toc restore rtl for indirect calls
	when pcrel.  Reorganize.
	(rs6000_sibcall_aix): Don't add r2 to function usage when pcrel.
	* rs6000.md (UNSPEC_PLT_PCREL): New unspec.
	(*pltseq_plt_pcrel): New insn.
	(*call_local_aix): Handle @notoc calls.
	(*call_value_local_aix): Likewise.
	(*call_nonlocal_aix): Adjust lengths for pcrel calls.
	(*call_value_nonlocal_aix): Likewise.
	(*call_indirect_pcrel): New insn.
	(*call_value_indirect_pcrel): Likewise.


[gcc/testsuite]

2019-05-23  Bill Schmidt  <wschmidt@linux.ibm.com>

	* gcc.target/powerpc/notoc-direct-1.c: New.
	* gcc.target/powerpc/pcrel-sibcall-1.c: New.

Comments

Bill Schmidt May 24, 2019, 2:17 a.m. UTC | #1
Hm, I got ahead of myself on this one.  I haven't done the regstrap yet,
so please hold off reviewing for now.

Sorry for the noise.  I shouldn't post when I'm tired...

Thanks,
Bill

On 5/23/19 9:11 PM, Bill Schmidt wrote:
> Hi,
>
> This patch contains the changes to implement call flow for PC-relative addressing.  
> It's an amalgam of several internal patches that Alan and I worked on, and as a 
> result it's hard to tease apart individual pieces much further.  So I apologize 
> that this patch is a little larger than the others.  Also, I've CC'd Alan so he 
> can help answer questions about the patch, particularly the PLT bits I'm not very
> familiar with.
>
> Following are descriptions of the individual patches that are combined here.
>
> (1) When a function uses PC-relative code generation, all direct calls (other than 
> sibcalls) that the function makes to local or external callees should appear as
> "bl sym@notoc" and should not be followed by a nop instruction.  @notoc indicates
> that the assembler should annotate the call with R_PPC64_REL24_NOTOC, meaning
> that the caller does not guarantee that r2 contains a valid TOC pointer.  Thus
> the linker should not try to replace a subsequent "nop" with a TOC restore
> instruction.
>
> I've added a test case for the four cases handled here:  local calls with/without
> a return value, and external cases with/without a return value.
>
> (2) If a caller preserves the TOC pointer and the callee does not, or vice versa,
> then a sibcall will cause an inconsistency.  Don't allow that.
>
> (3) The linker needs a @notoc directive on sibcall targets when the caller does not
> provide or preserve a TOC pointer.  This patch provides for that.
>
> In creating the new sibcall patterns, I did not duplicate the "c" alternatives
> that allow for bctr or blr sibcalls.  I don't think there's a way to generate
> those currently.  The bctr would be legitimate for PC-relative sibcalls if you
> can prove that the target function is in the same binary, but we don't appear
> to detect that possibility today.
>
>     (4) This patch deletes all the extra insns added to handle pcrel calls,
>     instead opting to use existing insns but making their output
>     conditional on rs6000_pcrel_p(cfun).  There isn't a need to
>     differentiate between pcrel and non-pcrel calls at the point rtl is
>     created; rs6000_pcrel_p is valid right up to the final pass, as
>     evidenced by use of rs6000_pcrel_p to emit .localentry.
>     
>     There is one case however where we do need new insns: The existing
>     nonlocal indirect call insns mention r2 in their rtl.  That isn't
>     correct for pcrel indirect calls, and may cause problems when/if r2
>     is allocated as any other volatile gpr in pcrel code.
>     
>     The patch also fixes pcrel inline PLT calls (which are used for
>     -fno-plt and -mlongcall) to use a pcrel load from the PLT rather than
>     attempting (and failing) to use TOC-relative loads.  This requires
>     some changes in the way relocs are emitted.  For prefix insns we can't
>     write
>        .reloc .,R_PPC64_PLT_PCREL34_NOTOC,foo
>        pld 12,0(0),1
>     since the pld may require a padding nop.  Instead it's necessary to
>     put the .reloc after the instruction or use a label on the insn.  Like
>     this (which is what the patch does):
>        pld 12,0(0),1
>        .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo
>     or this:
>        .reloc 0f,R_PPC64_PLT_PCREL34_NOTOC,foo
>     0: pld 12,0(0),1
>
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
> Is this okay for trunk?
>
> Thanks!
> Bill
>
>
> [gcc]
>
> 2019-05-23  Bill Schmidt  <wschmidt@linux.ibm.com>
> 	    Alan Modra  <amodra@gmail.com>
>
> 	* config/rs6000/rs6000.c (rs6000_call_template_1): Handle pcrel
> 	calls here...
> 	(rs6000_indirect_call_template_1): ...and here.
> 	(rs6000_indirect_sibcall_template): Handle plt_pcrel34.  Rework
> 	tocsave, plt16_ha, plt16_lo, mtctr indirect calls.
> 	(rs6000_decl_ok_for_sibcall): New function.
> 	(rs6000_function_ok_for_sibcall): Refactor.
> 	(rs6000_longcall_ref): Use UNSPEC_PLT_PCREL when pcrel.
> 	(rs6000_call_aix): Don't emit toc restore rtl for indirect calls
> 	when pcrel.  Reorganize.
> 	(rs6000_sibcall_aix): Don't add r2 to function usage when pcrel.
> 	* rs6000.md (UNSPEC_PLT_PCREL): New unspec.
> 	(*pltseq_plt_pcrel): New insn.
> 	(*call_local_aix): Handle @notoc calls.
> 	(*call_value_local_aix): Likewise.
> 	(*call_nonlocal_aix): Adjust lengths for pcrel calls.
> 	(*call_value_nonlocal_aix): Likewise.
> 	(*call_indirect_pcrel): New insn.
> 	(*call_value_indirect_pcrel): Likewise.
>
>
> [gcc/testsuite]
>
> 2019-05-23  Bill Schmidt  <wschmidt@linux.ibm.com>
>
> 	* gcc.target/powerpc/notoc-direct-1.c: New.
> 	* gcc.target/powerpc/pcrel-sibcall-1.c: New.
>
>
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 3d5cf9e4ece..9229bad6acc 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -21268,7 +21268,9 @@ rs6000_call_template_1 (rtx *operands, unsigned int funop, bool sibcall)
>  	    ? "+32768" : ""));
>
>    static char str[32];  /* 2 spare */
> -  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
> +  if (rs6000_pcrel_p (cfun))
> +    sprintf (str, "b%s %s@notoc%s", sibcall ? "" : "l", z, arg);
> +  else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>      sprintf (str, "b%s %s%s%s", sibcall ? "" : "l", z, arg,
>  	     sibcall ? "" : "\n\tnop");
>    else if (DEFAULT_ABI == ABI_V4)
> @@ -21333,6 +21335,16 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop,
>    /* Currently, funop is either 0 or 1.  The maximum string is always
>       a !speculate 64-bit __tls_get_addr call.
>
> +     ABI_ELFv2, pcrel:
> +     . 27	.reloc .,R_PPC64_TLSGD,%2\n\t
> +     . 35	.reloc .,R_PPC64_PLTSEQ_NOTOC,%z1\n\t
> +     .  9	crset 2\n\t
> +     . 27	.reloc .,R_PPC64_TLSGD,%2\n\t
> +     . 36	.reloc .,R_PPC64_PLTCALL_NOTOC,%z1\n\t
> +     .  8	beq%T1l-
> +     .---
> +     .142
> +
>       ABI_AIX:
>       .  9	ld 2,%3\n\t
>       . 27	.reloc .,R_PPC64_TLSGD,%2\n\t
> @@ -21398,23 +21410,31 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop,
>  	    gcc_unreachable ();
>  	}
>
> +      const char *notoc = rs6000_pcrel_p (cfun) ? "_NOTOC" : "";
>        const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT
>  			    && flag_pic == 2 ? "+32768" : "");
>        if (!speculate)
>  	{
>  	  s += sprintf (s,
> -			"%s.reloc .,R_PPC%s_PLTSEQ,%%z%u%s\n\t",
> -			tls, rel64, funop, addend);
> +			"%s.reloc .,R_PPC%s_PLTSEQ%s,%%z%u%s\n\t",
> +			tls, rel64, notoc, funop, addend);
>  	  s += sprintf (s, "crset 2\n\t");
>  	}
>        s += sprintf (s,
> -		    "%s.reloc .,R_PPC%s_PLTCALL,%%z%u%s\n\t",
> -		    tls, rel64, funop, addend);
> +		    "%s.reloc .,R_PPC%s_PLTCALL%s,%%z%u%s\n\t",
> +		    tls, rel64, notoc, funop, addend);
>      }
>    else if (!speculate)
>      s += sprintf (s, "crset 2\n\t");
>
> -  if (DEFAULT_ABI == ABI_AIX)
> +  if (rs6000_pcrel_p (cfun))
> +    {
> +      if (speculate)
> +	sprintf (s, "b%%T%ul", funop);
> +      else
> +	sprintf (s, "beq%%T%ul-", funop);
> +    }
> +  else if (DEFAULT_ABI == ABI_AIX)
>      {
>        if (speculate)
>  	sprintf (s,
> @@ -21468,63 +21488,73 @@ rs6000_indirect_sibcall_template (rtx *operands, unsigned int funop)
>
>  #if HAVE_AS_PLTSEQ
>  /* Output indirect call insns.
> -   WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr.  */
> +   WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr,
> +   4 for plt_pcrel34.  */
>  const char *
>  rs6000_pltseq_template (rtx *operands, int which)
>  {
>    const char *rel64 = TARGET_64BIT ? "64" : "";
> -  char tls[28];
> +  char tls[30];
>    tls[0] = 0;
>    if (TARGET_TLS_MARKERS && GET_CODE (operands[3]) == UNSPEC)
>      {
> +      char off = which == 4 ? '8' : '4';
>        if (XINT (operands[3], 1) == UNSPEC_TLSGD)
> -	sprintf (tls, ".reloc .,R_PPC%s_TLSGD,%%3\n\t",
> -		 rel64);
> +	sprintf (tls, ".reloc .-%c,R_PPC%s_TLSGD,%%3\n\t",
> +		 off, rel64);
>        else if (XINT (operands[3], 1) == UNSPEC_TLSLD)
> -	sprintf (tls, ".reloc .,R_PPC%s_TLSLD,%%&\n\t",
> -		 rel64);
> +	sprintf (tls, ".reloc .-%c,R_PPC%s_TLSLD,%%&\n\t",
> +		 off, rel64);
>        else
>  	gcc_unreachable ();
>      }
>
>    gcc_assert (DEFAULT_ABI == ABI_ELFv2 || DEFAULT_ABI == ABI_V4);
> -  static char str[96];  /* 15 spare */
> -  const char *off = WORDS_BIG_ENDIAN ? "+2" : "";
> +  static char str[96];  /* 10 spare */
> +  char off = WORDS_BIG_ENDIAN ? '2' : '4';
>    const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT
>  			&& flag_pic == 2 ? "+32768" : "");
>    switch (which)
>      {
>      case 0:
>        sprintf (str,
> -	       "%s.reloc .,R_PPC%s_PLTSEQ,%%z2\n\t"
> -	       "st%s",
> -	       tls, rel64, TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)");
> +	       "st%s\n\t"
> +	       "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2",
> +	       TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)",
> +	       tls, rel64);
>        break;
>      case 1:
>        if (DEFAULT_ABI == ABI_V4 && !flag_pic)
>  	sprintf (str,
> -		 "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2\n\t"
> -		 "lis %%0,0",
> +		 "lis %%0,0\n\t"
> +		 "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2",
>  		 tls, off, rel64);
>        else
>  	sprintf (str,
> -		 "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2%s\n\t"
> -		 "addis %%0,%%1,0",
> +		 "addis %%0,%%1,0\n\t"
> +		 "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2%s",
>  		 tls, off, rel64, addend);
>        break;
>      case 2:
>        sprintf (str,
> -	       "%s.reloc .%s,R_PPC%s_PLT16_LO%s,%%z2%s\n\t"
> -	       "l%s %%0,0(%%1)",
> -	       tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend,
> -	       TARGET_64BIT ? "d" : "wz");
> +	       "l%s %%0,0(%%1)\n\t"
> +	       "%s.reloc .-%c,R_PPC%s_PLT16_LO%s,%%z2%s",
> +	       TARGET_64BIT ? "d" : "wz",
> +	       tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend);
>        break;
>      case 3:
>        sprintf (str,
> -	       "%s.reloc .,R_PPC%s_PLTSEQ,%%z2%s\n\t"
> -	       "mtctr %%1",
> +	       "mtctr %%1\n\t"
> +	       "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2%s",
>  	       tls, rel64, addend);
>        break;
> +    case 4:
> +      sprintf (str,
> +	       "pl%s %%0,0(0),1\n\t"
> +	       "%s.reloc .-8,R_PPC%s_PLT_PCREL34_NOTOC,%%z2",
> +	       TARGET_64BIT ? "d" : "wz",
> +	       tls, rel64);
> +      break;
>      default:
>        gcc_unreachable ();
>      }
> @@ -24703,6 +24733,53 @@ rs6000_return_addr (int count, rtx frame)
>    return get_hard_reg_initial_val (Pmode, LR_REGNO);
>  }
>
> +/* Helper function for rs6000_function_ok_for_sibcall.  */
> +
> +static bool
> +rs6000_decl_ok_for_sibcall (tree decl)
> +{
> +  /* Sibcalls are always fine for the Darwin ABI.  */
> +  if (DEFAULT_ABI == ABI_DARWIN)
> +    return true;
> +
> +  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
> +    {
> +      /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local
> +	 functions, because the callee may have a different TOC pointer to
> +	 the caller and there's no way to ensure we restore the TOC when
> +	 we return.  */
> +      if (!decl || DECL_EXTERNAL (decl) || DECL_WEAK (decl)
> +	  || !(*targetm.binds_local_p) (decl))
> +	return false;
> +
> +      /* Similarly, if the caller preserves the TOC pointer and the callee
> +	 doesn't (or vice versa), proper TOC setup or restoration will be
> +	 missed.  For example, suppose A, B, and C are in the same binary
> +	 and A -> B -> C.  A and B preserve the TOC pointer but C does not,
> +	 and B -> C is eligible as a sibcall.  A will call B through its
> +	 local entry point, so A will not restore its TOC itself.  B calls
> +	 C with a sibcall, so it will not restore the TOC.  C does not
> +	 preserve the TOC, so it may clobber r2 with impunity.  Returning
> +	 from C will result in a corrupted TOC for A.  */
> +      else if (rs6000_fndecl_pcrel_p (decl) != rs6000_pcrel_p (cfun))
> +	return false;
> +
> +      else
> +	return true;
> +    }
> +
> +  /*  With the secure-plt SYSV ABI we can't make non-local calls when
> +      -fpic/PIC because the plt call stubs use r30.  */
> +  if (DEFAULT_ABI == ABI_V4
> +      && (!TARGET_SECURE_PLT
> +	  || !flag_pic
> +	  || (decl
> +	      && (*targetm.binds_local_p) (decl))))
> +    return true;
> +
> +  return false;
> +}
> +
>  /* Say whether a function is a candidate for sibcall handling or not.  */
>
>  static bool
> @@ -24748,22 +24825,7 @@ rs6000_function_ok_for_sibcall (tree decl, tree exp)
>  	return false;
>      }
>
> -  /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local
> -     functions, because the callee may have a different TOC pointer to
> -     the caller and there's no way to ensure we restore the TOC when
> -     we return.  With the secure-plt SYSV ABI we can't make non-local
> -     calls when -fpic/PIC because the plt call stubs use r30.  */
> -  if (DEFAULT_ABI == ABI_DARWIN
> -      || ((DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
> -	  && decl
> -	  && !DECL_EXTERNAL (decl)
> -	  && !DECL_WEAK (decl)
> -	  && (*targetm.binds_local_p) (decl))
> -      || (DEFAULT_ABI == ABI_V4
> -	  && (!TARGET_SECURE_PLT
> -	      || !flag_pic
> -	      || (decl
> -		  && (*targetm.binds_local_p) (decl)))))
> +  if (rs6000_decl_ok_for_sibcall (decl))
>      {
>        tree attr_list = TYPE_ATTRIBUTES (fntype);
>
> @@ -32592,12 +32654,18 @@ rs6000_longcall_ref (rtx call_ref, rtx arg)
>    if (TARGET_PLTSEQ)
>      {
>        rtx base = const0_rtx;
> -      int regno;
> -      if (DEFAULT_ABI == ABI_ELFv2)
> +      int regno = 12;
> +      if (rs6000_pcrel_p (cfun))
>  	{
> -	  base = gen_rtx_REG (Pmode, TOC_REGISTER);
> -	  regno = 12;
> +	  rtx reg = gen_rtx_REG (Pmode, regno);
> +	  rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg),
> +				  UNSPEC_PLT_PCREL);
> +	  emit_insn (gen_rtx_SET (reg, u));
> +	  return reg;
>  	}
> +
> +      if (DEFAULT_ABI == ABI_ELFv2)
> +	base = gen_rtx_REG (Pmode, TOC_REGISTER);
>        else
>  	{
>  	  if (flag_pic)
> @@ -37706,37 +37774,38 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
>    if (!SYMBOL_REF_P (func)
>        || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (func)))
>      {
> -      /* Save the TOC into its reserved slot before the call,
> -	 and prepare to restore it after the call.  */
> -      rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT);
> -      rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode,
> -					     gen_rtvec (1, stack_toc_offset),
> -					     UNSPEC_TOCSLOT);
> -      toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec);
> -
> -      /* Can we optimize saving the TOC in the prologue or
> -	 do we need to do it at every call?  */
> -      if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca)
> -	cfun->machine->save_toc_in_prologue = true;
> -      else
> +      if (!rs6000_pcrel_p (cfun))
>  	{
> -	  rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
> -	  rtx stack_toc_mem = gen_frame_mem (Pmode,
> -					     gen_rtx_PLUS (Pmode, stack_ptr,
> -							   stack_toc_offset));
> -	  MEM_VOLATILE_P (stack_toc_mem) = 1;
> -	  if (is_pltseq_longcall)
> +	  /* Save the TOC into its reserved slot before the call,
> +	     and prepare to restore it after the call.  */
> +	  rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT);
> +	  rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode,
> +						 gen_rtvec (1, stack_toc_offset),
> +						 UNSPEC_TOCSLOT);
> +	  toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec);
> +
> +	  /* Can we optimize saving the TOC in the prologue or
> +	     do we need to do it at every call?  */
> +	  if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca)
> +	    cfun->machine->save_toc_in_prologue = true;
> +	  else
>  	    {
> -	      /* Use USPEC_PLTSEQ here to emit every instruction in an
> -		 inline PLT call sequence with a reloc, enabling the
> -		 linker to edit the sequence back to a direct call
> -		 when that makes sense.  */
> -	      rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg);
> -	      rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ);
> -	      emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg));
> +	      rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
> +	      rtx stack_toc_mem = gen_frame_mem (Pmode,
> +						 gen_rtx_PLUS (Pmode, stack_ptr,
> +							       stack_toc_offset));
> +	      MEM_VOLATILE_P (stack_toc_mem) = 1;
> +	      if (HAVE_AS_PLTSEQ
> +		  && DEFAULT_ABI == ABI_ELFv2
> +		  && GET_CODE (func_desc) == SYMBOL_REF)
> +		{
> +		  rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg);
> +		  rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ);
> +		  emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg));
> +		}
> +	      else
> +		emit_move_insn (stack_toc_mem, toc_reg);
>  	    }
> -	  else
> -	    emit_move_insn (stack_toc_mem, toc_reg);
>  	}
>
>        if (DEFAULT_ABI == ABI_ELFv2)
> @@ -37813,10 +37882,12 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
>      }
>    else
>      {
> -      /* Direct calls use the TOC: for local calls, the callee will
> -	 assume the TOC register is set; for non-local calls, the
> -	 PLT stub needs the TOC register.  */
> -      abi_reg = toc_reg;
> +      /* No TOC register needed for calls from PC-relative callers.  */
> +      if (!rs6000_pcrel_p (cfun))
> +	/* Direct calls use the TOC: for local calls, the callee will
> +	   assume the TOC register is set; for non-local calls, the
> +	   PLT stub needs the TOC register.  */
> +	abi_reg = toc_reg;
>        func_addr = func;
>      }
>
> @@ -37866,7 +37937,9 @@ rs6000_sibcall_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
>    insn = emit_call_insn (insn);
>
>    /* Note use of the TOC register.  */
> -  use_reg (&CALL_INSN_FUNCTION_USAGE (insn), gen_rtx_REG (Pmode, TOC_REGNUM));
> +  if (!rs6000_pcrel_p (cfun))
> +    use_reg (&CALL_INSN_FUNCTION_USAGE (insn),
> +	     gen_rtx_REG (Pmode, TOC_REGNUM));
>  }
>
>  /* Expand code to perform a call under the SYSV4 ABI.  */
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 71613e21384..e1d9045c5bb 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -147,6 +147,7 @@
>     UNSPEC_PLTSEQ
>     UNSPEC_PLT16_HA
>     UNSPEC_PLT16_LO
> +   UNSPEC_PLT_PCREL
>    ])
>
>  ;;
> @@ -10267,6 +10268,20 @@
>  {
>    return rs6000_pltseq_template (operands, 3);
>  })
> +
> +(define_insn "*pltseq_plt_pcrel<mode>"
> +  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
> +	(unspec:P [(match_operand:P 1 "" "")
> +		   (match_operand:P 2 "symbol_ref_operand" "s")
> +		   (match_operand:P 3 "" "")]
> +		  UNSPEC_PLT_PCREL))]
> +  "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS
> +   && rs6000_pcrel_p (cfun)"
> +{
> +  return rs6000_pltseq_template (operands, 4);
> +}
> +  [(set_attr "type" "load")
> +   (set_attr "length" "12")])
>  
>  ;; Call and call_value insns
>  ;; For the purposes of expanding calls, Darwin is very similar to SYSV.
> @@ -10582,7 +10597,11 @@
>  	 (match_operand 1))
>     (clobber (reg:P LR_REGNO))]
>    "DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2"
> -  "bl %z0"
> +{
> +  if (rs6000_pcrel_p (cfun))
> +    return "bl %z0@notoc";
> +  return "bl %z0";
> +}
>    [(set_attr "type" "branch")])
>
>  (define_insn "*call_value_local_aix<mode>"
> @@ -10592,7 +10611,11 @@
>     (clobber (reg:P LR_REGNO))]
>    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>     && !IS_NOMARK_TLSGETADDR (operands[2])"
> -  "bl %z1"
> +{
> +  if (rs6000_pcrel_p (cfun))
> +    return "bl %z1@notoc";
> +  return "bl %z1";
> +}
>    [(set_attr "type" "branch")])
>
>  ;; Call to AIX abi function which may be in another module.
> @@ -10607,7 +10630,10 @@
>    return rs6000_call_template (operands, 0);
>  }
>    [(set_attr "type" "branch")
> -   (set_attr "length" "8")])
> +   (set (attr "length")
> +	(if_then_else (match_test "rs6000_pcrel_p (cfun)")
> +	  (const_int 4)
> +	  (const_int 8)))])
>
>  (define_insn "*call_value_nonlocal_aix<mode>"
>    [(set (match_operand 0 "" "")
> @@ -10623,11 +10649,14 @@
>  }
>    [(set_attr "type" "branch")
>     (set (attr "length")
> -	(if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
> -	  (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
> -	    (const_int 16)
> -	    (const_int 12))
> -	  (const_int 8)))])
> +	(plus (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
> +		(if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
> +		  (const_int 8)
> +		  (const_int 4))
> +		(const_int 0))
> +	      (if_then_else (match_test "rs6000_pcrel_p (cfun)")
> +		(const_int 4)
> +		(const_int 8))))])
>
>  ;; Call to indirect functions with the AIX abi using a 3 word descriptor.
>  ;; Operand0 is the addresss of the function to call
> @@ -10700,6 +10729,21 @@
>  		      (const_string "12")
>  		      (const_string "8")))])
>
> +(define_insn "*call_indirect_pcrel<mode>"
> +  [(call (mem:SI (match_operand:P 0 "indirect_call_operand" "c,*l,X"))
> +	 (match_operand 1))
> +   (clobber (reg:P LR_REGNO))]
> +  "rs6000_pcrel_p (cfun)"
> +{
> +  return rs6000_indirect_call_template (operands, 0);
> +}
> +  [(set_attr "type" "jmpreg")
> +   (set (attr "length")
> +	(if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
> +			   (match_test "which_alternative != 1"))
> +		      (const_string "8")
> +		      (const_string "4")))])
> +
>  (define_insn "*call_value_indirect_elfv2<mode>"
>    [(set (match_operand 0 "" "")
>  	(call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X"))
> @@ -10728,6 +10772,31 @@
>  	    (const_string "12")
>  	    (const_string "8"))))])
>
> +(define_insn "*call_value_indirect_pcrel<mode>"
> +  [(set (match_operand 0 "" "")
> +	(call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X"))
> +	      (match_operand:P 2 "unspec_tls" "")))
> +   (clobber (reg:P LR_REGNO))]
> +  "rs6000_pcrel_p (cfun)"
> +{
> +  if (IS_NOMARK_TLSGETADDR (operands[2]))
> +    rs6000_output_tlsargs (operands);
> +
> +  return rs6000_indirect_call_template (operands, 1);
> +}
> +  [(set_attr "type" "jmpreg")
> +   (set (attr "length")
> +	(plus
> +	  (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
> +	    (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
> +	      (const_int 8)
> +	      (const_int 4))
> +	    (const_int 0))
> +	  (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
> +			     (match_test "which_alternative != 1"))
> +	    (const_string "8")
> +	    (const_string "4"))))])
> +
>  ;; Call subroutine returning any type.
>  (define_expand "untyped_call"
>    [(parallel [(call (match_operand 0 "")
> diff --git a/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c
> new file mode 100644
> index 00000000000..c7d322c1c96
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c
> @@ -0,0 +1,41 @@
> ++/* { dg-do compile } */
> ++/* { dg-options "-mdejagnu-cpu=future -O2" } */
> ++/* { dg-require-effective-target powerpc_elfv2 } */
> +
> +/* Test that calls generated from PC-relative code are
> +   annotated with @notoc.  */
> +
> +extern int yy0 (int);
> +extern void yy1 (int);
> +
> +int zz0 (void) __attribute__((noinline));
> +void zz1 (int) __attribute__((noinline));
> +
> +int xx (void)
> +{
> +  yy1 (7);
> +  return yy0 (5);
> +}
> +
> +int zz0 ()
> +{
> +  asm ("");
> +  return 16;
> +};
> +
> +void zz1 (int a __attribute__((__unused__)))
> +{
> +  asm ("");
> +};
> +
> +int ww (void)
> +{
> +  zz1 (zz0 ());
> +  return 4;
> +}
> +
> +/* { dg-final { scan-assembler {yy1@notoc} } } */
> +/* { dg-final { scan-assembler {yy0@notoc} } } */
> +/* { dg-final { scan-assembler {zz1@notoc} } } */
> +/* { dg-final { scan-assembler {zz0@notoc} } } */
> +
> diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c
> new file mode 100644
> index 00000000000..7c767e2ba32
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c
> @@ -0,0 +1,46 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mdejagnu-cpu=future -O2" } */
> +/* { dg-require-effective-target powerpc_elfv2 } */
> +
> +/* Test that potential sibcalls are not generated when the caller preserves
> +   the TOC and the callee doesn't, or vice versa.  */
> +
> +int x (void) __attribute__((noinline));
> +int y (void) __attribute__((noinline));
> +int xx (void) __attribute__((noinline));
> +  
> +int x (void)
> +{
> +  return 1;
> +}
> +
> +int y (void)
> +{
> +  return 2;
> +}
> +
> +int sib_call (void)
> +{
> +  return x ();
> +}
> +
> +#pragma GCC target ("cpu=power9")
> +int normal_call (void)
> +{
> +  return y ();
> +}
> +
> +int xx (void)
> +{
> +  return 1;
> +}
> +
> +#pragma GCC target ("cpu=future")
> +int notoc_call (void)
> +{
> +  return xx ();
> +}
> +
> +/* { dg-final { scan-assembler {\mb x@notoc\M} } } */
> +/* { dg-final { scan-assembler {\mbl y\M} } } */
> +/* { dg-final { scan-assembler {\mbl xx@notoc\M} } } */
>
Bill Schmidt May 24, 2019, 2:06 p.m. UTC | #2
New test case ICEs, so consider this withdrawn.  Sorry again about this.

Bill

On 5/23/19 9:17 PM, Bill Schmidt wrote:
> Hm, I got ahead of myself on this one.  I haven't done the regstrap yet,
> so please hold off reviewing for now.
>
> Sorry for the noise.  I shouldn't post when I'm tired...
>
> Thanks,
> Bill
>
> On 5/23/19 9:11 PM, Bill Schmidt wrote:
>> Hi,
>>
>> This patch contains the changes to implement call flow for PC-relative addressing.  
>> It's an amalgam of several internal patches that Alan and I worked on, and as a 
>> result it's hard to tease apart individual pieces much further.  So I apologize 
>> that this patch is a little larger than the others.  Also, I've CC'd Alan so he 
>> can help answer questions about the patch, particularly the PLT bits I'm not very
>> familiar with.
>>
>> Following are descriptions of the individual patches that are combined here.
>>
>> (1) When a function uses PC-relative code generation, all direct calls (other than 
>> sibcalls) that the function makes to local or external callees should appear as
>> "bl sym@notoc" and should not be followed by a nop instruction.  @notoc indicates
>> that the assembler should annotate the call with R_PPC64_REL24_NOTOC, meaning
>> that the caller does not guarantee that r2 contains a valid TOC pointer.  Thus
>> the linker should not try to replace a subsequent "nop" with a TOC restore
>> instruction.
>>
>> I've added a test case for the four cases handled here:  local calls with/without
>> a return value, and external cases with/without a return value.
>>
>> (2) If a caller preserves the TOC pointer and the callee does not, or vice versa,
>> then a sibcall will cause an inconsistency.  Don't allow that.
>>
>> (3) The linker needs a @notoc directive on sibcall targets when the caller does not
>> provide or preserve a TOC pointer.  This patch provides for that.
>>
>> In creating the new sibcall patterns, I did not duplicate the "c" alternatives
>> that allow for bctr or blr sibcalls.  I don't think there's a way to generate
>> those currently.  The bctr would be legitimate for PC-relative sibcalls if you
>> can prove that the target function is in the same binary, but we don't appear
>> to detect that possibility today.
>>
>>     (4) This patch deletes all the extra insns added to handle pcrel calls,
>>     instead opting to use existing insns but making their output
>>     conditional on rs6000_pcrel_p(cfun).  There isn't a need to
>>     differentiate between pcrel and non-pcrel calls at the point rtl is
>>     created; rs6000_pcrel_p is valid right up to the final pass, as
>>     evidenced by use of rs6000_pcrel_p to emit .localentry.
>>     
>>     There is one case however where we do need new insns: The existing
>>     nonlocal indirect call insns mention r2 in their rtl.  That isn't
>>     correct for pcrel indirect calls, and may cause problems when/if r2
>>     is allocated as any other volatile gpr in pcrel code.
>>     
>>     The patch also fixes pcrel inline PLT calls (which are used for
>>     -fno-plt and -mlongcall) to use a pcrel load from the PLT rather than
>>     attempting (and failing) to use TOC-relative loads.  This requires
>>     some changes in the way relocs are emitted.  For prefix insns we can't
>>     write
>>        .reloc .,R_PPC64_PLT_PCREL34_NOTOC,foo
>>        pld 12,0(0),1
>>     since the pld may require a padding nop.  Instead it's necessary to
>>     put the .reloc after the instruction or use a label on the insn.  Like
>>     this (which is what the patch does):
>>        pld 12,0(0),1
>>        .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo
>>     or this:
>>        .reloc 0f,R_PPC64_PLT_PCREL34_NOTOC,foo
>>     0: pld 12,0(0),1
>>
>> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
>> Is this okay for trunk?
>>
>> Thanks!
>> Bill
>>
>>
>> [gcc]
>>
>> 2019-05-23  Bill Schmidt  <wschmidt@linux.ibm.com>
>> 	    Alan Modra  <amodra@gmail.com>
>>
>> 	* config/rs6000/rs6000.c (rs6000_call_template_1): Handle pcrel
>> 	calls here...
>> 	(rs6000_indirect_call_template_1): ...and here.
>> 	(rs6000_indirect_sibcall_template): Handle plt_pcrel34.  Rework
>> 	tocsave, plt16_ha, plt16_lo, mtctr indirect calls.
>> 	(rs6000_decl_ok_for_sibcall): New function.
>> 	(rs6000_function_ok_for_sibcall): Refactor.
>> 	(rs6000_longcall_ref): Use UNSPEC_PLT_PCREL when pcrel.
>> 	(rs6000_call_aix): Don't emit toc restore rtl for indirect calls
>> 	when pcrel.  Reorganize.
>> 	(rs6000_sibcall_aix): Don't add r2 to function usage when pcrel.
>> 	* rs6000.md (UNSPEC_PLT_PCREL): New unspec.
>> 	(*pltseq_plt_pcrel): New insn.
>> 	(*call_local_aix): Handle @notoc calls.
>> 	(*call_value_local_aix): Likewise.
>> 	(*call_nonlocal_aix): Adjust lengths for pcrel calls.
>> 	(*call_value_nonlocal_aix): Likewise.
>> 	(*call_indirect_pcrel): New insn.
>> 	(*call_value_indirect_pcrel): Likewise.
>>
>>
>> [gcc/testsuite]
>>
>> 2019-05-23  Bill Schmidt  <wschmidt@linux.ibm.com>
>>
>> 	* gcc.target/powerpc/notoc-direct-1.c: New.
>> 	* gcc.target/powerpc/pcrel-sibcall-1.c: New.
>>
>>
>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>> index 3d5cf9e4ece..9229bad6acc 100644
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -21268,7 +21268,9 @@ rs6000_call_template_1 (rtx *operands, unsigned int funop, bool sibcall)
>>  	    ? "+32768" : ""));
>>
>>    static char str[32];  /* 2 spare */
>> -  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>> +  if (rs6000_pcrel_p (cfun))
>> +    sprintf (str, "b%s %s@notoc%s", sibcall ? "" : "l", z, arg);
>> +  else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>>      sprintf (str, "b%s %s%s%s", sibcall ? "" : "l", z, arg,
>>  	     sibcall ? "" : "\n\tnop");
>>    else if (DEFAULT_ABI == ABI_V4)
>> @@ -21333,6 +21335,16 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop,
>>    /* Currently, funop is either 0 or 1.  The maximum string is always
>>       a !speculate 64-bit __tls_get_addr call.
>>
>> +     ABI_ELFv2, pcrel:
>> +     . 27	.reloc .,R_PPC64_TLSGD,%2\n\t
>> +     . 35	.reloc .,R_PPC64_PLTSEQ_NOTOC,%z1\n\t
>> +     .  9	crset 2\n\t
>> +     . 27	.reloc .,R_PPC64_TLSGD,%2\n\t
>> +     . 36	.reloc .,R_PPC64_PLTCALL_NOTOC,%z1\n\t
>> +     .  8	beq%T1l-
>> +     .---
>> +     .142
>> +
>>       ABI_AIX:
>>       .  9	ld 2,%3\n\t
>>       . 27	.reloc .,R_PPC64_TLSGD,%2\n\t
>> @@ -21398,23 +21410,31 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop,
>>  	    gcc_unreachable ();
>>  	}
>>
>> +      const char *notoc = rs6000_pcrel_p (cfun) ? "_NOTOC" : "";
>>        const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT
>>  			    && flag_pic == 2 ? "+32768" : "");
>>        if (!speculate)
>>  	{
>>  	  s += sprintf (s,
>> -			"%s.reloc .,R_PPC%s_PLTSEQ,%%z%u%s\n\t",
>> -			tls, rel64, funop, addend);
>> +			"%s.reloc .,R_PPC%s_PLTSEQ%s,%%z%u%s\n\t",
>> +			tls, rel64, notoc, funop, addend);
>>  	  s += sprintf (s, "crset 2\n\t");
>>  	}
>>        s += sprintf (s,
>> -		    "%s.reloc .,R_PPC%s_PLTCALL,%%z%u%s\n\t",
>> -		    tls, rel64, funop, addend);
>> +		    "%s.reloc .,R_PPC%s_PLTCALL%s,%%z%u%s\n\t",
>> +		    tls, rel64, notoc, funop, addend);
>>      }
>>    else if (!speculate)
>>      s += sprintf (s, "crset 2\n\t");
>>
>> -  if (DEFAULT_ABI == ABI_AIX)
>> +  if (rs6000_pcrel_p (cfun))
>> +    {
>> +      if (speculate)
>> +	sprintf (s, "b%%T%ul", funop);
>> +      else
>> +	sprintf (s, "beq%%T%ul-", funop);
>> +    }
>> +  else if (DEFAULT_ABI == ABI_AIX)
>>      {
>>        if (speculate)
>>  	sprintf (s,
>> @@ -21468,63 +21488,73 @@ rs6000_indirect_sibcall_template (rtx *operands, unsigned int funop)
>>
>>  #if HAVE_AS_PLTSEQ
>>  /* Output indirect call insns.
>> -   WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr.  */
>> +   WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr,
>> +   4 for plt_pcrel34.  */
>>  const char *
>>  rs6000_pltseq_template (rtx *operands, int which)
>>  {
>>    const char *rel64 = TARGET_64BIT ? "64" : "";
>> -  char tls[28];
>> +  char tls[30];
>>    tls[0] = 0;
>>    if (TARGET_TLS_MARKERS && GET_CODE (operands[3]) == UNSPEC)
>>      {
>> +      char off = which == 4 ? '8' : '4';
>>        if (XINT (operands[3], 1) == UNSPEC_TLSGD)
>> -	sprintf (tls, ".reloc .,R_PPC%s_TLSGD,%%3\n\t",
>> -		 rel64);
>> +	sprintf (tls, ".reloc .-%c,R_PPC%s_TLSGD,%%3\n\t",
>> +		 off, rel64);
>>        else if (XINT (operands[3], 1) == UNSPEC_TLSLD)
>> -	sprintf (tls, ".reloc .,R_PPC%s_TLSLD,%%&\n\t",
>> -		 rel64);
>> +	sprintf (tls, ".reloc .-%c,R_PPC%s_TLSLD,%%&\n\t",
>> +		 off, rel64);
>>        else
>>  	gcc_unreachable ();
>>      }
>>
>>    gcc_assert (DEFAULT_ABI == ABI_ELFv2 || DEFAULT_ABI == ABI_V4);
>> -  static char str[96];  /* 15 spare */
>> -  const char *off = WORDS_BIG_ENDIAN ? "+2" : "";
>> +  static char str[96];  /* 10 spare */
>> +  char off = WORDS_BIG_ENDIAN ? '2' : '4';
>>    const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT
>>  			&& flag_pic == 2 ? "+32768" : "");
>>    switch (which)
>>      {
>>      case 0:
>>        sprintf (str,
>> -	       "%s.reloc .,R_PPC%s_PLTSEQ,%%z2\n\t"
>> -	       "st%s",
>> -	       tls, rel64, TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)");
>> +	       "st%s\n\t"
>> +	       "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2",
>> +	       TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)",
>> +	       tls, rel64);
>>        break;
>>      case 1:
>>        if (DEFAULT_ABI == ABI_V4 && !flag_pic)
>>  	sprintf (str,
>> -		 "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2\n\t"
>> -		 "lis %%0,0",
>> +		 "lis %%0,0\n\t"
>> +		 "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2",
>>  		 tls, off, rel64);
>>        else
>>  	sprintf (str,
>> -		 "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2%s\n\t"
>> -		 "addis %%0,%%1,0",
>> +		 "addis %%0,%%1,0\n\t"
>> +		 "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2%s",
>>  		 tls, off, rel64, addend);
>>        break;
>>      case 2:
>>        sprintf (str,
>> -	       "%s.reloc .%s,R_PPC%s_PLT16_LO%s,%%z2%s\n\t"
>> -	       "l%s %%0,0(%%1)",
>> -	       tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend,
>> -	       TARGET_64BIT ? "d" : "wz");
>> +	       "l%s %%0,0(%%1)\n\t"
>> +	       "%s.reloc .-%c,R_PPC%s_PLT16_LO%s,%%z2%s",
>> +	       TARGET_64BIT ? "d" : "wz",
>> +	       tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend);
>>        break;
>>      case 3:
>>        sprintf (str,
>> -	       "%s.reloc .,R_PPC%s_PLTSEQ,%%z2%s\n\t"
>> -	       "mtctr %%1",
>> +	       "mtctr %%1\n\t"
>> +	       "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2%s",
>>  	       tls, rel64, addend);
>>        break;
>> +    case 4:
>> +      sprintf (str,
>> +	       "pl%s %%0,0(0),1\n\t"
>> +	       "%s.reloc .-8,R_PPC%s_PLT_PCREL34_NOTOC,%%z2",
>> +	       TARGET_64BIT ? "d" : "wz",
>> +	       tls, rel64);
>> +      break;
>>      default:
>>        gcc_unreachable ();
>>      }
>> @@ -24703,6 +24733,53 @@ rs6000_return_addr (int count, rtx frame)
>>    return get_hard_reg_initial_val (Pmode, LR_REGNO);
>>  }
>>
>> +/* Helper function for rs6000_function_ok_for_sibcall.  */
>> +
>> +static bool
>> +rs6000_decl_ok_for_sibcall (tree decl)
>> +{
>> +  /* Sibcalls are always fine for the Darwin ABI.  */
>> +  if (DEFAULT_ABI == ABI_DARWIN)
>> +    return true;
>> +
>> +  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>> +    {
>> +      /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local
>> +	 functions, because the callee may have a different TOC pointer to
>> +	 the caller and there's no way to ensure we restore the TOC when
>> +	 we return.  */
>> +      if (!decl || DECL_EXTERNAL (decl) || DECL_WEAK (decl)
>> +	  || !(*targetm.binds_local_p) (decl))
>> +	return false;
>> +
>> +      /* Similarly, if the caller preserves the TOC pointer and the callee
>> +	 doesn't (or vice versa), proper TOC setup or restoration will be
>> +	 missed.  For example, suppose A, B, and C are in the same binary
>> +	 and A -> B -> C.  A and B preserve the TOC pointer but C does not,
>> +	 and B -> C is eligible as a sibcall.  A will call B through its
>> +	 local entry point, so A will not restore its TOC itself.  B calls
>> +	 C with a sibcall, so it will not restore the TOC.  C does not
>> +	 preserve the TOC, so it may clobber r2 with impunity.  Returning
>> +	 from C will result in a corrupted TOC for A.  */
>> +      else if (rs6000_fndecl_pcrel_p (decl) != rs6000_pcrel_p (cfun))
>> +	return false;
>> +
>> +      else
>> +	return true;
>> +    }
>> +
>> +  /*  With the secure-plt SYSV ABI we can't make non-local calls when
>> +      -fpic/PIC because the plt call stubs use r30.  */
>> +  if (DEFAULT_ABI == ABI_V4
>> +      && (!TARGET_SECURE_PLT
>> +	  || !flag_pic
>> +	  || (decl
>> +	      && (*targetm.binds_local_p) (decl))))
>> +    return true;
>> +
>> +  return false;
>> +}
>> +
>>  /* Say whether a function is a candidate for sibcall handling or not.  */
>>
>>  static bool
>> @@ -24748,22 +24825,7 @@ rs6000_function_ok_for_sibcall (tree decl, tree exp)
>>  	return false;
>>      }
>>
>> -  /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local
>> -     functions, because the callee may have a different TOC pointer to
>> -     the caller and there's no way to ensure we restore the TOC when
>> -     we return.  With the secure-plt SYSV ABI we can't make non-local
>> -     calls when -fpic/PIC because the plt call stubs use r30.  */
>> -  if (DEFAULT_ABI == ABI_DARWIN
>> -      || ((DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>> -	  && decl
>> -	  && !DECL_EXTERNAL (decl)
>> -	  && !DECL_WEAK (decl)
>> -	  && (*targetm.binds_local_p) (decl))
>> -      || (DEFAULT_ABI == ABI_V4
>> -	  && (!TARGET_SECURE_PLT
>> -	      || !flag_pic
>> -	      || (decl
>> -		  && (*targetm.binds_local_p) (decl)))))
>> +  if (rs6000_decl_ok_for_sibcall (decl))
>>      {
>>        tree attr_list = TYPE_ATTRIBUTES (fntype);
>>
>> @@ -32592,12 +32654,18 @@ rs6000_longcall_ref (rtx call_ref, rtx arg)
>>    if (TARGET_PLTSEQ)
>>      {
>>        rtx base = const0_rtx;
>> -      int regno;
>> -      if (DEFAULT_ABI == ABI_ELFv2)
>> +      int regno = 12;
>> +      if (rs6000_pcrel_p (cfun))
>>  	{
>> -	  base = gen_rtx_REG (Pmode, TOC_REGISTER);
>> -	  regno = 12;
>> +	  rtx reg = gen_rtx_REG (Pmode, regno);
>> +	  rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg),
>> +				  UNSPEC_PLT_PCREL);
>> +	  emit_insn (gen_rtx_SET (reg, u));
>> +	  return reg;
>>  	}
>> +
>> +      if (DEFAULT_ABI == ABI_ELFv2)
>> +	base = gen_rtx_REG (Pmode, TOC_REGISTER);
>>        else
>>  	{
>>  	  if (flag_pic)
>> @@ -37706,37 +37774,38 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
>>    if (!SYMBOL_REF_P (func)
>>        || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (func)))
>>      {
>> -      /* Save the TOC into its reserved slot before the call,
>> -	 and prepare to restore it after the call.  */
>> -      rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT);
>> -      rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode,
>> -					     gen_rtvec (1, stack_toc_offset),
>> -					     UNSPEC_TOCSLOT);
>> -      toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec);
>> -
>> -      /* Can we optimize saving the TOC in the prologue or
>> -	 do we need to do it at every call?  */
>> -      if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca)
>> -	cfun->machine->save_toc_in_prologue = true;
>> -      else
>> +      if (!rs6000_pcrel_p (cfun))
>>  	{
>> -	  rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
>> -	  rtx stack_toc_mem = gen_frame_mem (Pmode,
>> -					     gen_rtx_PLUS (Pmode, stack_ptr,
>> -							   stack_toc_offset));
>> -	  MEM_VOLATILE_P (stack_toc_mem) = 1;
>> -	  if (is_pltseq_longcall)
>> +	  /* Save the TOC into its reserved slot before the call,
>> +	     and prepare to restore it after the call.  */
>> +	  rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT);
>> +	  rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode,
>> +						 gen_rtvec (1, stack_toc_offset),
>> +						 UNSPEC_TOCSLOT);
>> +	  toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec);
>> +
>> +	  /* Can we optimize saving the TOC in the prologue or
>> +	     do we need to do it at every call?  */
>> +	  if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca)
>> +	    cfun->machine->save_toc_in_prologue = true;
>> +	  else
>>  	    {
>> -	      /* Use USPEC_PLTSEQ here to emit every instruction in an
>> -		 inline PLT call sequence with a reloc, enabling the
>> -		 linker to edit the sequence back to a direct call
>> -		 when that makes sense.  */
>> -	      rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg);
>> -	      rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ);
>> -	      emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg));
>> +	      rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
>> +	      rtx stack_toc_mem = gen_frame_mem (Pmode,
>> +						 gen_rtx_PLUS (Pmode, stack_ptr,
>> +							       stack_toc_offset));
>> +	      MEM_VOLATILE_P (stack_toc_mem) = 1;
>> +	      if (HAVE_AS_PLTSEQ
>> +		  && DEFAULT_ABI == ABI_ELFv2
>> +		  && GET_CODE (func_desc) == SYMBOL_REF)
>> +		{
>> +		  rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg);
>> +		  rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ);
>> +		  emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg));
>> +		}
>> +	      else
>> +		emit_move_insn (stack_toc_mem, toc_reg);
>>  	    }
>> -	  else
>> -	    emit_move_insn (stack_toc_mem, toc_reg);
>>  	}
>>
>>        if (DEFAULT_ABI == ABI_ELFv2)
>> @@ -37813,10 +37882,12 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
>>      }
>>    else
>>      {
>> -      /* Direct calls use the TOC: for local calls, the callee will
>> -	 assume the TOC register is set; for non-local calls, the
>> -	 PLT stub needs the TOC register.  */
>> -      abi_reg = toc_reg;
>> +      /* No TOC register needed for calls from PC-relative callers.  */
>> +      if (!rs6000_pcrel_p (cfun))
>> +	/* Direct calls use the TOC: for local calls, the callee will
>> +	   assume the TOC register is set; for non-local calls, the
>> +	   PLT stub needs the TOC register.  */
>> +	abi_reg = toc_reg;
>>        func_addr = func;
>>      }
>>
>> @@ -37866,7 +37937,9 @@ rs6000_sibcall_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
>>    insn = emit_call_insn (insn);
>>
>>    /* Note use of the TOC register.  */
>> -  use_reg (&CALL_INSN_FUNCTION_USAGE (insn), gen_rtx_REG (Pmode, TOC_REGNUM));
>> +  if (!rs6000_pcrel_p (cfun))
>> +    use_reg (&CALL_INSN_FUNCTION_USAGE (insn),
>> +	     gen_rtx_REG (Pmode, TOC_REGNUM));
>>  }
>>
>>  /* Expand code to perform a call under the SYSV4 ABI.  */
>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>> index 71613e21384..e1d9045c5bb 100644
>> --- a/gcc/config/rs6000/rs6000.md
>> +++ b/gcc/config/rs6000/rs6000.md
>> @@ -147,6 +147,7 @@
>>     UNSPEC_PLTSEQ
>>     UNSPEC_PLT16_HA
>>     UNSPEC_PLT16_LO
>> +   UNSPEC_PLT_PCREL
>>    ])
>>
>>  ;;
>> @@ -10267,6 +10268,20 @@
>>  {
>>    return rs6000_pltseq_template (operands, 3);
>>  })
>> +
>> +(define_insn "*pltseq_plt_pcrel<mode>"
>> +  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
>> +	(unspec:P [(match_operand:P 1 "" "")
>> +		   (match_operand:P 2 "symbol_ref_operand" "s")
>> +		   (match_operand:P 3 "" "")]
>> +		  UNSPEC_PLT_PCREL))]
>> +  "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS
>> +   && rs6000_pcrel_p (cfun)"
>> +{
>> +  return rs6000_pltseq_template (operands, 4);
>> +}
>> +  [(set_attr "type" "load")
>> +   (set_attr "length" "12")])
>>  
>>  ;; Call and call_value insns
>>  ;; For the purposes of expanding calls, Darwin is very similar to SYSV.
>> @@ -10582,7 +10597,11 @@
>>  	 (match_operand 1))
>>     (clobber (reg:P LR_REGNO))]
>>    "DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2"
>> -  "bl %z0"
>> +{
>> +  if (rs6000_pcrel_p (cfun))
>> +    return "bl %z0@notoc";
>> +  return "bl %z0";
>> +}
>>    [(set_attr "type" "branch")])
>>
>>  (define_insn "*call_value_local_aix<mode>"
>> @@ -10592,7 +10611,11 @@
>>     (clobber (reg:P LR_REGNO))]
>>    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>>     && !IS_NOMARK_TLSGETADDR (operands[2])"
>> -  "bl %z1"
>> +{
>> +  if (rs6000_pcrel_p (cfun))
>> +    return "bl %z1@notoc";
>> +  return "bl %z1";
>> +}
>>    [(set_attr "type" "branch")])
>>
>>  ;; Call to AIX abi function which may be in another module.
>> @@ -10607,7 +10630,10 @@
>>    return rs6000_call_template (operands, 0);
>>  }
>>    [(set_attr "type" "branch")
>> -   (set_attr "length" "8")])
>> +   (set (attr "length")
>> +	(if_then_else (match_test "rs6000_pcrel_p (cfun)")
>> +	  (const_int 4)
>> +	  (const_int 8)))])
>>
>>  (define_insn "*call_value_nonlocal_aix<mode>"
>>    [(set (match_operand 0 "" "")
>> @@ -10623,11 +10649,14 @@
>>  }
>>    [(set_attr "type" "branch")
>>     (set (attr "length")
>> -	(if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
>> -	  (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
>> -	    (const_int 16)
>> -	    (const_int 12))
>> -	  (const_int 8)))])
>> +	(plus (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
>> +		(if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
>> +		  (const_int 8)
>> +		  (const_int 4))
>> +		(const_int 0))
>> +	      (if_then_else (match_test "rs6000_pcrel_p (cfun)")
>> +		(const_int 4)
>> +		(const_int 8))))])
>>
>>  ;; Call to indirect functions with the AIX abi using a 3 word descriptor.
>>  ;; Operand0 is the addresss of the function to call
>> @@ -10700,6 +10729,21 @@
>>  		      (const_string "12")
>>  		      (const_string "8")))])
>>
>> +(define_insn "*call_indirect_pcrel<mode>"
>> +  [(call (mem:SI (match_operand:P 0 "indirect_call_operand" "c,*l,X"))
>> +	 (match_operand 1))
>> +   (clobber (reg:P LR_REGNO))]
>> +  "rs6000_pcrel_p (cfun)"
>> +{
>> +  return rs6000_indirect_call_template (operands, 0);
>> +}
>> +  [(set_attr "type" "jmpreg")
>> +   (set (attr "length")
>> +	(if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
>> +			   (match_test "which_alternative != 1"))
>> +		      (const_string "8")
>> +		      (const_string "4")))])
>> +
>>  (define_insn "*call_value_indirect_elfv2<mode>"
>>    [(set (match_operand 0 "" "")
>>  	(call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X"))
>> @@ -10728,6 +10772,31 @@
>>  	    (const_string "12")
>>  	    (const_string "8"))))])
>>
>> +(define_insn "*call_value_indirect_pcrel<mode>"
>> +  [(set (match_operand 0 "" "")
>> +	(call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X"))
>> +	      (match_operand:P 2 "unspec_tls" "")))
>> +   (clobber (reg:P LR_REGNO))]
>> +  "rs6000_pcrel_p (cfun)"
>> +{
>> +  if (IS_NOMARK_TLSGETADDR (operands[2]))
>> +    rs6000_output_tlsargs (operands);
>> +
>> +  return rs6000_indirect_call_template (operands, 1);
>> +}
>> +  [(set_attr "type" "jmpreg")
>> +   (set (attr "length")
>> +	(plus
>> +	  (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
>> +	    (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
>> +	      (const_int 8)
>> +	      (const_int 4))
>> +	    (const_int 0))
>> +	  (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
>> +			     (match_test "which_alternative != 1"))
>> +	    (const_string "8")
>> +	    (const_string "4"))))])
>> +
>>  ;; Call subroutine returning any type.
>>  (define_expand "untyped_call"
>>    [(parallel [(call (match_operand 0 "")
>> diff --git a/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c
>> new file mode 100644
>> index 00000000000..c7d322c1c96
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c
>> @@ -0,0 +1,41 @@
>> ++/* { dg-do compile } */
>> ++/* { dg-options "-mdejagnu-cpu=future -O2" } */
>> ++/* { dg-require-effective-target powerpc_elfv2 } */
>> +
>> +/* Test that calls generated from PC-relative code are
>> +   annotated with @notoc.  */
>> +
>> +extern int yy0 (int);
>> +extern void yy1 (int);
>> +
>> +int zz0 (void) __attribute__((noinline));
>> +void zz1 (int) __attribute__((noinline));
>> +
>> +int xx (void)
>> +{
>> +  yy1 (7);
>> +  return yy0 (5);
>> +}
>> +
>> +int zz0 ()
>> +{
>> +  asm ("");
>> +  return 16;
>> +};
>> +
>> +void zz1 (int a __attribute__((__unused__)))
>> +{
>> +  asm ("");
>> +};
>> +
>> +int ww (void)
>> +{
>> +  zz1 (zz0 ());
>> +  return 4;
>> +}
>> +
>> +/* { dg-final { scan-assembler {yy1@notoc} } } */
>> +/* { dg-final { scan-assembler {yy0@notoc} } } */
>> +/* { dg-final { scan-assembler {zz1@notoc} } } */
>> +/* { dg-final { scan-assembler {zz0@notoc} } } */
>> +
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c
>> new file mode 100644
>> index 00000000000..7c767e2ba32
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c
>> @@ -0,0 +1,46 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-mdejagnu-cpu=future -O2" } */
>> +/* { dg-require-effective-target powerpc_elfv2 } */
>> +
>> +/* Test that potential sibcalls are not generated when the caller preserves
>> +   the TOC and the callee doesn't, or vice versa.  */
>> +
>> +int x (void) __attribute__((noinline));
>> +int y (void) __attribute__((noinline));
>> +int xx (void) __attribute__((noinline));
>> +  
>> +int x (void)
>> +{
>> +  return 1;
>> +}
>> +
>> +int y (void)
>> +{
>> +  return 2;
>> +}
>> +
>> +int sib_call (void)
>> +{
>> +  return x ();
>> +}
>> +
>> +#pragma GCC target ("cpu=power9")
>> +int normal_call (void)
>> +{
>> +  return y ();
>> +}
>> +
>> +int xx (void)
>> +{
>> +  return 1;
>> +}
>> +
>> +#pragma GCC target ("cpu=future")
>> +int notoc_call (void)
>> +{
>> +  return xx ();
>> +}
>> +
>> +/* { dg-final { scan-assembler {\mb x@notoc\M} } } */
>> +/* { dg-final { scan-assembler {\mbl y\M} } } */
>> +/* { dg-final { scan-assembler {\mbl xx@notoc\M} } } */
>>
Bill Schmidt May 29, 2019, 1:50 a.m. UTC | #3
Hi,

Please go ahead and review this.  In the test case
gcc.target/powerpc/notoc-direct-1.c, I accidentally left in '+'
characters in column 1 of the first three lines, which caused the test
case failure.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with that
fixed.  Is this okay for trunk?

Thanks,
Bill

On 5/24/19 9:06 AM, Bill Schmidt wrote:
> New test case ICEs, so consider this withdrawn.  Sorry again about this.
>
> Bill
>
> On 5/23/19 9:17 PM, Bill Schmidt wrote:
>> Hm, I got ahead of myself on this one.  I haven't done the regstrap yet,
>> so please hold off reviewing for now.
>>
>> Sorry for the noise.  I shouldn't post when I'm tired...
>>
>> Thanks,
>> Bill
>>
>> On 5/23/19 9:11 PM, Bill Schmidt wrote:
>>> Hi,
>>>
>>> This patch contains the changes to implement call flow for PC-relative addressing.  
>>> It's an amalgam of several internal patches that Alan and I worked on, and as a 
>>> result it's hard to tease apart individual pieces much further.  So I apologize 
>>> that this patch is a little larger than the others.  Also, I've CC'd Alan so he 
>>> can help answer questions about the patch, particularly the PLT bits I'm not very
>>> familiar with.
>>>
>>> Following are descriptions of the individual patches that are combined here.
>>>
>>> (1) When a function uses PC-relative code generation, all direct calls (other than 
>>> sibcalls) that the function makes to local or external callees should appear as
>>> "bl sym@notoc" and should not be followed by a nop instruction.  @notoc indicates
>>> that the assembler should annotate the call with R_PPC64_REL24_NOTOC, meaning
>>> that the caller does not guarantee that r2 contains a valid TOC pointer.  Thus
>>> the linker should not try to replace a subsequent "nop" with a TOC restore
>>> instruction.
>>>
>>> I've added a test case for the four cases handled here:  local calls with/without
>>> a return value, and external cases with/without a return value.
>>>
>>> (2) If a caller preserves the TOC pointer and the callee does not, or vice versa,
>>> then a sibcall will cause an inconsistency.  Don't allow that.
>>>
>>> (3) The linker needs a @notoc directive on sibcall targets when the caller does not
>>> provide or preserve a TOC pointer.  This patch provides for that.
>>>
>>> In creating the new sibcall patterns, I did not duplicate the "c" alternatives
>>> that allow for bctr or blr sibcalls.  I don't think there's a way to generate
>>> those currently.  The bctr would be legitimate for PC-relative sibcalls if you
>>> can prove that the target function is in the same binary, but we don't appear
>>> to detect that possibility today.
>>>
>>>     (4) This patch deletes all the extra insns added to handle pcrel calls,
>>>     instead opting to use existing insns but making their output
>>>     conditional on rs6000_pcrel_p(cfun).  There isn't a need to
>>>     differentiate between pcrel and non-pcrel calls at the point rtl is
>>>     created; rs6000_pcrel_p is valid right up to the final pass, as
>>>     evidenced by use of rs6000_pcrel_p to emit .localentry.
>>>     
>>>     There is one case however where we do need new insns: The existing
>>>     nonlocal indirect call insns mention r2 in their rtl.  That isn't
>>>     correct for pcrel indirect calls, and may cause problems when/if r2
>>>     is allocated as any other volatile gpr in pcrel code.
>>>     
>>>     The patch also fixes pcrel inline PLT calls (which are used for
>>>     -fno-plt and -mlongcall) to use a pcrel load from the PLT rather than
>>>     attempting (and failing) to use TOC-relative loads.  This requires
>>>     some changes in the way relocs are emitted.  For prefix insns we can't
>>>     write
>>>        .reloc .,R_PPC64_PLT_PCREL34_NOTOC,foo
>>>        pld 12,0(0),1
>>>     since the pld may require a padding nop.  Instead it's necessary to
>>>     put the .reloc after the instruction or use a label on the insn.  Like
>>>     this (which is what the patch does):
>>>        pld 12,0(0),1
>>>        .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo
>>>     or this:
>>>        .reloc 0f,R_PPC64_PLT_PCREL34_NOTOC,foo
>>>     0: pld 12,0(0),1
>>>
>>> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
>>> Is this okay for trunk?
>>>
>>> Thanks!
>>> Bill
>>>
>>>
>>> [gcc]
>>>
>>> 2019-05-23  Bill Schmidt  <wschmidt@linux.ibm.com>
>>> 	    Alan Modra  <amodra@gmail.com>
>>>
>>> 	* config/rs6000/rs6000.c (rs6000_call_template_1): Handle pcrel
>>> 	calls here...
>>> 	(rs6000_indirect_call_template_1): ...and here.
>>> 	(rs6000_indirect_sibcall_template): Handle plt_pcrel34.  Rework
>>> 	tocsave, plt16_ha, plt16_lo, mtctr indirect calls.
>>> 	(rs6000_decl_ok_for_sibcall): New function.
>>> 	(rs6000_function_ok_for_sibcall): Refactor.
>>> 	(rs6000_longcall_ref): Use UNSPEC_PLT_PCREL when pcrel.
>>> 	(rs6000_call_aix): Don't emit toc restore rtl for indirect calls
>>> 	when pcrel.  Reorganize.
>>> 	(rs6000_sibcall_aix): Don't add r2 to function usage when pcrel.
>>> 	* rs6000.md (UNSPEC_PLT_PCREL): New unspec.
>>> 	(*pltseq_plt_pcrel): New insn.
>>> 	(*call_local_aix): Handle @notoc calls.
>>> 	(*call_value_local_aix): Likewise.
>>> 	(*call_nonlocal_aix): Adjust lengths for pcrel calls.
>>> 	(*call_value_nonlocal_aix): Likewise.
>>> 	(*call_indirect_pcrel): New insn.
>>> 	(*call_value_indirect_pcrel): Likewise.
>>>
>>>
>>> [gcc/testsuite]
>>>
>>> 2019-05-23  Bill Schmidt  <wschmidt@linux.ibm.com>
>>>
>>> 	* gcc.target/powerpc/notoc-direct-1.c: New.
>>> 	* gcc.target/powerpc/pcrel-sibcall-1.c: New.
>>>
>>>
>>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>>> index 3d5cf9e4ece..9229bad6acc 100644
>>> --- a/gcc/config/rs6000/rs6000.c
>>> +++ b/gcc/config/rs6000/rs6000.c
>>> @@ -21268,7 +21268,9 @@ rs6000_call_template_1 (rtx *operands, unsigned int funop, bool sibcall)
>>>  	    ? "+32768" : ""));
>>>
>>>    static char str[32];  /* 2 spare */
>>> -  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>>> +  if (rs6000_pcrel_p (cfun))
>>> +    sprintf (str, "b%s %s@notoc%s", sibcall ? "" : "l", z, arg);
>>> +  else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>>>      sprintf (str, "b%s %s%s%s", sibcall ? "" : "l", z, arg,
>>>  	     sibcall ? "" : "\n\tnop");
>>>    else if (DEFAULT_ABI == ABI_V4)
>>> @@ -21333,6 +21335,16 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop,
>>>    /* Currently, funop is either 0 or 1.  The maximum string is always
>>>       a !speculate 64-bit __tls_get_addr call.
>>>
>>> +     ABI_ELFv2, pcrel:
>>> +     . 27	.reloc .,R_PPC64_TLSGD,%2\n\t
>>> +     . 35	.reloc .,R_PPC64_PLTSEQ_NOTOC,%z1\n\t
>>> +     .  9	crset 2\n\t
>>> +     . 27	.reloc .,R_PPC64_TLSGD,%2\n\t
>>> +     . 36	.reloc .,R_PPC64_PLTCALL_NOTOC,%z1\n\t
>>> +     .  8	beq%T1l-
>>> +     .---
>>> +     .142
>>> +
>>>       ABI_AIX:
>>>       .  9	ld 2,%3\n\t
>>>       . 27	.reloc .,R_PPC64_TLSGD,%2\n\t
>>> @@ -21398,23 +21410,31 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop,
>>>  	    gcc_unreachable ();
>>>  	}
>>>
>>> +      const char *notoc = rs6000_pcrel_p (cfun) ? "_NOTOC" : "";
>>>        const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT
>>>  			    && flag_pic == 2 ? "+32768" : "");
>>>        if (!speculate)
>>>  	{
>>>  	  s += sprintf (s,
>>> -			"%s.reloc .,R_PPC%s_PLTSEQ,%%z%u%s\n\t",
>>> -			tls, rel64, funop, addend);
>>> +			"%s.reloc .,R_PPC%s_PLTSEQ%s,%%z%u%s\n\t",
>>> +			tls, rel64, notoc, funop, addend);
>>>  	  s += sprintf (s, "crset 2\n\t");
>>>  	}
>>>        s += sprintf (s,
>>> -		    "%s.reloc .,R_PPC%s_PLTCALL,%%z%u%s\n\t",
>>> -		    tls, rel64, funop, addend);
>>> +		    "%s.reloc .,R_PPC%s_PLTCALL%s,%%z%u%s\n\t",
>>> +		    tls, rel64, notoc, funop, addend);
>>>      }
>>>    else if (!speculate)
>>>      s += sprintf (s, "crset 2\n\t");
>>>
>>> -  if (DEFAULT_ABI == ABI_AIX)
>>> +  if (rs6000_pcrel_p (cfun))
>>> +    {
>>> +      if (speculate)
>>> +	sprintf (s, "b%%T%ul", funop);
>>> +      else
>>> +	sprintf (s, "beq%%T%ul-", funop);
>>> +    }
>>> +  else if (DEFAULT_ABI == ABI_AIX)
>>>      {
>>>        if (speculate)
>>>  	sprintf (s,
>>> @@ -21468,63 +21488,73 @@ rs6000_indirect_sibcall_template (rtx *operands, unsigned int funop)
>>>
>>>  #if HAVE_AS_PLTSEQ
>>>  /* Output indirect call insns.
>>> -   WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr.  */
>>> +   WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr,
>>> +   4 for plt_pcrel34.  */
>>>  const char *
>>>  rs6000_pltseq_template (rtx *operands, int which)
>>>  {
>>>    const char *rel64 = TARGET_64BIT ? "64" : "";
>>> -  char tls[28];
>>> +  char tls[30];
>>>    tls[0] = 0;
>>>    if (TARGET_TLS_MARKERS && GET_CODE (operands[3]) == UNSPEC)
>>>      {
>>> +      char off = which == 4 ? '8' : '4';
>>>        if (XINT (operands[3], 1) == UNSPEC_TLSGD)
>>> -	sprintf (tls, ".reloc .,R_PPC%s_TLSGD,%%3\n\t",
>>> -		 rel64);
>>> +	sprintf (tls, ".reloc .-%c,R_PPC%s_TLSGD,%%3\n\t",
>>> +		 off, rel64);
>>>        else if (XINT (operands[3], 1) == UNSPEC_TLSLD)
>>> -	sprintf (tls, ".reloc .,R_PPC%s_TLSLD,%%&\n\t",
>>> -		 rel64);
>>> +	sprintf (tls, ".reloc .-%c,R_PPC%s_TLSLD,%%&\n\t",
>>> +		 off, rel64);
>>>        else
>>>  	gcc_unreachable ();
>>>      }
>>>
>>>    gcc_assert (DEFAULT_ABI == ABI_ELFv2 || DEFAULT_ABI == ABI_V4);
>>> -  static char str[96];  /* 15 spare */
>>> -  const char *off = WORDS_BIG_ENDIAN ? "+2" : "";
>>> +  static char str[96];  /* 10 spare */
>>> +  char off = WORDS_BIG_ENDIAN ? '2' : '4';
>>>    const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT
>>>  			&& flag_pic == 2 ? "+32768" : "");
>>>    switch (which)
>>>      {
>>>      case 0:
>>>        sprintf (str,
>>> -	       "%s.reloc .,R_PPC%s_PLTSEQ,%%z2\n\t"
>>> -	       "st%s",
>>> -	       tls, rel64, TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)");
>>> +	       "st%s\n\t"
>>> +	       "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2",
>>> +	       TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)",
>>> +	       tls, rel64);
>>>        break;
>>>      case 1:
>>>        if (DEFAULT_ABI == ABI_V4 && !flag_pic)
>>>  	sprintf (str,
>>> -		 "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2\n\t"
>>> -		 "lis %%0,0",
>>> +		 "lis %%0,0\n\t"
>>> +		 "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2",
>>>  		 tls, off, rel64);
>>>        else
>>>  	sprintf (str,
>>> -		 "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2%s\n\t"
>>> -		 "addis %%0,%%1,0",
>>> +		 "addis %%0,%%1,0\n\t"
>>> +		 "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2%s",
>>>  		 tls, off, rel64, addend);
>>>        break;
>>>      case 2:
>>>        sprintf (str,
>>> -	       "%s.reloc .%s,R_PPC%s_PLT16_LO%s,%%z2%s\n\t"
>>> -	       "l%s %%0,0(%%1)",
>>> -	       tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend,
>>> -	       TARGET_64BIT ? "d" : "wz");
>>> +	       "l%s %%0,0(%%1)\n\t"
>>> +	       "%s.reloc .-%c,R_PPC%s_PLT16_LO%s,%%z2%s",
>>> +	       TARGET_64BIT ? "d" : "wz",
>>> +	       tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend);
>>>        break;
>>>      case 3:
>>>        sprintf (str,
>>> -	       "%s.reloc .,R_PPC%s_PLTSEQ,%%z2%s\n\t"
>>> -	       "mtctr %%1",
>>> +	       "mtctr %%1\n\t"
>>> +	       "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2%s",
>>>  	       tls, rel64, addend);
>>>        break;
>>> +    case 4:
>>> +      sprintf (str,
>>> +	       "pl%s %%0,0(0),1\n\t"
>>> +	       "%s.reloc .-8,R_PPC%s_PLT_PCREL34_NOTOC,%%z2",
>>> +	       TARGET_64BIT ? "d" : "wz",
>>> +	       tls, rel64);
>>> +      break;
>>>      default:
>>>        gcc_unreachable ();
>>>      }
>>> @@ -24703,6 +24733,53 @@ rs6000_return_addr (int count, rtx frame)
>>>    return get_hard_reg_initial_val (Pmode, LR_REGNO);
>>>  }
>>>
>>> +/* Helper function for rs6000_function_ok_for_sibcall.  */
>>> +
>>> +static bool
>>> +rs6000_decl_ok_for_sibcall (tree decl)
>>> +{
>>> +  /* Sibcalls are always fine for the Darwin ABI.  */
>>> +  if (DEFAULT_ABI == ABI_DARWIN)
>>> +    return true;
>>> +
>>> +  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>>> +    {
>>> +      /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local
>>> +	 functions, because the callee may have a different TOC pointer to
>>> +	 the caller and there's no way to ensure we restore the TOC when
>>> +	 we return.  */
>>> +      if (!decl || DECL_EXTERNAL (decl) || DECL_WEAK (decl)
>>> +	  || !(*targetm.binds_local_p) (decl))
>>> +	return false;
>>> +
>>> +      /* Similarly, if the caller preserves the TOC pointer and the callee
>>> +	 doesn't (or vice versa), proper TOC setup or restoration will be
>>> +	 missed.  For example, suppose A, B, and C are in the same binary
>>> +	 and A -> B -> C.  A and B preserve the TOC pointer but C does not,
>>> +	 and B -> C is eligible as a sibcall.  A will call B through its
>>> +	 local entry point, so A will not restore its TOC itself.  B calls
>>> +	 C with a sibcall, so it will not restore the TOC.  C does not
>>> +	 preserve the TOC, so it may clobber r2 with impunity.  Returning
>>> +	 from C will result in a corrupted TOC for A.  */
>>> +      else if (rs6000_fndecl_pcrel_p (decl) != rs6000_pcrel_p (cfun))
>>> +	return false;
>>> +
>>> +      else
>>> +	return true;
>>> +    }
>>> +
>>> +  /*  With the secure-plt SYSV ABI we can't make non-local calls when
>>> +      -fpic/PIC because the plt call stubs use r30.  */
>>> +  if (DEFAULT_ABI == ABI_V4
>>> +      && (!TARGET_SECURE_PLT
>>> +	  || !flag_pic
>>> +	  || (decl
>>> +	      && (*targetm.binds_local_p) (decl))))
>>> +    return true;
>>> +
>>> +  return false;
>>> +}
>>> +
>>>  /* Say whether a function is a candidate for sibcall handling or not.  */
>>>
>>>  static bool
>>> @@ -24748,22 +24825,7 @@ rs6000_function_ok_for_sibcall (tree decl, tree exp)
>>>  	return false;
>>>      }
>>>
>>> -  /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local
>>> -     functions, because the callee may have a different TOC pointer to
>>> -     the caller and there's no way to ensure we restore the TOC when
>>> -     we return.  With the secure-plt SYSV ABI we can't make non-local
>>> -     calls when -fpic/PIC because the plt call stubs use r30.  */
>>> -  if (DEFAULT_ABI == ABI_DARWIN
>>> -      || ((DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>>> -	  && decl
>>> -	  && !DECL_EXTERNAL (decl)
>>> -	  && !DECL_WEAK (decl)
>>> -	  && (*targetm.binds_local_p) (decl))
>>> -      || (DEFAULT_ABI == ABI_V4
>>> -	  && (!TARGET_SECURE_PLT
>>> -	      || !flag_pic
>>> -	      || (decl
>>> -		  && (*targetm.binds_local_p) (decl)))))
>>> +  if (rs6000_decl_ok_for_sibcall (decl))
>>>      {
>>>        tree attr_list = TYPE_ATTRIBUTES (fntype);
>>>
>>> @@ -32592,12 +32654,18 @@ rs6000_longcall_ref (rtx call_ref, rtx arg)
>>>    if (TARGET_PLTSEQ)
>>>      {
>>>        rtx base = const0_rtx;
>>> -      int regno;
>>> -      if (DEFAULT_ABI == ABI_ELFv2)
>>> +      int regno = 12;
>>> +      if (rs6000_pcrel_p (cfun))
>>>  	{
>>> -	  base = gen_rtx_REG (Pmode, TOC_REGISTER);
>>> -	  regno = 12;
>>> +	  rtx reg = gen_rtx_REG (Pmode, regno);
>>> +	  rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg),
>>> +				  UNSPEC_PLT_PCREL);
>>> +	  emit_insn (gen_rtx_SET (reg, u));
>>> +	  return reg;
>>>  	}
>>> +
>>> +      if (DEFAULT_ABI == ABI_ELFv2)
>>> +	base = gen_rtx_REG (Pmode, TOC_REGISTER);
>>>        else
>>>  	{
>>>  	  if (flag_pic)
>>> @@ -37706,37 +37774,38 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
>>>    if (!SYMBOL_REF_P (func)
>>>        || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (func)))
>>>      {
>>> -      /* Save the TOC into its reserved slot before the call,
>>> -	 and prepare to restore it after the call.  */
>>> -      rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT);
>>> -      rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode,
>>> -					     gen_rtvec (1, stack_toc_offset),
>>> -					     UNSPEC_TOCSLOT);
>>> -      toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec);
>>> -
>>> -      /* Can we optimize saving the TOC in the prologue or
>>> -	 do we need to do it at every call?  */
>>> -      if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca)
>>> -	cfun->machine->save_toc_in_prologue = true;
>>> -      else
>>> +      if (!rs6000_pcrel_p (cfun))
>>>  	{
>>> -	  rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
>>> -	  rtx stack_toc_mem = gen_frame_mem (Pmode,
>>> -					     gen_rtx_PLUS (Pmode, stack_ptr,
>>> -							   stack_toc_offset));
>>> -	  MEM_VOLATILE_P (stack_toc_mem) = 1;
>>> -	  if (is_pltseq_longcall)
>>> +	  /* Save the TOC into its reserved slot before the call,
>>> +	     and prepare to restore it after the call.  */
>>> +	  rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT);
>>> +	  rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode,
>>> +						 gen_rtvec (1, stack_toc_offset),
>>> +						 UNSPEC_TOCSLOT);
>>> +	  toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec);
>>> +
>>> +	  /* Can we optimize saving the TOC in the prologue or
>>> +	     do we need to do it at every call?  */
>>> +	  if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca)
>>> +	    cfun->machine->save_toc_in_prologue = true;
>>> +	  else
>>>  	    {
>>> -	      /* Use USPEC_PLTSEQ here to emit every instruction in an
>>> -		 inline PLT call sequence with a reloc, enabling the
>>> -		 linker to edit the sequence back to a direct call
>>> -		 when that makes sense.  */
>>> -	      rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg);
>>> -	      rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ);
>>> -	      emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg));
>>> +	      rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
>>> +	      rtx stack_toc_mem = gen_frame_mem (Pmode,
>>> +						 gen_rtx_PLUS (Pmode, stack_ptr,
>>> +							       stack_toc_offset));
>>> +	      MEM_VOLATILE_P (stack_toc_mem) = 1;
>>> +	      if (HAVE_AS_PLTSEQ
>>> +		  && DEFAULT_ABI == ABI_ELFv2
>>> +		  && GET_CODE (func_desc) == SYMBOL_REF)
>>> +		{
>>> +		  rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg);
>>> +		  rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ);
>>> +		  emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg));
>>> +		}
>>> +	      else
>>> +		emit_move_insn (stack_toc_mem, toc_reg);
>>>  	    }
>>> -	  else
>>> -	    emit_move_insn (stack_toc_mem, toc_reg);
>>>  	}
>>>
>>>        if (DEFAULT_ABI == ABI_ELFv2)
>>> @@ -37813,10 +37882,12 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
>>>      }
>>>    else
>>>      {
>>> -      /* Direct calls use the TOC: for local calls, the callee will
>>> -	 assume the TOC register is set; for non-local calls, the
>>> -	 PLT stub needs the TOC register.  */
>>> -      abi_reg = toc_reg;
>>> +      /* No TOC register needed for calls from PC-relative callers.  */
>>> +      if (!rs6000_pcrel_p (cfun))
>>> +	/* Direct calls use the TOC: for local calls, the callee will
>>> +	   assume the TOC register is set; for non-local calls, the
>>> +	   PLT stub needs the TOC register.  */
>>> +	abi_reg = toc_reg;
>>>        func_addr = func;
>>>      }
>>>
>>> @@ -37866,7 +37937,9 @@ rs6000_sibcall_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
>>>    insn = emit_call_insn (insn);
>>>
>>>    /* Note use of the TOC register.  */
>>> -  use_reg (&CALL_INSN_FUNCTION_USAGE (insn), gen_rtx_REG (Pmode, TOC_REGNUM));
>>> +  if (!rs6000_pcrel_p (cfun))
>>> +    use_reg (&CALL_INSN_FUNCTION_USAGE (insn),
>>> +	     gen_rtx_REG (Pmode, TOC_REGNUM));
>>>  }
>>>
>>>  /* Expand code to perform a call under the SYSV4 ABI.  */
>>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>>> index 71613e21384..e1d9045c5bb 100644
>>> --- a/gcc/config/rs6000/rs6000.md
>>> +++ b/gcc/config/rs6000/rs6000.md
>>> @@ -147,6 +147,7 @@
>>>     UNSPEC_PLTSEQ
>>>     UNSPEC_PLT16_HA
>>>     UNSPEC_PLT16_LO
>>> +   UNSPEC_PLT_PCREL
>>>    ])
>>>
>>>  ;;
>>> @@ -10267,6 +10268,20 @@
>>>  {
>>>    return rs6000_pltseq_template (operands, 3);
>>>  })
>>> +
>>> +(define_insn "*pltseq_plt_pcrel<mode>"
>>> +  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
>>> +	(unspec:P [(match_operand:P 1 "" "")
>>> +		   (match_operand:P 2 "symbol_ref_operand" "s")
>>> +		   (match_operand:P 3 "" "")]
>>> +		  UNSPEC_PLT_PCREL))]
>>> +  "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS
>>> +   && rs6000_pcrel_p (cfun)"
>>> +{
>>> +  return rs6000_pltseq_template (operands, 4);
>>> +}
>>> +  [(set_attr "type" "load")
>>> +   (set_attr "length" "12")])
>>>  
>>>  ;; Call and call_value insns
>>>  ;; For the purposes of expanding calls, Darwin is very similar to SYSV.
>>> @@ -10582,7 +10597,11 @@
>>>  	 (match_operand 1))
>>>     (clobber (reg:P LR_REGNO))]
>>>    "DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2"
>>> -  "bl %z0"
>>> +{
>>> +  if (rs6000_pcrel_p (cfun))
>>> +    return "bl %z0@notoc";
>>> +  return "bl %z0";
>>> +}
>>>    [(set_attr "type" "branch")])
>>>
>>>  (define_insn "*call_value_local_aix<mode>"
>>> @@ -10592,7 +10611,11 @@
>>>     (clobber (reg:P LR_REGNO))]
>>>    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>>>     && !IS_NOMARK_TLSGETADDR (operands[2])"
>>> -  "bl %z1"
>>> +{
>>> +  if (rs6000_pcrel_p (cfun))
>>> +    return "bl %z1@notoc";
>>> +  return "bl %z1";
>>> +}
>>>    [(set_attr "type" "branch")])
>>>
>>>  ;; Call to AIX abi function which may be in another module.
>>> @@ -10607,7 +10630,10 @@
>>>    return rs6000_call_template (operands, 0);
>>>  }
>>>    [(set_attr "type" "branch")
>>> -   (set_attr "length" "8")])
>>> +   (set (attr "length")
>>> +	(if_then_else (match_test "rs6000_pcrel_p (cfun)")
>>> +	  (const_int 4)
>>> +	  (const_int 8)))])
>>>
>>>  (define_insn "*call_value_nonlocal_aix<mode>"
>>>    [(set (match_operand 0 "" "")
>>> @@ -10623,11 +10649,14 @@
>>>  }
>>>    [(set_attr "type" "branch")
>>>     (set (attr "length")
>>> -	(if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
>>> -	  (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
>>> -	    (const_int 16)
>>> -	    (const_int 12))
>>> -	  (const_int 8)))])
>>> +	(plus (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
>>> +		(if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
>>> +		  (const_int 8)
>>> +		  (const_int 4))
>>> +		(const_int 0))
>>> +	      (if_then_else (match_test "rs6000_pcrel_p (cfun)")
>>> +		(const_int 4)
>>> +		(const_int 8))))])
>>>
>>>  ;; Call to indirect functions with the AIX abi using a 3 word descriptor.
>>>  ;; Operand0 is the addresss of the function to call
>>> @@ -10700,6 +10729,21 @@
>>>  		      (const_string "12")
>>>  		      (const_string "8")))])
>>>
>>> +(define_insn "*call_indirect_pcrel<mode>"
>>> +  [(call (mem:SI (match_operand:P 0 "indirect_call_operand" "c,*l,X"))
>>> +	 (match_operand 1))
>>> +   (clobber (reg:P LR_REGNO))]
>>> +  "rs6000_pcrel_p (cfun)"
>>> +{
>>> +  return rs6000_indirect_call_template (operands, 0);
>>> +}
>>> +  [(set_attr "type" "jmpreg")
>>> +   (set (attr "length")
>>> +	(if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
>>> +			   (match_test "which_alternative != 1"))
>>> +		      (const_string "8")
>>> +		      (const_string "4")))])
>>> +
>>>  (define_insn "*call_value_indirect_elfv2<mode>"
>>>    [(set (match_operand 0 "" "")
>>>  	(call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X"))
>>> @@ -10728,6 +10772,31 @@
>>>  	    (const_string "12")
>>>  	    (const_string "8"))))])
>>>
>>> +(define_insn "*call_value_indirect_pcrel<mode>"
>>> +  [(set (match_operand 0 "" "")
>>> +	(call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X"))
>>> +	      (match_operand:P 2 "unspec_tls" "")))
>>> +   (clobber (reg:P LR_REGNO))]
>>> +  "rs6000_pcrel_p (cfun)"
>>> +{
>>> +  if (IS_NOMARK_TLSGETADDR (operands[2]))
>>> +    rs6000_output_tlsargs (operands);
>>> +
>>> +  return rs6000_indirect_call_template (operands, 1);
>>> +}
>>> +  [(set_attr "type" "jmpreg")
>>> +   (set (attr "length")
>>> +	(plus
>>> +	  (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
>>> +	    (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
>>> +	      (const_int 8)
>>> +	      (const_int 4))
>>> +	    (const_int 0))
>>> +	  (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
>>> +			     (match_test "which_alternative != 1"))
>>> +	    (const_string "8")
>>> +	    (const_string "4"))))])
>>> +
>>>  ;; Call subroutine returning any type.
>>>  (define_expand "untyped_call"
>>>    [(parallel [(call (match_operand 0 "")
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c
>>> new file mode 100644
>>> index 00000000000..c7d322c1c96
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c
>>> @@ -0,0 +1,41 @@
>>> ++/* { dg-do compile } */
>>> ++/* { dg-options "-mdejagnu-cpu=future -O2" } */
>>> ++/* { dg-require-effective-target powerpc_elfv2 } */
>>> +
>>> +/* Test that calls generated from PC-relative code are
>>> +   annotated with @notoc.  */
>>> +
>>> +extern int yy0 (int);
>>> +extern void yy1 (int);
>>> +
>>> +int zz0 (void) __attribute__((noinline));
>>> +void zz1 (int) __attribute__((noinline));
>>> +
>>> +int xx (void)
>>> +{
>>> +  yy1 (7);
>>> +  return yy0 (5);
>>> +}
>>> +
>>> +int zz0 ()
>>> +{
>>> +  asm ("");
>>> +  return 16;
>>> +};
>>> +
>>> +void zz1 (int a __attribute__((__unused__)))
>>> +{
>>> +  asm ("");
>>> +};
>>> +
>>> +int ww (void)
>>> +{
>>> +  zz1 (zz0 ());
>>> +  return 4;
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler {yy1@notoc} } } */
>>> +/* { dg-final { scan-assembler {yy0@notoc} } } */
>>> +/* { dg-final { scan-assembler {zz1@notoc} } } */
>>> +/* { dg-final { scan-assembler {zz0@notoc} } } */
>>> +
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c
>>> new file mode 100644
>>> index 00000000000..7c767e2ba32
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c
>>> @@ -0,0 +1,46 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-mdejagnu-cpu=future -O2" } */
>>> +/* { dg-require-effective-target powerpc_elfv2 } */
>>> +
>>> +/* Test that potential sibcalls are not generated when the caller preserves
>>> +   the TOC and the callee doesn't, or vice versa.  */
>>> +
>>> +int x (void) __attribute__((noinline));
>>> +int y (void) __attribute__((noinline));
>>> +int xx (void) __attribute__((noinline));
>>> +  
>>> +int x (void)
>>> +{
>>> +  return 1;
>>> +}
>>> +
>>> +int y (void)
>>> +{
>>> +  return 2;
>>> +}
>>> +
>>> +int sib_call (void)
>>> +{
>>> +  return x ();
>>> +}
>>> +
>>> +#pragma GCC target ("cpu=power9")
>>> +int normal_call (void)
>>> +{
>>> +  return y ();
>>> +}
>>> +
>>> +int xx (void)
>>> +{
>>> +  return 1;
>>> +}
>>> +
>>> +#pragma GCC target ("cpu=future")
>>> +int notoc_call (void)
>>> +{
>>> +  return xx ();
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler {\mb x@notoc\M} } } */
>>> +/* { dg-final { scan-assembler {\mbl y\M} } } */
>>> +/* { dg-final { scan-assembler {\mbl xx@notoc\M} } } */
>>>
Segher Boessenkool May 29, 2019, 12:40 p.m. UTC | #4
Hi Bill,

On Thu, May 23, 2019 at 09:11:44PM -0500, Bill Schmidt wrote:
> (1) When a function uses PC-relative code generation, all direct calls (other than 
> sibcalls) that the function makes to local or external callees should appear as
> "bl sym@notoc" and should not be followed by a nop instruction.  @notoc indicates
> that the assembler should annotate the call with R_PPC64_REL24_NOTOC, meaning
> that the caller does not guarantee that r2 contains a valid TOC pointer.  Thus
> the linker should not try to replace a subsequent "nop" with a TOC restore
> instruction.

All necessary linker (and binutils and GAS) support is upstream already, right?

> In creating the new sibcall patterns, I did not duplicate the "c" alternatives
> that allow for bctr or blr sibcalls.  I don't think there's a way to generate
> those currently.  The bctr would be legitimate for PC-relative sibcalls if you
> can prove that the target function is in the same binary, but we don't appear
> to detect that possibility today.

But you could see that the target is in the same translation unit, for example?
That should be a simple test to make, too.

>        pld 12,0(0),1
>        .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo

Are we guaranteed the assembler always writes a pld like this as 8 bytes?

> 	* gcc.target/powerpc/notoc-direct-1.c: New.
> 	* gcc.target/powerpc/pcrel-sibcall-1.c: New.

A few more testcases would be useful.  Well we'll gain a lot of-em soon
enough, I suppose.

>    static char str[32];  /* 2 spare */
> -  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
> +  if (rs6000_pcrel_p (cfun))
> +    sprintf (str, "b%s %s@notoc%s", sibcall ? "" : "l", z, arg);
> +  else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>      sprintf (str, "b%s %s%s%s", sibcall ? "" : "l", z, arg,
>  	     sibcall ? "" : "\n\tnop");

Two spare, and you add one char (@notoc vs. ..nop), so at a minimum you
need to correct the comment?

> +  if (DEFAULT_ABI == ABI_V4
> +      && (!TARGET_SECURE_PLT
> +	  || !flag_pic
> +	  || (decl
> +	      && (*targetm.binds_local_p) (decl))))
> +    return true;
> +
> +  return false;

Please invert this (put the "return false" ondition in the if, like the
preceding comment says).

>    if (TARGET_PLTSEQ)
>      {
>        rtx base = const0_rtx;
> -      int regno;
> -      if (DEFAULT_ABI == ABI_ELFv2)
> +      int regno = 12;
> +      if (rs6000_pcrel_p (cfun))
>  	{
> -	  base = gen_rtx_REG (Pmode, TOC_REGISTER);
> -	  regno = 12;
> +	  rtx reg = gen_rtx_REG (Pmode, regno);
> +	  rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg),
> +				  UNSPEC_PLT_PCREL);
> +	  emit_insn (gen_rtx_SET (reg, u));
> +	  return reg;
>  	}

You don't need a regno variable here, so don't use it, only set it later
where it _is_ used?

> +(define_insn "*pltseq_plt_pcrel<mode>"
> +  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
> +	(unspec:P [(match_operand:P 1 "" "")
> +		   (match_operand:P 2 "symbol_ref_operand" "s")
> +		   (match_operand:P 3 "" "")]
> +		  UNSPEC_PLT_PCREL))]
> +  "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS
> +   && rs6000_pcrel_p (cfun)"
> +{
> +  return rs6000_pltseq_template (operands, 4);

Maybe those "4" magic constants should be an enum?

> +int zz0 ()
> +{
> +  asm ("");
> +  return 16;
> +};

You might want to put in a comment what this asm is for.


Please consider those things.  Okay for trunk with that.  Thanks!


Segher
Alan Modra May 29, 2019, 3:14 p.m. UTC | #5
On Wed, May 29, 2019 at 07:40:46AM -0500, Segher Boessenkool wrote:
> All necessary linker (and binutils and GAS) support is upstream already, right?

I believe so, except gold support is lacking right now.

> >        pld 12,0(0),1
> >        .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo
> 
> Are we guaranteed the assembler always writes a pld like this as 8 bytes?

Strictly speaking the assembler might nop pad *before* the pld making
a total of 12 bytes, and that's the reason to put the .reloc *after*
the prefix instruction.
Bill Schmidt May 29, 2019, 3:48 p.m. UTC | #6
On 5/29/19 7:40 AM, Segher Boessenkool wrote:
> Hi Bill,
>
> On Thu, May 23, 2019 at 09:11:44PM -0500, Bill Schmidt wrote:
>> (1) When a function uses PC-relative code generation, all direct calls (other than 
>> sibcalls) that the function makes to local or external callees should appear as
>> "bl sym@notoc" and should not be followed by a nop instruction.  @notoc indicates
>> that the assembler should annotate the call with R_PPC64_REL24_NOTOC, meaning
>> that the caller does not guarantee that r2 contains a valid TOC pointer.  Thus
>> the linker should not try to replace a subsequent "nop" with a TOC restore
>> instruction.
> All necessary linker (and binutils and GAS) support is upstream already, right?
>
>> In creating the new sibcall patterns, I did not duplicate the "c" alternatives
>> that allow for bctr or blr sibcalls.  I don't think there's a way to generate
>> those currently.  The bctr would be legitimate for PC-relative sibcalls if you
>> can prove that the target function is in the same binary, but we don't appear
>> to detect that possibility today.
> But you could see that the target is in the same translation unit, for example?
> That should be a simple test to make, too.
>
>>        pld 12,0(0),1
>>        .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo
> Are we guaranteed the assembler always writes a pld like this as 8 bytes?
>
>> 	* gcc.target/powerpc/notoc-direct-1.c: New.
>> 	* gcc.target/powerpc/pcrel-sibcall-1.c: New.
> A few more testcases would be useful.  Well we'll gain a lot of-em soon
> enough, I suppose.
>
>>    static char str[32];  /* 2 spare */
>> -  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>> +  if (rs6000_pcrel_p (cfun))
>> +    sprintf (str, "b%s %s@notoc%s", sibcall ? "" : "l", z, arg);
>> +  else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>>      sprintf (str, "b%s %s%s%s", sibcall ? "" : "l", z, arg,
>>  	     sibcall ? "" : "\n\tnop");
> Two spare, and you add one char (@notoc vs. ..nop), so at a minimum you
> need to correct the comment?
>
>> +  if (DEFAULT_ABI == ABI_V4
>> +      && (!TARGET_SECURE_PLT
>> +	  || !flag_pic
>> +	  || (decl
>> +	      && (*targetm.binds_local_p) (decl))))
>> +    return true;
>> +
>> +  return false;
> Please invert this (put the "return false" ondition in the if, like the
> preceding comment says).
>
>>    if (TARGET_PLTSEQ)
>>      {
>>        rtx base = const0_rtx;
>> -      int regno;
>> -      if (DEFAULT_ABI == ABI_ELFv2)
>> +      int regno = 12;
>> +      if (rs6000_pcrel_p (cfun))
>>  	{
>> -	  base = gen_rtx_REG (Pmode, TOC_REGISTER);
>> -	  regno = 12;
>> +	  rtx reg = gen_rtx_REG (Pmode, regno);
>> +	  rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg),
>> +				  UNSPEC_PLT_PCREL);
>> +	  emit_insn (gen_rtx_SET (reg, u));
>> +	  return reg;
>>  	}
> You don't need a regno variable here, so don't use it, only set it later
> where it _is_ used?
>
>> +(define_insn "*pltseq_plt_pcrel<mode>"
>> +  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
>> +	(unspec:P [(match_operand:P 1 "" "")
>> +		   (match_operand:P 2 "symbol_ref_operand" "s")
>> +		   (match_operand:P 3 "" "")]
>> +		  UNSPEC_PLT_PCREL))]
>> +  "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS
>> +   && rs6000_pcrel_p (cfun)"
>> +{
>> +  return rs6000_pltseq_template (operands, 4);
> Maybe those "4" magic constants should be an enum?
>
>> +int zz0 ()
>> +{
>> +  asm ("");
>> +  return 16;
>> +};
> You might want to put in a comment what this asm is for.
>
>
> Please consider those things.  Okay for trunk with that.  Thanks!

Thanks!  Will make appropriate changes and commit.  Much obliged for the
review!

Bill
>
>
> Segher
>
Segher Boessenkool May 29, 2019, 4:22 p.m. UTC | #7
On Thu, May 30, 2019 at 12:44:35AM +0930, Alan Modra wrote:
> On Wed, May 29, 2019 at 07:40:46AM -0500, Segher Boessenkool wrote:
> > All necessary linker (and binutils and GAS) support is upstream already, right?
> 
> I believe so, except gold support is lacking right now.

Excellent :-)

> > >        pld 12,0(0),1
> > >        .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo
> > 
> > Are we guaranteed the assembler always writes a pld like this as 8 bytes?
> 
> Strictly speaking the assembler might nop pad *before* the pld making
> a total of 12 bytes, and that's the reason to put the .reloc *after*
> the prefix instruction.

Ah, okay.  That probably warrants a comment...

Thanks,


Segher
diff mbox series

Patch

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 3d5cf9e4ece..9229bad6acc 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -21268,7 +21268,9 @@  rs6000_call_template_1 (rtx *operands, unsigned int funop, bool sibcall)
 	    ? "+32768" : ""));
 
   static char str[32];  /* 2 spare */
-  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
+  if (rs6000_pcrel_p (cfun))
+    sprintf (str, "b%s %s@notoc%s", sibcall ? "" : "l", z, arg);
+  else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
     sprintf (str, "b%s %s%s%s", sibcall ? "" : "l", z, arg,
 	     sibcall ? "" : "\n\tnop");
   else if (DEFAULT_ABI == ABI_V4)
@@ -21333,6 +21335,16 @@  rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop,
   /* Currently, funop is either 0 or 1.  The maximum string is always
      a !speculate 64-bit __tls_get_addr call.
 
+     ABI_ELFv2, pcrel:
+     . 27	.reloc .,R_PPC64_TLSGD,%2\n\t
+     . 35	.reloc .,R_PPC64_PLTSEQ_NOTOC,%z1\n\t
+     .  9	crset 2\n\t
+     . 27	.reloc .,R_PPC64_TLSGD,%2\n\t
+     . 36	.reloc .,R_PPC64_PLTCALL_NOTOC,%z1\n\t
+     .  8	beq%T1l-
+     .---
+     .142
+
      ABI_AIX:
      .  9	ld 2,%3\n\t
      . 27	.reloc .,R_PPC64_TLSGD,%2\n\t
@@ -21398,23 +21410,31 @@  rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop,
 	    gcc_unreachable ();
 	}
 
+      const char *notoc = rs6000_pcrel_p (cfun) ? "_NOTOC" : "";
       const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT
 			    && flag_pic == 2 ? "+32768" : "");
       if (!speculate)
 	{
 	  s += sprintf (s,
-			"%s.reloc .,R_PPC%s_PLTSEQ,%%z%u%s\n\t",
-			tls, rel64, funop, addend);
+			"%s.reloc .,R_PPC%s_PLTSEQ%s,%%z%u%s\n\t",
+			tls, rel64, notoc, funop, addend);
 	  s += sprintf (s, "crset 2\n\t");
 	}
       s += sprintf (s,
-		    "%s.reloc .,R_PPC%s_PLTCALL,%%z%u%s\n\t",
-		    tls, rel64, funop, addend);
+		    "%s.reloc .,R_PPC%s_PLTCALL%s,%%z%u%s\n\t",
+		    tls, rel64, notoc, funop, addend);
     }
   else if (!speculate)
     s += sprintf (s, "crset 2\n\t");
 
-  if (DEFAULT_ABI == ABI_AIX)
+  if (rs6000_pcrel_p (cfun))
+    {
+      if (speculate)
+	sprintf (s, "b%%T%ul", funop);
+      else
+	sprintf (s, "beq%%T%ul-", funop);
+    }
+  else if (DEFAULT_ABI == ABI_AIX)
     {
       if (speculate)
 	sprintf (s,
@@ -21468,63 +21488,73 @@  rs6000_indirect_sibcall_template (rtx *operands, unsigned int funop)
 
 #if HAVE_AS_PLTSEQ
 /* Output indirect call insns.
-   WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr.  */
+   WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr,
+   4 for plt_pcrel34.  */
 const char *
 rs6000_pltseq_template (rtx *operands, int which)
 {
   const char *rel64 = TARGET_64BIT ? "64" : "";
-  char tls[28];
+  char tls[30];
   tls[0] = 0;
   if (TARGET_TLS_MARKERS && GET_CODE (operands[3]) == UNSPEC)
     {
+      char off = which == 4 ? '8' : '4';
       if (XINT (operands[3], 1) == UNSPEC_TLSGD)
-	sprintf (tls, ".reloc .,R_PPC%s_TLSGD,%%3\n\t",
-		 rel64);
+	sprintf (tls, ".reloc .-%c,R_PPC%s_TLSGD,%%3\n\t",
+		 off, rel64);
       else if (XINT (operands[3], 1) == UNSPEC_TLSLD)
-	sprintf (tls, ".reloc .,R_PPC%s_TLSLD,%%&\n\t",
-		 rel64);
+	sprintf (tls, ".reloc .-%c,R_PPC%s_TLSLD,%%&\n\t",
+		 off, rel64);
       else
 	gcc_unreachable ();
     }
 
   gcc_assert (DEFAULT_ABI == ABI_ELFv2 || DEFAULT_ABI == ABI_V4);
-  static char str[96];  /* 15 spare */
-  const char *off = WORDS_BIG_ENDIAN ? "+2" : "";
+  static char str[96];  /* 10 spare */
+  char off = WORDS_BIG_ENDIAN ? '2' : '4';
   const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT
 			&& flag_pic == 2 ? "+32768" : "");
   switch (which)
     {
     case 0:
       sprintf (str,
-	       "%s.reloc .,R_PPC%s_PLTSEQ,%%z2\n\t"
-	       "st%s",
-	       tls, rel64, TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)");
+	       "st%s\n\t"
+	       "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2",
+	       TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)",
+	       tls, rel64);
       break;
     case 1:
       if (DEFAULT_ABI == ABI_V4 && !flag_pic)
 	sprintf (str,
-		 "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2\n\t"
-		 "lis %%0,0",
+		 "lis %%0,0\n\t"
+		 "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2",
 		 tls, off, rel64);
       else
 	sprintf (str,
-		 "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2%s\n\t"
-		 "addis %%0,%%1,0",
+		 "addis %%0,%%1,0\n\t"
+		 "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2%s",
 		 tls, off, rel64, addend);
       break;
     case 2:
       sprintf (str,
-	       "%s.reloc .%s,R_PPC%s_PLT16_LO%s,%%z2%s\n\t"
-	       "l%s %%0,0(%%1)",
-	       tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend,
-	       TARGET_64BIT ? "d" : "wz");
+	       "l%s %%0,0(%%1)\n\t"
+	       "%s.reloc .-%c,R_PPC%s_PLT16_LO%s,%%z2%s",
+	       TARGET_64BIT ? "d" : "wz",
+	       tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend);
       break;
     case 3:
       sprintf (str,
-	       "%s.reloc .,R_PPC%s_PLTSEQ,%%z2%s\n\t"
-	       "mtctr %%1",
+	       "mtctr %%1\n\t"
+	       "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2%s",
 	       tls, rel64, addend);
       break;
+    case 4:
+      sprintf (str,
+	       "pl%s %%0,0(0),1\n\t"
+	       "%s.reloc .-8,R_PPC%s_PLT_PCREL34_NOTOC,%%z2",
+	       TARGET_64BIT ? "d" : "wz",
+	       tls, rel64);
+      break;
     default:
       gcc_unreachable ();
     }
@@ -24703,6 +24733,53 @@  rs6000_return_addr (int count, rtx frame)
   return get_hard_reg_initial_val (Pmode, LR_REGNO);
 }
 
+/* Helper function for rs6000_function_ok_for_sibcall.  */
+
+static bool
+rs6000_decl_ok_for_sibcall (tree decl)
+{
+  /* Sibcalls are always fine for the Darwin ABI.  */
+  if (DEFAULT_ABI == ABI_DARWIN)
+    return true;
+
+  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
+    {
+      /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local
+	 functions, because the callee may have a different TOC pointer to
+	 the caller and there's no way to ensure we restore the TOC when
+	 we return.  */
+      if (!decl || DECL_EXTERNAL (decl) || DECL_WEAK (decl)
+	  || !(*targetm.binds_local_p) (decl))
+	return false;
+
+      /* Similarly, if the caller preserves the TOC pointer and the callee
+	 doesn't (or vice versa), proper TOC setup or restoration will be
+	 missed.  For example, suppose A, B, and C are in the same binary
+	 and A -> B -> C.  A and B preserve the TOC pointer but C does not,
+	 and B -> C is eligible as a sibcall.  A will call B through its
+	 local entry point, so A will not restore its TOC itself.  B calls
+	 C with a sibcall, so it will not restore the TOC.  C does not
+	 preserve the TOC, so it may clobber r2 with impunity.  Returning
+	 from C will result in a corrupted TOC for A.  */
+      else if (rs6000_fndecl_pcrel_p (decl) != rs6000_pcrel_p (cfun))
+	return false;
+
+      else
+	return true;
+    }
+
+  /*  With the secure-plt SYSV ABI we can't make non-local calls when
+      -fpic/PIC because the plt call stubs use r30.  */
+  if (DEFAULT_ABI == ABI_V4
+      && (!TARGET_SECURE_PLT
+	  || !flag_pic
+	  || (decl
+	      && (*targetm.binds_local_p) (decl))))
+    return true;
+
+  return false;
+}
+
 /* Say whether a function is a candidate for sibcall handling or not.  */
 
 static bool
@@ -24748,22 +24825,7 @@  rs6000_function_ok_for_sibcall (tree decl, tree exp)
 	return false;
     }
 
-  /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local
-     functions, because the callee may have a different TOC pointer to
-     the caller and there's no way to ensure we restore the TOC when
-     we return.  With the secure-plt SYSV ABI we can't make non-local
-     calls when -fpic/PIC because the plt call stubs use r30.  */
-  if (DEFAULT_ABI == ABI_DARWIN
-      || ((DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
-	  && decl
-	  && !DECL_EXTERNAL (decl)
-	  && !DECL_WEAK (decl)
-	  && (*targetm.binds_local_p) (decl))
-      || (DEFAULT_ABI == ABI_V4
-	  && (!TARGET_SECURE_PLT
-	      || !flag_pic
-	      || (decl
-		  && (*targetm.binds_local_p) (decl)))))
+  if (rs6000_decl_ok_for_sibcall (decl))
     {
       tree attr_list = TYPE_ATTRIBUTES (fntype);
 
@@ -32592,12 +32654,18 @@  rs6000_longcall_ref (rtx call_ref, rtx arg)
   if (TARGET_PLTSEQ)
     {
       rtx base = const0_rtx;
-      int regno;
-      if (DEFAULT_ABI == ABI_ELFv2)
+      int regno = 12;
+      if (rs6000_pcrel_p (cfun))
 	{
-	  base = gen_rtx_REG (Pmode, TOC_REGISTER);
-	  regno = 12;
+	  rtx reg = gen_rtx_REG (Pmode, regno);
+	  rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg),
+				  UNSPEC_PLT_PCREL);
+	  emit_insn (gen_rtx_SET (reg, u));
+	  return reg;
 	}
+
+      if (DEFAULT_ABI == ABI_ELFv2)
+	base = gen_rtx_REG (Pmode, TOC_REGISTER);
       else
 	{
 	  if (flag_pic)
@@ -37706,37 +37774,38 @@  rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
   if (!SYMBOL_REF_P (func)
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (func)))
     {
-      /* Save the TOC into its reserved slot before the call,
-	 and prepare to restore it after the call.  */
-      rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT);
-      rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode,
-					     gen_rtvec (1, stack_toc_offset),
-					     UNSPEC_TOCSLOT);
-      toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec);
-
-      /* Can we optimize saving the TOC in the prologue or
-	 do we need to do it at every call?  */
-      if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca)
-	cfun->machine->save_toc_in_prologue = true;
-      else
+      if (!rs6000_pcrel_p (cfun))
 	{
-	  rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
-	  rtx stack_toc_mem = gen_frame_mem (Pmode,
-					     gen_rtx_PLUS (Pmode, stack_ptr,
-							   stack_toc_offset));
-	  MEM_VOLATILE_P (stack_toc_mem) = 1;
-	  if (is_pltseq_longcall)
+	  /* Save the TOC into its reserved slot before the call,
+	     and prepare to restore it after the call.  */
+	  rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT);
+	  rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode,
+						 gen_rtvec (1, stack_toc_offset),
+						 UNSPEC_TOCSLOT);
+	  toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec);
+
+	  /* Can we optimize saving the TOC in the prologue or
+	     do we need to do it at every call?  */
+	  if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca)
+	    cfun->machine->save_toc_in_prologue = true;
+	  else
 	    {
-	      /* Use USPEC_PLTSEQ here to emit every instruction in an
-		 inline PLT call sequence with a reloc, enabling the
-		 linker to edit the sequence back to a direct call
-		 when that makes sense.  */
-	      rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg);
-	      rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ);
-	      emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg));
+	      rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
+	      rtx stack_toc_mem = gen_frame_mem (Pmode,
+						 gen_rtx_PLUS (Pmode, stack_ptr,
+							       stack_toc_offset));
+	      MEM_VOLATILE_P (stack_toc_mem) = 1;
+	      if (HAVE_AS_PLTSEQ
+		  && DEFAULT_ABI == ABI_ELFv2
+		  && GET_CODE (func_desc) == SYMBOL_REF)
+		{
+		  rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg);
+		  rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ);
+		  emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg));
+		}
+	      else
+		emit_move_insn (stack_toc_mem, toc_reg);
 	    }
-	  else
-	    emit_move_insn (stack_toc_mem, toc_reg);
 	}
 
       if (DEFAULT_ABI == ABI_ELFv2)
@@ -37813,10 +37882,12 @@  rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
     }
   else
     {
-      /* Direct calls use the TOC: for local calls, the callee will
-	 assume the TOC register is set; for non-local calls, the
-	 PLT stub needs the TOC register.  */
-      abi_reg = toc_reg;
+      /* No TOC register needed for calls from PC-relative callers.  */
+      if (!rs6000_pcrel_p (cfun))
+	/* Direct calls use the TOC: for local calls, the callee will
+	   assume the TOC register is set; for non-local calls, the
+	   PLT stub needs the TOC register.  */
+	abi_reg = toc_reg;
       func_addr = func;
     }
 
@@ -37866,7 +37937,9 @@  rs6000_sibcall_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
   insn = emit_call_insn (insn);
 
   /* Note use of the TOC register.  */
-  use_reg (&CALL_INSN_FUNCTION_USAGE (insn), gen_rtx_REG (Pmode, TOC_REGNUM));
+  if (!rs6000_pcrel_p (cfun))
+    use_reg (&CALL_INSN_FUNCTION_USAGE (insn),
+	     gen_rtx_REG (Pmode, TOC_REGNUM));
 }
 
 /* Expand code to perform a call under the SYSV4 ABI.  */
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 71613e21384..e1d9045c5bb 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -147,6 +147,7 @@ 
    UNSPEC_PLTSEQ
    UNSPEC_PLT16_HA
    UNSPEC_PLT16_LO
+   UNSPEC_PLT_PCREL
   ])
 
 ;;
@@ -10267,6 +10268,20 @@ 
 {
   return rs6000_pltseq_template (operands, 3);
 })
+
+(define_insn "*pltseq_plt_pcrel<mode>"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
+	(unspec:P [(match_operand:P 1 "" "")
+		   (match_operand:P 2 "symbol_ref_operand" "s")
+		   (match_operand:P 3 "" "")]
+		  UNSPEC_PLT_PCREL))]
+  "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS
+   && rs6000_pcrel_p (cfun)"
+{
+  return rs6000_pltseq_template (operands, 4);
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "12")])
 
 ;; Call and call_value insns
 ;; For the purposes of expanding calls, Darwin is very similar to SYSV.
@@ -10582,7 +10597,11 @@ 
 	 (match_operand 1))
    (clobber (reg:P LR_REGNO))]
   "DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2"
-  "bl %z0"
+{
+  if (rs6000_pcrel_p (cfun))
+    return "bl %z0@notoc";
+  return "bl %z0";
+}
   [(set_attr "type" "branch")])
 
 (define_insn "*call_value_local_aix<mode>"
@@ -10592,7 +10611,11 @@ 
    (clobber (reg:P LR_REGNO))]
   "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
    && !IS_NOMARK_TLSGETADDR (operands[2])"
-  "bl %z1"
+{
+  if (rs6000_pcrel_p (cfun))
+    return "bl %z1@notoc";
+  return "bl %z1";
+}
   [(set_attr "type" "branch")])
 
 ;; Call to AIX abi function which may be in another module.
@@ -10607,7 +10630,10 @@ 
   return rs6000_call_template (operands, 0);
 }
   [(set_attr "type" "branch")
-   (set_attr "length" "8")])
+   (set (attr "length")
+	(if_then_else (match_test "rs6000_pcrel_p (cfun)")
+	  (const_int 4)
+	  (const_int 8)))])
 
 (define_insn "*call_value_nonlocal_aix<mode>"
   [(set (match_operand 0 "" "")
@@ -10623,11 +10649,14 @@ 
 }
   [(set_attr "type" "branch")
    (set (attr "length")
-	(if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
-	  (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
-	    (const_int 16)
-	    (const_int 12))
-	  (const_int 8)))])
+	(plus (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
+		(if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
+		  (const_int 8)
+		  (const_int 4))
+		(const_int 0))
+	      (if_then_else (match_test "rs6000_pcrel_p (cfun)")
+		(const_int 4)
+		(const_int 8))))])
 
 ;; Call to indirect functions with the AIX abi using a 3 word descriptor.
 ;; Operand0 is the addresss of the function to call
@@ -10700,6 +10729,21 @@ 
 		      (const_string "12")
 		      (const_string "8")))])
 
+(define_insn "*call_indirect_pcrel<mode>"
+  [(call (mem:SI (match_operand:P 0 "indirect_call_operand" "c,*l,X"))
+	 (match_operand 1))
+   (clobber (reg:P LR_REGNO))]
+  "rs6000_pcrel_p (cfun)"
+{
+  return rs6000_indirect_call_template (operands, 0);
+}
+  [(set_attr "type" "jmpreg")
+   (set (attr "length")
+	(if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
+			   (match_test "which_alternative != 1"))
+		      (const_string "8")
+		      (const_string "4")))])
+
 (define_insn "*call_value_indirect_elfv2<mode>"
   [(set (match_operand 0 "" "")
 	(call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X"))
@@ -10728,6 +10772,31 @@ 
 	    (const_string "12")
 	    (const_string "8"))))])
 
+(define_insn "*call_value_indirect_pcrel<mode>"
+  [(set (match_operand 0 "" "")
+	(call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X"))
+	      (match_operand:P 2 "unspec_tls" "")))
+   (clobber (reg:P LR_REGNO))]
+  "rs6000_pcrel_p (cfun)"
+{
+  if (IS_NOMARK_TLSGETADDR (operands[2]))
+    rs6000_output_tlsargs (operands);
+
+  return rs6000_indirect_call_template (operands, 1);
+}
+  [(set_attr "type" "jmpreg")
+   (set (attr "length")
+	(plus
+	  (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
+	    (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
+	      (const_int 8)
+	      (const_int 4))
+	    (const_int 0))
+	  (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
+			     (match_test "which_alternative != 1"))
+	    (const_string "8")
+	    (const_string "4"))))])
+
 ;; Call subroutine returning any type.
 (define_expand "untyped_call"
   [(parallel [(call (match_operand 0 "")
diff --git a/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c
new file mode 100644
index 00000000000..c7d322c1c96
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c
@@ -0,0 +1,41 @@ 
++/* { dg-do compile } */
++/* { dg-options "-mdejagnu-cpu=future -O2" } */
++/* { dg-require-effective-target powerpc_elfv2 } */
+
+/* Test that calls generated from PC-relative code are
+   annotated with @notoc.  */
+
+extern int yy0 (int);
+extern void yy1 (int);
+
+int zz0 (void) __attribute__((noinline));
+void zz1 (int) __attribute__((noinline));
+
+int xx (void)
+{
+  yy1 (7);
+  return yy0 (5);
+}
+
+int zz0 ()
+{
+  asm ("");
+  return 16;
+};
+
+void zz1 (int a __attribute__((__unused__)))
+{
+  asm ("");
+};
+
+int ww (void)
+{
+  zz1 (zz0 ());
+  return 4;
+}
+
+/* { dg-final { scan-assembler {yy1@notoc} } } */
+/* { dg-final { scan-assembler {yy0@notoc} } } */
+/* { dg-final { scan-assembler {zz1@notoc} } } */
+/* { dg-final { scan-assembler {zz0@notoc} } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c
new file mode 100644
index 00000000000..7c767e2ba32
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c
@@ -0,0 +1,46 @@ 
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+/* { dg-require-effective-target powerpc_elfv2 } */
+
+/* Test that potential sibcalls are not generated when the caller preserves
+   the TOC and the callee doesn't, or vice versa.  */
+
+int x (void) __attribute__((noinline));
+int y (void) __attribute__((noinline));
+int xx (void) __attribute__((noinline));
+  
+int x (void)
+{
+  return 1;
+}
+
+int y (void)
+{
+  return 2;
+}
+
+int sib_call (void)
+{
+  return x ();
+}
+
+#pragma GCC target ("cpu=power9")
+int normal_call (void)
+{
+  return y ();
+}
+
+int xx (void)
+{
+  return 1;
+}
+
+#pragma GCC target ("cpu=future")
+int notoc_call (void)
+{
+  return xx ();
+}
+
+/* { dg-final { scan-assembler {\mb x@notoc\M} } } */
+/* { dg-final { scan-assembler {\mbl y\M} } } */
+/* { dg-final { scan-assembler {\mbl xx@notoc\M} } } */