Message ID | bec8578a-8cb6-8c3e-1938-df90d192665d@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | rs6000: Call flow implementation for PC-relative addressing | expand |
Hm, I got ahead of myself on this one. I haven't done the regstrap yet, so please hold off reviewing for now. Sorry for the noise. I shouldn't post when I'm tired... Thanks, Bill On 5/23/19 9:11 PM, Bill Schmidt wrote: > Hi, > > This patch contains the changes to implement call flow for PC-relative addressing. > It's an amalgam of several internal patches that Alan and I worked on, and as a > result it's hard to tease apart individual pieces much further. So I apologize > that this patch is a little larger than the others. Also, I've CC'd Alan so he > can help answer questions about the patch, particularly the PLT bits I'm not very > familiar with. > > Following are descriptions of the individual patches that are combined here. > > (1) When a function uses PC-relative code generation, all direct calls (other than > sibcalls) that the function makes to local or external callees should appear as > "bl sym@notoc" and should not be followed by a nop instruction. @notoc indicates > that the assembler should annotate the call with R_PPC64_REL24_NOTOC, meaning > that the caller does not guarantee that r2 contains a valid TOC pointer. Thus > the linker should not try to replace a subsequent "nop" with a TOC restore > instruction. > > I've added a test case for the four cases handled here: local calls with/without > a return value, and external cases with/without a return value. > > (2) If a caller preserves the TOC pointer and the callee does not, or vice versa, > then a sibcall will cause an inconsistency. Don't allow that. > > (3) The linker needs a @notoc directive on sibcall targets when the caller does not > provide or preserve a TOC pointer. This patch provides for that. > > In creating the new sibcall patterns, I did not duplicate the "c" alternatives > that allow for bctr or blr sibcalls. I don't think there's a way to generate > those currently. The bctr would be legitimate for PC-relative sibcalls if you > can prove that the target function is in the same binary, but we don't appear > to detect that possibility today. > > (4) This patch deletes all the extra insns added to handle pcrel calls, > instead opting to use existing insns but making their output > conditional on rs6000_pcrel_p(cfun). There isn't a need to > differentiate between pcrel and non-pcrel calls at the point rtl is > created; rs6000_pcrel_p is valid right up to the final pass, as > evidenced by use of rs6000_pcrel_p to emit .localentry. > > There is one case however where we do need new insns: The existing > nonlocal indirect call insns mention r2 in their rtl. That isn't > correct for pcrel indirect calls, and may cause problems when/if r2 > is allocated as any other volatile gpr in pcrel code. > > The patch also fixes pcrel inline PLT calls (which are used for > -fno-plt and -mlongcall) to use a pcrel load from the PLT rather than > attempting (and failing) to use TOC-relative loads. This requires > some changes in the way relocs are emitted. For prefix insns we can't > write > .reloc .,R_PPC64_PLT_PCREL34_NOTOC,foo > pld 12,0(0),1 > since the pld may require a padding nop. Instead it's necessary to > put the .reloc after the instruction or use a label on the insn. Like > this (which is what the patch does): > pld 12,0(0),1 > .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo > or this: > .reloc 0f,R_PPC64_PLT_PCREL34_NOTOC,foo > 0: pld 12,0(0),1 > > Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions. > Is this okay for trunk? > > Thanks! > Bill > > > [gcc] > > 2019-05-23 Bill Schmidt <wschmidt@linux.ibm.com> > Alan Modra <amodra@gmail.com> > > * config/rs6000/rs6000.c (rs6000_call_template_1): Handle pcrel > calls here... > (rs6000_indirect_call_template_1): ...and here. > (rs6000_indirect_sibcall_template): Handle plt_pcrel34. Rework > tocsave, plt16_ha, plt16_lo, mtctr indirect calls. > (rs6000_decl_ok_for_sibcall): New function. > (rs6000_function_ok_for_sibcall): Refactor. > (rs6000_longcall_ref): Use UNSPEC_PLT_PCREL when pcrel. > (rs6000_call_aix): Don't emit toc restore rtl for indirect calls > when pcrel. Reorganize. > (rs6000_sibcall_aix): Don't add r2 to function usage when pcrel. > * rs6000.md (UNSPEC_PLT_PCREL): New unspec. > (*pltseq_plt_pcrel): New insn. > (*call_local_aix): Handle @notoc calls. > (*call_value_local_aix): Likewise. > (*call_nonlocal_aix): Adjust lengths for pcrel calls. > (*call_value_nonlocal_aix): Likewise. > (*call_indirect_pcrel): New insn. > (*call_value_indirect_pcrel): Likewise. > > > [gcc/testsuite] > > 2019-05-23 Bill Schmidt <wschmidt@linux.ibm.com> > > * gcc.target/powerpc/notoc-direct-1.c: New. > * gcc.target/powerpc/pcrel-sibcall-1.c: New. > > > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > index 3d5cf9e4ece..9229bad6acc 100644 > --- a/gcc/config/rs6000/rs6000.c > +++ b/gcc/config/rs6000/rs6000.c > @@ -21268,7 +21268,9 @@ rs6000_call_template_1 (rtx *operands, unsigned int funop, bool sibcall) > ? "+32768" : "")); > > static char str[32]; /* 2 spare */ > - if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) > + if (rs6000_pcrel_p (cfun)) > + sprintf (str, "b%s %s@notoc%s", sibcall ? "" : "l", z, arg); > + else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) > sprintf (str, "b%s %s%s%s", sibcall ? "" : "l", z, arg, > sibcall ? "" : "\n\tnop"); > else if (DEFAULT_ABI == ABI_V4) > @@ -21333,6 +21335,16 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop, > /* Currently, funop is either 0 or 1. The maximum string is always > a !speculate 64-bit __tls_get_addr call. > > + ABI_ELFv2, pcrel: > + . 27 .reloc .,R_PPC64_TLSGD,%2\n\t > + . 35 .reloc .,R_PPC64_PLTSEQ_NOTOC,%z1\n\t > + . 9 crset 2\n\t > + . 27 .reloc .,R_PPC64_TLSGD,%2\n\t > + . 36 .reloc .,R_PPC64_PLTCALL_NOTOC,%z1\n\t > + . 8 beq%T1l- > + .--- > + .142 > + > ABI_AIX: > . 9 ld 2,%3\n\t > . 27 .reloc .,R_PPC64_TLSGD,%2\n\t > @@ -21398,23 +21410,31 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop, > gcc_unreachable (); > } > > + const char *notoc = rs6000_pcrel_p (cfun) ? "_NOTOC" : ""; > const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT > && flag_pic == 2 ? "+32768" : ""); > if (!speculate) > { > s += sprintf (s, > - "%s.reloc .,R_PPC%s_PLTSEQ,%%z%u%s\n\t", > - tls, rel64, funop, addend); > + "%s.reloc .,R_PPC%s_PLTSEQ%s,%%z%u%s\n\t", > + tls, rel64, notoc, funop, addend); > s += sprintf (s, "crset 2\n\t"); > } > s += sprintf (s, > - "%s.reloc .,R_PPC%s_PLTCALL,%%z%u%s\n\t", > - tls, rel64, funop, addend); > + "%s.reloc .,R_PPC%s_PLTCALL%s,%%z%u%s\n\t", > + tls, rel64, notoc, funop, addend); > } > else if (!speculate) > s += sprintf (s, "crset 2\n\t"); > > - if (DEFAULT_ABI == ABI_AIX) > + if (rs6000_pcrel_p (cfun)) > + { > + if (speculate) > + sprintf (s, "b%%T%ul", funop); > + else > + sprintf (s, "beq%%T%ul-", funop); > + } > + else if (DEFAULT_ABI == ABI_AIX) > { > if (speculate) > sprintf (s, > @@ -21468,63 +21488,73 @@ rs6000_indirect_sibcall_template (rtx *operands, unsigned int funop) > > #if HAVE_AS_PLTSEQ > /* Output indirect call insns. > - WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr. */ > + WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr, > + 4 for plt_pcrel34. */ > const char * > rs6000_pltseq_template (rtx *operands, int which) > { > const char *rel64 = TARGET_64BIT ? "64" : ""; > - char tls[28]; > + char tls[30]; > tls[0] = 0; > if (TARGET_TLS_MARKERS && GET_CODE (operands[3]) == UNSPEC) > { > + char off = which == 4 ? '8' : '4'; > if (XINT (operands[3], 1) == UNSPEC_TLSGD) > - sprintf (tls, ".reloc .,R_PPC%s_TLSGD,%%3\n\t", > - rel64); > + sprintf (tls, ".reloc .-%c,R_PPC%s_TLSGD,%%3\n\t", > + off, rel64); > else if (XINT (operands[3], 1) == UNSPEC_TLSLD) > - sprintf (tls, ".reloc .,R_PPC%s_TLSLD,%%&\n\t", > - rel64); > + sprintf (tls, ".reloc .-%c,R_PPC%s_TLSLD,%%&\n\t", > + off, rel64); > else > gcc_unreachable (); > } > > gcc_assert (DEFAULT_ABI == ABI_ELFv2 || DEFAULT_ABI == ABI_V4); > - static char str[96]; /* 15 spare */ > - const char *off = WORDS_BIG_ENDIAN ? "+2" : ""; > + static char str[96]; /* 10 spare */ > + char off = WORDS_BIG_ENDIAN ? '2' : '4'; > const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT > && flag_pic == 2 ? "+32768" : ""); > switch (which) > { > case 0: > sprintf (str, > - "%s.reloc .,R_PPC%s_PLTSEQ,%%z2\n\t" > - "st%s", > - tls, rel64, TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)"); > + "st%s\n\t" > + "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2", > + TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)", > + tls, rel64); > break; > case 1: > if (DEFAULT_ABI == ABI_V4 && !flag_pic) > sprintf (str, > - "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2\n\t" > - "lis %%0,0", > + "lis %%0,0\n\t" > + "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2", > tls, off, rel64); > else > sprintf (str, > - "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2%s\n\t" > - "addis %%0,%%1,0", > + "addis %%0,%%1,0\n\t" > + "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2%s", > tls, off, rel64, addend); > break; > case 2: > sprintf (str, > - "%s.reloc .%s,R_PPC%s_PLT16_LO%s,%%z2%s\n\t" > - "l%s %%0,0(%%1)", > - tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend, > - TARGET_64BIT ? "d" : "wz"); > + "l%s %%0,0(%%1)\n\t" > + "%s.reloc .-%c,R_PPC%s_PLT16_LO%s,%%z2%s", > + TARGET_64BIT ? "d" : "wz", > + tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend); > break; > case 3: > sprintf (str, > - "%s.reloc .,R_PPC%s_PLTSEQ,%%z2%s\n\t" > - "mtctr %%1", > + "mtctr %%1\n\t" > + "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2%s", > tls, rel64, addend); > break; > + case 4: > + sprintf (str, > + "pl%s %%0,0(0),1\n\t" > + "%s.reloc .-8,R_PPC%s_PLT_PCREL34_NOTOC,%%z2", > + TARGET_64BIT ? "d" : "wz", > + tls, rel64); > + break; > default: > gcc_unreachable (); > } > @@ -24703,6 +24733,53 @@ rs6000_return_addr (int count, rtx frame) > return get_hard_reg_initial_val (Pmode, LR_REGNO); > } > > +/* Helper function for rs6000_function_ok_for_sibcall. */ > + > +static bool > +rs6000_decl_ok_for_sibcall (tree decl) > +{ > + /* Sibcalls are always fine for the Darwin ABI. */ > + if (DEFAULT_ABI == ABI_DARWIN) > + return true; > + > + if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) > + { > + /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local > + functions, because the callee may have a different TOC pointer to > + the caller and there's no way to ensure we restore the TOC when > + we return. */ > + if (!decl || DECL_EXTERNAL (decl) || DECL_WEAK (decl) > + || !(*targetm.binds_local_p) (decl)) > + return false; > + > + /* Similarly, if the caller preserves the TOC pointer and the callee > + doesn't (or vice versa), proper TOC setup or restoration will be > + missed. For example, suppose A, B, and C are in the same binary > + and A -> B -> C. A and B preserve the TOC pointer but C does not, > + and B -> C is eligible as a sibcall. A will call B through its > + local entry point, so A will not restore its TOC itself. B calls > + C with a sibcall, so it will not restore the TOC. C does not > + preserve the TOC, so it may clobber r2 with impunity. Returning > + from C will result in a corrupted TOC for A. */ > + else if (rs6000_fndecl_pcrel_p (decl) != rs6000_pcrel_p (cfun)) > + return false; > + > + else > + return true; > + } > + > + /* With the secure-plt SYSV ABI we can't make non-local calls when > + -fpic/PIC because the plt call stubs use r30. */ > + if (DEFAULT_ABI == ABI_V4 > + && (!TARGET_SECURE_PLT > + || !flag_pic > + || (decl > + && (*targetm.binds_local_p) (decl)))) > + return true; > + > + return false; > +} > + > /* Say whether a function is a candidate for sibcall handling or not. */ > > static bool > @@ -24748,22 +24825,7 @@ rs6000_function_ok_for_sibcall (tree decl, tree exp) > return false; > } > > - /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local > - functions, because the callee may have a different TOC pointer to > - the caller and there's no way to ensure we restore the TOC when > - we return. With the secure-plt SYSV ABI we can't make non-local > - calls when -fpic/PIC because the plt call stubs use r30. */ > - if (DEFAULT_ABI == ABI_DARWIN > - || ((DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) > - && decl > - && !DECL_EXTERNAL (decl) > - && !DECL_WEAK (decl) > - && (*targetm.binds_local_p) (decl)) > - || (DEFAULT_ABI == ABI_V4 > - && (!TARGET_SECURE_PLT > - || !flag_pic > - || (decl > - && (*targetm.binds_local_p) (decl))))) > + if (rs6000_decl_ok_for_sibcall (decl)) > { > tree attr_list = TYPE_ATTRIBUTES (fntype); > > @@ -32592,12 +32654,18 @@ rs6000_longcall_ref (rtx call_ref, rtx arg) > if (TARGET_PLTSEQ) > { > rtx base = const0_rtx; > - int regno; > - if (DEFAULT_ABI == ABI_ELFv2) > + int regno = 12; > + if (rs6000_pcrel_p (cfun)) > { > - base = gen_rtx_REG (Pmode, TOC_REGISTER); > - regno = 12; > + rtx reg = gen_rtx_REG (Pmode, regno); > + rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg), > + UNSPEC_PLT_PCREL); > + emit_insn (gen_rtx_SET (reg, u)); > + return reg; > } > + > + if (DEFAULT_ABI == ABI_ELFv2) > + base = gen_rtx_REG (Pmode, TOC_REGISTER); > else > { > if (flag_pic) > @@ -37706,37 +37774,38 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie) > if (!SYMBOL_REF_P (func) > || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (func))) > { > - /* Save the TOC into its reserved slot before the call, > - and prepare to restore it after the call. */ > - rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT); > - rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode, > - gen_rtvec (1, stack_toc_offset), > - UNSPEC_TOCSLOT); > - toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec); > - > - /* Can we optimize saving the TOC in the prologue or > - do we need to do it at every call? */ > - if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca) > - cfun->machine->save_toc_in_prologue = true; > - else > + if (!rs6000_pcrel_p (cfun)) > { > - rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); > - rtx stack_toc_mem = gen_frame_mem (Pmode, > - gen_rtx_PLUS (Pmode, stack_ptr, > - stack_toc_offset)); > - MEM_VOLATILE_P (stack_toc_mem) = 1; > - if (is_pltseq_longcall) > + /* Save the TOC into its reserved slot before the call, > + and prepare to restore it after the call. */ > + rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT); > + rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode, > + gen_rtvec (1, stack_toc_offset), > + UNSPEC_TOCSLOT); > + toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec); > + > + /* Can we optimize saving the TOC in the prologue or > + do we need to do it at every call? */ > + if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca) > + cfun->machine->save_toc_in_prologue = true; > + else > { > - /* Use USPEC_PLTSEQ here to emit every instruction in an > - inline PLT call sequence with a reloc, enabling the > - linker to edit the sequence back to a direct call > - when that makes sense. */ > - rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg); > - rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ); > - emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg)); > + rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); > + rtx stack_toc_mem = gen_frame_mem (Pmode, > + gen_rtx_PLUS (Pmode, stack_ptr, > + stack_toc_offset)); > + MEM_VOLATILE_P (stack_toc_mem) = 1; > + if (HAVE_AS_PLTSEQ > + && DEFAULT_ABI == ABI_ELFv2 > + && GET_CODE (func_desc) == SYMBOL_REF) > + { > + rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg); > + rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ); > + emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg)); > + } > + else > + emit_move_insn (stack_toc_mem, toc_reg); > } > - else > - emit_move_insn (stack_toc_mem, toc_reg); > } > > if (DEFAULT_ABI == ABI_ELFv2) > @@ -37813,10 +37882,12 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie) > } > else > { > - /* Direct calls use the TOC: for local calls, the callee will > - assume the TOC register is set; for non-local calls, the > - PLT stub needs the TOC register. */ > - abi_reg = toc_reg; > + /* No TOC register needed for calls from PC-relative callers. */ > + if (!rs6000_pcrel_p (cfun)) > + /* Direct calls use the TOC: for local calls, the callee will > + assume the TOC register is set; for non-local calls, the > + PLT stub needs the TOC register. */ > + abi_reg = toc_reg; > func_addr = func; > } > > @@ -37866,7 +37937,9 @@ rs6000_sibcall_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie) > insn = emit_call_insn (insn); > > /* Note use of the TOC register. */ > - use_reg (&CALL_INSN_FUNCTION_USAGE (insn), gen_rtx_REG (Pmode, TOC_REGNUM)); > + if (!rs6000_pcrel_p (cfun)) > + use_reg (&CALL_INSN_FUNCTION_USAGE (insn), > + gen_rtx_REG (Pmode, TOC_REGNUM)); > } > > /* Expand code to perform a call under the SYSV4 ABI. */ > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index 71613e21384..e1d9045c5bb 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -147,6 +147,7 @@ > UNSPEC_PLTSEQ > UNSPEC_PLT16_HA > UNSPEC_PLT16_LO > + UNSPEC_PLT_PCREL > ]) > > ;; > @@ -10267,6 +10268,20 @@ > { > return rs6000_pltseq_template (operands, 3); > }) > + > +(define_insn "*pltseq_plt_pcrel<mode>" > + [(set (match_operand:P 0 "gpc_reg_operand" "=r") > + (unspec:P [(match_operand:P 1 "" "") > + (match_operand:P 2 "symbol_ref_operand" "s") > + (match_operand:P 3 "" "")] > + UNSPEC_PLT_PCREL))] > + "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS > + && rs6000_pcrel_p (cfun)" > +{ > + return rs6000_pltseq_template (operands, 4); > +} > + [(set_attr "type" "load") > + (set_attr "length" "12")]) > > ;; Call and call_value insns > ;; For the purposes of expanding calls, Darwin is very similar to SYSV. > @@ -10582,7 +10597,11 @@ > (match_operand 1)) > (clobber (reg:P LR_REGNO))] > "DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2" > - "bl %z0" > +{ > + if (rs6000_pcrel_p (cfun)) > + return "bl %z0@notoc"; > + return "bl %z0"; > +} > [(set_attr "type" "branch")]) > > (define_insn "*call_value_local_aix<mode>" > @@ -10592,7 +10611,11 @@ > (clobber (reg:P LR_REGNO))] > "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) > && !IS_NOMARK_TLSGETADDR (operands[2])" > - "bl %z1" > +{ > + if (rs6000_pcrel_p (cfun)) > + return "bl %z1@notoc"; > + return "bl %z1"; > +} > [(set_attr "type" "branch")]) > > ;; Call to AIX abi function which may be in another module. > @@ -10607,7 +10630,10 @@ > return rs6000_call_template (operands, 0); > } > [(set_attr "type" "branch") > - (set_attr "length" "8")]) > + (set (attr "length") > + (if_then_else (match_test "rs6000_pcrel_p (cfun)") > + (const_int 4) > + (const_int 8)))]) > > (define_insn "*call_value_nonlocal_aix<mode>" > [(set (match_operand 0 "" "") > @@ -10623,11 +10649,14 @@ > } > [(set_attr "type" "branch") > (set (attr "length") > - (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])") > - (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL") > - (const_int 16) > - (const_int 12)) > - (const_int 8)))]) > + (plus (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])") > + (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL") > + (const_int 8) > + (const_int 4)) > + (const_int 0)) > + (if_then_else (match_test "rs6000_pcrel_p (cfun)") > + (const_int 4) > + (const_int 8))))]) > > ;; Call to indirect functions with the AIX abi using a 3 word descriptor. > ;; Operand0 is the addresss of the function to call > @@ -10700,6 +10729,21 @@ > (const_string "12") > (const_string "8")))]) > > +(define_insn "*call_indirect_pcrel<mode>" > + [(call (mem:SI (match_operand:P 0 "indirect_call_operand" "c,*l,X")) > + (match_operand 1)) > + (clobber (reg:P LR_REGNO))] > + "rs6000_pcrel_p (cfun)" > +{ > + return rs6000_indirect_call_template (operands, 0); > +} > + [(set_attr "type" "jmpreg") > + (set (attr "length") > + (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps") > + (match_test "which_alternative != 1")) > + (const_string "8") > + (const_string "4")))]) > + > (define_insn "*call_value_indirect_elfv2<mode>" > [(set (match_operand 0 "" "") > (call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X")) > @@ -10728,6 +10772,31 @@ > (const_string "12") > (const_string "8"))))]) > > +(define_insn "*call_value_indirect_pcrel<mode>" > + [(set (match_operand 0 "" "") > + (call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X")) > + (match_operand:P 2 "unspec_tls" ""))) > + (clobber (reg:P LR_REGNO))] > + "rs6000_pcrel_p (cfun)" > +{ > + if (IS_NOMARK_TLSGETADDR (operands[2])) > + rs6000_output_tlsargs (operands); > + > + return rs6000_indirect_call_template (operands, 1); > +} > + [(set_attr "type" "jmpreg") > + (set (attr "length") > + (plus > + (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])") > + (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL") > + (const_int 8) > + (const_int 4)) > + (const_int 0)) > + (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps") > + (match_test "which_alternative != 1")) > + (const_string "8") > + (const_string "4"))))]) > + > ;; Call subroutine returning any type. > (define_expand "untyped_call" > [(parallel [(call (match_operand 0 "") > diff --git a/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c > new file mode 100644 > index 00000000000..c7d322c1c96 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c > @@ -0,0 +1,41 @@ > ++/* { dg-do compile } */ > ++/* { dg-options "-mdejagnu-cpu=future -O2" } */ > ++/* { dg-require-effective-target powerpc_elfv2 } */ > + > +/* Test that calls generated from PC-relative code are > + annotated with @notoc. */ > + > +extern int yy0 (int); > +extern void yy1 (int); > + > +int zz0 (void) __attribute__((noinline)); > +void zz1 (int) __attribute__((noinline)); > + > +int xx (void) > +{ > + yy1 (7); > + return yy0 (5); > +} > + > +int zz0 () > +{ > + asm (""); > + return 16; > +}; > + > +void zz1 (int a __attribute__((__unused__))) > +{ > + asm (""); > +}; > + > +int ww (void) > +{ > + zz1 (zz0 ()); > + return 4; > +} > + > +/* { dg-final { scan-assembler {yy1@notoc} } } */ > +/* { dg-final { scan-assembler {yy0@notoc} } } */ > +/* { dg-final { scan-assembler {zz1@notoc} } } */ > +/* { dg-final { scan-assembler {zz0@notoc} } } */ > + > diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c > new file mode 100644 > index 00000000000..7c767e2ba32 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c > @@ -0,0 +1,46 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mdejagnu-cpu=future -O2" } */ > +/* { dg-require-effective-target powerpc_elfv2 } */ > + > +/* Test that potential sibcalls are not generated when the caller preserves > + the TOC and the callee doesn't, or vice versa. */ > + > +int x (void) __attribute__((noinline)); > +int y (void) __attribute__((noinline)); > +int xx (void) __attribute__((noinline)); > + > +int x (void) > +{ > + return 1; > +} > + > +int y (void) > +{ > + return 2; > +} > + > +int sib_call (void) > +{ > + return x (); > +} > + > +#pragma GCC target ("cpu=power9") > +int normal_call (void) > +{ > + return y (); > +} > + > +int xx (void) > +{ > + return 1; > +} > + > +#pragma GCC target ("cpu=future") > +int notoc_call (void) > +{ > + return xx (); > +} > + > +/* { dg-final { scan-assembler {\mb x@notoc\M} } } */ > +/* { dg-final { scan-assembler {\mbl y\M} } } */ > +/* { dg-final { scan-assembler {\mbl xx@notoc\M} } } */ >
New test case ICEs, so consider this withdrawn. Sorry again about this. Bill On 5/23/19 9:17 PM, Bill Schmidt wrote: > Hm, I got ahead of myself on this one. I haven't done the regstrap yet, > so please hold off reviewing for now. > > Sorry for the noise. I shouldn't post when I'm tired... > > Thanks, > Bill > > On 5/23/19 9:11 PM, Bill Schmidt wrote: >> Hi, >> >> This patch contains the changes to implement call flow for PC-relative addressing. >> It's an amalgam of several internal patches that Alan and I worked on, and as a >> result it's hard to tease apart individual pieces much further. So I apologize >> that this patch is a little larger than the others. Also, I've CC'd Alan so he >> can help answer questions about the patch, particularly the PLT bits I'm not very >> familiar with. >> >> Following are descriptions of the individual patches that are combined here. >> >> (1) When a function uses PC-relative code generation, all direct calls (other than >> sibcalls) that the function makes to local or external callees should appear as >> "bl sym@notoc" and should not be followed by a nop instruction. @notoc indicates >> that the assembler should annotate the call with R_PPC64_REL24_NOTOC, meaning >> that the caller does not guarantee that r2 contains a valid TOC pointer. Thus >> the linker should not try to replace a subsequent "nop" with a TOC restore >> instruction. >> >> I've added a test case for the four cases handled here: local calls with/without >> a return value, and external cases with/without a return value. >> >> (2) If a caller preserves the TOC pointer and the callee does not, or vice versa, >> then a sibcall will cause an inconsistency. Don't allow that. >> >> (3) The linker needs a @notoc directive on sibcall targets when the caller does not >> provide or preserve a TOC pointer. This patch provides for that. >> >> In creating the new sibcall patterns, I did not duplicate the "c" alternatives >> that allow for bctr or blr sibcalls. I don't think there's a way to generate >> those currently. The bctr would be legitimate for PC-relative sibcalls if you >> can prove that the target function is in the same binary, but we don't appear >> to detect that possibility today. >> >> (4) This patch deletes all the extra insns added to handle pcrel calls, >> instead opting to use existing insns but making their output >> conditional on rs6000_pcrel_p(cfun). There isn't a need to >> differentiate between pcrel and non-pcrel calls at the point rtl is >> created; rs6000_pcrel_p is valid right up to the final pass, as >> evidenced by use of rs6000_pcrel_p to emit .localentry. >> >> There is one case however where we do need new insns: The existing >> nonlocal indirect call insns mention r2 in their rtl. That isn't >> correct for pcrel indirect calls, and may cause problems when/if r2 >> is allocated as any other volatile gpr in pcrel code. >> >> The patch also fixes pcrel inline PLT calls (which are used for >> -fno-plt and -mlongcall) to use a pcrel load from the PLT rather than >> attempting (and failing) to use TOC-relative loads. This requires >> some changes in the way relocs are emitted. For prefix insns we can't >> write >> .reloc .,R_PPC64_PLT_PCREL34_NOTOC,foo >> pld 12,0(0),1 >> since the pld may require a padding nop. Instead it's necessary to >> put the .reloc after the instruction or use a label on the insn. Like >> this (which is what the patch does): >> pld 12,0(0),1 >> .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo >> or this: >> .reloc 0f,R_PPC64_PLT_PCREL34_NOTOC,foo >> 0: pld 12,0(0),1 >> >> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions. >> Is this okay for trunk? >> >> Thanks! >> Bill >> >> >> [gcc] >> >> 2019-05-23 Bill Schmidt <wschmidt@linux.ibm.com> >> Alan Modra <amodra@gmail.com> >> >> * config/rs6000/rs6000.c (rs6000_call_template_1): Handle pcrel >> calls here... >> (rs6000_indirect_call_template_1): ...and here. >> (rs6000_indirect_sibcall_template): Handle plt_pcrel34. Rework >> tocsave, plt16_ha, plt16_lo, mtctr indirect calls. >> (rs6000_decl_ok_for_sibcall): New function. >> (rs6000_function_ok_for_sibcall): Refactor. >> (rs6000_longcall_ref): Use UNSPEC_PLT_PCREL when pcrel. >> (rs6000_call_aix): Don't emit toc restore rtl for indirect calls >> when pcrel. Reorganize. >> (rs6000_sibcall_aix): Don't add r2 to function usage when pcrel. >> * rs6000.md (UNSPEC_PLT_PCREL): New unspec. >> (*pltseq_plt_pcrel): New insn. >> (*call_local_aix): Handle @notoc calls. >> (*call_value_local_aix): Likewise. >> (*call_nonlocal_aix): Adjust lengths for pcrel calls. >> (*call_value_nonlocal_aix): Likewise. >> (*call_indirect_pcrel): New insn. >> (*call_value_indirect_pcrel): Likewise. >> >> >> [gcc/testsuite] >> >> 2019-05-23 Bill Schmidt <wschmidt@linux.ibm.com> >> >> * gcc.target/powerpc/notoc-direct-1.c: New. >> * gcc.target/powerpc/pcrel-sibcall-1.c: New. >> >> >> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c >> index 3d5cf9e4ece..9229bad6acc 100644 >> --- a/gcc/config/rs6000/rs6000.c >> +++ b/gcc/config/rs6000/rs6000.c >> @@ -21268,7 +21268,9 @@ rs6000_call_template_1 (rtx *operands, unsigned int funop, bool sibcall) >> ? "+32768" : "")); >> >> static char str[32]; /* 2 spare */ >> - if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) >> + if (rs6000_pcrel_p (cfun)) >> + sprintf (str, "b%s %s@notoc%s", sibcall ? "" : "l", z, arg); >> + else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) >> sprintf (str, "b%s %s%s%s", sibcall ? "" : "l", z, arg, >> sibcall ? "" : "\n\tnop"); >> else if (DEFAULT_ABI == ABI_V4) >> @@ -21333,6 +21335,16 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop, >> /* Currently, funop is either 0 or 1. The maximum string is always >> a !speculate 64-bit __tls_get_addr call. >> >> + ABI_ELFv2, pcrel: >> + . 27 .reloc .,R_PPC64_TLSGD,%2\n\t >> + . 35 .reloc .,R_PPC64_PLTSEQ_NOTOC,%z1\n\t >> + . 9 crset 2\n\t >> + . 27 .reloc .,R_PPC64_TLSGD,%2\n\t >> + . 36 .reloc .,R_PPC64_PLTCALL_NOTOC,%z1\n\t >> + . 8 beq%T1l- >> + .--- >> + .142 >> + >> ABI_AIX: >> . 9 ld 2,%3\n\t >> . 27 .reloc .,R_PPC64_TLSGD,%2\n\t >> @@ -21398,23 +21410,31 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop, >> gcc_unreachable (); >> } >> >> + const char *notoc = rs6000_pcrel_p (cfun) ? "_NOTOC" : ""; >> const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT >> && flag_pic == 2 ? "+32768" : ""); >> if (!speculate) >> { >> s += sprintf (s, >> - "%s.reloc .,R_PPC%s_PLTSEQ,%%z%u%s\n\t", >> - tls, rel64, funop, addend); >> + "%s.reloc .,R_PPC%s_PLTSEQ%s,%%z%u%s\n\t", >> + tls, rel64, notoc, funop, addend); >> s += sprintf (s, "crset 2\n\t"); >> } >> s += sprintf (s, >> - "%s.reloc .,R_PPC%s_PLTCALL,%%z%u%s\n\t", >> - tls, rel64, funop, addend); >> + "%s.reloc .,R_PPC%s_PLTCALL%s,%%z%u%s\n\t", >> + tls, rel64, notoc, funop, addend); >> } >> else if (!speculate) >> s += sprintf (s, "crset 2\n\t"); >> >> - if (DEFAULT_ABI == ABI_AIX) >> + if (rs6000_pcrel_p (cfun)) >> + { >> + if (speculate) >> + sprintf (s, "b%%T%ul", funop); >> + else >> + sprintf (s, "beq%%T%ul-", funop); >> + } >> + else if (DEFAULT_ABI == ABI_AIX) >> { >> if (speculate) >> sprintf (s, >> @@ -21468,63 +21488,73 @@ rs6000_indirect_sibcall_template (rtx *operands, unsigned int funop) >> >> #if HAVE_AS_PLTSEQ >> /* Output indirect call insns. >> - WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr. */ >> + WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr, >> + 4 for plt_pcrel34. */ >> const char * >> rs6000_pltseq_template (rtx *operands, int which) >> { >> const char *rel64 = TARGET_64BIT ? "64" : ""; >> - char tls[28]; >> + char tls[30]; >> tls[0] = 0; >> if (TARGET_TLS_MARKERS && GET_CODE (operands[3]) == UNSPEC) >> { >> + char off = which == 4 ? '8' : '4'; >> if (XINT (operands[3], 1) == UNSPEC_TLSGD) >> - sprintf (tls, ".reloc .,R_PPC%s_TLSGD,%%3\n\t", >> - rel64); >> + sprintf (tls, ".reloc .-%c,R_PPC%s_TLSGD,%%3\n\t", >> + off, rel64); >> else if (XINT (operands[3], 1) == UNSPEC_TLSLD) >> - sprintf (tls, ".reloc .,R_PPC%s_TLSLD,%%&\n\t", >> - rel64); >> + sprintf (tls, ".reloc .-%c,R_PPC%s_TLSLD,%%&\n\t", >> + off, rel64); >> else >> gcc_unreachable (); >> } >> >> gcc_assert (DEFAULT_ABI == ABI_ELFv2 || DEFAULT_ABI == ABI_V4); >> - static char str[96]; /* 15 spare */ >> - const char *off = WORDS_BIG_ENDIAN ? "+2" : ""; >> + static char str[96]; /* 10 spare */ >> + char off = WORDS_BIG_ENDIAN ? '2' : '4'; >> const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT >> && flag_pic == 2 ? "+32768" : ""); >> switch (which) >> { >> case 0: >> sprintf (str, >> - "%s.reloc .,R_PPC%s_PLTSEQ,%%z2\n\t" >> - "st%s", >> - tls, rel64, TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)"); >> + "st%s\n\t" >> + "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2", >> + TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)", >> + tls, rel64); >> break; >> case 1: >> if (DEFAULT_ABI == ABI_V4 && !flag_pic) >> sprintf (str, >> - "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2\n\t" >> - "lis %%0,0", >> + "lis %%0,0\n\t" >> + "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2", >> tls, off, rel64); >> else >> sprintf (str, >> - "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2%s\n\t" >> - "addis %%0,%%1,0", >> + "addis %%0,%%1,0\n\t" >> + "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2%s", >> tls, off, rel64, addend); >> break; >> case 2: >> sprintf (str, >> - "%s.reloc .%s,R_PPC%s_PLT16_LO%s,%%z2%s\n\t" >> - "l%s %%0,0(%%1)", >> - tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend, >> - TARGET_64BIT ? "d" : "wz"); >> + "l%s %%0,0(%%1)\n\t" >> + "%s.reloc .-%c,R_PPC%s_PLT16_LO%s,%%z2%s", >> + TARGET_64BIT ? "d" : "wz", >> + tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend); >> break; >> case 3: >> sprintf (str, >> - "%s.reloc .,R_PPC%s_PLTSEQ,%%z2%s\n\t" >> - "mtctr %%1", >> + "mtctr %%1\n\t" >> + "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2%s", >> tls, rel64, addend); >> break; >> + case 4: >> + sprintf (str, >> + "pl%s %%0,0(0),1\n\t" >> + "%s.reloc .-8,R_PPC%s_PLT_PCREL34_NOTOC,%%z2", >> + TARGET_64BIT ? "d" : "wz", >> + tls, rel64); >> + break; >> default: >> gcc_unreachable (); >> } >> @@ -24703,6 +24733,53 @@ rs6000_return_addr (int count, rtx frame) >> return get_hard_reg_initial_val (Pmode, LR_REGNO); >> } >> >> +/* Helper function for rs6000_function_ok_for_sibcall. */ >> + >> +static bool >> +rs6000_decl_ok_for_sibcall (tree decl) >> +{ >> + /* Sibcalls are always fine for the Darwin ABI. */ >> + if (DEFAULT_ABI == ABI_DARWIN) >> + return true; >> + >> + if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) >> + { >> + /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local >> + functions, because the callee may have a different TOC pointer to >> + the caller and there's no way to ensure we restore the TOC when >> + we return. */ >> + if (!decl || DECL_EXTERNAL (decl) || DECL_WEAK (decl) >> + || !(*targetm.binds_local_p) (decl)) >> + return false; >> + >> + /* Similarly, if the caller preserves the TOC pointer and the callee >> + doesn't (or vice versa), proper TOC setup or restoration will be >> + missed. For example, suppose A, B, and C are in the same binary >> + and A -> B -> C. A and B preserve the TOC pointer but C does not, >> + and B -> C is eligible as a sibcall. A will call B through its >> + local entry point, so A will not restore its TOC itself. B calls >> + C with a sibcall, so it will not restore the TOC. C does not >> + preserve the TOC, so it may clobber r2 with impunity. Returning >> + from C will result in a corrupted TOC for A. */ >> + else if (rs6000_fndecl_pcrel_p (decl) != rs6000_pcrel_p (cfun)) >> + return false; >> + >> + else >> + return true; >> + } >> + >> + /* With the secure-plt SYSV ABI we can't make non-local calls when >> + -fpic/PIC because the plt call stubs use r30. */ >> + if (DEFAULT_ABI == ABI_V4 >> + && (!TARGET_SECURE_PLT >> + || !flag_pic >> + || (decl >> + && (*targetm.binds_local_p) (decl)))) >> + return true; >> + >> + return false; >> +} >> + >> /* Say whether a function is a candidate for sibcall handling or not. */ >> >> static bool >> @@ -24748,22 +24825,7 @@ rs6000_function_ok_for_sibcall (tree decl, tree exp) >> return false; >> } >> >> - /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local >> - functions, because the callee may have a different TOC pointer to >> - the caller and there's no way to ensure we restore the TOC when >> - we return. With the secure-plt SYSV ABI we can't make non-local >> - calls when -fpic/PIC because the plt call stubs use r30. */ >> - if (DEFAULT_ABI == ABI_DARWIN >> - || ((DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) >> - && decl >> - && !DECL_EXTERNAL (decl) >> - && !DECL_WEAK (decl) >> - && (*targetm.binds_local_p) (decl)) >> - || (DEFAULT_ABI == ABI_V4 >> - && (!TARGET_SECURE_PLT >> - || !flag_pic >> - || (decl >> - && (*targetm.binds_local_p) (decl))))) >> + if (rs6000_decl_ok_for_sibcall (decl)) >> { >> tree attr_list = TYPE_ATTRIBUTES (fntype); >> >> @@ -32592,12 +32654,18 @@ rs6000_longcall_ref (rtx call_ref, rtx arg) >> if (TARGET_PLTSEQ) >> { >> rtx base = const0_rtx; >> - int regno; >> - if (DEFAULT_ABI == ABI_ELFv2) >> + int regno = 12; >> + if (rs6000_pcrel_p (cfun)) >> { >> - base = gen_rtx_REG (Pmode, TOC_REGISTER); >> - regno = 12; >> + rtx reg = gen_rtx_REG (Pmode, regno); >> + rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg), >> + UNSPEC_PLT_PCREL); >> + emit_insn (gen_rtx_SET (reg, u)); >> + return reg; >> } >> + >> + if (DEFAULT_ABI == ABI_ELFv2) >> + base = gen_rtx_REG (Pmode, TOC_REGISTER); >> else >> { >> if (flag_pic) >> @@ -37706,37 +37774,38 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie) >> if (!SYMBOL_REF_P (func) >> || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (func))) >> { >> - /* Save the TOC into its reserved slot before the call, >> - and prepare to restore it after the call. */ >> - rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT); >> - rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode, >> - gen_rtvec (1, stack_toc_offset), >> - UNSPEC_TOCSLOT); >> - toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec); >> - >> - /* Can we optimize saving the TOC in the prologue or >> - do we need to do it at every call? */ >> - if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca) >> - cfun->machine->save_toc_in_prologue = true; >> - else >> + if (!rs6000_pcrel_p (cfun)) >> { >> - rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); >> - rtx stack_toc_mem = gen_frame_mem (Pmode, >> - gen_rtx_PLUS (Pmode, stack_ptr, >> - stack_toc_offset)); >> - MEM_VOLATILE_P (stack_toc_mem) = 1; >> - if (is_pltseq_longcall) >> + /* Save the TOC into its reserved slot before the call, >> + and prepare to restore it after the call. */ >> + rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT); >> + rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode, >> + gen_rtvec (1, stack_toc_offset), >> + UNSPEC_TOCSLOT); >> + toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec); >> + >> + /* Can we optimize saving the TOC in the prologue or >> + do we need to do it at every call? */ >> + if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca) >> + cfun->machine->save_toc_in_prologue = true; >> + else >> { >> - /* Use USPEC_PLTSEQ here to emit every instruction in an >> - inline PLT call sequence with a reloc, enabling the >> - linker to edit the sequence back to a direct call >> - when that makes sense. */ >> - rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg); >> - rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ); >> - emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg)); >> + rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); >> + rtx stack_toc_mem = gen_frame_mem (Pmode, >> + gen_rtx_PLUS (Pmode, stack_ptr, >> + stack_toc_offset)); >> + MEM_VOLATILE_P (stack_toc_mem) = 1; >> + if (HAVE_AS_PLTSEQ >> + && DEFAULT_ABI == ABI_ELFv2 >> + && GET_CODE (func_desc) == SYMBOL_REF) >> + { >> + rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg); >> + rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ); >> + emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg)); >> + } >> + else >> + emit_move_insn (stack_toc_mem, toc_reg); >> } >> - else >> - emit_move_insn (stack_toc_mem, toc_reg); >> } >> >> if (DEFAULT_ABI == ABI_ELFv2) >> @@ -37813,10 +37882,12 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie) >> } >> else >> { >> - /* Direct calls use the TOC: for local calls, the callee will >> - assume the TOC register is set; for non-local calls, the >> - PLT stub needs the TOC register. */ >> - abi_reg = toc_reg; >> + /* No TOC register needed for calls from PC-relative callers. */ >> + if (!rs6000_pcrel_p (cfun)) >> + /* Direct calls use the TOC: for local calls, the callee will >> + assume the TOC register is set; for non-local calls, the >> + PLT stub needs the TOC register. */ >> + abi_reg = toc_reg; >> func_addr = func; >> } >> >> @@ -37866,7 +37937,9 @@ rs6000_sibcall_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie) >> insn = emit_call_insn (insn); >> >> /* Note use of the TOC register. */ >> - use_reg (&CALL_INSN_FUNCTION_USAGE (insn), gen_rtx_REG (Pmode, TOC_REGNUM)); >> + if (!rs6000_pcrel_p (cfun)) >> + use_reg (&CALL_INSN_FUNCTION_USAGE (insn), >> + gen_rtx_REG (Pmode, TOC_REGNUM)); >> } >> >> /* Expand code to perform a call under the SYSV4 ABI. */ >> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >> index 71613e21384..e1d9045c5bb 100644 >> --- a/gcc/config/rs6000/rs6000.md >> +++ b/gcc/config/rs6000/rs6000.md >> @@ -147,6 +147,7 @@ >> UNSPEC_PLTSEQ >> UNSPEC_PLT16_HA >> UNSPEC_PLT16_LO >> + UNSPEC_PLT_PCREL >> ]) >> >> ;; >> @@ -10267,6 +10268,20 @@ >> { >> return rs6000_pltseq_template (operands, 3); >> }) >> + >> +(define_insn "*pltseq_plt_pcrel<mode>" >> + [(set (match_operand:P 0 "gpc_reg_operand" "=r") >> + (unspec:P [(match_operand:P 1 "" "") >> + (match_operand:P 2 "symbol_ref_operand" "s") >> + (match_operand:P 3 "" "")] >> + UNSPEC_PLT_PCREL))] >> + "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS >> + && rs6000_pcrel_p (cfun)" >> +{ >> + return rs6000_pltseq_template (operands, 4); >> +} >> + [(set_attr "type" "load") >> + (set_attr "length" "12")]) >> >> ;; Call and call_value insns >> ;; For the purposes of expanding calls, Darwin is very similar to SYSV. >> @@ -10582,7 +10597,11 @@ >> (match_operand 1)) >> (clobber (reg:P LR_REGNO))] >> "DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2" >> - "bl %z0" >> +{ >> + if (rs6000_pcrel_p (cfun)) >> + return "bl %z0@notoc"; >> + return "bl %z0"; >> +} >> [(set_attr "type" "branch")]) >> >> (define_insn "*call_value_local_aix<mode>" >> @@ -10592,7 +10611,11 @@ >> (clobber (reg:P LR_REGNO))] >> "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) >> && !IS_NOMARK_TLSGETADDR (operands[2])" >> - "bl %z1" >> +{ >> + if (rs6000_pcrel_p (cfun)) >> + return "bl %z1@notoc"; >> + return "bl %z1"; >> +} >> [(set_attr "type" "branch")]) >> >> ;; Call to AIX abi function which may be in another module. >> @@ -10607,7 +10630,10 @@ >> return rs6000_call_template (operands, 0); >> } >> [(set_attr "type" "branch") >> - (set_attr "length" "8")]) >> + (set (attr "length") >> + (if_then_else (match_test "rs6000_pcrel_p (cfun)") >> + (const_int 4) >> + (const_int 8)))]) >> >> (define_insn "*call_value_nonlocal_aix<mode>" >> [(set (match_operand 0 "" "") >> @@ -10623,11 +10649,14 @@ >> } >> [(set_attr "type" "branch") >> (set (attr "length") >> - (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])") >> - (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL") >> - (const_int 16) >> - (const_int 12)) >> - (const_int 8)))]) >> + (plus (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])") >> + (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL") >> + (const_int 8) >> + (const_int 4)) >> + (const_int 0)) >> + (if_then_else (match_test "rs6000_pcrel_p (cfun)") >> + (const_int 4) >> + (const_int 8))))]) >> >> ;; Call to indirect functions with the AIX abi using a 3 word descriptor. >> ;; Operand0 is the addresss of the function to call >> @@ -10700,6 +10729,21 @@ >> (const_string "12") >> (const_string "8")))]) >> >> +(define_insn "*call_indirect_pcrel<mode>" >> + [(call (mem:SI (match_operand:P 0 "indirect_call_operand" "c,*l,X")) >> + (match_operand 1)) >> + (clobber (reg:P LR_REGNO))] >> + "rs6000_pcrel_p (cfun)" >> +{ >> + return rs6000_indirect_call_template (operands, 0); >> +} >> + [(set_attr "type" "jmpreg") >> + (set (attr "length") >> + (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps") >> + (match_test "which_alternative != 1")) >> + (const_string "8") >> + (const_string "4")))]) >> + >> (define_insn "*call_value_indirect_elfv2<mode>" >> [(set (match_operand 0 "" "") >> (call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X")) >> @@ -10728,6 +10772,31 @@ >> (const_string "12") >> (const_string "8"))))]) >> >> +(define_insn "*call_value_indirect_pcrel<mode>" >> + [(set (match_operand 0 "" "") >> + (call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X")) >> + (match_operand:P 2 "unspec_tls" ""))) >> + (clobber (reg:P LR_REGNO))] >> + "rs6000_pcrel_p (cfun)" >> +{ >> + if (IS_NOMARK_TLSGETADDR (operands[2])) >> + rs6000_output_tlsargs (operands); >> + >> + return rs6000_indirect_call_template (operands, 1); >> +} >> + [(set_attr "type" "jmpreg") >> + (set (attr "length") >> + (plus >> + (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])") >> + (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL") >> + (const_int 8) >> + (const_int 4)) >> + (const_int 0)) >> + (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps") >> + (match_test "which_alternative != 1")) >> + (const_string "8") >> + (const_string "4"))))]) >> + >> ;; Call subroutine returning any type. >> (define_expand "untyped_call" >> [(parallel [(call (match_operand 0 "") >> diff --git a/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c >> new file mode 100644 >> index 00000000000..c7d322c1c96 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c >> @@ -0,0 +1,41 @@ >> ++/* { dg-do compile } */ >> ++/* { dg-options "-mdejagnu-cpu=future -O2" } */ >> ++/* { dg-require-effective-target powerpc_elfv2 } */ >> + >> +/* Test that calls generated from PC-relative code are >> + annotated with @notoc. */ >> + >> +extern int yy0 (int); >> +extern void yy1 (int); >> + >> +int zz0 (void) __attribute__((noinline)); >> +void zz1 (int) __attribute__((noinline)); >> + >> +int xx (void) >> +{ >> + yy1 (7); >> + return yy0 (5); >> +} >> + >> +int zz0 () >> +{ >> + asm (""); >> + return 16; >> +}; >> + >> +void zz1 (int a __attribute__((__unused__))) >> +{ >> + asm (""); >> +}; >> + >> +int ww (void) >> +{ >> + zz1 (zz0 ()); >> + return 4; >> +} >> + >> +/* { dg-final { scan-assembler {yy1@notoc} } } */ >> +/* { dg-final { scan-assembler {yy0@notoc} } } */ >> +/* { dg-final { scan-assembler {zz1@notoc} } } */ >> +/* { dg-final { scan-assembler {zz0@notoc} } } */ >> + >> diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c >> new file mode 100644 >> index 00000000000..7c767e2ba32 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c >> @@ -0,0 +1,46 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-mdejagnu-cpu=future -O2" } */ >> +/* { dg-require-effective-target powerpc_elfv2 } */ >> + >> +/* Test that potential sibcalls are not generated when the caller preserves >> + the TOC and the callee doesn't, or vice versa. */ >> + >> +int x (void) __attribute__((noinline)); >> +int y (void) __attribute__((noinline)); >> +int xx (void) __attribute__((noinline)); >> + >> +int x (void) >> +{ >> + return 1; >> +} >> + >> +int y (void) >> +{ >> + return 2; >> +} >> + >> +int sib_call (void) >> +{ >> + return x (); >> +} >> + >> +#pragma GCC target ("cpu=power9") >> +int normal_call (void) >> +{ >> + return y (); >> +} >> + >> +int xx (void) >> +{ >> + return 1; >> +} >> + >> +#pragma GCC target ("cpu=future") >> +int notoc_call (void) >> +{ >> + return xx (); >> +} >> + >> +/* { dg-final { scan-assembler {\mb x@notoc\M} } } */ >> +/* { dg-final { scan-assembler {\mbl y\M} } } */ >> +/* { dg-final { scan-assembler {\mbl xx@notoc\M} } } */ >>
Hi, Please go ahead and review this. In the test case gcc.target/powerpc/notoc-direct-1.c, I accidentally left in '+' characters in column 1 of the first three lines, which caused the test case failure. Bootstrapped and tested on powerpc64le-unknown-linux-gnu with that fixed. Is this okay for trunk? Thanks, Bill On 5/24/19 9:06 AM, Bill Schmidt wrote: > New test case ICEs, so consider this withdrawn. Sorry again about this. > > Bill > > On 5/23/19 9:17 PM, Bill Schmidt wrote: >> Hm, I got ahead of myself on this one. I haven't done the regstrap yet, >> so please hold off reviewing for now. >> >> Sorry for the noise. I shouldn't post when I'm tired... >> >> Thanks, >> Bill >> >> On 5/23/19 9:11 PM, Bill Schmidt wrote: >>> Hi, >>> >>> This patch contains the changes to implement call flow for PC-relative addressing. >>> It's an amalgam of several internal patches that Alan and I worked on, and as a >>> result it's hard to tease apart individual pieces much further. So I apologize >>> that this patch is a little larger than the others. Also, I've CC'd Alan so he >>> can help answer questions about the patch, particularly the PLT bits I'm not very >>> familiar with. >>> >>> Following are descriptions of the individual patches that are combined here. >>> >>> (1) When a function uses PC-relative code generation, all direct calls (other than >>> sibcalls) that the function makes to local or external callees should appear as >>> "bl sym@notoc" and should not be followed by a nop instruction. @notoc indicates >>> that the assembler should annotate the call with R_PPC64_REL24_NOTOC, meaning >>> that the caller does not guarantee that r2 contains a valid TOC pointer. Thus >>> the linker should not try to replace a subsequent "nop" with a TOC restore >>> instruction. >>> >>> I've added a test case for the four cases handled here: local calls with/without >>> a return value, and external cases with/without a return value. >>> >>> (2) If a caller preserves the TOC pointer and the callee does not, or vice versa, >>> then a sibcall will cause an inconsistency. Don't allow that. >>> >>> (3) The linker needs a @notoc directive on sibcall targets when the caller does not >>> provide or preserve a TOC pointer. This patch provides for that. >>> >>> In creating the new sibcall patterns, I did not duplicate the "c" alternatives >>> that allow for bctr or blr sibcalls. I don't think there's a way to generate >>> those currently. The bctr would be legitimate for PC-relative sibcalls if you >>> can prove that the target function is in the same binary, but we don't appear >>> to detect that possibility today. >>> >>> (4) This patch deletes all the extra insns added to handle pcrel calls, >>> instead opting to use existing insns but making their output >>> conditional on rs6000_pcrel_p(cfun). There isn't a need to >>> differentiate between pcrel and non-pcrel calls at the point rtl is >>> created; rs6000_pcrel_p is valid right up to the final pass, as >>> evidenced by use of rs6000_pcrel_p to emit .localentry. >>> >>> There is one case however where we do need new insns: The existing >>> nonlocal indirect call insns mention r2 in their rtl. That isn't >>> correct for pcrel indirect calls, and may cause problems when/if r2 >>> is allocated as any other volatile gpr in pcrel code. >>> >>> The patch also fixes pcrel inline PLT calls (which are used for >>> -fno-plt and -mlongcall) to use a pcrel load from the PLT rather than >>> attempting (and failing) to use TOC-relative loads. This requires >>> some changes in the way relocs are emitted. For prefix insns we can't >>> write >>> .reloc .,R_PPC64_PLT_PCREL34_NOTOC,foo >>> pld 12,0(0),1 >>> since the pld may require a padding nop. Instead it's necessary to >>> put the .reloc after the instruction or use a label on the insn. Like >>> this (which is what the patch does): >>> pld 12,0(0),1 >>> .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo >>> or this: >>> .reloc 0f,R_PPC64_PLT_PCREL34_NOTOC,foo >>> 0: pld 12,0(0),1 >>> >>> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions. >>> Is this okay for trunk? >>> >>> Thanks! >>> Bill >>> >>> >>> [gcc] >>> >>> 2019-05-23 Bill Schmidt <wschmidt@linux.ibm.com> >>> Alan Modra <amodra@gmail.com> >>> >>> * config/rs6000/rs6000.c (rs6000_call_template_1): Handle pcrel >>> calls here... >>> (rs6000_indirect_call_template_1): ...and here. >>> (rs6000_indirect_sibcall_template): Handle plt_pcrel34. Rework >>> tocsave, plt16_ha, plt16_lo, mtctr indirect calls. >>> (rs6000_decl_ok_for_sibcall): New function. >>> (rs6000_function_ok_for_sibcall): Refactor. >>> (rs6000_longcall_ref): Use UNSPEC_PLT_PCREL when pcrel. >>> (rs6000_call_aix): Don't emit toc restore rtl for indirect calls >>> when pcrel. Reorganize. >>> (rs6000_sibcall_aix): Don't add r2 to function usage when pcrel. >>> * rs6000.md (UNSPEC_PLT_PCREL): New unspec. >>> (*pltseq_plt_pcrel): New insn. >>> (*call_local_aix): Handle @notoc calls. >>> (*call_value_local_aix): Likewise. >>> (*call_nonlocal_aix): Adjust lengths for pcrel calls. >>> (*call_value_nonlocal_aix): Likewise. >>> (*call_indirect_pcrel): New insn. >>> (*call_value_indirect_pcrel): Likewise. >>> >>> >>> [gcc/testsuite] >>> >>> 2019-05-23 Bill Schmidt <wschmidt@linux.ibm.com> >>> >>> * gcc.target/powerpc/notoc-direct-1.c: New. >>> * gcc.target/powerpc/pcrel-sibcall-1.c: New. >>> >>> >>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c >>> index 3d5cf9e4ece..9229bad6acc 100644 >>> --- a/gcc/config/rs6000/rs6000.c >>> +++ b/gcc/config/rs6000/rs6000.c >>> @@ -21268,7 +21268,9 @@ rs6000_call_template_1 (rtx *operands, unsigned int funop, bool sibcall) >>> ? "+32768" : "")); >>> >>> static char str[32]; /* 2 spare */ >>> - if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) >>> + if (rs6000_pcrel_p (cfun)) >>> + sprintf (str, "b%s %s@notoc%s", sibcall ? "" : "l", z, arg); >>> + else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) >>> sprintf (str, "b%s %s%s%s", sibcall ? "" : "l", z, arg, >>> sibcall ? "" : "\n\tnop"); >>> else if (DEFAULT_ABI == ABI_V4) >>> @@ -21333,6 +21335,16 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop, >>> /* Currently, funop is either 0 or 1. The maximum string is always >>> a !speculate 64-bit __tls_get_addr call. >>> >>> + ABI_ELFv2, pcrel: >>> + . 27 .reloc .,R_PPC64_TLSGD,%2\n\t >>> + . 35 .reloc .,R_PPC64_PLTSEQ_NOTOC,%z1\n\t >>> + . 9 crset 2\n\t >>> + . 27 .reloc .,R_PPC64_TLSGD,%2\n\t >>> + . 36 .reloc .,R_PPC64_PLTCALL_NOTOC,%z1\n\t >>> + . 8 beq%T1l- >>> + .--- >>> + .142 >>> + >>> ABI_AIX: >>> . 9 ld 2,%3\n\t >>> . 27 .reloc .,R_PPC64_TLSGD,%2\n\t >>> @@ -21398,23 +21410,31 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop, >>> gcc_unreachable (); >>> } >>> >>> + const char *notoc = rs6000_pcrel_p (cfun) ? "_NOTOC" : ""; >>> const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT >>> && flag_pic == 2 ? "+32768" : ""); >>> if (!speculate) >>> { >>> s += sprintf (s, >>> - "%s.reloc .,R_PPC%s_PLTSEQ,%%z%u%s\n\t", >>> - tls, rel64, funop, addend); >>> + "%s.reloc .,R_PPC%s_PLTSEQ%s,%%z%u%s\n\t", >>> + tls, rel64, notoc, funop, addend); >>> s += sprintf (s, "crset 2\n\t"); >>> } >>> s += sprintf (s, >>> - "%s.reloc .,R_PPC%s_PLTCALL,%%z%u%s\n\t", >>> - tls, rel64, funop, addend); >>> + "%s.reloc .,R_PPC%s_PLTCALL%s,%%z%u%s\n\t", >>> + tls, rel64, notoc, funop, addend); >>> } >>> else if (!speculate) >>> s += sprintf (s, "crset 2\n\t"); >>> >>> - if (DEFAULT_ABI == ABI_AIX) >>> + if (rs6000_pcrel_p (cfun)) >>> + { >>> + if (speculate) >>> + sprintf (s, "b%%T%ul", funop); >>> + else >>> + sprintf (s, "beq%%T%ul-", funop); >>> + } >>> + else if (DEFAULT_ABI == ABI_AIX) >>> { >>> if (speculate) >>> sprintf (s, >>> @@ -21468,63 +21488,73 @@ rs6000_indirect_sibcall_template (rtx *operands, unsigned int funop) >>> >>> #if HAVE_AS_PLTSEQ >>> /* Output indirect call insns. >>> - WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr. */ >>> + WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr, >>> + 4 for plt_pcrel34. */ >>> const char * >>> rs6000_pltseq_template (rtx *operands, int which) >>> { >>> const char *rel64 = TARGET_64BIT ? "64" : ""; >>> - char tls[28]; >>> + char tls[30]; >>> tls[0] = 0; >>> if (TARGET_TLS_MARKERS && GET_CODE (operands[3]) == UNSPEC) >>> { >>> + char off = which == 4 ? '8' : '4'; >>> if (XINT (operands[3], 1) == UNSPEC_TLSGD) >>> - sprintf (tls, ".reloc .,R_PPC%s_TLSGD,%%3\n\t", >>> - rel64); >>> + sprintf (tls, ".reloc .-%c,R_PPC%s_TLSGD,%%3\n\t", >>> + off, rel64); >>> else if (XINT (operands[3], 1) == UNSPEC_TLSLD) >>> - sprintf (tls, ".reloc .,R_PPC%s_TLSLD,%%&\n\t", >>> - rel64); >>> + sprintf (tls, ".reloc .-%c,R_PPC%s_TLSLD,%%&\n\t", >>> + off, rel64); >>> else >>> gcc_unreachable (); >>> } >>> >>> gcc_assert (DEFAULT_ABI == ABI_ELFv2 || DEFAULT_ABI == ABI_V4); >>> - static char str[96]; /* 15 spare */ >>> - const char *off = WORDS_BIG_ENDIAN ? "+2" : ""; >>> + static char str[96]; /* 10 spare */ >>> + char off = WORDS_BIG_ENDIAN ? '2' : '4'; >>> const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT >>> && flag_pic == 2 ? "+32768" : ""); >>> switch (which) >>> { >>> case 0: >>> sprintf (str, >>> - "%s.reloc .,R_PPC%s_PLTSEQ,%%z2\n\t" >>> - "st%s", >>> - tls, rel64, TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)"); >>> + "st%s\n\t" >>> + "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2", >>> + TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)", >>> + tls, rel64); >>> break; >>> case 1: >>> if (DEFAULT_ABI == ABI_V4 && !flag_pic) >>> sprintf (str, >>> - "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2\n\t" >>> - "lis %%0,0", >>> + "lis %%0,0\n\t" >>> + "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2", >>> tls, off, rel64); >>> else >>> sprintf (str, >>> - "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2%s\n\t" >>> - "addis %%0,%%1,0", >>> + "addis %%0,%%1,0\n\t" >>> + "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2%s", >>> tls, off, rel64, addend); >>> break; >>> case 2: >>> sprintf (str, >>> - "%s.reloc .%s,R_PPC%s_PLT16_LO%s,%%z2%s\n\t" >>> - "l%s %%0,0(%%1)", >>> - tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend, >>> - TARGET_64BIT ? "d" : "wz"); >>> + "l%s %%0,0(%%1)\n\t" >>> + "%s.reloc .-%c,R_PPC%s_PLT16_LO%s,%%z2%s", >>> + TARGET_64BIT ? "d" : "wz", >>> + tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend); >>> break; >>> case 3: >>> sprintf (str, >>> - "%s.reloc .,R_PPC%s_PLTSEQ,%%z2%s\n\t" >>> - "mtctr %%1", >>> + "mtctr %%1\n\t" >>> + "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2%s", >>> tls, rel64, addend); >>> break; >>> + case 4: >>> + sprintf (str, >>> + "pl%s %%0,0(0),1\n\t" >>> + "%s.reloc .-8,R_PPC%s_PLT_PCREL34_NOTOC,%%z2", >>> + TARGET_64BIT ? "d" : "wz", >>> + tls, rel64); >>> + break; >>> default: >>> gcc_unreachable (); >>> } >>> @@ -24703,6 +24733,53 @@ rs6000_return_addr (int count, rtx frame) >>> return get_hard_reg_initial_val (Pmode, LR_REGNO); >>> } >>> >>> +/* Helper function for rs6000_function_ok_for_sibcall. */ >>> + >>> +static bool >>> +rs6000_decl_ok_for_sibcall (tree decl) >>> +{ >>> + /* Sibcalls are always fine for the Darwin ABI. */ >>> + if (DEFAULT_ABI == ABI_DARWIN) >>> + return true; >>> + >>> + if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) >>> + { >>> + /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local >>> + functions, because the callee may have a different TOC pointer to >>> + the caller and there's no way to ensure we restore the TOC when >>> + we return. */ >>> + if (!decl || DECL_EXTERNAL (decl) || DECL_WEAK (decl) >>> + || !(*targetm.binds_local_p) (decl)) >>> + return false; >>> + >>> + /* Similarly, if the caller preserves the TOC pointer and the callee >>> + doesn't (or vice versa), proper TOC setup or restoration will be >>> + missed. For example, suppose A, B, and C are in the same binary >>> + and A -> B -> C. A and B preserve the TOC pointer but C does not, >>> + and B -> C is eligible as a sibcall. A will call B through its >>> + local entry point, so A will not restore its TOC itself. B calls >>> + C with a sibcall, so it will not restore the TOC. C does not >>> + preserve the TOC, so it may clobber r2 with impunity. Returning >>> + from C will result in a corrupted TOC for A. */ >>> + else if (rs6000_fndecl_pcrel_p (decl) != rs6000_pcrel_p (cfun)) >>> + return false; >>> + >>> + else >>> + return true; >>> + } >>> + >>> + /* With the secure-plt SYSV ABI we can't make non-local calls when >>> + -fpic/PIC because the plt call stubs use r30. */ >>> + if (DEFAULT_ABI == ABI_V4 >>> + && (!TARGET_SECURE_PLT >>> + || !flag_pic >>> + || (decl >>> + && (*targetm.binds_local_p) (decl)))) >>> + return true; >>> + >>> + return false; >>> +} >>> + >>> /* Say whether a function is a candidate for sibcall handling or not. */ >>> >>> static bool >>> @@ -24748,22 +24825,7 @@ rs6000_function_ok_for_sibcall (tree decl, tree exp) >>> return false; >>> } >>> >>> - /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local >>> - functions, because the callee may have a different TOC pointer to >>> - the caller and there's no way to ensure we restore the TOC when >>> - we return. With the secure-plt SYSV ABI we can't make non-local >>> - calls when -fpic/PIC because the plt call stubs use r30. */ >>> - if (DEFAULT_ABI == ABI_DARWIN >>> - || ((DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) >>> - && decl >>> - && !DECL_EXTERNAL (decl) >>> - && !DECL_WEAK (decl) >>> - && (*targetm.binds_local_p) (decl)) >>> - || (DEFAULT_ABI == ABI_V4 >>> - && (!TARGET_SECURE_PLT >>> - || !flag_pic >>> - || (decl >>> - && (*targetm.binds_local_p) (decl))))) >>> + if (rs6000_decl_ok_for_sibcall (decl)) >>> { >>> tree attr_list = TYPE_ATTRIBUTES (fntype); >>> >>> @@ -32592,12 +32654,18 @@ rs6000_longcall_ref (rtx call_ref, rtx arg) >>> if (TARGET_PLTSEQ) >>> { >>> rtx base = const0_rtx; >>> - int regno; >>> - if (DEFAULT_ABI == ABI_ELFv2) >>> + int regno = 12; >>> + if (rs6000_pcrel_p (cfun)) >>> { >>> - base = gen_rtx_REG (Pmode, TOC_REGISTER); >>> - regno = 12; >>> + rtx reg = gen_rtx_REG (Pmode, regno); >>> + rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg), >>> + UNSPEC_PLT_PCREL); >>> + emit_insn (gen_rtx_SET (reg, u)); >>> + return reg; >>> } >>> + >>> + if (DEFAULT_ABI == ABI_ELFv2) >>> + base = gen_rtx_REG (Pmode, TOC_REGISTER); >>> else >>> { >>> if (flag_pic) >>> @@ -37706,37 +37774,38 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie) >>> if (!SYMBOL_REF_P (func) >>> || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (func))) >>> { >>> - /* Save the TOC into its reserved slot before the call, >>> - and prepare to restore it after the call. */ >>> - rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT); >>> - rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode, >>> - gen_rtvec (1, stack_toc_offset), >>> - UNSPEC_TOCSLOT); >>> - toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec); >>> - >>> - /* Can we optimize saving the TOC in the prologue or >>> - do we need to do it at every call? */ >>> - if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca) >>> - cfun->machine->save_toc_in_prologue = true; >>> - else >>> + if (!rs6000_pcrel_p (cfun)) >>> { >>> - rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); >>> - rtx stack_toc_mem = gen_frame_mem (Pmode, >>> - gen_rtx_PLUS (Pmode, stack_ptr, >>> - stack_toc_offset)); >>> - MEM_VOLATILE_P (stack_toc_mem) = 1; >>> - if (is_pltseq_longcall) >>> + /* Save the TOC into its reserved slot before the call, >>> + and prepare to restore it after the call. */ >>> + rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT); >>> + rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode, >>> + gen_rtvec (1, stack_toc_offset), >>> + UNSPEC_TOCSLOT); >>> + toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec); >>> + >>> + /* Can we optimize saving the TOC in the prologue or >>> + do we need to do it at every call? */ >>> + if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca) >>> + cfun->machine->save_toc_in_prologue = true; >>> + else >>> { >>> - /* Use USPEC_PLTSEQ here to emit every instruction in an >>> - inline PLT call sequence with a reloc, enabling the >>> - linker to edit the sequence back to a direct call >>> - when that makes sense. */ >>> - rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg); >>> - rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ); >>> - emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg)); >>> + rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); >>> + rtx stack_toc_mem = gen_frame_mem (Pmode, >>> + gen_rtx_PLUS (Pmode, stack_ptr, >>> + stack_toc_offset)); >>> + MEM_VOLATILE_P (stack_toc_mem) = 1; >>> + if (HAVE_AS_PLTSEQ >>> + && DEFAULT_ABI == ABI_ELFv2 >>> + && GET_CODE (func_desc) == SYMBOL_REF) >>> + { >>> + rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg); >>> + rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ); >>> + emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg)); >>> + } >>> + else >>> + emit_move_insn (stack_toc_mem, toc_reg); >>> } >>> - else >>> - emit_move_insn (stack_toc_mem, toc_reg); >>> } >>> >>> if (DEFAULT_ABI == ABI_ELFv2) >>> @@ -37813,10 +37882,12 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie) >>> } >>> else >>> { >>> - /* Direct calls use the TOC: for local calls, the callee will >>> - assume the TOC register is set; for non-local calls, the >>> - PLT stub needs the TOC register. */ >>> - abi_reg = toc_reg; >>> + /* No TOC register needed for calls from PC-relative callers. */ >>> + if (!rs6000_pcrel_p (cfun)) >>> + /* Direct calls use the TOC: for local calls, the callee will >>> + assume the TOC register is set; for non-local calls, the >>> + PLT stub needs the TOC register. */ >>> + abi_reg = toc_reg; >>> func_addr = func; >>> } >>> >>> @@ -37866,7 +37937,9 @@ rs6000_sibcall_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie) >>> insn = emit_call_insn (insn); >>> >>> /* Note use of the TOC register. */ >>> - use_reg (&CALL_INSN_FUNCTION_USAGE (insn), gen_rtx_REG (Pmode, TOC_REGNUM)); >>> + if (!rs6000_pcrel_p (cfun)) >>> + use_reg (&CALL_INSN_FUNCTION_USAGE (insn), >>> + gen_rtx_REG (Pmode, TOC_REGNUM)); >>> } >>> >>> /* Expand code to perform a call under the SYSV4 ABI. */ >>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >>> index 71613e21384..e1d9045c5bb 100644 >>> --- a/gcc/config/rs6000/rs6000.md >>> +++ b/gcc/config/rs6000/rs6000.md >>> @@ -147,6 +147,7 @@ >>> UNSPEC_PLTSEQ >>> UNSPEC_PLT16_HA >>> UNSPEC_PLT16_LO >>> + UNSPEC_PLT_PCREL >>> ]) >>> >>> ;; >>> @@ -10267,6 +10268,20 @@ >>> { >>> return rs6000_pltseq_template (operands, 3); >>> }) >>> + >>> +(define_insn "*pltseq_plt_pcrel<mode>" >>> + [(set (match_operand:P 0 "gpc_reg_operand" "=r") >>> + (unspec:P [(match_operand:P 1 "" "") >>> + (match_operand:P 2 "symbol_ref_operand" "s") >>> + (match_operand:P 3 "" "")] >>> + UNSPEC_PLT_PCREL))] >>> + "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS >>> + && rs6000_pcrel_p (cfun)" >>> +{ >>> + return rs6000_pltseq_template (operands, 4); >>> +} >>> + [(set_attr "type" "load") >>> + (set_attr "length" "12")]) >>> >>> ;; Call and call_value insns >>> ;; For the purposes of expanding calls, Darwin is very similar to SYSV. >>> @@ -10582,7 +10597,11 @@ >>> (match_operand 1)) >>> (clobber (reg:P LR_REGNO))] >>> "DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2" >>> - "bl %z0" >>> +{ >>> + if (rs6000_pcrel_p (cfun)) >>> + return "bl %z0@notoc"; >>> + return "bl %z0"; >>> +} >>> [(set_attr "type" "branch")]) >>> >>> (define_insn "*call_value_local_aix<mode>" >>> @@ -10592,7 +10611,11 @@ >>> (clobber (reg:P LR_REGNO))] >>> "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) >>> && !IS_NOMARK_TLSGETADDR (operands[2])" >>> - "bl %z1" >>> +{ >>> + if (rs6000_pcrel_p (cfun)) >>> + return "bl %z1@notoc"; >>> + return "bl %z1"; >>> +} >>> [(set_attr "type" "branch")]) >>> >>> ;; Call to AIX abi function which may be in another module. >>> @@ -10607,7 +10630,10 @@ >>> return rs6000_call_template (operands, 0); >>> } >>> [(set_attr "type" "branch") >>> - (set_attr "length" "8")]) >>> + (set (attr "length") >>> + (if_then_else (match_test "rs6000_pcrel_p (cfun)") >>> + (const_int 4) >>> + (const_int 8)))]) >>> >>> (define_insn "*call_value_nonlocal_aix<mode>" >>> [(set (match_operand 0 "" "") >>> @@ -10623,11 +10649,14 @@ >>> } >>> [(set_attr "type" "branch") >>> (set (attr "length") >>> - (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])") >>> - (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL") >>> - (const_int 16) >>> - (const_int 12)) >>> - (const_int 8)))]) >>> + (plus (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])") >>> + (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL") >>> + (const_int 8) >>> + (const_int 4)) >>> + (const_int 0)) >>> + (if_then_else (match_test "rs6000_pcrel_p (cfun)") >>> + (const_int 4) >>> + (const_int 8))))]) >>> >>> ;; Call to indirect functions with the AIX abi using a 3 word descriptor. >>> ;; Operand0 is the addresss of the function to call >>> @@ -10700,6 +10729,21 @@ >>> (const_string "12") >>> (const_string "8")))]) >>> >>> +(define_insn "*call_indirect_pcrel<mode>" >>> + [(call (mem:SI (match_operand:P 0 "indirect_call_operand" "c,*l,X")) >>> + (match_operand 1)) >>> + (clobber (reg:P LR_REGNO))] >>> + "rs6000_pcrel_p (cfun)" >>> +{ >>> + return rs6000_indirect_call_template (operands, 0); >>> +} >>> + [(set_attr "type" "jmpreg") >>> + (set (attr "length") >>> + (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps") >>> + (match_test "which_alternative != 1")) >>> + (const_string "8") >>> + (const_string "4")))]) >>> + >>> (define_insn "*call_value_indirect_elfv2<mode>" >>> [(set (match_operand 0 "" "") >>> (call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X")) >>> @@ -10728,6 +10772,31 @@ >>> (const_string "12") >>> (const_string "8"))))]) >>> >>> +(define_insn "*call_value_indirect_pcrel<mode>" >>> + [(set (match_operand 0 "" "") >>> + (call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X")) >>> + (match_operand:P 2 "unspec_tls" ""))) >>> + (clobber (reg:P LR_REGNO))] >>> + "rs6000_pcrel_p (cfun)" >>> +{ >>> + if (IS_NOMARK_TLSGETADDR (operands[2])) >>> + rs6000_output_tlsargs (operands); >>> + >>> + return rs6000_indirect_call_template (operands, 1); >>> +} >>> + [(set_attr "type" "jmpreg") >>> + (set (attr "length") >>> + (plus >>> + (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])") >>> + (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL") >>> + (const_int 8) >>> + (const_int 4)) >>> + (const_int 0)) >>> + (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps") >>> + (match_test "which_alternative != 1")) >>> + (const_string "8") >>> + (const_string "4"))))]) >>> + >>> ;; Call subroutine returning any type. >>> (define_expand "untyped_call" >>> [(parallel [(call (match_operand 0 "") >>> diff --git a/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c >>> new file mode 100644 >>> index 00000000000..c7d322c1c96 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c >>> @@ -0,0 +1,41 @@ >>> ++/* { dg-do compile } */ >>> ++/* { dg-options "-mdejagnu-cpu=future -O2" } */ >>> ++/* { dg-require-effective-target powerpc_elfv2 } */ >>> + >>> +/* Test that calls generated from PC-relative code are >>> + annotated with @notoc. */ >>> + >>> +extern int yy0 (int); >>> +extern void yy1 (int); >>> + >>> +int zz0 (void) __attribute__((noinline)); >>> +void zz1 (int) __attribute__((noinline)); >>> + >>> +int xx (void) >>> +{ >>> + yy1 (7); >>> + return yy0 (5); >>> +} >>> + >>> +int zz0 () >>> +{ >>> + asm (""); >>> + return 16; >>> +}; >>> + >>> +void zz1 (int a __attribute__((__unused__))) >>> +{ >>> + asm (""); >>> +}; >>> + >>> +int ww (void) >>> +{ >>> + zz1 (zz0 ()); >>> + return 4; >>> +} >>> + >>> +/* { dg-final { scan-assembler {yy1@notoc} } } */ >>> +/* { dg-final { scan-assembler {yy0@notoc} } } */ >>> +/* { dg-final { scan-assembler {zz1@notoc} } } */ >>> +/* { dg-final { scan-assembler {zz0@notoc} } } */ >>> + >>> diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c >>> new file mode 100644 >>> index 00000000000..7c767e2ba32 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c >>> @@ -0,0 +1,46 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-mdejagnu-cpu=future -O2" } */ >>> +/* { dg-require-effective-target powerpc_elfv2 } */ >>> + >>> +/* Test that potential sibcalls are not generated when the caller preserves >>> + the TOC and the callee doesn't, or vice versa. */ >>> + >>> +int x (void) __attribute__((noinline)); >>> +int y (void) __attribute__((noinline)); >>> +int xx (void) __attribute__((noinline)); >>> + >>> +int x (void) >>> +{ >>> + return 1; >>> +} >>> + >>> +int y (void) >>> +{ >>> + return 2; >>> +} >>> + >>> +int sib_call (void) >>> +{ >>> + return x (); >>> +} >>> + >>> +#pragma GCC target ("cpu=power9") >>> +int normal_call (void) >>> +{ >>> + return y (); >>> +} >>> + >>> +int xx (void) >>> +{ >>> + return 1; >>> +} >>> + >>> +#pragma GCC target ("cpu=future") >>> +int notoc_call (void) >>> +{ >>> + return xx (); >>> +} >>> + >>> +/* { dg-final { scan-assembler {\mb x@notoc\M} } } */ >>> +/* { dg-final { scan-assembler {\mbl y\M} } } */ >>> +/* { dg-final { scan-assembler {\mbl xx@notoc\M} } } */ >>>
Hi Bill, On Thu, May 23, 2019 at 09:11:44PM -0500, Bill Schmidt wrote: > (1) When a function uses PC-relative code generation, all direct calls (other than > sibcalls) that the function makes to local or external callees should appear as > "bl sym@notoc" and should not be followed by a nop instruction. @notoc indicates > that the assembler should annotate the call with R_PPC64_REL24_NOTOC, meaning > that the caller does not guarantee that r2 contains a valid TOC pointer. Thus > the linker should not try to replace a subsequent "nop" with a TOC restore > instruction. All necessary linker (and binutils and GAS) support is upstream already, right? > In creating the new sibcall patterns, I did not duplicate the "c" alternatives > that allow for bctr or blr sibcalls. I don't think there's a way to generate > those currently. The bctr would be legitimate for PC-relative sibcalls if you > can prove that the target function is in the same binary, but we don't appear > to detect that possibility today. But you could see that the target is in the same translation unit, for example? That should be a simple test to make, too. > pld 12,0(0),1 > .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo Are we guaranteed the assembler always writes a pld like this as 8 bytes? > * gcc.target/powerpc/notoc-direct-1.c: New. > * gcc.target/powerpc/pcrel-sibcall-1.c: New. A few more testcases would be useful. Well we'll gain a lot of-em soon enough, I suppose. > static char str[32]; /* 2 spare */ > - if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) > + if (rs6000_pcrel_p (cfun)) > + sprintf (str, "b%s %s@notoc%s", sibcall ? "" : "l", z, arg); > + else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) > sprintf (str, "b%s %s%s%s", sibcall ? "" : "l", z, arg, > sibcall ? "" : "\n\tnop"); Two spare, and you add one char (@notoc vs. ..nop), so at a minimum you need to correct the comment? > + if (DEFAULT_ABI == ABI_V4 > + && (!TARGET_SECURE_PLT > + || !flag_pic > + || (decl > + && (*targetm.binds_local_p) (decl)))) > + return true; > + > + return false; Please invert this (put the "return false" ondition in the if, like the preceding comment says). > if (TARGET_PLTSEQ) > { > rtx base = const0_rtx; > - int regno; > - if (DEFAULT_ABI == ABI_ELFv2) > + int regno = 12; > + if (rs6000_pcrel_p (cfun)) > { > - base = gen_rtx_REG (Pmode, TOC_REGISTER); > - regno = 12; > + rtx reg = gen_rtx_REG (Pmode, regno); > + rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg), > + UNSPEC_PLT_PCREL); > + emit_insn (gen_rtx_SET (reg, u)); > + return reg; > } You don't need a regno variable here, so don't use it, only set it later where it _is_ used? > +(define_insn "*pltseq_plt_pcrel<mode>" > + [(set (match_operand:P 0 "gpc_reg_operand" "=r") > + (unspec:P [(match_operand:P 1 "" "") > + (match_operand:P 2 "symbol_ref_operand" "s") > + (match_operand:P 3 "" "")] > + UNSPEC_PLT_PCREL))] > + "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS > + && rs6000_pcrel_p (cfun)" > +{ > + return rs6000_pltseq_template (operands, 4); Maybe those "4" magic constants should be an enum? > +int zz0 () > +{ > + asm (""); > + return 16; > +}; You might want to put in a comment what this asm is for. Please consider those things. Okay for trunk with that. Thanks! Segher
On Wed, May 29, 2019 at 07:40:46AM -0500, Segher Boessenkool wrote: > All necessary linker (and binutils and GAS) support is upstream already, right? I believe so, except gold support is lacking right now. > > pld 12,0(0),1 > > .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo > > Are we guaranteed the assembler always writes a pld like this as 8 bytes? Strictly speaking the assembler might nop pad *before* the pld making a total of 12 bytes, and that's the reason to put the .reloc *after* the prefix instruction.
On 5/29/19 7:40 AM, Segher Boessenkool wrote: > Hi Bill, > > On Thu, May 23, 2019 at 09:11:44PM -0500, Bill Schmidt wrote: >> (1) When a function uses PC-relative code generation, all direct calls (other than >> sibcalls) that the function makes to local or external callees should appear as >> "bl sym@notoc" and should not be followed by a nop instruction. @notoc indicates >> that the assembler should annotate the call with R_PPC64_REL24_NOTOC, meaning >> that the caller does not guarantee that r2 contains a valid TOC pointer. Thus >> the linker should not try to replace a subsequent "nop" with a TOC restore >> instruction. > All necessary linker (and binutils and GAS) support is upstream already, right? > >> In creating the new sibcall patterns, I did not duplicate the "c" alternatives >> that allow for bctr or blr sibcalls. I don't think there's a way to generate >> those currently. The bctr would be legitimate for PC-relative sibcalls if you >> can prove that the target function is in the same binary, but we don't appear >> to detect that possibility today. > But you could see that the target is in the same translation unit, for example? > That should be a simple test to make, too. > >> pld 12,0(0),1 >> .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo > Are we guaranteed the assembler always writes a pld like this as 8 bytes? > >> * gcc.target/powerpc/notoc-direct-1.c: New. >> * gcc.target/powerpc/pcrel-sibcall-1.c: New. > A few more testcases would be useful. Well we'll gain a lot of-em soon > enough, I suppose. > >> static char str[32]; /* 2 spare */ >> - if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) >> + if (rs6000_pcrel_p (cfun)) >> + sprintf (str, "b%s %s@notoc%s", sibcall ? "" : "l", z, arg); >> + else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) >> sprintf (str, "b%s %s%s%s", sibcall ? "" : "l", z, arg, >> sibcall ? "" : "\n\tnop"); > Two spare, and you add one char (@notoc vs. ..nop), so at a minimum you > need to correct the comment? > >> + if (DEFAULT_ABI == ABI_V4 >> + && (!TARGET_SECURE_PLT >> + || !flag_pic >> + || (decl >> + && (*targetm.binds_local_p) (decl)))) >> + return true; >> + >> + return false; > Please invert this (put the "return false" ondition in the if, like the > preceding comment says). > >> if (TARGET_PLTSEQ) >> { >> rtx base = const0_rtx; >> - int regno; >> - if (DEFAULT_ABI == ABI_ELFv2) >> + int regno = 12; >> + if (rs6000_pcrel_p (cfun)) >> { >> - base = gen_rtx_REG (Pmode, TOC_REGISTER); >> - regno = 12; >> + rtx reg = gen_rtx_REG (Pmode, regno); >> + rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg), >> + UNSPEC_PLT_PCREL); >> + emit_insn (gen_rtx_SET (reg, u)); >> + return reg; >> } > You don't need a regno variable here, so don't use it, only set it later > where it _is_ used? > >> +(define_insn "*pltseq_plt_pcrel<mode>" >> + [(set (match_operand:P 0 "gpc_reg_operand" "=r") >> + (unspec:P [(match_operand:P 1 "" "") >> + (match_operand:P 2 "symbol_ref_operand" "s") >> + (match_operand:P 3 "" "")] >> + UNSPEC_PLT_PCREL))] >> + "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS >> + && rs6000_pcrel_p (cfun)" >> +{ >> + return rs6000_pltseq_template (operands, 4); > Maybe those "4" magic constants should be an enum? > >> +int zz0 () >> +{ >> + asm (""); >> + return 16; >> +}; > You might want to put in a comment what this asm is for. > > > Please consider those things. Okay for trunk with that. Thanks! Thanks! Will make appropriate changes and commit. Much obliged for the review! Bill > > > Segher >
On Thu, May 30, 2019 at 12:44:35AM +0930, Alan Modra wrote: > On Wed, May 29, 2019 at 07:40:46AM -0500, Segher Boessenkool wrote: > > All necessary linker (and binutils and GAS) support is upstream already, right? > > I believe so, except gold support is lacking right now. Excellent :-) > > > pld 12,0(0),1 > > > .reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo > > > > Are we guaranteed the assembler always writes a pld like this as 8 bytes? > > Strictly speaking the assembler might nop pad *before* the pld making > a total of 12 bytes, and that's the reason to put the .reloc *after* > the prefix instruction. Ah, okay. That probably warrants a comment... Thanks, Segher
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 3d5cf9e4ece..9229bad6acc 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -21268,7 +21268,9 @@ rs6000_call_template_1 (rtx *operands, unsigned int funop, bool sibcall) ? "+32768" : "")); static char str[32]; /* 2 spare */ - if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) + if (rs6000_pcrel_p (cfun)) + sprintf (str, "b%s %s@notoc%s", sibcall ? "" : "l", z, arg); + else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) sprintf (str, "b%s %s%s%s", sibcall ? "" : "l", z, arg, sibcall ? "" : "\n\tnop"); else if (DEFAULT_ABI == ABI_V4) @@ -21333,6 +21335,16 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop, /* Currently, funop is either 0 or 1. The maximum string is always a !speculate 64-bit __tls_get_addr call. + ABI_ELFv2, pcrel: + . 27 .reloc .,R_PPC64_TLSGD,%2\n\t + . 35 .reloc .,R_PPC64_PLTSEQ_NOTOC,%z1\n\t + . 9 crset 2\n\t + . 27 .reloc .,R_PPC64_TLSGD,%2\n\t + . 36 .reloc .,R_PPC64_PLTCALL_NOTOC,%z1\n\t + . 8 beq%T1l- + .--- + .142 + ABI_AIX: . 9 ld 2,%3\n\t . 27 .reloc .,R_PPC64_TLSGD,%2\n\t @@ -21398,23 +21410,31 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop, gcc_unreachable (); } + const char *notoc = rs6000_pcrel_p (cfun) ? "_NOTOC" : ""; const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT && flag_pic == 2 ? "+32768" : ""); if (!speculate) { s += sprintf (s, - "%s.reloc .,R_PPC%s_PLTSEQ,%%z%u%s\n\t", - tls, rel64, funop, addend); + "%s.reloc .,R_PPC%s_PLTSEQ%s,%%z%u%s\n\t", + tls, rel64, notoc, funop, addend); s += sprintf (s, "crset 2\n\t"); } s += sprintf (s, - "%s.reloc .,R_PPC%s_PLTCALL,%%z%u%s\n\t", - tls, rel64, funop, addend); + "%s.reloc .,R_PPC%s_PLTCALL%s,%%z%u%s\n\t", + tls, rel64, notoc, funop, addend); } else if (!speculate) s += sprintf (s, "crset 2\n\t"); - if (DEFAULT_ABI == ABI_AIX) + if (rs6000_pcrel_p (cfun)) + { + if (speculate) + sprintf (s, "b%%T%ul", funop); + else + sprintf (s, "beq%%T%ul-", funop); + } + else if (DEFAULT_ABI == ABI_AIX) { if (speculate) sprintf (s, @@ -21468,63 +21488,73 @@ rs6000_indirect_sibcall_template (rtx *operands, unsigned int funop) #if HAVE_AS_PLTSEQ /* Output indirect call insns. - WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr. */ + WHICH is 0 for tocsave, 1 for plt16_ha, 2 for plt16_lo, 3 for mtctr, + 4 for plt_pcrel34. */ const char * rs6000_pltseq_template (rtx *operands, int which) { const char *rel64 = TARGET_64BIT ? "64" : ""; - char tls[28]; + char tls[30]; tls[0] = 0; if (TARGET_TLS_MARKERS && GET_CODE (operands[3]) == UNSPEC) { + char off = which == 4 ? '8' : '4'; if (XINT (operands[3], 1) == UNSPEC_TLSGD) - sprintf (tls, ".reloc .,R_PPC%s_TLSGD,%%3\n\t", - rel64); + sprintf (tls, ".reloc .-%c,R_PPC%s_TLSGD,%%3\n\t", + off, rel64); else if (XINT (operands[3], 1) == UNSPEC_TLSLD) - sprintf (tls, ".reloc .,R_PPC%s_TLSLD,%%&\n\t", - rel64); + sprintf (tls, ".reloc .-%c,R_PPC%s_TLSLD,%%&\n\t", + off, rel64); else gcc_unreachable (); } gcc_assert (DEFAULT_ABI == ABI_ELFv2 || DEFAULT_ABI == ABI_V4); - static char str[96]; /* 15 spare */ - const char *off = WORDS_BIG_ENDIAN ? "+2" : ""; + static char str[96]; /* 10 spare */ + char off = WORDS_BIG_ENDIAN ? '2' : '4'; const char *addend = (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT && flag_pic == 2 ? "+32768" : ""); switch (which) { case 0: sprintf (str, - "%s.reloc .,R_PPC%s_PLTSEQ,%%z2\n\t" - "st%s", - tls, rel64, TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)"); + "st%s\n\t" + "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2", + TARGET_64BIT ? "d 2,24(1)" : "w 2,12(1)", + tls, rel64); break; case 1: if (DEFAULT_ABI == ABI_V4 && !flag_pic) sprintf (str, - "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2\n\t" - "lis %%0,0", + "lis %%0,0\n\t" + "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2", tls, off, rel64); else sprintf (str, - "%s.reloc .%s,R_PPC%s_PLT16_HA,%%z2%s\n\t" - "addis %%0,%%1,0", + "addis %%0,%%1,0\n\t" + "%s.reloc .-%c,R_PPC%s_PLT16_HA,%%z2%s", tls, off, rel64, addend); break; case 2: sprintf (str, - "%s.reloc .%s,R_PPC%s_PLT16_LO%s,%%z2%s\n\t" - "l%s %%0,0(%%1)", - tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend, - TARGET_64BIT ? "d" : "wz"); + "l%s %%0,0(%%1)\n\t" + "%s.reloc .-%c,R_PPC%s_PLT16_LO%s,%%z2%s", + TARGET_64BIT ? "d" : "wz", + tls, off, rel64, TARGET_64BIT ? "_DS" : "", addend); break; case 3: sprintf (str, - "%s.reloc .,R_PPC%s_PLTSEQ,%%z2%s\n\t" - "mtctr %%1", + "mtctr %%1\n\t" + "%s.reloc .-4,R_PPC%s_PLTSEQ,%%z2%s", tls, rel64, addend); break; + case 4: + sprintf (str, + "pl%s %%0,0(0),1\n\t" + "%s.reloc .-8,R_PPC%s_PLT_PCREL34_NOTOC,%%z2", + TARGET_64BIT ? "d" : "wz", + tls, rel64); + break; default: gcc_unreachable (); } @@ -24703,6 +24733,53 @@ rs6000_return_addr (int count, rtx frame) return get_hard_reg_initial_val (Pmode, LR_REGNO); } +/* Helper function for rs6000_function_ok_for_sibcall. */ + +static bool +rs6000_decl_ok_for_sibcall (tree decl) +{ + /* Sibcalls are always fine for the Darwin ABI. */ + if (DEFAULT_ABI == ABI_DARWIN) + return true; + + if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) + { + /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local + functions, because the callee may have a different TOC pointer to + the caller and there's no way to ensure we restore the TOC when + we return. */ + if (!decl || DECL_EXTERNAL (decl) || DECL_WEAK (decl) + || !(*targetm.binds_local_p) (decl)) + return false; + + /* Similarly, if the caller preserves the TOC pointer and the callee + doesn't (or vice versa), proper TOC setup or restoration will be + missed. For example, suppose A, B, and C are in the same binary + and A -> B -> C. A and B preserve the TOC pointer but C does not, + and B -> C is eligible as a sibcall. A will call B through its + local entry point, so A will not restore its TOC itself. B calls + C with a sibcall, so it will not restore the TOC. C does not + preserve the TOC, so it may clobber r2 with impunity. Returning + from C will result in a corrupted TOC for A. */ + else if (rs6000_fndecl_pcrel_p (decl) != rs6000_pcrel_p (cfun)) + return false; + + else + return true; + } + + /* With the secure-plt SYSV ABI we can't make non-local calls when + -fpic/PIC because the plt call stubs use r30. */ + if (DEFAULT_ABI == ABI_V4 + && (!TARGET_SECURE_PLT + || !flag_pic + || (decl + && (*targetm.binds_local_p) (decl)))) + return true; + + return false; +} + /* Say whether a function is a candidate for sibcall handling or not. */ static bool @@ -24748,22 +24825,7 @@ rs6000_function_ok_for_sibcall (tree decl, tree exp) return false; } - /* Under the AIX or ELFv2 ABIs we can't allow calls to non-local - functions, because the callee may have a different TOC pointer to - the caller and there's no way to ensure we restore the TOC when - we return. With the secure-plt SYSV ABI we can't make non-local - calls when -fpic/PIC because the plt call stubs use r30. */ - if (DEFAULT_ABI == ABI_DARWIN - || ((DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) - && decl - && !DECL_EXTERNAL (decl) - && !DECL_WEAK (decl) - && (*targetm.binds_local_p) (decl)) - || (DEFAULT_ABI == ABI_V4 - && (!TARGET_SECURE_PLT - || !flag_pic - || (decl - && (*targetm.binds_local_p) (decl))))) + if (rs6000_decl_ok_for_sibcall (decl)) { tree attr_list = TYPE_ATTRIBUTES (fntype); @@ -32592,12 +32654,18 @@ rs6000_longcall_ref (rtx call_ref, rtx arg) if (TARGET_PLTSEQ) { rtx base = const0_rtx; - int regno; - if (DEFAULT_ABI == ABI_ELFv2) + int regno = 12; + if (rs6000_pcrel_p (cfun)) { - base = gen_rtx_REG (Pmode, TOC_REGISTER); - regno = 12; + rtx reg = gen_rtx_REG (Pmode, regno); + rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg), + UNSPEC_PLT_PCREL); + emit_insn (gen_rtx_SET (reg, u)); + return reg; } + + if (DEFAULT_ABI == ABI_ELFv2) + base = gen_rtx_REG (Pmode, TOC_REGISTER); else { if (flag_pic) @@ -37706,37 +37774,38 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie) if (!SYMBOL_REF_P (func) || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (func))) { - /* Save the TOC into its reserved slot before the call, - and prepare to restore it after the call. */ - rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT); - rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode, - gen_rtvec (1, stack_toc_offset), - UNSPEC_TOCSLOT); - toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec); - - /* Can we optimize saving the TOC in the prologue or - do we need to do it at every call? */ - if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca) - cfun->machine->save_toc_in_prologue = true; - else + if (!rs6000_pcrel_p (cfun)) { - rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); - rtx stack_toc_mem = gen_frame_mem (Pmode, - gen_rtx_PLUS (Pmode, stack_ptr, - stack_toc_offset)); - MEM_VOLATILE_P (stack_toc_mem) = 1; - if (is_pltseq_longcall) + /* Save the TOC into its reserved slot before the call, + and prepare to restore it after the call. */ + rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT); + rtx stack_toc_unspec = gen_rtx_UNSPEC (Pmode, + gen_rtvec (1, stack_toc_offset), + UNSPEC_TOCSLOT); + toc_restore = gen_rtx_SET (toc_reg, stack_toc_unspec); + + /* Can we optimize saving the TOC in the prologue or + do we need to do it at every call? */ + if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca) + cfun->machine->save_toc_in_prologue = true; + else { - /* Use USPEC_PLTSEQ here to emit every instruction in an - inline PLT call sequence with a reloc, enabling the - linker to edit the sequence back to a direct call - when that makes sense. */ - rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg); - rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ); - emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg)); + rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); + rtx stack_toc_mem = gen_frame_mem (Pmode, + gen_rtx_PLUS (Pmode, stack_ptr, + stack_toc_offset)); + MEM_VOLATILE_P (stack_toc_mem) = 1; + if (HAVE_AS_PLTSEQ + && DEFAULT_ABI == ABI_ELFv2 + && GET_CODE (func_desc) == SYMBOL_REF) + { + rtvec v = gen_rtvec (3, toc_reg, func_desc, tlsarg); + rtx mark_toc_reg = gen_rtx_UNSPEC (Pmode, v, UNSPEC_PLTSEQ); + emit_insn (gen_rtx_SET (stack_toc_mem, mark_toc_reg)); + } + else + emit_move_insn (stack_toc_mem, toc_reg); } - else - emit_move_insn (stack_toc_mem, toc_reg); } if (DEFAULT_ABI == ABI_ELFv2) @@ -37813,10 +37882,12 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie) } else { - /* Direct calls use the TOC: for local calls, the callee will - assume the TOC register is set; for non-local calls, the - PLT stub needs the TOC register. */ - abi_reg = toc_reg; + /* No TOC register needed for calls from PC-relative callers. */ + if (!rs6000_pcrel_p (cfun)) + /* Direct calls use the TOC: for local calls, the callee will + assume the TOC register is set; for non-local calls, the + PLT stub needs the TOC register. */ + abi_reg = toc_reg; func_addr = func; } @@ -37866,7 +37937,9 @@ rs6000_sibcall_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie) insn = emit_call_insn (insn); /* Note use of the TOC register. */ - use_reg (&CALL_INSN_FUNCTION_USAGE (insn), gen_rtx_REG (Pmode, TOC_REGNUM)); + if (!rs6000_pcrel_p (cfun)) + use_reg (&CALL_INSN_FUNCTION_USAGE (insn), + gen_rtx_REG (Pmode, TOC_REGNUM)); } /* Expand code to perform a call under the SYSV4 ABI. */ diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 71613e21384..e1d9045c5bb 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -147,6 +147,7 @@ UNSPEC_PLTSEQ UNSPEC_PLT16_HA UNSPEC_PLT16_LO + UNSPEC_PLT_PCREL ]) ;; @@ -10267,6 +10268,20 @@ { return rs6000_pltseq_template (operands, 3); }) + +(define_insn "*pltseq_plt_pcrel<mode>" + [(set (match_operand:P 0 "gpc_reg_operand" "=r") + (unspec:P [(match_operand:P 1 "" "") + (match_operand:P 2 "symbol_ref_operand" "s") + (match_operand:P 3 "" "")] + UNSPEC_PLT_PCREL))] + "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS + && rs6000_pcrel_p (cfun)" +{ + return rs6000_pltseq_template (operands, 4); +} + [(set_attr "type" "load") + (set_attr "length" "12")]) ;; Call and call_value insns ;; For the purposes of expanding calls, Darwin is very similar to SYSV. @@ -10582,7 +10597,11 @@ (match_operand 1)) (clobber (reg:P LR_REGNO))] "DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2" - "bl %z0" +{ + if (rs6000_pcrel_p (cfun)) + return "bl %z0@notoc"; + return "bl %z0"; +} [(set_attr "type" "branch")]) (define_insn "*call_value_local_aix<mode>" @@ -10592,7 +10611,11 @@ (clobber (reg:P LR_REGNO))] "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) && !IS_NOMARK_TLSGETADDR (operands[2])" - "bl %z1" +{ + if (rs6000_pcrel_p (cfun)) + return "bl %z1@notoc"; + return "bl %z1"; +} [(set_attr "type" "branch")]) ;; Call to AIX abi function which may be in another module. @@ -10607,7 +10630,10 @@ return rs6000_call_template (operands, 0); } [(set_attr "type" "branch") - (set_attr "length" "8")]) + (set (attr "length") + (if_then_else (match_test "rs6000_pcrel_p (cfun)") + (const_int 4) + (const_int 8)))]) (define_insn "*call_value_nonlocal_aix<mode>" [(set (match_operand 0 "" "") @@ -10623,11 +10649,14 @@ } [(set_attr "type" "branch") (set (attr "length") - (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])") - (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL") - (const_int 16) - (const_int 12)) - (const_int 8)))]) + (plus (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])") + (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL") + (const_int 8) + (const_int 4)) + (const_int 0)) + (if_then_else (match_test "rs6000_pcrel_p (cfun)") + (const_int 4) + (const_int 8))))]) ;; Call to indirect functions with the AIX abi using a 3 word descriptor. ;; Operand0 is the addresss of the function to call @@ -10700,6 +10729,21 @@ (const_string "12") (const_string "8")))]) +(define_insn "*call_indirect_pcrel<mode>" + [(call (mem:SI (match_operand:P 0 "indirect_call_operand" "c,*l,X")) + (match_operand 1)) + (clobber (reg:P LR_REGNO))] + "rs6000_pcrel_p (cfun)" +{ + return rs6000_indirect_call_template (operands, 0); +} + [(set_attr "type" "jmpreg") + (set (attr "length") + (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps") + (match_test "which_alternative != 1")) + (const_string "8") + (const_string "4")))]) + (define_insn "*call_value_indirect_elfv2<mode>" [(set (match_operand 0 "" "") (call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X")) @@ -10728,6 +10772,31 @@ (const_string "12") (const_string "8"))))]) +(define_insn "*call_value_indirect_pcrel<mode>" + [(set (match_operand 0 "" "") + (call (mem:SI (match_operand:P 1 "indirect_call_operand" "c,*l,X")) + (match_operand:P 2 "unspec_tls" ""))) + (clobber (reg:P LR_REGNO))] + "rs6000_pcrel_p (cfun)" +{ + if (IS_NOMARK_TLSGETADDR (operands[2])) + rs6000_output_tlsargs (operands); + + return rs6000_indirect_call_template (operands, 1); +} + [(set_attr "type" "jmpreg") + (set (attr "length") + (plus + (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])") + (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL") + (const_int 8) + (const_int 4)) + (const_int 0)) + (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps") + (match_test "which_alternative != 1")) + (const_string "8") + (const_string "4"))))]) + ;; Call subroutine returning any type. (define_expand "untyped_call" [(parallel [(call (match_operand 0 "") diff --git a/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c new file mode 100644 index 00000000000..c7d322c1c96 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c @@ -0,0 +1,41 @@ ++/* { dg-do compile } */ ++/* { dg-options "-mdejagnu-cpu=future -O2" } */ ++/* { dg-require-effective-target powerpc_elfv2 } */ + +/* Test that calls generated from PC-relative code are + annotated with @notoc. */ + +extern int yy0 (int); +extern void yy1 (int); + +int zz0 (void) __attribute__((noinline)); +void zz1 (int) __attribute__((noinline)); + +int xx (void) +{ + yy1 (7); + return yy0 (5); +} + +int zz0 () +{ + asm (""); + return 16; +}; + +void zz1 (int a __attribute__((__unused__))) +{ + asm (""); +}; + +int ww (void) +{ + zz1 (zz0 ()); + return 4; +} + +/* { dg-final { scan-assembler {yy1@notoc} } } */ +/* { dg-final { scan-assembler {yy0@notoc} } } */ +/* { dg-final { scan-assembler {zz1@notoc} } } */ +/* { dg-final { scan-assembler {zz0@notoc} } } */ + diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c new file mode 100644 index 00000000000..7c767e2ba32 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-sibcall-1.c @@ -0,0 +1,46 @@ +/* { dg-do compile } */ +/* { dg-options "-mdejagnu-cpu=future -O2" } */ +/* { dg-require-effective-target powerpc_elfv2 } */ + +/* Test that potential sibcalls are not generated when the caller preserves + the TOC and the callee doesn't, or vice versa. */ + +int x (void) __attribute__((noinline)); +int y (void) __attribute__((noinline)); +int xx (void) __attribute__((noinline)); + +int x (void) +{ + return 1; +} + +int y (void) +{ + return 2; +} + +int sib_call (void) +{ + return x (); +} + +#pragma GCC target ("cpu=power9") +int normal_call (void) +{ + return y (); +} + +int xx (void) +{ + return 1; +} + +#pragma GCC target ("cpu=future") +int notoc_call (void) +{ + return xx (); +} + +/* { dg-final { scan-assembler {\mb x@notoc\M} } } */ +/* { dg-final { scan-assembler {\mbl y\M} } } */ +/* { dg-final { scan-assembler {\mbl xx@notoc\M} } } */