diff mbox

[RFC,X86_64] Eliminate PLT stubs for specified external functions via -fno-plt=

Message ID CAAs8Hmy5RXk89ea=uxHip-t0-2J2-2PTo_y0uGqrvyCFb1oqnw@mail.gmail.com
State New
Headers show

Commit Message

Sriraman Tallam June 3, 2015, 6:38 p.m. UTC
>
> I agree now that it will be much cleaner just to punt this into the backend,
> so it may be worth noting that making this work properly for the non-PIC
> case requires quite a degree of massaging in the backends.
>
> Objections withdrawn.

Thanks!, I have attached the latest patch after making the changes
Bernhard suggested.  Also, added a comment saying non-PIC case needs
to be handled specially by the backend.

* c-family/c-common.c (no_plt): New attribute.
(handle_no_plt_attribute): New handler.
* calls.c (prepare_call_address): Check for no_plt
attribute.
* config/i386/i386.c (ix86_function_ok_for_sibcall): Check
for no_plt attribute.
(ix86_expand_call):  Ditto.
(ix86_nopic_no_plt_attribute_p): New function.
(ix86_output_call_insn): Output indirect call for non-pic
no plt calls.
* doc/extend.texi (no_plt): Document new attribute.
* doc/invoke.texi: Document new attribute.
* testsuite/gcc.target/i386/noplt-1.c: New test.
* testsuite/gcc.target/i386/noplt-2.c: New test.
* testsuite/gcc.target/i386/noplt-3.c: New test.
* testsuite/gcc.target/i386/noplt-4.c: New test.

This patch does two things:

* Adds new generic function attribute "no_plt" that is similar in
functionality  to -fno-plt except that it applies only to calls to
functions that are marked  with this attribute.
* For x86_64, it makes -fno-plt(and the attribute) also work for
non-PIC code by  directly generating an indirect call via a GOT entry.


Sri


>
> Thanks,
> Ramana
* c-family/c-common.c (no_plt): New attribute.
	(handle_no_plt_attribute): New handler.
	* calls.c (prepare_call_address): Check for no_plt
	attribute.
	* config/i386/i386.c (ix86_function_ok_for_sibcall): Check
	for no_plt attribute.
	(ix86_expand_call):  Ditto.
	(ix86_nopic_no_plt_attribute_p): New function.
	(ix86_output_call_insn): Output indirect call for non-pic
	no plt calls.
	* doc/extend.texi (no_plt): Document new attribute.
	* doc/invoke.texi: Document new attribute.
	* testsuite/gcc.target/i386/noplt-1.c: New test.
	* testsuite/gcc.target/i386/noplt-2.c: New test.
	* testsuite/gcc.target/i386/noplt-3.c: New test.
	* testsuite/gcc.target/i386/noplt-4.c: New test.

This patch does two things:

* Adds new generic function attribute "no_plt" that is similar in functionality
  to -fno-plt except that it applies only to calls to functions that are marked
  with this attribute.
* For x86_64, it makes -fno-plt(and the attribute) also work for non-PIC code by
  directly generating an indirect call via a GOT entry.

Comments

Richard Henderson June 3, 2015, 8:09 p.m. UTC | #1
On 06/03/2015 11:38 AM, Sriraman Tallam wrote:
> +  { "no_plt",                   0, 0, true,  false, false,
> +			      handle_no_plt_attribute, false },

Call it noplt.  We don't add the underscore for noinline, noclone, etc.



> Index: config/i386/i386.c
> ===================================================================
> --- config/i386/i386.c	(revision 223720)
> +++ config/i386/i386.c	(working copy)
> @@ -5479,7 +5479,10 @@ ix86_function_ok_for_sibcall (tree decl, tree exp)
>        && !TARGET_64BIT
>        && flag_pic
>        && flag_plt
> -      && decl && !targetm.binds_local_p (decl))
> +      && decl
> +      && (TREE_CODE (decl) != FUNCTION_DECL
> +	  || !lookup_attribute ("no_plt", DECL_ATTRIBUTES (decl)))
> +      && !targetm.binds_local_p (decl))
>      return false;
>  
>    /* If we need to align the outgoing stack, then sibcalling would

Is this really necessary?  I'd expect DECL to be NULL in this case,
since the non-use of the PLT will mean that the (sib)call is indirect.


> @@ -25497,13 +25500,19 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx call
>      }
>    else
>      {
> -      /* Static functions and indirect calls don't need the pic register.  */
> +      /* Static functions and indirect calls don't need the pic register.  Also,
> +	 check if PLT was explicitly avoided via no-plt or "no_plt" attribute, making
> +	 it an indirect call.  */
>        if (flag_pic
>  	  && (!TARGET_64BIT
>  	      || (ix86_cmodel == CM_LARGE_PIC
>  		  && DEFAULT_ABI != MS_ABI))
>  	  && GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF
> -	  && ! SYMBOL_REF_LOCAL_P (XEXP (fnaddr, 0)))
> +	  && !SYMBOL_REF_LOCAL_P (XEXP (fnaddr, 0))
> +	  && flag_plt
> +	  && (TREE_CODE (SYMBOL_REF_DECL (XEXP(fnaddr, 0))) != FUNCTION_DECL
> +	      || !lookup_attribute ("no_plt",
> +		     DECL_ATTRIBUTES (SYMBOL_REF_DECL (XEXP(fnaddr, 0))))))
>  	{
>  	  use_reg (&use, gen_rtx_REG (Pmode, REAL_PIC_OFFSET_TABLE_REGNUM));
>  	  if (ix86_use_pseudo_pic_reg ())

Why are you testing FUNCTION_DECL?  Even if, somehow, the user were producing a
function call to a data symbol, why do you think that lookup_attribute would
produce incorrect results?

Similarly in ix86_nopic_no_plt_attribute_p.


r~
diff mbox

Patch

Index: c-family/c-common.c
===================================================================
--- c-family/c-common.c	(revision 223720)
+++ c-family/c-common.c	(working copy)
@@ -357,6 +357,7 @@  static tree handle_mode_attribute (tree *, tree, t
 static tree handle_section_attribute (tree *, tree, tree, int, bool *);
 static tree handle_aligned_attribute (tree *, tree, tree, int, bool *);
 static tree handle_weak_attribute (tree *, tree, tree, int, bool *) ;
+static tree handle_no_plt_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_alias_ifunc_attribute (bool, tree *, tree, tree, bool *);
 static tree handle_ifunc_attribute (tree *, tree, tree, int, bool *);
 static tree handle_alias_attribute (tree *, tree, tree, int, bool *);
@@ -706,6 +707,8 @@  const struct attribute_spec c_common_attribute_tab
 			      handle_aligned_attribute, false },
   { "weak",                   0, 0, true,  false, false,
 			      handle_weak_attribute, false },
+  { "no_plt",                   0, 0, true,  false, false,
+			      handle_no_plt_attribute, false },
   { "ifunc",                  1, 1, true,  false, false,
 			      handle_ifunc_attribute, false },
   { "alias",                  1, 1, true,  false, false,
@@ -8185,6 +8188,25 @@  handle_weak_attribute (tree *node, tree name,
   return NULL_TREE;
 }
 
+/* Handle a "no_plt" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_no_plt_attribute (tree *node, tree name,
+		       tree ARG_UNUSED (args),
+		       int ARG_UNUSED (flags),
+		       bool * ARG_UNUSED (no_add_attrs))
+{
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+    {
+      warning (OPT_Wattributes,
+	       "%qE attribute is only applicable on functions", name);
+      *no_add_attrs = true;
+      return NULL_TREE;
+    }
+  return NULL_TREE;
+}
+
 /* Handle an "alias" or "ifunc" attribute; arguments as in
    struct attribute_spec.handler, except that IS_ALIAS tells us
    whether this is an alias as opposed to ifunc attribute.  */
Index: calls.c
===================================================================
--- calls.c	(revision 223720)
+++ calls.c	(working copy)
@@ -226,10 +226,16 @@  prepare_call_address (tree fndecl_or_type, rtx fun
 	       && targetm.small_register_classes_for_mode_p (FUNCTION_MODE))
 	      ? force_not_mem (memory_address (FUNCTION_MODE, funexp))
 	      : memory_address (FUNCTION_MODE, funexp));
-  else if (flag_pic && !flag_plt && fndecl_or_type
+  else if (flag_pic
+	   && fndecl_or_type
 	   && TREE_CODE (fndecl_or_type) == FUNCTION_DECL
+	   && (!flag_plt
+	       || lookup_attribute ("no_plt", DECL_ATTRIBUTES (fndecl_or_type)))
 	   && !targetm.binds_local_p (fndecl_or_type))
     {
+      /* This is done only for PIC code.  There is no easy interface to force the
+	 function address into GOT for non-PIC case.  non-PIC case needs to be
+	 handled specially by the backend.  */
       funexp = force_reg (Pmode, funexp);
     }
   else if (! sibcallp)
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 223720)
+++ config/i386/i386.c	(working copy)
@@ -5479,7 +5479,10 @@  ix86_function_ok_for_sibcall (tree decl, tree exp)
       && !TARGET_64BIT
       && flag_pic
       && flag_plt
-      && decl && !targetm.binds_local_p (decl))
+      && decl
+      && (TREE_CODE (decl) != FUNCTION_DECL
+	  || !lookup_attribute ("no_plt", DECL_ATTRIBUTES (decl)))
+      && !targetm.binds_local_p (decl))
     return false;
 
   /* If we need to align the outgoing stack, then sibcalling would
@@ -25497,13 +25500,19 @@  ix86_expand_call (rtx retval, rtx fnaddr, rtx call
     }
   else
     {
-      /* Static functions and indirect calls don't need the pic register.  */
+      /* Static functions and indirect calls don't need the pic register.  Also,
+	 check if PLT was explicitly avoided via no-plt or "no_plt" attribute, making
+	 it an indirect call.  */
       if (flag_pic
 	  && (!TARGET_64BIT
 	      || (ix86_cmodel == CM_LARGE_PIC
 		  && DEFAULT_ABI != MS_ABI))
 	  && GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF
-	  && ! SYMBOL_REF_LOCAL_P (XEXP (fnaddr, 0)))
+	  && !SYMBOL_REF_LOCAL_P (XEXP (fnaddr, 0))
+	  && flag_plt
+	  && (TREE_CODE (SYMBOL_REF_DECL (XEXP(fnaddr, 0))) != FUNCTION_DECL
+	      || !lookup_attribute ("no_plt",
+		     DECL_ATTRIBUTES (SYMBOL_REF_DECL (XEXP(fnaddr, 0))))))
 	{
 	  use_reg (&use, gen_rtx_REG (Pmode, REAL_PIC_OFFSET_TABLE_REGNUM));
 	  if (ix86_use_pseudo_pic_reg ())
@@ -25598,7 +25607,32 @@  ix86_expand_call (rtx retval, rtx fnaddr, rtx call
 
   return call;
 }
+/* Return true if the function being called was marked with attribute "no_plt"
+   or using -fno-plt and we are compiling for non-PIC and x86_64.  We need to
+   handle the non-PIC case in the backend because there is no easy interface
+   for the front-end to force non-PLT calls to use the GOT.  This is currently
+   used only with 64-bit ELF targets to call the function marked "no_plt"
+   indirectly.  */
 
+static bool
+ix86_nopic_no_plt_attribute_p (rtx call_op)
+{
+  if (flag_pic || ix86_cmodel == CM_LARGE
+      || !TARGET_64BIT || TARGET_MACHO|| TARGET_SEH || TARGET_PECOFF
+      || SYMBOL_REF_LOCAL_P (call_op))
+    return false;
+
+  tree symbol_decl = SYMBOL_REF_DECL (call_op);
+
+  if (symbol_decl != NULL_TREE
+      && TREE_CODE (symbol_decl) == FUNCTION_DECL
+      && (!flag_plt
+          || lookup_attribute ("no_plt", DECL_ATTRIBUTES (symbol_decl))))
+    return true;
+
+  return false;
+}
+
 /* Output the assembly for a call instruction.  */
 
 const char *
@@ -25610,7 +25644,9 @@  ix86_output_call_insn (rtx_insn *insn, rtx call_op
 
   if (SIBLING_CALL_P (insn))
     {
-      if (direct_p)
+      if (direct_p && ix86_nopic_no_plt_attribute_p (call_op))
+	xasm = "%!jmp\t*%p0@GOTPCREL(%%rip)";
+      else if (direct_p)
 	xasm = "%!jmp\t%P0";
       /* SEH epilogue detection requires the indirect branch case
 	 to include REX.W.  */
@@ -25653,7 +25689,9 @@  ix86_output_call_insn (rtx_insn *insn, rtx call_op
 	seh_nop_p = true;
     }
 
-  if (direct_p)
+  if (direct_p && ix86_nopic_no_plt_attribute_p (call_op))
+    xasm = "%!call\t*%p0@GOTPCREL(%%rip)";
+  else if (direct_p)
     xasm = "%!call\t%P0";
   else
     xasm = "%!call\t%A0";
Index: doc/extend.texi
===================================================================
--- doc/extend.texi	(revision 223720)
+++ doc/extend.texi	(working copy)
@@ -2916,6 +2916,35 @@  the standard C library can be guaranteed not to th
 with the notable exceptions of @code{qsort} and @code{bsearch} that
 take function pointer arguments.
 
+@item no_plt
+@cindex @code{no_plt} function attribute
+The @code{no_plt} attribute is the counterpart to option @option{-fno-plt} and
+does not use PLT for calls to functions marked with this attribute in position
+independent code. 
+
+@smallexample
+@group
+/* Externally defined function foo.  */
+int foo () __attribute__ ((no_plt));
+
+int
+main (/* @r{@dots{}} */)
+@{
+  /* @r{@dots{}} */
+  foo ();
+  /* @r{@dots{}} */
+@}
+@end group
+@end smallexample
+
+The @code{no_plt} attribute on function foo tells the compiler to assume that
+the function foo is externally defined and the call to foo must avoid the PLT
+in position independent code.
+
+Additionally, a few targets also convert calls to those functions that are
+marked to not use the PLT to use the GOT instead for non-position independent
+code.
+
 @item optimize
 @cindex @code{optimize} function attribute
 The @code{optimize} attribute is used to specify that a function is to
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi	(revision 223720)
+++ doc/invoke.texi	(working copy)
@@ -23868,6 +23868,14 @@  PLT stubs expect GOT pointer in a specific registe
 register allocation freedom to the compiler.  Lazy binding requires PLT:
 with @option{-fno-plt} all external symbols are resolved at load time.
 
+Alternatively, function attribute @code{no_plt} can be used to avoid PLT
+for calls to specific external functions by marking those functions with
+this attribute.
+
+Additionally, a few targets also convert calls to those functions that are
+marked to not use the PLT to use the GOT instead for non-position independent
+code.
+
 @item -fno-jump-tables
 @opindex fno-jump-tables
 Do not use jump tables for switch statements even where it would be
Index: testsuite/gcc.target/i386/noplt-1.c
===================================================================
--- testsuite/gcc.target/i386/noplt-1.c	(revision 0)
+++ testsuite/gcc.target/i386/noplt-1.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile { target x86_64-*-linux* } } */
+/* { dg-options "-fno-pic" } */
+
+__attribute__ ((no_plt))
+void foo();
+
+int main()
+{
+  foo();
+  return 0;
+}
+
+/* { dg-final { scan-assembler "call\[ \t\]\\*.*foo.*@GOTPCREL\\(%rip\\)" } } */ 
Index: testsuite/gcc.target/i386/noplt-2.c
===================================================================
--- testsuite/gcc.target/i386/noplt-2.c	(revision 0)
+++ testsuite/gcc.target/i386/noplt-2.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile { target x86_64-*-linux* } } */
+/* { dg-options "-O2 -fno-pic" } */
+
+
+__attribute__ ((no_plt))
+int foo();
+
+int main()
+{
+  return foo();
+}
+
+/* { dg-final { scan-assembler "jmp\[ \t\]\\*.*foo.*@GOTPCREL\\(%rip\\)" } } */ 
Index: testsuite/gcc.target/i386/noplt-3.c
===================================================================
--- testsuite/gcc.target/i386/noplt-3.c	(revision 0)
+++ testsuite/gcc.target/i386/noplt-3.c	(working copy)
@@ -0,0 +1,12 @@ 
+/* { dg-do compile { target x86_64-*-linux* } } */
+/* { dg-options "-fno-pic -fno-plt" } */
+
+void foo();
+
+int main()
+{
+  foo();
+  return 0;
+}
+
+/* { dg-final { scan-assembler "call\[ \t\]\\*.*foo.*@GOTPCREL\\(%rip\\)" } } */ 
Index: testsuite/gcc.target/i386/noplt-4.c
===================================================================
--- testsuite/gcc.target/i386/noplt-4.c	(revision 0)
+++ testsuite/gcc.target/i386/noplt-4.c	(working copy)
@@ -0,0 +1,11 @@ 
+/* { dg-do compile { target x86_64-*-linux* } } */
+/* { dg-options "-O2 -fno-pic -fno-plt" } */
+
+int foo();
+
+int main()
+{
+  return foo();
+}
+
+/* { dg-final { scan-assembler "jmp\[ \t\]\\*.*foo.*@GOTPCREL\\(%rip\\)" } } */