diff mbox

[RFC,X86_64] Eliminate PLT stubs for specified external functions via -fno-plt=

Message ID CAAs8HmwVab+rgGYbzCWSzTfJ36Cs9fYpQMPn8NfzZcu9eaOayA@mail.gmail.com
State New
Headers show

Commit Message

Sriraman Tallam June 2, 2015, 7:59 p.m. UTC
On Tue, Jun 2, 2015 at 12:32 PM, Bernhard Reutner-Fischer
<rep.dot.nop@gmail.com> wrote:
> On June 2, 2015 8:15:42 PM GMT+02:00, Sriraman Tallam <tmsriram@google.com> wrote:
> []
>
>>I have now modified this patch.
>>
>>This patch does two things:
>>
>>1) Adds new generic function attribute "no_plt" that is similar in
>>functionality  to -fno-plt except that it applies only to calls to
>>functions that are marked  with this attribute.
>>2) For x86_64, it makes -fno-plt(and the attribute) also work for
>>non-PIC code by  directly generating an indirect call via a GOT entry.
>>
>>For PIC code, no_plt merely shadows the implementation of -fno-plt, no
>>surprises here.
>>
>>* c-family/c-common.c (no_plt): New attribute.
>>(handle_no_plt_attribute): New handler.
>>* calls.c (prepare_call_address): Check for no_plt
>>attribute.
>>* config/i386/i386.c (ix86_function_ok_for_sibcall): Check
>>for no_plt attribute.
>>(ix86_expand_call):  Ditto.
>>(nopic_no_plt_attribute): New function.
>>(ix86_output_call_insn): Output indirect call for non-pic
>>no plt calls.
>>* doc/extend.texi (no_plt): Document new attribute.
>>* testsuite/gcc.target/i386/noplt-1.c: New test.
>>* testsuite/gcc.target/i386/noplt-2.c: New test.
>>* testsuite/gcc.target/i386/noplt-3.c: New test.
>>* testsuite/gcc.target/i386/noplt-4.c: New test.
>>
>>
>>Please review.
>
> --- config/i386/i386.c  (revision 223720)
> +++ config/i386/i386.c  (working copy)
> @@ -5479,6 +5479,8 @@ ix86_function_ok_for_sibcall (tree decl, tree exp)
>        && !TARGET_64BIT
>        && flag_pic
>        && flag_plt
> +      && (TREE_CODE (decl) != FUNCTION_DECL
> +         || !lookup_attribute ("no_plt", DECL_ATTRIBUTES (decl)))
>        && decl && !targetm.binds_local_p (decl))
>      return false;
>
> Wrong order or && decl is redundant. Stopped reading here.

Fixed and new patch attached.

Thanks
Sri

>
> Thanks,
>
* c-family/c-common.c (no_plt): New attribute.
	(handle_no_plt_attribute): New handler.
	* calls.c (prepare_call_address): Check for no_plt
	attribute.
	* config/i386/i386.c (ix86_function_ok_for_sibcall): Check
	for no_plt attribute.
	(ix86_expand_call):  Ditto.
	(nopic_no_plt_attribute): New function.
	(ix86_output_call_insn): Output indirect call for non-pic
	no plt calls.
	* doc/extend.texi (no_plt): Document new attribute.
	* testsuite/gcc.target/i386/noplt-1.c: New test.
	* testsuite/gcc.target/i386/noplt-2.c: New test.
	* testsuite/gcc.target/i386/noplt-3.c: New test.
	* testsuite/gcc.target/i386/noplt-4.c: New test.

This patch does two things:

* Adds new generic function attribute "no_plt" that is similar in functionality
  to -fno-plt except that it applies only to calls to functions that are marked
  with this attribute.
* For x86_64, it makes -fno-plt(and the attribute) also work for non-PIC code by
  directly generating an indirect call via a GOT entry.

Comments

Bernhard Reutner-Fischer June 2, 2015, 9:18 p.m. UTC | #1
On June 2, 2015 9:59:40 PM GMT+02:00, Sriraman Tallam <tmsriram@google.com> wrote:
>On Tue, Jun 2, 2015 at 12:32 PM, Bernhard Reutner-Fischer
><rep.dot.nop@gmail.com> wrote:
>> On June 2, 2015 8:15:42 PM GMT+02:00, Sriraman Tallam
><tmsriram@google.com> wrote:
>> []
>>
>>>I have now modified this patch.
>>>
>>>This patch does two things:
>>>
>>>1) Adds new generic function attribute "no_plt" that is similar in
>>>functionality  to -fno-plt except that it applies only to calls to
>>>functions that are marked  with this attribute.
>>>2) For x86_64, it makes -fno-plt(and the attribute) also work for
>>>non-PIC code by  directly generating an indirect call via a GOT
>entry.
>>>
>>>For PIC code, no_plt merely shadows the implementation of -fno-plt,
>no
>>>surprises here.
>>>
>>>* c-family/c-common.c (no_plt): New attribute.
>>>(handle_no_plt_attribute): New handler.
>>>* calls.c (prepare_call_address): Check for no_plt
>>>attribute.
>>>* config/i386/i386.c (ix86_function_ok_for_sibcall): Check
>>>for no_plt attribute.
>>>(ix86_expand_call):  Ditto.
>>>(nopic_no_plt_attribute): New function.
>>>(ix86_output_call_insn): Output indirect call for non-pic
>>>no plt calls.
>>>* doc/extend.texi (no_plt): Document new attribute.
>>>* testsuite/gcc.target/i386/noplt-1.c: New test.
>>>* testsuite/gcc.target/i386/noplt-2.c: New test.
>>>* testsuite/gcc.target/i386/noplt-3.c: New test.
>>>* testsuite/gcc.target/i386/noplt-4.c: New test.
>>>
>>>
>>>Please review.
>>
>> --- config/i386/i386.c  (revision 223720)
>> +++ config/i386/i386.c  (working copy)
>> @@ -5479,6 +5479,8 @@ ix86_function_ok_for_sibcall (tree decl, tree
>exp)
>>        && !TARGET_64BIT
>>        && flag_pic
>>        && flag_plt
>> +      && (TREE_CODE (decl) != FUNCTION_DECL
>> +         || !lookup_attribute ("no_plt", DECL_ATTRIBUTES (decl)))
>>        && decl && !targetm.binds_local_p (decl))
>>      return false;
>>
>> Wrong order or && decl is redundant. Stopped reading here.
>
>Fixed and new patch

Just reading the diff I do not grok the different conditions in
ix86_function_ok_for_sibcall
ix86_expand_call
especially regarding CM_LARGE_PIC but I take it you've read more context.

-	  && ! SYMBOL_REF_LOCAL_P (XEXP (fnaddr, 0)))
+	  && ! SYMBOL_REF_LOCAL_P (XEXP (fnaddr, 0))
+	  && flag_plt

s/! /!/;# while you touch or maybe that's OK -- check_GNU.sh  would know, hopefully.

+/* Return true if the function being called was marked with attribute
+   "no_plt" or using -fno-plt and we are compiling for no-PIC and x86_64.
+   This is currently used only with 64-bit ELF targets to call the function

a function

+   marked "no_plt" indirectly.  */
+
+static bool
+nopic_no_plt_attribute (rtx call_op)

IIRC predicates ought to have a _p suffix but maybe that's outdated nowadays?

+{
+  if (flag_pic)
+    return false;
+
+  if (!TARGET_64BIT || TARGET_MACHO|| TARGET_SEH || TARGET_PECOFF)

missing space after ||
We have a contrib/check*.sh style checker for patches in there.

+    return false;
+
+  if (SYMBOL_REF_LOCAL_P (call_op))
+    return false;
+
+  tree symbol_decl = SYMBOL_REF_DECL (call_op);
+
+  if (symbol_decl != NULL_TREE
+      && TREE_CODE (symbol_decl) == FUNCTION_DECL
+      && (!flag_plt
+          || lookup_attribute ("no_plt", DECL_ATTRIBUTES (symbol_decl))))
+    return true;
+
+  return false;
+}

 
+@item no_plt
+@cindex @code{no_plt} function attribute
+The @code{no_plt} attribute is used to inform the compiler that a calls

Doesn't parse. a call / calls

+to the function should not use the PLT.  For example, external functions

would be nice to have an xref to PLT definition for the casual reader, iff we have one or could have one easily.

+defined in shared objects are called from the executable using the PLT.
+This attribute on the function declaration calls these functions indirectly
+rather than going via the PLT.  This is similar to @option{-fno-plt} but
+is only applicable to calls to the function marked with this attribute.
+

smallexample (or you-name-it counterpart) for code-avoidance for bonus points, maybe.

Not a conceptual review due to current cellphone-impairedness, but looks somewhat plausible at first glance..

HTH && cheers,
diff mbox

Patch

Index: c-family/c-common.c
===================================================================
--- c-family/c-common.c	(revision 223720)
+++ c-family/c-common.c	(working copy)
@@ -357,6 +357,7 @@  static tree handle_mode_attribute (tree *, tree, t
 static tree handle_section_attribute (tree *, tree, tree, int, bool *);
 static tree handle_aligned_attribute (tree *, tree, tree, int, bool *);
 static tree handle_weak_attribute (tree *, tree, tree, int, bool *) ;
+static tree handle_no_plt_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_alias_ifunc_attribute (bool, tree *, tree, tree, bool *);
 static tree handle_ifunc_attribute (tree *, tree, tree, int, bool *);
 static tree handle_alias_attribute (tree *, tree, tree, int, bool *);
@@ -706,6 +707,8 @@  const struct attribute_spec c_common_attribute_tab
 			      handle_aligned_attribute, false },
   { "weak",                   0, 0, true,  false, false,
 			      handle_weak_attribute, false },
+  { "no_plt",                   0, 0, true,  false, false,
+			      handle_no_plt_attribute, false },
   { "ifunc",                  1, 1, true,  false, false,
 			      handle_ifunc_attribute, false },
   { "alias",                  1, 1, true,  false, false,
@@ -8185,6 +8188,25 @@  handle_weak_attribute (tree *node, tree name,
   return NULL_TREE;
 }
 
+/* Handle a "no_plt" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_no_plt_attribute (tree *node, tree name,
+		       tree ARG_UNUSED (args),
+		       int ARG_UNUSED (flags),
+		       bool * ARG_UNUSED (no_add_attrs))
+{
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+    {
+      warning (OPT_Wattributes,
+	       "%qE attribute is only applicable on functions", name);
+      *no_add_attrs = true;
+      return NULL_TREE;
+    }
+  return NULL_TREE;
+}
+
 /* Handle an "alias" or "ifunc" attribute; arguments as in
    struct attribute_spec.handler, except that IS_ALIAS tells us
    whether this is an alias as opposed to ifunc attribute.  */
Index: calls.c
===================================================================
--- calls.c	(revision 223720)
+++ calls.c	(working copy)
@@ -226,8 +226,10 @@  prepare_call_address (tree fndecl_or_type, rtx fun
 	       && targetm.small_register_classes_for_mode_p (FUNCTION_MODE))
 	      ? force_not_mem (memory_address (FUNCTION_MODE, funexp))
 	      : memory_address (FUNCTION_MODE, funexp));
-  else if (flag_pic && !flag_plt && fndecl_or_type
+  else if (flag_pic && fndecl_or_type
 	   && TREE_CODE (fndecl_or_type) == FUNCTION_DECL
+	   && (!flag_plt
+	       || lookup_attribute ("no_plt", DECL_ATTRIBUTES (fndecl_or_type)))
 	   && !targetm.binds_local_p (fndecl_or_type))
     {
       funexp = force_reg (Pmode, funexp);
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 223720)
+++ config/i386/i386.c	(working copy)
@@ -5479,7 +5479,10 @@  ix86_function_ok_for_sibcall (tree decl, tree exp)
       && !TARGET_64BIT
       && flag_pic
       && flag_plt
-      && decl && !targetm.binds_local_p (decl))
+      && decl
+      && (TREE_CODE (decl) != FUNCTION_DECL
+	  || !lookup_attribute ("no_plt", DECL_ATTRIBUTES (decl)))
+      && !targetm.binds_local_p (decl))
     return false;
 
   /* If we need to align the outgoing stack, then sibcalling would
@@ -25497,13 +25500,19 @@  ix86_expand_call (rtx retval, rtx fnaddr, rtx call
     }
   else
     {
-      /* Static functions and indirect calls don't need the pic register.  */
+      /* Static functions and indirect calls don't need the pic register.  Also,
+	 check if PLT was explicitly avoided via no-plt or "no_plt" attribute, making
+	 it an indirect call.  */
       if (flag_pic
 	  && (!TARGET_64BIT
 	      || (ix86_cmodel == CM_LARGE_PIC
 		  && DEFAULT_ABI != MS_ABI))
 	  && GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF
-	  && ! SYMBOL_REF_LOCAL_P (XEXP (fnaddr, 0)))
+	  && ! SYMBOL_REF_LOCAL_P (XEXP (fnaddr, 0))
+	  && flag_plt
+	  && (TREE_CODE (SYMBOL_REF_DECL (XEXP(fnaddr, 0))) != FUNCTION_DECL
+	      || !lookup_attribute ("no_plt",
+		     DECL_ATTRIBUTES (SYMBOL_REF_DECL (XEXP(fnaddr, 0))))))
 	{
 	  use_reg (&use, gen_rtx_REG (Pmode, REAL_PIC_OFFSET_TABLE_REGNUM));
 	  if (ix86_use_pseudo_pic_reg ())
@@ -25599,6 +25608,34 @@  ix86_expand_call (rtx retval, rtx fnaddr, rtx call
   return call;
 }
 
+/* Return true if the function being called was marked with attribute
+   "no_plt" or using -fno-plt and we are compiling for no-PIC and x86_64.
+   This is currently used only with 64-bit ELF targets to call the function
+   marked "no_plt" indirectly.  */
+
+static bool
+nopic_no_plt_attribute (rtx call_op)
+{
+  if (flag_pic)
+    return false;
+
+  if (!TARGET_64BIT || TARGET_MACHO|| TARGET_SEH || TARGET_PECOFF)
+    return false;
+
+  if (SYMBOL_REF_LOCAL_P (call_op))
+    return false;
+
+  tree symbol_decl = SYMBOL_REF_DECL (call_op);
+
+  if (symbol_decl != NULL_TREE
+      && TREE_CODE (symbol_decl) == FUNCTION_DECL
+      && (!flag_plt
+          || lookup_attribute ("no_plt", DECL_ATTRIBUTES (symbol_decl))))
+    return true;
+
+  return false;
+}
+
 /* Output the assembly for a call instruction.  */
 
 const char *
@@ -25610,7 +25647,9 @@  ix86_output_call_insn (rtx_insn *insn, rtx call_op
 
   if (SIBLING_CALL_P (insn))
     {
-      if (direct_p)
+      if (direct_p && nopic_no_plt_attribute (call_op))
+	xasm = "%!jmp\t*%p0@GOTPCREL(%%rip)";
+      else if (direct_p)
 	xasm = "%!jmp\t%P0";
       /* SEH epilogue detection requires the indirect branch case
 	 to include REX.W.  */
@@ -25653,7 +25692,9 @@  ix86_output_call_insn (rtx_insn *insn, rtx call_op
 	seh_nop_p = true;
     }
 
-  if (direct_p)
+  if (direct_p && nopic_no_plt_attribute (call_op))
+    xasm = "%!call\t*%p0@GOTPCREL(%%rip)";
+  else if (direct_p)
     xasm = "%!call\t%P0";
   else
     xasm = "%!call\t%A0";
Index: doc/extend.texi
===================================================================
--- doc/extend.texi	(revision 223720)
+++ doc/extend.texi	(working copy)
@@ -2916,6 +2916,15 @@  the standard C library can be guaranteed not to th
 with the notable exceptions of @code{qsort} and @code{bsearch} that
 take function pointer arguments.
 
+@item no_plt
+@cindex @code{no_plt} function attribute
+The @code{no_plt} attribute is used to inform the compiler that a calls
+to the function should not use the PLT.  For example, external functions
+defined in shared objects are called from the executable using the PLT.
+This attribute on the function declaration calls these functions indirectly
+rather than going via the PLT.  This is similar to @option{-fno-plt} but
+is only applicable to calls to the function marked with this attribute.
+
 @item optimize
 @cindex @code{optimize} function attribute
 The @code{optimize} attribute is used to specify that a function is to
Index: testsuite/gcc.target/i386/noplt-1.c
===================================================================
--- testsuite/gcc.target/i386/noplt-1.c	(revision 0)
+++ testsuite/gcc.target/i386/noplt-1.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile { target x86_64-*-linux* } } */
+/* { dg-options "-fno-pic" } */
+
+__attribute__ ((no_plt))
+void foo();
+
+int main()
+{
+  foo();
+  return 0;
+}
+
+/* { dg-final { scan-assembler "call\[ \t\]\\*.*foo.*@GOTPCREL\\(%rip\\)" } } */ 
Index: testsuite/gcc.target/i386/noplt-2.c
===================================================================
--- testsuite/gcc.target/i386/noplt-2.c	(revision 0)
+++ testsuite/gcc.target/i386/noplt-2.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile { target x86_64-*-linux* } } */
+/* { dg-options "-O2 -fno-pic" } */
+
+
+__attribute__ ((no_plt))
+int foo();
+
+int main()
+{
+  return foo();
+}
+
+/* { dg-final { scan-assembler "jmp\[ \t\]\\*.*foo.*@GOTPCREL\\(%rip\\)" } } */ 
Index: testsuite/gcc.target/i386/noplt-3.c
===================================================================
--- testsuite/gcc.target/i386/noplt-3.c	(revision 0)
+++ testsuite/gcc.target/i386/noplt-3.c	(working copy)
@@ -0,0 +1,12 @@ 
+/* { dg-do compile { target x86_64-*-linux* } } */
+/* { dg-options "-fno-pic -fno-plt" } */
+
+void foo();
+
+int main()
+{
+  foo();
+  return 0;
+}
+
+/* { dg-final { scan-assembler "call\[ \t\]\\*.*foo.*@GOTPCREL\\(%rip\\)" } } */ 
Index: testsuite/gcc.target/i386/noplt-4.c
===================================================================
--- testsuite/gcc.target/i386/noplt-4.c	(revision 0)
+++ testsuite/gcc.target/i386/noplt-4.c	(working copy)
@@ -0,0 +1,11 @@ 
+/* { dg-do compile { target x86_64-*-linux* } } */
+/* { dg-options "-O2 -fno-pic -fno-plt" } */
+
+int foo();
+
+int main()
+{
+  return foo();
+}
+
+/* { dg-final { scan-assembler "jmp\[ \t\]\\*.*foo.*@GOTPCREL\\(%rip\\)" } } */