diff mbox

[x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations

Message ID CAMe9rOrUhR5xUU_9o+JCQez0qoJ-ZPzF0e6YDFy9sAP9pHBn-A@mail.gmail.com
State New
Headers show

Commit Message

H.J. Lu Dec. 4, 2014, 11:54 p.m. UTC
On Thu, Dec 4, 2014 at 2:19 PM, Dominique Dhumieres <dominiq@lps.ens.fr> wrote:
>> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
>> module using the GOT.  This is two instructions, one to get the address
>> of the global from the GOT and the other to get the value.  If it turns
>> out that the global gets defined in the executable at link-time, it still
>> needs to go through the GOT as it is too late then to generate a direct
>> access.
>>
>> Examples:
>>
>> foo.cc
>> ------
>> int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code directly accesses the global via
>> PC-relative insn:
>>
>> 5e0   <main>:
>>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>>
>> foo.cc
>> ------
>>
>> extern int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code accesses global via GOT using
>> two memory loads:
>>
>> 6f0  <main>:
>>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>>    mov    (%rax),%eax
>>
>> This is true even if in the latter case the global was defined in the
>> executable through a different file.
>>
>> Some experiments on google benchmarks shows that the extra memory loads
>> affects performance by 1% to 5%.
>>
>> Solution - Copy Relocations:
>>
>> When the linker supports copy relocations, GCC can always assume that
>> the global will be defined in the executable.  For globals that are truly
>> extern (come from shared objects), the linker will create copy relocations
>> and have them defined in the executable. Result is that no global access
>> needs to go through the GOT and hence improves performance.
>>
>> This optimization only applies to undefined, non-weak global data.
>> Undefined, weak global data access still must go through the GOT.
>>
>> This patch checks if linker supports PIE with copy reloc, which is
>> enabled in gold and bfd linker in bininutils 2.25, at configure time
>> and enables this optimization if the linker support is available.
>>
>> gcc/
>>
>> * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
>> Linux/x86-64 linker supports PIE with copy reloc.
>> * config.in: Regenerated.
>> * configure: Likewise.
>>
>> * config/i386/i386.c (legitimate_pic_address_disp_p): Allow
>> pc-relative address for undefined, non-weak, non-function
>> symbol reference in 64-bit PIE if linker supports PIE with
>> copy reloc.
>>
>> * doc/sourcebuild.texi: Document pie_copyreloc target.
>>
>> gcc/testsuite/
>>
>> * gcc.target/i386/pie-copyrelocs-1.c: New test.
>> * gcc.target/i386/pie-copyrelocs-2.c: Likewise.
>> * gcc.target/i386/pie-copyrelocs-3.c: Likewise.
>> * gcc.target/i386/pie-copyrelocs-4.c: Likewise.
>>
>> * lib/target-supports.exp (check_effective_target_pie_copyreloc):
>> New procedure.
>
> It caused pr64189.
>

I checked in this as an obvious fix.  Sorry for the inconvenience.
diff mbox

Patch

Index: ChangeLog
===================================================================
--- ChangeLog (revision 218407)
+++ ChangeLog (working copy)
@@ -1,3 +1,9 @@ 
+2014-12-04  H.J. Lu  <hongjiu.lu@intel.com>
+
+ PR bootstrap/64189
+ * configure.ac (HAVE_LD_PIE_COPYRELOC): Always define.
+ * configure: Regenerated.
+
 2014-12-04  Manuel López-Ibáñez  <manu@gcc.gnu.org>

  * diagnostic.c (diagnostic_color_init): New.
Index: configure
===================================================================
--- configure (revision 218407)
+++ configure (working copy)
@@ -27063,12 +27063,12 @@  EOF
       ;;
     esac
   fi
+fi

 cat >>confdefs.h <<_ACEOF
 #define HAVE_LD_PIE_COPYRELOC `if test x"$gcc_cv_ld_pie_copyreloc" =
xyes; then echo 1; else echo 0; fi`
 _ACEOF

-fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_ld_pie_copyreloc" >&5
 $as_echo "$gcc_cv_ld_pie_copyreloc" >&6; }

Index: configure.ac
===================================================================
--- configure.ac (revision 218407)
+++ configure.ac (working copy)
@@ -4730,10 +4730,10 @@  EOF
       ;;
     esac
   fi
-  AC_DEFINE_UNQUOTED(HAVE_LD_PIE_COPYRELOC,
-    [`if test x"$gcc_cv_ld_pie_copyreloc" = xyes; then echo 1; else
echo 0; fi`],
-    [Define 0/1 if your linker supports -pie option with copy reloc.])
 fi
+AC_DEFINE_UNQUOTED(HAVE_LD_PIE_COPYRELOC,
+  [`if test x"$gcc_cv_ld_pie_copyreloc" = xyes; then echo 1; else echo 0; fi`],
+  [Define 0/1 if your linker supports -pie option with copy reloc.])
 AC_MSG_RESULT($gcc_cv_ld_pie_copyreloc)