diff mbox

Fix large code model with the ELFv2 ABI

Message ID 20151202194433.41A86F8BD@oc7340732750.ibm.com
State New
Headers show

Commit Message

Ulrich Weigand Dec. 2, 2015, 7:44 p.m. UTC
Hello,

this patch fixes support for the large code model with the ELFv2 ABI.

The global entry point prologue currently assumes that the TOC associated
with a function is less than 2GB away from the function entry point.  This
is always true when using the medium or small code model, but may not be
the case when using the large code model.

This patch adds a new variant of the ELFv2 global entry point prologue that
lifts the 2GB restriction when building with -mcmodel=large.  This works by
emitting a quadword containing the distance from the function entry point
to its associated TOC immediately before the entry point (this is done in
rs6000_elf_declare_function_name), and then using a prologue like:

	ld r2,-8(r12)
	add r2,r2,r12

In addition, if assembler support for this new relocation is detected,
the compiler emits a R_PPC64_ENTRY reloc on the first instruction of
this new prologue.  This will allow the linker to rewrite the prologue
to the original form if it turns out at link time that the distance
between entry point and TOC actually happens to be less than 2GB.

The patch also introduces a new function rs6000_global_entry_point_needed_p,
which is used instead of directly checking cfun->machine->r2_setup_needed.
This allows handling global entry point prologues for C++ thunks.  This was
previously done by having rs6000_output_mi_thunk set the r2_setup_needed
flag, but this no longer works, since we now need to check whether we need
a global entry point prologue in rs6000_elf_declare_function_name, which
is already called *before* rs6000_output_mi_thunk.

Finally, the patch removes use of the GNU local label extension ("0b")
in favour of compiler-emitted internal labels.  This seems clearer now
that the entry point code may be split across two different functions
(rs6000_output_function_prologue vs. rs6000_elf_declare_function_name)
and makes it simpler to move the location of the TOC delta quadword
at some future time.  Also, it removes the implicit assumption that
the system assembler supports this GNU extension, which I understand
we don't assume anywhere else.

Tested on powerp64le-linux.  Also tested bootstrap/regtest with an
extra patch to switch the default code model to CMODEL_LARGE.  Tested
both with an assembler supporting R_PPC64_ENTRY and with an assembler
that doesn't support it.

OK for mainline?

Bye,
Ulrich

ChangeLog:

	* configure.ac: Check assembler support for R_PPC64_ENTRY relocation.
	* configure: Regenerate.
	* config.in: Regenerate.
	* config/rs6000/rs6000.c (rs6000_global_entry_point_needed_p): New
	function.
	(rs6000_output_function_prologue): Use it instead of checking
	cfun->machine->r2_setup_needed.  Use internal labels instead of
	GNU as local label extension.  Handle ELFv2 large code model.
	(rs6000_output_mi_thunk): Do not set cfun->machine->r2_setup_needed.
	(rs6000_elf_declare_function_name): Handle ELFv2 large code model.

Comments

David Edelsohn Dec. 2, 2015, 7:46 p.m. UTC | #1
On Wed, Dec 2, 2015 at 2:44 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> Hello,
>
> this patch fixes support for the large code model with the ELFv2 ABI.
>
> The global entry point prologue currently assumes that the TOC associated
> with a function is less than 2GB away from the function entry point.  This
> is always true when using the medium or small code model, but may not be
> the case when using the large code model.
>
> This patch adds a new variant of the ELFv2 global entry point prologue that
> lifts the 2GB restriction when building with -mcmodel=large.  This works by
> emitting a quadword containing the distance from the function entry point
> to its associated TOC immediately before the entry point (this is done in
> rs6000_elf_declare_function_name), and then using a prologue like:
>
>         ld r2,-8(r12)
>         add r2,r2,r12
>
> In addition, if assembler support for this new relocation is detected,
> the compiler emits a R_PPC64_ENTRY reloc on the first instruction of
> this new prologue.  This will allow the linker to rewrite the prologue
> to the original form if it turns out at link time that the distance
> between entry point and TOC actually happens to be less than 2GB.
>
> The patch also introduces a new function rs6000_global_entry_point_needed_p,
> which is used instead of directly checking cfun->machine->r2_setup_needed.
> This allows handling global entry point prologues for C++ thunks.  This was
> previously done by having rs6000_output_mi_thunk set the r2_setup_needed
> flag, but this no longer works, since we now need to check whether we need
> a global entry point prologue in rs6000_elf_declare_function_name, which
> is already called *before* rs6000_output_mi_thunk.
>
> Finally, the patch removes use of the GNU local label extension ("0b")
> in favour of compiler-emitted internal labels.  This seems clearer now
> that the entry point code may be split across two different functions
> (rs6000_output_function_prologue vs. rs6000_elf_declare_function_name)
> and makes it simpler to move the location of the TOC delta quadword
> at some future time.  Also, it removes the implicit assumption that
> the system assembler supports this GNU extension, which I understand
> we don't assume anywhere else.
>
> Tested on powerp64le-linux.  Also tested bootstrap/regtest with an
> extra patch to switch the default code model to CMODEL_LARGE.  Tested
> both with an assembler supporting R_PPC64_ENTRY and with an assembler
> that doesn't support it.
>
> OK for mainline?
>
> Bye,
> Ulrich
>
> ChangeLog:
>
>         * configure.ac: Check assembler support for R_PPC64_ENTRY relocation.
>         * configure: Regenerate.
>         * config.in: Regenerate.
>         * config/rs6000/rs6000.c (rs6000_global_entry_point_needed_p): New
>         function.
>         (rs6000_output_function_prologue): Use it instead of checking
>         cfun->machine->r2_setup_needed.  Use internal labels instead of
>         GNU as local label extension.  Handle ELFv2 large code model.
>         (rs6000_output_mi_thunk): Do not set cfun->machine->r2_setup_needed.
>         (rs6000_elf_declare_function_name): Handle ELFv2 large code model.

Okay.

Thanks, David
diff mbox

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 231177)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -24888,6 +24888,31 @@ 
   return bitmap_bit_p (DF_LR_OUT (bb), 12);
 }
 
+/* Return whether we need to emit an ELFv2 global entry point prologue.  */
+
+static bool
+rs6000_global_entry_point_needed_p (void)
+{
+  /* Only needed for the ELFv2 ABI.  */
+  if (DEFAULT_ABI != ABI_ELFv2)
+    return false;
+
+  /* With -msingle-pic-base, we assume the whole program shares the same
+     TOC, so no global entry point prologues are needed anywhere.  */
+  if (TARGET_SINGLE_PIC_BASE)
+    return false;
+
+  /* Ensure we have a global entry point for thunks.   ??? We could
+     avoid that if the target routine doesn't need a global entry point,
+     but we do not know whether this is the case at this point.  */
+  if (cfun->is_thunk)
+    return true;
+
+  /* For regular functions, rs6000_emit_prologue sets this flag if the
+     routine ever uses the TOC pointer.  */
+  return cfun->machine->r2_setup_needed;
+}
+
 /* Emit function prologue as insns.  */
 
 void
@@ -25951,13 +25976,53 @@ 
 
   /* ELFv2 ABI r2 setup code and local entry point.  This must follow
      immediately after the global entry point label.  */
-  if (DEFAULT_ABI == ABI_ELFv2 && cfun->machine->r2_setup_needed)
+  if (rs6000_global_entry_point_needed_p ())
     {
       const char *name = XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0);
 
-      fprintf (file, "0:\taddis 2,12,.TOC.-0b@ha\n");
-      fprintf (file, "\taddi 2,2,.TOC.-0b@l\n");
+      (*targetm.asm_out.internal_label) (file, "LCF", rs6000_pic_labelno);
 
+      if (TARGET_CMODEL != CMODEL_LARGE)
+	{
+	  /* In the small and medium code models, we assume the TOC is less
+	     2 GB away from the text section, so it can be computed via the
+	     following two-instruction sequence.  */
+	  char buf[256];
+
+	  ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);
+	  fprintf (file, "0:\taddis 2,12,.TOC.-");
+	  assemble_name (file, buf);
+	  fprintf (file, "@ha\n");
+	  fprintf (file, "\taddi 2,2,.TOC.-");
+	  assemble_name (file, buf);
+	  fprintf (file, "@l\n");
+	}
+      else
+	{
+	  /* In the large code model, we allow arbitrary offsets between the
+	     TOC and the text section, so we have to load the offset from
+	     memory.  The data field is emitted directly before the global
+	     entry point in rs6000_elf_declare_function_name.  */
+	  char buf[256];
+
+#ifdef HAVE_AS_ENTRY_MARKERS
+	  /* If supported by the linker, emit a marker relocation.  If the
+	     total code size of the final executable or shared library
+	     happens to fit into 2 GB after all, the linker will replace
+	     this code sequence with the sequence for the small or medium
+	     code model.  */
+	  fprintf (file, "\t.reloc .,R_PPC64_ENTRY\n");
+#endif
+	  fprintf (file, "\tld 2,");
+	  ASM_GENERATE_INTERNAL_LABEL (buf, "LCL", rs6000_pic_labelno);
+	  assemble_name (file, buf);
+	  fprintf (file, "-");
+	  ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);
+	  assemble_name (file, buf);
+	  fprintf (file, "(12)\n");
+	  fprintf (file, "\tadd 2,2,12\n");
+	}
+
       fputs ("\t.localentry\t", file);
       assemble_name (file, name);
       fputs (",.-", file);
@@ -27620,13 +27685,6 @@ 
   SIBLING_CALL_P (insn) = 1;
   emit_barrier ();
 
-  /* Ensure we have a global entry point for the thunk.   ??? We could
-     avoid that if the target routine doesn't need a global entry point,
-     but we do not know whether this is the case at this point.  */
-  if (DEFAULT_ABI == ABI_ELFv2
-      && !TARGET_SINGLE_PIC_BASE)
-    cfun->machine->r2_setup_needed = true;
-
   /* Run just enough of rest_of_compilation to get the insns emitted.
      There's not really enough bulk here to make other passes such as
      instruction scheduling worth while.  Note that use_thunk calls
@@ -31493,6 +31551,18 @@ 
   ASM_OUTPUT_TYPE_DIRECTIVE (file, name, "function");
   ASM_DECLARE_RESULT (file, DECL_RESULT (decl));
 
+  if (TARGET_CMODEL == CMODEL_LARGE && rs6000_global_entry_point_needed_p ())
+    {
+      char buf[256];
+
+      (*targetm.asm_out.internal_label) (file, "LCL", rs6000_pic_labelno);
+
+      fprintf (file, "\t.quad .TOC.-");
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);
+      assemble_name (file, buf);
+      putc ('\n', file);
+    }
+
   if (DEFAULT_ABI == ABI_AIX)
     {
       const char *desc_name, *orig_name;
Index: gcc/configure.ac
===================================================================
--- gcc/configure.ac	(revision 231177)
+++ gcc/configure.ac	(working copy)
@@ -4371,6 +4371,12 @@ 
       [AC_DEFINE(HAVE_AS_TLS_MARKERS, 1,
 	  [Define if your assembler supports arg info for __tls_get_addr.])])
 
+    gcc_GAS_CHECK_FEATURE([prologue entry point marker support],
+      gcc_cv_as_powerpc_entry_markers, [2,26,0],-a64 --fatal-warnings,
+      [ .reloc .,R_PPC64_ENTRY; nop],,
+      [AC_DEFINE(HAVE_AS_ENTRY_MARKERS, 1,
+	  [Define if your assembler supports the R_PPC64_ENTRY relocation.])])
+
     case $target in
       *-*-aix*)
 	gcc_GAS_CHECK_FEATURE([.ref support],
Index: gcc/config.in
===================================================================
--- gcc/config.in	(revision 231177)
+++ gcc/config.in	(working copy)
@@ -327,6 +327,12 @@ 
 #endif
 
 
+/* Define if your assembler supports the R_PPC64_ENTRY relocation. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_AS_ENTRY_MARKERS
+#endif
+
+
 /* Define if your assembler supports explicit relocations. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_AS_EXPLICIT_RELOCS
Index: gcc/configure
===================================================================
--- gcc/configure	(revision 231177)
+++ gcc/configure	(working copy)
@@ -26534,6 +26534,41 @@ 
 
 fi
 
+    { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for prologue entry point marker support" >&5
+$as_echo_n "checking assembler for prologue entry point marker support... " >&6; }
+if test "${gcc_cv_as_powerpc_entry_markers+set}" = set; then :
+  $as_echo_n "(cached) " >&6
+else
+  gcc_cv_as_powerpc_entry_markers=no
+    if test $in_tree_gas = yes; then
+    if test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 26 \) \* 1000 + 0`
+  then gcc_cv_as_powerpc_entry_markers=yes
+fi
+  elif test x$gcc_cv_as != x; then
+    $as_echo ' .reloc .,R_PPC64_ENTRY; nop' > conftest.s
+    if { ac_try='$gcc_cv_as $gcc_cv_as_flags -a64 --fatal-warnings -o conftest.o conftest.s >&5'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; }
+    then
+	gcc_cv_as_powerpc_entry_markers=yes
+    else
+      echo "configure: failed program was" >&5
+      cat conftest.s >&5
+    fi
+    rm -f conftest.o conftest.s
+  fi
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_powerpc_entry_markers" >&5
+$as_echo "$gcc_cv_as_powerpc_entry_markers" >&6; }
+if test $gcc_cv_as_powerpc_entry_markers = yes; then
+
+$as_echo "#define HAVE_AS_ENTRY_MARKERS 1" >>confdefs.h
+
+fi
+
     case $target in
       *-*-aix*)
 	{ $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for .ref support" >&5