Patchwork PATCH RFA: Split stack [2/7]: Middle-end support

login
register
mail settings
Submitter Ian Taylor
Date Sept. 22, 2010, 10:02 p.m.
Message ID <mcrpqw577yn.fsf@google.com>
Download mbox | patch
Permalink /patch/65463/
State New
Headers show

Comments

Ian Taylor - Sept. 22, 2010, 10:02 p.m.
This is the second of the -fsplit-stack patches.  This patch is the
general middle-end support.  It implements the following:

* Let the backend insert a split stack prologue at the start of the
  function.

* Handle dynamic stack space allocation (for alloca or variably sized
  arrays) by calling a support function if necessary.

* Add magic sections which let the linker automatically adjust
  split-stack functions which call non-split-stack functions.

This patch is all middle-end code and hence I do not require any
specific approval.  However I would of course be happy to see any
comments.

Ian


2010-09-21  Ian Lance Taylor  <iant@google.com>

	* function.c (thread_prologue_and_epilogue_insns): If
	flag_split_stack, add split stack prologue.
	* explow.c (allocate_dynamic_stack_space): Support -fsplit-stack.
	* varasm.c (saw_no_split_stack): New static variable.
	(assemble_start_function): Set saw_no_split_stack if the function
	has the no_split_stack attribute.
	(file_end_indicate_split_stack): New function.
	* output.h (file_end_indicate_split_stack): Declare.
Richard Guenther - Sept. 23, 2010, 9:46 a.m.
On Thu, Sep 23, 2010 at 12:02 AM, Ian Lance Taylor <iant@google.com> wrote:
> This is the second of the -fsplit-stack patches.  This patch is the
> general middle-end support.  It implements the following:
>
> * Let the backend insert a split stack prologue at the start of the
>  function.
>
> * Handle dynamic stack space allocation (for alloca or variably sized
>  arrays) by calling a support function if necessary.
>
> * Add magic sections which let the linker automatically adjust
>  split-stack functions which call non-split-stack functions.
>
> This patch is all middle-end code and hence I do not require any
> specific approval.  However I would of course be happy to see any
> comments.

Are there any optimizations for functions which do either not require
stack or are leaf functions and thus would be fine with using the
red zone?  (does -fsplit-stack make sure that space for the red zone
is available as well?)

Thanks,
Richard.

> Ian
>
>
> 2010-09-21  Ian Lance Taylor  <iant@google.com>
>
>        * function.c (thread_prologue_and_epilogue_insns): If
>        flag_split_stack, add split stack prologue.
>        * explow.c (allocate_dynamic_stack_space): Support -fsplit-stack.
>        * varasm.c (saw_no_split_stack): New static variable.
>        (assemble_start_function): Set saw_no_split_stack if the function
>        has the no_split_stack attribute.
>        (file_end_indicate_split_stack): New function.
>        * output.h (file_end_indicate_split_stack): Declare.
>
>
>
Ian Taylor - Sept. 23, 2010, 4:28 p.m.
Richard Guenther <richard.guenther@gmail.com> writes:

> On Thu, Sep 23, 2010 at 12:02 AM, Ian Lance Taylor <iant@google.com> wrote:
>> This is the second of the -fsplit-stack patches.  This patch is the
>> general middle-end support.  It implements the following:
>>
>> * Let the backend insert a split stack prologue at the start of the
>>  function.
>>
>> * Handle dynamic stack space allocation (for alloca or variably sized
>>  arrays) by calling a support function if necessary.
>>
>> * Add magic sections which let the linker automatically adjust
>>  split-stack functions which call non-split-stack functions.
>>
>> This patch is all middle-end code and hence I do not require any
>> specific approval.  However I would of course be happy to see any
>> comments.
>
> Are there any optimizations for functions which do either not require
> stack or are leaf functions and thus would be fine with using the
> red zone?  (does -fsplit-stack make sure that space for the red zone
> is available as well?)

At present there are no such optimizations, though of course they can be
added easily enough.

-fsplit-stack does ensure that there is enough space for the red zone;
in fact on x86_64 there is always at least 1536 bytes (from
libgcc/config/i386/morestack.S) available on the stack, because signal
handlers need to be able to run.

Ian

Patch

Index: gcc/function.c
===================================================================
--- gcc/function.c	(revision 164490)
+++ gcc/function.c	(working copy)
@@ -5209,17 +5209,50 @@  emit_return_into_block (basic_block bb)
 static void
 thread_prologue_and_epilogue_insns (void)
 {
-  int inserted = 0;
+  bool inserted;
+  rtx seq, epilogue_end;
+  edge entry_edge;
   edge e;
-#if defined (HAVE_sibcall_epilogue) || defined (HAVE_epilogue) || defined (HAVE_return) || defined (HAVE_prologue)
-  rtx seq;
-#endif
-#if defined (HAVE_epilogue) || defined(HAVE_return)
-  rtx epilogue_end = NULL_RTX;
-#endif
   edge_iterator ei;
 
   rtl_profile_for_bb (ENTRY_BLOCK_PTR);
+
+  inserted = false;
+  seq = NULL_RTX;
+  epilogue_end = NULL_RTX;
+
+  /* Can't deal with multiple successors of the entry block at the
+     moment.  Function should always have at least one entry
+     point.  */
+  gcc_assert (single_succ_p (ENTRY_BLOCK_PTR));
+  entry_edge = single_succ_edge (ENTRY_BLOCK_PTR);
+
+  if (flag_split_stack
+      && (lookup_attribute ("no_split_stack", DECL_ATTRIBUTES (cfun->decl))
+	  == NULL))
+    {
+#ifndef HAVE_split_stack_prologue
+      gcc_unreachable ();
+#else
+      gcc_assert (HAVE_split_stack_prologue);
+
+      start_sequence ();
+      emit_insn (gen_split_stack_prologue ());
+      seq = get_insns ();
+      end_sequence ();
+
+      record_insns (seq, NULL, &prologue_insn_hash);
+      set_insn_locators (seq, prologue_locator);
+
+      /* This relies on the fact that committing the edge insertion
+	 will look for basic blocks within the inserted instructions,
+	 which in turn relies on the fact that we are not in CFG
+	 layout mode here.  */
+      insert_insn_on_edge (seq, entry_edge);
+      inserted = true;
+#endif
+    }
+
 #ifdef HAVE_prologue
   if (HAVE_prologue)
     {
@@ -5246,13 +5279,8 @@  thread_prologue_and_epilogue_insns (void
       end_sequence ();
       set_insn_locators (seq, prologue_locator);
 
-      /* Can't deal with multiple successors of the entry block
-         at the moment.  Function should always have at least one
-         entry point.  */
-      gcc_assert (single_succ_p (ENTRY_BLOCK_PTR));
-
-      insert_insn_on_edge (seq, single_succ_edge (ENTRY_BLOCK_PTR));
-      inserted = 1;
+      insert_insn_on_edge (seq, entry_edge);
+      inserted = true;
     }
 #endif
 
@@ -5422,7 +5450,7 @@  thread_prologue_and_epilogue_insns (void
       end_sequence ();
 
       insert_insn_on_edge (seq, e);
-      inserted = 1;
+      inserted = true;
     }
   else
 #endif
Index: gcc/explow.c
===================================================================
--- gcc/explow.c	(revision 164490)
+++ gcc/explow.c	(working copy)
@@ -1129,6 +1129,7 @@  allocate_dynamic_stack_space (rtx size, 
 {
   HOST_WIDE_INT stack_usage_size = -1;
   bool known_align_valid = true;
+  rtx final_label, final_target;
 
   /* If we're asking for zero bytes, it doesn't matter what we point
      to since we can't dereference it.  But return a reasonable
@@ -1266,6 +1267,14 @@  allocate_dynamic_stack_space (rtx size, 
 	}
     }
 
+  /* Don't use a TARGET that isn't a pseudo or is the wrong mode.  */
+  if (target == 0 || !REG_P (target)
+      || REGNO (target) < FIRST_PSEUDO_REGISTER
+      || GET_MODE (target) != Pmode)
+    target = gen_reg_rtx (Pmode);
+
+  mark_reg_pointer (target, known_align);
+
   /* The size is supposed to be fully adjusted at this point so record it
      if stack usage info is requested.  */
   if (flag_stack_usage)
@@ -1278,6 +1287,52 @@  allocate_dynamic_stack_space (rtx size, 
 	current_function_has_unbounded_dynamic_stack_size = 1;
     }
 
+  final_label = NULL_RTX;
+  final_target = NULL_RTX;
+
+  /* If we are splitting the stack, we need to ask the backend whether
+     there is enough room on the current stack.  If there isn't, or if
+     the backend doesn't know how to tell is, then we need to call a
+     function to allocate memory in some other way.  This memory will
+     be released when we release the current stack segment.  The
+     effect is that stack allocation becomes less efficient, but at
+     least it doesn't cause a stack overflow.  */
+  if (flag_split_stack)
+    {
+      rtx available_label, space, func;
+
+      available_label = NULL_RTX;
+
+#ifdef HAVE_split_stack_space_check
+      if (HAVE_split_stack_space_check)
+	{
+	  available_label = gen_label_rtx ();
+
+	  /* This instruction will branch to AVAILABLE_LABEL if there
+	     are SIZE bytes available on the stack.  */
+	  emit_insn (gen_split_stack_space_check (size, available_label));
+	}
+#endif
+
+      func = init_one_libfunc ("__morestack_allocate_stack_space");
+
+      space = emit_library_call_value (func, target, LCT_NORMAL, Pmode,
+				       1, size, Pmode);
+
+      if (available_label == NULL_RTX)
+	return space;
+
+      final_target = gen_reg_rtx (Pmode);
+      mark_reg_pointer (final_target, known_align);
+
+      emit_move_insn (final_target, space);
+
+      final_label = gen_label_rtx ();
+      emit_jump (final_label);
+
+      emit_label (available_label);
+    }
+
   do_pending_stack_adjust ();
 
  /* We ought to be called always on the toplevel and stack ought to be aligned
@@ -1295,14 +1350,6 @@  allocate_dynamic_stack_space (rtx size, 
   else if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
     probe_stack_range (STACK_CHECK_PROTECT, size);
 
-  /* Don't use a TARGET that isn't a pseudo or is the wrong mode.  */
-  if (target == 0 || !REG_P (target)
-      || REGNO (target) < FIRST_PSEUDO_REGISTER
-      || GET_MODE (target) != Pmode)
-    target = gen_reg_rtx (Pmode);
-
-  mark_reg_pointer (target, known_align);
-
   /* Perform the required allocation from the stack.  Some systems do
      this differently than simply incrementing/decrementing from the
      stack pointer, such as acquiring the space by calling malloc().  */
@@ -1388,6 +1435,15 @@  allocate_dynamic_stack_space (rtx size, 
   if (cfun->nonlocal_goto_save_area != 0)
     update_nonlocal_goto_save_area ();
 
+  /* Finish up the split stack handling.  */
+  if (final_label != NULL_RTX)
+    {
+      gcc_assert (flag_split_stack);
+      emit_move_insn (final_target, target);
+      emit_label (final_label);
+      target = final_target;
+    }
+
   return target;
 }
 
Index: gcc/varasm.c
===================================================================
--- gcc/varasm.c	(revision 164490)
+++ gcc/varasm.c	(working copy)
@@ -99,6 +99,10 @@  bool first_function_block_is_cold;
 
 static alias_set_type const_alias_set;
 
+/* Whether we saw any functions with no_split_stack.  */
+
+static bool saw_no_split_stack;
+
 static const char *strip_reg_name (const char *);
 static int contains_pointers_p (tree);
 #ifdef ASM_OUTPUT_EXTERNAL
@@ -1549,6 +1553,9 @@  assemble_start_function (tree decl, cons
   /* Standard thing is just output label for the function.  */
   ASM_OUTPUT_FUNCTION_LABEL (asm_out_file, fnname, current_function_decl);
 #endif /* ASM_DECLARE_FUNCTION_NAME */
+
+  if (lookup_attribute ("no_split_stack", DECL_ATTRIBUTES (decl)))
+    saw_no_split_stack = true;
 }
 
 /* Output assembler code associated with defining the size of the
@@ -6533,6 +6540,28 @@  file_end_indicate_exec_stack (void)
   switch_to_section (get_section (".note.GNU-stack", flags, NULL));
 }
 
+/* Emit a special section directive to indicate that this object file
+   was compiled with -fsplit-stack.  This is used to let the linker
+   detect calls between split-stack code and non-split-stack code, so
+   that it can modify the split-stack code to allocate a sufficiently
+   large stack.  We emit another special section if there are any
+   functions in this file which have the no_split_stack attribute, to
+   prevent the linker from warning about being unable to convert the
+   functions if they call non-split-stack code.  */
+
+void
+file_end_indicate_split_stack (void)
+{
+  if (flag_split_stack)
+    {
+      switch_to_section (get_section (".note.GNU-split-stack", SECTION_DEBUG,
+				      NULL));
+      if (saw_no_split_stack)
+	switch_to_section (get_section (".note.GNU-no-split-stack",
+					SECTION_DEBUG, NULL));
+    }
+}
+
 /* Output DIRECTIVE (a C string) followed by a newline.  This is used as
    a get_unnamed_section callback.  */
 
Index: gcc/output.h
===================================================================
--- gcc/output.h	(revision 164490)
+++ gcc/output.h	(working copy)
@@ -632,6 +632,7 @@  extern void default_asm_declare_constant
 					       const_tree, HOST_WIDE_INT);
 extern void default_file_start (void);
 extern void file_end_indicate_exec_stack (void);
+extern void file_end_indicate_split_stack (void);
 
 extern void default_elf_asm_output_external (FILE *file, tree,
 					     const char *);