diff mbox

Remove SETJMP_VIA_SAVE_AREA support

Message ID 201106021241.49426.ebotcazou@adacore.com
State New
Headers show

Commit Message

Eric Botcazou June 2, 2011, 10:41 a.m. UTC
This removes the (undocumented) support for SETJMP_VIA_SAVE_AREA from the 
compiler.  This is a trick implemented on the SPARC exclusively to reuse the 
register save area present in all frames (because of the register windows) for 
part of the setjmp buffer.  The benefit are marginal and dwarfed by the usual 
drawbacks of using setjmp/longjmp (to implement exceptions for example).

This exposed a couple of similar bugs in cse.c and postreload-gcse.c: the code 
was effectively treating a basic block with a single, abnormal incoming edge 
as if the edge was normal.

Bootstrapped/regtested on x86_64-suse-linux and sparc-sun-solaris2.10.  I also 
verified that ACATS is clean with the SJLJ EH scheme.  Applied on the mainline.


2011-06-02  Eric Botcazou  <ebotcazou@adacore.com>

	* function.h (struct stack_usage): Remove dynamic_alloc_count field.
	(current_function_dynamic_alloc_count): Delete.
	* builtins.c (expand_builtin_setjmp_setup): Do not set calls_setjmp.
	(expand_builtin_nonlocal_goto): Remove obsolete comment.
	(expand_builtin_update_setjmp_buf): Remove dead code.
	* cse.c (cse_find_path): Do not follow a single abnormal incoming edge.
	* explow.c (allocate_dynamic_stack_space): Remove SETJMP_VIA_SAVE_AREA
	support.
	* function.c (instantiate_virtual_regs): Likewise.
	* postreload-gcse.c (bb_has_well_behaved_predecessors): Return false
	for a block with a single abnormal incoming edge.
	* config/sparc/sparc.h (STACK_SAVEAREA_MODE): Define.
	* config/sparc/sparc-protos.h (load_got_register): Declare.
	* config/sparc/sparc.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Define.
	(load_got_register): Make global.
	(sparc_frame_pointer_required): Add 'static'.
	(sparc_can_eliminate): Likewise.  Call sparc_frame_pointer_required.
	(sparc_builtin_setjmp_frame_value): New function.
	* config/sparc/sparc.md (UNSPECV_SETJMP): Remove.
	(save_stack_nonlocal): New expander.
	(restore_stack_nonlocal): Likewise.
	(nonlocal_goto): Remove modes, adjust predicates and reimplement.
	(nonlocal_goto_internal): New insn.
	(goto_handler_and_restore): Delete.
	(builtin_setjmp_setup): Likewise.
	(do_builtin_setjmp_setup): Likewise.
	(setjmp): Likewise.
	(builtin_setjmp_receiver): New expander.

Comments

Hans-Peter Nilsson June 9, 2011, 4:38 a.m. UTC | #1
On Thu, 2 Jun 2011, Eric Botcazou wrote:

> This removes the (undocumented) support for SETJMP_VIA_SAVE_AREA from the
> compiler.

Poison it (in system.h)?

brgds, H-P
Eric Botcazou June 9, 2011, 12:37 p.m. UTC | #2
> Poison it (in system.h)?

Let's keep pretending that it never existed. :-)
Mike Stump April 18, 2012, 7:18 p.m. UTC | #3
On Jun 2, 2011, at 3:41 AM, Eric Botcazou wrote:
> This removes the (undocumented) support for SETJMP_VIA_SAVE_AREA from the 
> compiler.

> 2011-06-02  Eric Botcazou  <ebotcazou@adacore.com>

> 	* builtins.c (expand_builtin_setjmp_setup): Do not set calls_setjmp.

> --- builtins.c	(revision 174559)
> +++ builtins.c	(working copy)
> @@ -806,10 +806,6 @@ expand_builtin_setjmp_setup (rtx buf_add
>      emit_insn (gen_builtin_setjmp_setup (buf_addr));
>  #endif
>  
> -  /* Tell optimize_save_area_alloca that extra work is going to
> -     need to go on during alloca.  */
> -  cfun->calls_setjmp = 1;
> -
>    /* We have a nonlocal label.   */
>    cfun->has_nonlocal_label = 1;
>  }

You do know that at least rtl hoisting is dependent upon calls_setjmp being set, right?  :-(

This part breaks my port.  I think you read the comment and thought it was exhaustive, I don't believe it is.

Any objection to putting it back, or, would you like me to drill down on rtl hoisting?
Eric Botcazou April 23, 2012, 10:04 a.m. UTC | #4
> You do know that at least rtl hoisting is dependent upon calls_setjmp being
> set, right?  :-(

Sure, clearly a straightforward dependency. ;-)

> This part breaks my port.  I think you read the comment and thought it was
> exhaustive, I don't believe it is.

I think it should, though.  __builtin_setjmp doesn't need the calls_setjmp big 
hammer, since everything is supposed to be exposed in the IR.

> Any objection to putting it back, or, would you like me to drill down on
> rtl hoisting?

Yes, the problem needs to be understood first.  This change broke something for 
Darwin too and the fix was a one-liner in the back-end:
  http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00351.html
diff mbox

Patch

Index: function.h
===================================================================
--- function.h	(revision 174559)
+++ function.h	(working copy)
@@ -1,6 +1,6 @@ 
 /* Structure for saving state for a nested function.
    Copyright (C) 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
-   1999, 2000, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
+   1999, 2000, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011
    Free Software Foundation, Inc.
 
 This file is part of GCC.
@@ -476,9 +476,6 @@  struct GTY(()) stack_usage
      !ACCUMULATE_OUTGOING_ARGS, it contains the outgoing arguments.  */
   int pushed_stack_size;
 
-  /* # of dynamic allocations in the function.  */
-  unsigned int dynamic_alloc_count : 31;
-
   /* Nonzero if the amount of stack space allocated dynamically cannot
      be bounded at compile-time.  */
   unsigned int has_unbounded_dynamic_stack_size : 1;
@@ -487,7 +484,6 @@  struct GTY(()) stack_usage
 #define current_function_static_stack_size (cfun->su->static_stack_size)
 #define current_function_dynamic_stack_size (cfun->su->dynamic_stack_size)
 #define current_function_pushed_stack_size (cfun->su->pushed_stack_size)
-#define current_function_dynamic_alloc_count (cfun->su->dynamic_alloc_count)
 #define current_function_has_unbounded_dynamic_stack_size \
   (cfun->su->has_unbounded_dynamic_stack_size)
 #define current_function_allocates_dynamic_stack_space    \
Index: builtins.c
===================================================================
--- builtins.c	(revision 174559)
+++ builtins.c	(working copy)
@@ -806,10 +806,6 @@  expand_builtin_setjmp_setup (rtx buf_add
     emit_insn (gen_builtin_setjmp_setup (buf_addr));
 #endif
 
-  /* Tell optimize_save_area_alloca that extra work is going to
-     need to go on during alloca.  */
-  cfun->calls_setjmp = 1;
-
   /* We have a nonlocal label.   */
   cfun->has_nonlocal_label = 1;
 }
@@ -992,8 +988,8 @@  expand_builtin_nonlocal_goto (tree exp)
   r_label = convert_memory_address (Pmode, r_label);
   r_save_area = expand_normal (t_save_area);
   r_save_area = convert_memory_address (Pmode, r_save_area);
-  /* Copy the address of the save location to a register just in case it was based
-    on the frame pointer.   */
+  /* Copy the address of the save location to a register just in case it was
+     based on the frame pointer.   */
   r_save_area = copy_to_reg (r_save_area);
   r_fp = gen_rtx_MEM (Pmode, r_save_area);
   r_sp = gen_rtx_MEM (STACK_SAVEAREA_MODE (SAVE_NONLOCAL),
@@ -1013,11 +1009,7 @@  expand_builtin_nonlocal_goto (tree exp)
       emit_clobber (gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (VOIDmode)));
       emit_clobber (gen_rtx_MEM (BLKmode, hard_frame_pointer_rtx));
 
-      /* Restore frame pointer for containing function.
-	 This sets the actual hard register used for the frame pointer
-	 to the location of the function's incoming static chain info.
-	 The non-local goto handler will then adjust it to contain the
-	 proper value and reload the argument pointer, if needed.  */
+      /* Restore frame pointer for containing function.  */
       emit_move_insn (hard_frame_pointer_rtx, r_fp);
       emit_stack_restore (SAVE_NONLOCAL, r_sp);
 
@@ -1066,29 +1058,13 @@  expand_builtin_nonlocal_goto (tree exp)
 static void
 expand_builtin_update_setjmp_buf (rtx buf_addr)
 {
-  enum machine_mode sa_mode = Pmode;
-  rtx stack_save;
-
-
-#ifdef HAVE_save_stack_nonlocal
-  if (HAVE_save_stack_nonlocal)
-    sa_mode = insn_data[(int) CODE_FOR_save_stack_nonlocal].operand[0].mode;
-#endif
-#ifdef STACK_SAVEAREA_MODE
-  sa_mode = STACK_SAVEAREA_MODE (SAVE_NONLOCAL);
-#endif
-
-  stack_save
+  enum machine_mode sa_mode = STACK_SAVEAREA_MODE (SAVE_NONLOCAL);
+  rtx stack_save
     = gen_rtx_MEM (sa_mode,
 		   memory_address
 		   (sa_mode,
 		    plus_constant (buf_addr, 2 * GET_MODE_SIZE (Pmode))));
 
-#ifdef HAVE_setjmp
-  if (HAVE_setjmp)
-    emit_insn (gen_setjmp ());
-#endif
-
   emit_stack_save (SAVE_NONLOCAL, &stack_save);
 }
 
Index: cse.c
===================================================================
--- cse.c	(revision 174559)
+++ cse.c	(working copy)
@@ -1,7 +1,7 @@ 
 /* Common subexpression elimination for GNU compiler.
    Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998
-   1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
-   Free Software Foundation, Inc.
+   1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
+   2011 Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -6192,7 +6192,9 @@  cse_find_path (basic_block first_bb, str
 	  else
 	    e = NULL;
 
-	  if (e && e->dest != EXIT_BLOCK_PTR
+	  if (e
+	      && (e->flags & EDGE_ABNORMAL) == 0
+	      && e->dest != EXIT_BLOCK_PTR
 	      && single_pred_p (e->dest)
 	      /* Avoid visiting basic blocks twice.  The large comment
 		 above explains why this can happen.  */
Index: explow.c
===================================================================
--- explow.c	(revision 174559)
+++ explow.c	(working copy)
@@ -1,6 +1,6 @@ 
 /* Subroutines for manipulating rtx's in semantically interesting ways.
-   Copyright (C) 1987, 1991, 1994, 1995, 1996, 1997, 1998,
-   1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
+   Copyright (C) 1987, 1991, 1994, 1995, 1996, 1997, 1998, 1999, 2000,
+   2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011
    Free Software Foundation, Inc.
 
 This file is part of GCC.
@@ -1249,38 +1249,6 @@  allocate_dynamic_stack_space (rtx size,
 	size_align = extra_align;
     }
 
-#ifdef SETJMP_VIA_SAVE_AREA
-  /* If setjmp restores regs from a save area in the stack frame,
-     avoid clobbering the reg save area.  Note that the offset of
-     virtual_incoming_args_rtx includes the preallocated stack args space.
-     It would be no problem to clobber that, but it's on the wrong side
-     of the old save area.
-
-     What used to happen is that, since we did not know for sure
-     whether setjmp() was invoked until after RTL generation, we
-     would use reg notes to store the "optimized" size and fix things
-     up later.  These days we know this information before we ever
-     start building RTL so the reg notes are unnecessary.  */
-  if (cfun->calls_setjmp)
-    {
-      rtx dynamic_offset
-	= expand_binop (Pmode, sub_optab, virtual_stack_dynamic_rtx,
-			stack_pointer_rtx, NULL_RTX, 1, OPTAB_LIB_WIDEN);
-
-      size = expand_binop (Pmode, add_optab, size, dynamic_offset,
-			   NULL_RTX, 1, OPTAB_LIB_WIDEN);
-
-      /* The above dynamic offset cannot be computed statically at this
-	 point, but it will be possible to do so after RTL expansion is
-	 done.  Record how many times we will need to add it.  */
-      if (flag_stack_usage_info)
-	current_function_dynamic_alloc_count++;
-
-      /* ??? Can we infer a minimum of STACK_BOUNDARY here?  */
-      size_align = BITS_PER_UNIT;
-    }
-#endif /* SETJMP_VIA_SAVE_AREA */
-
   /* Round the size to a multiple of the required stack alignment.
      Since the stack if presumed to be rounded before this allocation,
      this will maintain the required alignment.
Index: function.c
===================================================================
--- function.c	(revision 174559)
+++ function.c	(working copy)
@@ -1937,17 +1937,6 @@  instantiate_virtual_regs (void)
      frame_pointer_rtx.  */
   virtuals_instantiated = 1;
 
-  /* See allocate_dynamic_stack_space for the rationale.  */
-#ifdef SETJMP_VIA_SAVE_AREA
-  if (flag_stack_usage_info && cfun->calls_setjmp)
-    {
-      int align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
-      dynamic_offset = (dynamic_offset + align - 1) / align * align;
-      current_function_dynamic_stack_size
-	+= current_function_dynamic_alloc_count * dynamic_offset;
-    }
-#endif
-
   return 0;
 }
 
Index: postreload-gcse.c
===================================================================
--- postreload-gcse.c	(revision 174559)
+++ postreload-gcse.c	(working copy)
@@ -1,5 +1,5 @@ 
 /* Post reload partially redundant load elimination
-   Copyright (C) 2004, 2005, 2006, 2007, 2008, 2010
+   Copyright (C) 2004, 2005, 2006, 2007, 2008, 2010, 2011
    Free Software Foundation, Inc.
 
 This file is part of GCC.
@@ -912,10 +912,12 @@  get_avail_load_store_reg (rtx insn)
 static bool
 bb_has_well_behaved_predecessors (basic_block bb)
 {
+  unsigned int edge_count = EDGE_COUNT (bb->preds);
   edge pred;
   edge_iterator ei;
 
-  if (EDGE_COUNT (bb->preds) == 0)
+  if (edge_count == 0
+      || (edge_count == 1 && (single_pred_edge (bb)->flags & EDGE_ABNORMAL)))
     return false;
 
   FOR_EACH_EDGE (pred, ei, bb->preds)
Index: config/sparc/sparc.md
===================================================================
--- config/sparc/sparc.md	(revision 174559)
+++ config/sparc/sparc.md	(working copy)
@@ -1,7 +1,7 @@ 
 ;; Machine description for SPARC chip for GCC
 ;;  Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
-;;  1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
-;;  Free Software Foundation, Inc.
+;;  1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
+;;  2011 Free Software Foundation, Inc.
 ;;  Contributed by Michael Tiemann (tiemann@cygnus.com)
 ;;  64-bit SPARC-V9 support by Michael Tiemann, Jim Wilson, and Doug Evans,
 ;;  at Cygnus Support.
@@ -70,7 +70,6 @@  (define_constants
    (UNSPECV_FLUSHW		1)
    (UNSPECV_GOTO		2)
    (UNSPECV_FLUSH		4)
-   (UNSPECV_SETJMP		5)
    (UNSPECV_SAVEW		6)
    (UNSPECV_CAS			8)
    (UNSPECV_SWAP		9)
@@ -6444,136 +6443,100 @@  (define_insn "*branch_sp64"
   "jmp\t%a0%#"
   [(set_attr "type" "uncond_branch")])
 
+(define_expand "save_stack_nonlocal"
+  [(set (match_operand 0 "memory_operand" "")
+	(match_operand 1 "register_operand" ""))
+   (set (match_dup 2) (match_dup 3))]
+  ""
+{
+  operands[0] = adjust_address_nv (operands[0], Pmode, 0);
+  operands[2] = adjust_address_nv (operands[0], Pmode, GET_MODE_SIZE (Pmode));
+  operands[3] = gen_rtx_REG (Pmode, 31); /* %i7 */
+})
+
+(define_expand "restore_stack_nonlocal"
+  [(set (match_operand 0 "register_operand" "")
+	(match_operand 1 "memory_operand" ""))]
+  ""
+{
+  operands[1] = adjust_address_nv (operands[1], Pmode, 0);
+})
+
 (define_expand "nonlocal_goto"
-  [(match_operand:SI 0 "general_operand" "")
-   (match_operand:SI 1 "general_operand" "")
-   (match_operand:SI 2 "general_operand" "")
-   (match_operand:SI 3 "" "")]
+  [(match_operand 0 "general_operand" "")
+   (match_operand 1 "general_operand" "")
+   (match_operand 2 "memory_operand" "")
+   (match_operand 3 "memory_operand" "")]
   ""
 {
-  rtx lab = operands[1];
-  rtx stack = operands[2];
-  rtx fp = operands[3];
-  rtx labreg;
+  rtx r_label = copy_to_reg (operands[1]);
+  rtx r_sp = adjust_address_nv (operands[2], Pmode, 0);
+  rtx r_fp = operands[3];
+  rtx r_i7 = adjust_address_nv (operands[2], Pmode, GET_MODE_SIZE (Pmode));
 
-  /* Trap instruction to flush all the register windows.  */
+  /* We need to flush all the register windows so that their contents will
+     be re-synchronized by the restore insn of the target function.  */
   emit_insn (gen_flush_register_windows ());
 
-  /* Load the fp value for the containing fn into %fp.  This is needed
-     because STACK refers to %fp.  Note that virtual register instantiation
-     fails if the virtual %fp isn't set from a register.  */
-  if (GET_CODE (fp) != REG)
-    fp = force_reg (Pmode, fp);
-  emit_move_insn (virtual_stack_vars_rtx, fp);
-
-  /* Find the containing function's current nonlocal goto handler,
-     which will do any cleanups and then jump to the label.  */
-  labreg = gen_rtx_REG (Pmode, 8);
-  emit_move_insn (labreg, lab);
-
-  /* Restore %fp from stack pointer value for containing function.
-     The restore insn that follows will move this to %sp,
-     and reload the appropriate value into %fp.  */
-  emit_move_insn (hard_frame_pointer_rtx, stack);
+  emit_clobber (gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (VOIDmode)));
+  emit_clobber (gen_rtx_MEM (BLKmode, hard_frame_pointer_rtx));
 
+  /* Restore frame pointer for containing function.  */
+  emit_move_insn (hard_frame_pointer_rtx, r_fp);
+  emit_stack_restore (SAVE_NONLOCAL, r_sp);
+
+  /* USE of hard_frame_pointer_rtx added for consistency;
+     not clear if really needed.  */
+  emit_use (hard_frame_pointer_rtx);
   emit_use (stack_pointer_rtx);
 
-  /* ??? The V9-specific version was disabled in rev 1.65.  */
-  emit_jump_insn (gen_goto_handler_and_restore (labreg));
+  /* We need to smuggle the load of %i7 as it is a fixed register.  */
+  emit_jump_insn (gen_nonlocal_goto_internal (r_label, r_i7));
   emit_barrier ();
   DONE;
 })
 
-;; Special trap insn to flush register windows.
-(define_insn "flush_register_windows"
-  [(unspec_volatile [(const_int 0)] UNSPECV_FLUSHW)]
-  ""
-  { return TARGET_V9 ? "flushw" : "ta\t3"; }
-  [(set_attr "type" "flushw")])
-
-(define_insn "goto_handler_and_restore"
-  [(unspec_volatile [(match_operand 0 "register_operand" "=r")] UNSPECV_GOTO)]
-  "GET_MODE (operands[0]) == Pmode"
+(define_insn "nonlocal_goto_internal"
+  [(unspec_volatile [(match_operand 0 "register_operand" "r")
+                     (match_operand 1 "memory_operand" "m")] UNSPECV_GOTO)]
+  "GET_MODE (operands[0]) == Pmode && GET_MODE (operands[1]) == Pmode"
 {
   if (flag_delayed_branch)
-    return "jmp\t%0\n\t restore";
+    {
+      if (TARGET_ARCH64)
+	return "jmp\t%0\n\t ldx\t%1, %%i7";
+      else
+	return "jmp\t%0\n\t ld\t%1, %%i7";
+    }
   else
-    return "mov\t%0,%%g1\n\trestore\n\tjmp\t%%g1\n\t nop";
+    {
+      if (TARGET_ARCH64)
+	return "ldx\t%1, %%i7\n\tjmp\t%0\n\t nop";
+      else
+	return "ld\t%1, %%i7\n\tjmp\t%0\n\t nop";
+    }
 }
   [(set (attr "type") (const_string "multi"))
    (set (attr "length")
 	(if_then_else (eq_attr "delayed_branch" "true")
 		      (const_int 2)
-		      (const_int 4)))])
+		      (const_int 3)))])
 
-;; For __builtin_setjmp we need to flush register windows iff the function
-;; calls alloca as well, because otherwise the current register window might
-;; be saved after the %sp adjustment and thus setjmp would crash.
-(define_expand "builtin_setjmp_setup"
-  [(match_operand 0 "register_operand" "r")]
-  ""
+(define_expand "builtin_setjmp_receiver"
+  [(label_ref (match_operand 0 "" ""))]
+  "flag_pic"
 {
-  emit_insn (gen_do_builtin_setjmp_setup ());
+  load_got_register ();
   DONE;
 })
 
-(define_insn "do_builtin_setjmp_setup"
-  [(unspec_volatile [(const_int 0)] UNSPECV_SETJMP)]
-  ""
-{
-  if (!cfun->calls_alloca)
-    return "";
-  if (!TARGET_V9)
-    return "ta\t3";
-  fputs ("\tflushw\n", asm_out_file);
-  if (flag_pic)
-    fprintf (asm_out_file, "\tst%c\t%%l7, [%%sp+%d]\n",
-	     TARGET_ARCH64 ? 'x' : 'w',
-	     SPARC_STACK_BIAS + 7 * UNITS_PER_WORD);
-  fprintf (asm_out_file, "\tst%c\t%%fp, [%%sp+%d]\n",
-	   TARGET_ARCH64 ? 'x' : 'w',
-	   SPARC_STACK_BIAS + 14 * UNITS_PER_WORD);
-  fprintf (asm_out_file, "\tst%c\t%%i7, [%%sp+%d]\n",
-	   TARGET_ARCH64 ? 'x' : 'w',
-	   SPARC_STACK_BIAS + 15 * UNITS_PER_WORD);
-  return "";
-}
-  [(set_attr "type" "multi")
-   (set (attr "length")
-        (cond [(eq_attr "calls_alloca" "false")
-                 (const_int 0)
-               (eq_attr "isa" "!v9")
-                 (const_int 1)
-               (eq_attr "pic" "true")
-                 (const_int 4)] (const_int 3)))])
-
-;; Pattern for use after a setjmp to store registers into the save area.
+;; Special insn to flush register windows.
 
-(define_expand "setjmp"
-  [(const_int 0)]
+(define_insn "flush_register_windows"
+  [(unspec_volatile [(const_int 0)] UNSPECV_FLUSHW)]
   ""
-{
-  rtx mem;
-
-  if (flag_pic)
-    {
-      mem = gen_rtx_MEM (Pmode,
-			 plus_constant (stack_pointer_rtx,
-					SPARC_STACK_BIAS + 7 * UNITS_PER_WORD));
-      emit_insn (gen_rtx_SET (VOIDmode, mem, pic_offset_table_rtx));
-    }
-
-  mem = gen_rtx_MEM (Pmode,
-		     plus_constant (stack_pointer_rtx,
-				    SPARC_STACK_BIAS + 14 * UNITS_PER_WORD));
-  emit_insn (gen_rtx_SET (VOIDmode, mem, hard_frame_pointer_rtx));
-
-  mem = gen_rtx_MEM (Pmode,
-		     plus_constant (stack_pointer_rtx,
-				    SPARC_STACK_BIAS + 15 * UNITS_PER_WORD));
-  emit_insn (gen_rtx_SET (VOIDmode, mem, gen_rtx_REG (Pmode, 31)));
-  DONE;
-})
+  { return TARGET_V9 ? "flushw" : "ta\t3"; }
+  [(set_attr "type" "flushw")])
 
 ;; Special pattern for the FLUSH instruction.
 
Index: config/sparc/sparc-protos.h
===================================================================
--- config/sparc/sparc-protos.h	(revision 174559)
+++ config/sparc/sparc-protos.h	(working copy)
@@ -60,6 +60,7 @@  extern bool constant_address_p (rtx);
 extern bool legitimate_pic_operand_p (rtx);
 extern rtx sparc_legitimize_reload_address (rtx, enum machine_mode, int, int,
 					    int, int *win);
+extern void load_got_register (void);
 extern void sparc_emit_call_insn (rtx, rtx);
 extern void sparc_defer_case_vector (rtx, rtx, int);
 extern bool sparc_expand_move (enum machine_mode, rtx *);
Index: config/sparc/sparc.c
===================================================================
--- config/sparc/sparc.c	(revision 174559)
+++ config/sparc/sparc.c	(working copy)
@@ -387,7 +387,6 @@  static rtx sparc_builtin_saveregs (void)
 static int epilogue_renumber (rtx *, int);
 static bool sparc_assemble_integer (rtx, unsigned int, int);
 static int set_extends (rtx);
-static void load_got_register (void);
 static int save_or_restore_regs (int, int, rtx, int, int);
 static void emit_save_or_restore_regs (int);
 static void sparc_asm_function_prologue (FILE *, HOST_WIDE_INT);
@@ -464,6 +463,7 @@  static void sparc_output_dwarf_dtprel (F
 static void sparc_file_end (void);
 static bool sparc_frame_pointer_required (void);
 static bool sparc_can_eliminate (const int, const int);
+static rtx sparc_builtin_setjmp_frame_value (void);
 static void sparc_conditional_register_usage (void);
 #ifdef TARGET_ALTERNATE_LONG_DOUBLE_MANGLING
 static const char *sparc_mangle_type (const_tree);
@@ -650,8 +650,12 @@  static const struct default_options spar
 #undef TARGET_FRAME_POINTER_REQUIRED
 #define TARGET_FRAME_POINTER_REQUIRED sparc_frame_pointer_required
 
+#undef TARGET_BUILTIN_SETJMP_FRAME_VALUE
+#define TARGET_BUILTIN_SETJMP_FRAME_VALUE sparc_builtin_setjmp_frame_value
+
 #undef TARGET_CAN_ELIMINATE
 #define TARGET_CAN_ELIMINATE sparc_can_eliminate
+
 #undef  TARGET_PREFERRED_RELOAD_CLASS
 #define TARGET_PREFERRED_RELOAD_CLASS sparc_preferred_reload_class
 
@@ -3770,7 +3774,7 @@  gen_load_pcrel_sym (rtx op0, rtx op1, rt
 
 /* Emit code to load the GOT register.  */
 
-static void
+void
 load_got_register (void)
 {
   /* In PIC mode, this will retrieve pic_offset_table_rtx.  */
@@ -9801,7 +9805,7 @@  sparc_expand_compare_and_swap_12 (rtx re
 
 /* Implement TARGET_FRAME_POINTER_REQUIRED.  */
 
-bool
+static bool
 sparc_frame_pointer_required (void)
 {
   return !(current_function_is_leaf && only_leaf_regs_used ());
@@ -9812,11 +9816,18 @@  sparc_frame_pointer_required (void)
    in that case.  But the test in update_eliminables doesn't know we are
    assuming below that we only do the former elimination.  */
 
-bool
+static bool
 sparc_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to)
 {
-  return (to == HARD_FRAME_POINTER_REGNUM
-          || !targetm.frame_pointer_required ());
+  return to == HARD_FRAME_POINTER_REGNUM || !sparc_frame_pointer_required ();
+}
+
+/* Return the hard frame pointer directly to bypass the stack bias.  */
+
+static rtx
+sparc_builtin_setjmp_frame_value (void)
+{
+  return hard_frame_pointer_rtx;
 }
 
 /* If !TARGET_FPU, then make the fp registers and fp cc regs fixed so that
Index: config/sparc/sparc.h
===================================================================
--- config/sparc/sparc.h	(revision 174559)
+++ config/sparc/sparc.h	(working copy)
@@ -1384,11 +1384,19 @@  do {									\
 #define EPILOGUE_USES(REGNO) ((REGNO) == 31 \
   || (crtl->calls_eh_return && (REGNO) == 1))
 
+/* We need 2 words, so we can save the stack pointer and the return register
+   of the function containing a non-local goto target.  */
+
+#define STACK_SAVEAREA_MODE(LEVEL) \
+  ((LEVEL) == SAVE_NONLOCAL ? (TARGET_ARCH64 ? TImode : DImode) : Pmode)
+
 /* Length in units of the trampoline for entering a nested function.  */
 
 #define TRAMPOLINE_SIZE (TARGET_ARCH64 ? 32 : 16)
 
-#define TRAMPOLINE_ALIGNMENT 128 /* 16 bytes */
+/* Alignment required for trampolines, in bits.  */
+
+#define TRAMPOLINE_ALIGNMENT 128
 
 /* Generate RTL to flush the register windows so as to make arbitrary frames
    available.  */