Patchwork PowerPC shrink-wrap support 3 of 3

login
register
mail settings
Submitter Alan Modra
Date Sept. 28, 2011, 3:47 p.m.
Message ID <20110928154747.GO10321@bubble.grove.modra.org>
Download mbox | patch
Permalink /patch/116830/
State New
Headers show

Comments

Alan Modra - Sept. 28, 2011, 3:47 p.m.
This supercedes http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01004.html
and http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01593.html, fixing
the two regressions introduced by those patches.  The first patch is
unchanged except to leave all the out-of-line restore functions using
"return" rather than "simple_return".  We don't want these being
confused with a plain "simple_return" and perhaps used by the shrink-
wrapping to return from pre-prologue code.

The second of these two patches was way too simplistic.  It was a real
pain getting the cfa_restores correct.  A lot were missing, or emitted
at the wrong place (due to bug in rs6000_emit_stack_reset).  I also
had the real restore insn move past the cfa_restores ("mtlr 0" insn
scheduled over loads from stack).

	* config/rs6000/rs6000.c (rs6000_make_savres_rtx): Delete unneeded
	declaration.
	(rs6000_emit_stack_reset): Only return insn emitted when it adjusts sp.
	(rs6000_make_savres_rtx): Rename to rs6000_emit_savres_rtx.  Use
	simple_return in pattern, emit instruction, and set jump_label.
	(rs6000_emit_prologue): Update for rs6000_emit_savres_rtx.  Use
	simple_return rather than return.
	(emit_cfa_restores): New function.
	(rs6000_emit_epilogue): Emit cfa_restores when flag_shrink_wrap.
	Add missing cfa_restores for SAVE_WORLD.  Add missing LR cfa_restore
	when using out-of-line gpr restore.  Add missing LR and FP regs
	cfa_restores for out-of-line fpr restore.  Consolidate code setting
	up cfa_restores.  Formatting.  Use LR_REGNO define.
	(rs6000_output_mi_thunk): Use simple_return rather than return.
	* config/rs6000/rs6000.md (sibcall*, sibcall_value*): Likewise.
	(return_internal*): Likewise.
	(any_return, return_pred, return_str): New iterators.
	(return, conditional return insns): Provide both return and
	simple_return variants.
	* gcc/config/rs6000/rs6000.h (EARLY_R12, LATE_R12): Define.
	(REG_ALLOC_ORDER): Move r12 before call-saved regs when FIXED_R13.
	Move r11 and r0 later to suit shrink-wrapping.
David Edelsohn - Oct. 16, 2011, 6:51 p.m.
On Wed, Sep 28, 2011 at 11:47 AM, Alan Modra <amodra@gmail.com> wrote:

>        * config/rs6000/rs6000.c (rs6000_make_savres_rtx): Delete unneeded
>        declaration.
>        (rs6000_emit_stack_reset): Only return insn emitted when it adjusts sp.
>        (rs6000_make_savres_rtx): Rename to rs6000_emit_savres_rtx.  Use
>        simple_return in pattern, emit instruction, and set jump_label.
>        (rs6000_emit_prologue): Update for rs6000_emit_savres_rtx.  Use
>        simple_return rather than return.
>        (emit_cfa_restores): New function.
>        (rs6000_emit_epilogue): Emit cfa_restores when flag_shrink_wrap.
>        Add missing cfa_restores for SAVE_WORLD.  Add missing LR cfa_restore
>        when using out-of-line gpr restore.  Add missing LR and FP regs
>        cfa_restores for out-of-line fpr restore.  Consolidate code setting
>        up cfa_restores.  Formatting.  Use LR_REGNO define.
>        (rs6000_output_mi_thunk): Use simple_return rather than return.
>        * config/rs6000/rs6000.md (sibcall*, sibcall_value*): Likewise.
>        (return_internal*): Likewise.
>        (any_return, return_pred, return_str): New iterators.
>        (return, conditional return insns): Provide both return and
>        simple_return variants.
>        * gcc/config/rs6000/rs6000.h (EARLY_R12, LATE_R12): Define.
>        (REG_ALLOC_ORDER): Move r12 before call-saved regs when FIXED_R13.
>        Move r11 and r0 later to suit shrink-wrapping.

Alan,

The patch is okay, although I am not thrilled about the need to change
the register allocation order.

Thanks, David

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 178876)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -899,8 +900,6 @@  static const char *rs6000_mangle_type (c
 static void rs6000_set_default_type_attributes (tree);
 static rtx rs6000_savres_routine_sym (rs6000_stack_t *, bool, bool, bool);
 static rtx rs6000_emit_stack_reset (rs6000_stack_t *, rtx, rtx, int, bool);
-static rtx rs6000_make_savres_rtx (rs6000_stack_t *, rtx, int,
-				   enum machine_mode, bool, bool, bool);
 static bool rs6000_reg_live_or_pic_offset_p (int);
 static tree rs6000_builtin_vectorized_libmass (tree, tree, tree);
 static tree rs6000_builtin_vectorized_function (tree, tree, tree);
@@ -19704,8 +19728,10 @@  rs6000_emit_stack_reset (rs6000_stack_t 
       if (sp_offset != 0)
 	{
 	  rtx dest_reg = savres ? gen_rtx_REG (Pmode, 11) : sp_reg_rtx;
-	  return emit_insn (gen_add3_insn (dest_reg, frame_reg_rtx,
-					   GEN_INT (sp_offset)));
+	  rtx insn = emit_insn (gen_add3_insn (dest_reg, frame_reg_rtx,
+					       GEN_INT (sp_offset)));
+	  if (!savres)
+	    return insn;
 	}
       else if (!savres)
 	return emit_move_insn (sp_reg_rtx, frame_reg_rtx);
@@ -19729,10 +19755,11 @@  rs6000_emit_stack_reset (rs6000_stack_t 
 }
 
 /* Construct a parallel rtx describing the effect of a call to an
-   out-of-line register save/restore routine.  */
+   out-of-line register save/restore routine, and emit the insn
+   or jump_insn as appropriate.  */
 
 static rtx
-rs6000_make_savres_rtx (rs6000_stack_t *info,
+rs6000_emit_savres_rtx (rs6000_stack_t *info,
 			rtx frame_reg_rtx, int save_area_offset,
 			enum machine_mode reg_mode,
 			bool savep, bool gpr, bool lr)
@@ -19742,6 +19769,7 @@  rs6000_make_savres_rtx (rs6000_stack_t *
   int reg_size = GET_MODE_SIZE (reg_mode);
   rtx sym;
   rtvec p;
+  rtx par, insn;
 
   offset = 0;
   start_reg = (gpr
@@ -19755,7 +19783,7 @@  rs6000_make_savres_rtx (rs6000_stack_t *
     RTVEC_ELT (p, offset++) = ret_rtx;
 
   RTVEC_ELT (p, offset++)
-    = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (Pmode, 65));
+    = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (Pmode, LR_REGNO));
 
   sym = rs6000_savres_routine_sym (info, savep, gpr, lr);
   RTVEC_ELT (p, offset++) = gen_rtx_USE (VOIDmode, sym);
@@ -19788,7 +19816,16 @@  rs6000_make_savres_rtx (rs6000_stack_t *
       RTVEC_ELT (p, i + offset) = gen_rtx_SET (VOIDmode, mem, reg);
     }
 
-  return gen_rtx_PARALLEL (VOIDmode, p);
+  par = gen_rtx_PARALLEL (VOIDmode, p);
+
+  if (!savep && lr)
+    {
+      insn = emit_jump_insn (par);
+      JUMP_LABEL (insn) = ret_rtx;
+    }
+  else
+    insn = emit_insn (par);
+  return insn;
 }
 
 /* Determine whether the gp REG is really used.  */
@@ -20087,16 +20124,13 @@  rs6000_emit_prologue (void)
     }
   else if (!WORLD_SAVE_P (info) && info->first_fp_reg_save != 64)
     {
-      rtx par;
-
-      par = rs6000_make_savres_rtx (info, frame_reg_rtx,
-				    info->fp_save_offset + sp_offset,
-				    DFmode,
-				    /*savep=*/true, /*gpr=*/false,
-				    /*lr=*/(strategy
-					    & SAVE_NOINLINE_FPRS_SAVES_LR)
-					   != 0);
-      insn = emit_insn (par);
+      insn = rs6000_emit_savres_rtx (info, frame_reg_rtx,
+				     info->fp_save_offset + sp_offset,
+				     DFmode,
+				     /*savep=*/true, /*gpr=*/false,
+				     /*lr=*/((strategy
+					      & SAVE_NOINLINE_FPRS_SAVES_LR)
+					     != 0));
       rs6000_frame_related (insn, frame_ptr_rtx, info->total_size,
 			    NULL_RTX, NULL_RTX);
     }
@@ -20186,13 +20220,10 @@  rs6000_emit_prologue (void)
 	}
       else
 	{
-	  rtx par;
-
-	  par = rs6000_make_savres_rtx (info, gen_rtx_REG (Pmode, 11),
-					0, reg_mode,
-					/*savep=*/true, /*gpr=*/true,
-					/*lr=*/false);
-	  insn = emit_insn (par);
+	  insn = rs6000_emit_savres_rtx (info, gen_rtx_REG (Pmode, 11),
+					 0, reg_mode,
+					 /*savep=*/true, /*gpr=*/true,
+					 /*lr=*/false);
 	  rs6000_frame_related (insn, frame_ptr_rtx, info->total_size,
 				NULL_RTX, NULL_RTX);
 	}
@@ -20204,8 +20235,6 @@  rs6000_emit_prologue (void)
     }
   else if (!WORLD_SAVE_P (info) && !saving_GPRs_inline)
     {
-      rtx par;
-
       /* Need to adjust r11 (r12) if we saved any FPRs.  */
       if (info->first_fp_reg_save != 64)
         {
@@ -20216,14 +20245,13 @@  rs6000_emit_prologue (void)
 	  emit_insn (gen_add3_insn (dest_reg, frame_reg_rtx, offset));
         }
 
-      par = rs6000_make_savres_rtx (info, frame_reg_rtx,
-				    info->gp_save_offset + sp_offset,
-				    reg_mode,
-				    /*savep=*/true, /*gpr=*/true,
-				    /*lr=*/(strategy
-					    & SAVE_NOINLINE_GPRS_SAVES_LR)
-					   != 0);
-      insn = emit_insn (par);
+      insn = rs6000_emit_savres_rtx (info, frame_reg_rtx,
+				     info->gp_save_offset + sp_offset,
+				     reg_mode,
+				     /*savep=*/true, /*gpr=*/true,
+				     /*lr=*/((strategy
+					      & SAVE_NOINLINE_GPRS_SAVES_LR)
+					     != 0));
       rs6000_frame_related (insn, frame_ptr_rtx, info->total_size,
 			    NULL_RTX, NULL_RTX);
     }
@@ -20672,6 +20718,20 @@  offset_below_red_zone_p (HOST_WIDE_INT o
 		   : TARGET_32BIT ? -220 : -288);
 }
 
+/* Append CFA_RESTORES to any existing REG_NOTES on the last insn.  */
+
+static void
+emit_cfa_restores (rtx cfa_restores)
+{
+  rtx insn = get_last_insn ();
+  rtx *loc = &REG_NOTES (insn);
+
+  while (*loc)
+    loc = &XEXP (*loc, 1);
+  *loc = cfa_restores;
+  RTX_FRAME_RELATED_P (insn) = 1;
+}
+
 /* Emit function epilogue as insns.  */
 
 void
@@ -20769,6 +20829,14 @@  rs6000_emit_epilogue (int sibcall)
 	rtx mem = gen_frame_mem (reg_mode, addr);
 
 	RTVEC_ELT (p, j++) = gen_rtx_SET (VOIDmode, reg, mem);
+
+	if (flag_shrink_wrap)
+	  {
+	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE,
+					   gen_rtx_REG (Pmode, LR_REGNO),
+					   cfa_restores);
+	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
+	  }
       }
 
       for (i = 0; i < 32 - info->first_gp_reg_save; i++)
@@ -20780,6 +20848,8 @@  rs6000_emit_epilogue (int sibcall)
 	  rtx mem = gen_frame_mem (reg_mode, addr);
 
 	  RTVEC_ELT (p, j++) = gen_rtx_SET (VOIDmode, reg, mem);
+	  if (flag_shrink_wrap)
+	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
 	}
       for (i = 0; info->first_altivec_reg_save + i <= LAST_ALTIVEC_REGNO; i++)
 	{
@@ -20790,6 +20860,8 @@  rs6000_emit_epilogue (int sibcall)
 	  rtx mem = gen_frame_mem (V4SImode, addr);
 
 	  RTVEC_ELT (p, j++) = gen_rtx_SET (VOIDmode, reg, mem);
+	  if (flag_shrink_wrap)
+	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
 	}
       for (i = 0; info->first_fp_reg_save + i <= 63; i++)
 	{
@@ -20803,6 +20875,8 @@  rs6000_emit_epilogue (int sibcall)
 				     ? DFmode : SFmode), addr);
 
 	  RTVEC_ELT (p, j++) = gen_rtx_SET (VOIDmode, reg, mem);
+	  if (flag_shrink_wrap)
+	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
 	}
       RTVEC_ELT (p, j++)
 	= gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (Pmode, 0));
@@ -20814,8 +20888,14 @@  rs6000_emit_epilogue (int sibcall)
 	= gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (SImode, 8));
       RTVEC_ELT (p, j++)
 	= gen_rtx_USE (VOIDmode, gen_rtx_REG (SImode, 10));
-      emit_jump_insn (gen_rtx_PARALLEL (VOIDmode, p));
+      insn = emit_jump_insn (gen_rtx_PARALLEL (VOIDmode, p));
 
+      if (flag_shrink_wrap)
+	{
+	  REG_NOTES (insn) = cfa_restores;
+	  add_reg_note (insn, REG_CFA_DEF_CFA, sp_reg_rtx);
+	  RTX_FRAME_RELATED_P (insn) = 1;
+	}
       return;
     }
 
@@ -20860,9 +20940,10 @@  rs6000_emit_epilogue (int sibcall)
 
 	    reg = gen_rtx_REG (V4SImode, i);
 	    emit_move_insn (reg, mem);
-	    if (offset_below_red_zone_p (info->altivec_save_offset
-					 + (i - info->first_altivec_reg_save)
-					   * 16))
+	    if (flag_shrink_wrap
+		|| offset_below_red_zone_p (info->altivec_save_offset
+					    + (i - info->first_altivec_reg_save)
+					    * 16))
 	      cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg,
 					     cfa_restores);
 	  }
@@ -21001,7 +21082,7 @@  rs6000_emit_epilogue (int sibcall)
 
 	    reg = gen_rtx_REG (V4SImode, i);
 	    emit_move_insn (reg, mem);
-	    if (DEFAULT_ABI == ABI_V4)
+	    if (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap)
 	      cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg,
 					     cfa_restores);
 	  }
@@ -21051,8 +21132,7 @@  rs6000_emit_epilogue (int sibcall)
       emit_move_insn (cr_save_reg, mem);
     }
 
-  /* Set LR here to try to overlap restores below.  LR is always saved
-     above incoming stack, so it never needs REG_CFA_RESTORE.  */
+  /* Set LR here to try to overlap restores below.  */
   if (restore_lr && restoring_GPRs_inline)
     emit_move_insn (gen_rtx_REG (Pmode, LR_REGNO),
 		    gen_rtx_REG (Pmode, 0));
@@ -21090,7 +21170,7 @@  rs6000_emit_epilogue (int sibcall)
   /* Restore GPRs.  This is done as a PARALLEL if we are using
      the load-multiple instructions.  */
   if (TARGET_SPE_ABI
-      && info->spe_64bit_regs_used != 0
+      && info->spe_64bit_regs_used
       && info->first_gp_reg_save != 32)
     {
       /* Determine whether we can address all of the registers that need
@@ -21114,7 +21194,7 @@  rs6000_emit_epilogue (int sibcall)
 	  int ool_adjust = (restoring_GPRs_inline
 			    ? 0
 			    : (info->first_gp_reg_save
-			       - (FIRST_SAVRES_REGISTER+1))*8);
+			       - (FIRST_SAVRES_REGISTER + 1)) * 8);
 
 	  if (frame_reg_rtx == sp_reg_rtx)
 	    frame_reg_rtx = gen_rtx_REG (Pmode, 11);
@@ -21145,48 +21225,28 @@  rs6000_emit_epilogue (int sibcall)
 		mem = gen_rtx_MEM (V2SImode, addr);
 		reg = gen_rtx_REG (reg_mode, info->first_gp_reg_save + i);
 
-		insn = emit_move_insn (reg, mem);
-		if (DEFAULT_ABI == ABI_V4)
-		  {
-		    if (frame_pointer_needed
-			&& info->first_gp_reg_save + i
-			   == HARD_FRAME_POINTER_REGNUM)
-		      {
-			add_reg_note (insn, REG_CFA_DEF_CFA,
-				      plus_constant (frame_reg_rtx,
-						     sp_offset));
-			RTX_FRAME_RELATED_P (insn) = 1;
-		      }
-
-		    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg,
-						   cfa_restores);
-		  }
+		emit_move_insn (reg, mem);
 	      }
 	}
       else
-	{
-	  rtx par;
-
-	  par = rs6000_make_savres_rtx (info, gen_rtx_REG (Pmode, 11),
-					0, reg_mode,
-					/*savep=*/false, /*gpr=*/true,
-					/*lr=*/true);
-	  emit_jump_insn (par);
-	  /* We don't want anybody else emitting things after we jumped
-	     back.  */
-	  return;
-	}
+	rs6000_emit_savres_rtx (info, gen_rtx_REG (Pmode, 11),
+				0, reg_mode,
+				/*savep=*/false, /*gpr=*/true,
+				/*lr=*/true);
     }
   else if (!restoring_GPRs_inline)
     {
       /* We are jumping to an out-of-line function.  */
       bool can_use_exit = info->first_fp_reg_save == 64;
-      rtx par;
 
       /* Emit stack reset code if we need it.  */
       if (can_use_exit)
-	rs6000_emit_stack_reset (info, sp_reg_rtx, frame_reg_rtx,
-				 sp_offset, can_use_exit);
+	{
+	  rs6000_emit_stack_reset (info, sp_reg_rtx, frame_reg_rtx,
+				   sp_offset, can_use_exit);
+	  if (info->cr_save_p)
+	    rs6000_restore_saved_cr (cr_save_reg, using_mtcr_multiple);
+	}
       else
 	{
 	  emit_insn (gen_add3_insn (gen_rtx_REG (Pmode, DEFAULT_ABI == ABI_AIX
@@ -21197,45 +21257,10 @@  rs6000_emit_epilogue (int sibcall)
 	    sp_offset += info->fp_size;
 	}
 
-      par = rs6000_make_savres_rtx (info, frame_reg_rtx,
-				    info->gp_save_offset, reg_mode,
-				    /*savep=*/false, /*gpr=*/true,
-				    /*lr=*/can_use_exit);
-
-      if (can_use_exit)
-	{
-	  if (info->cr_save_p)
-	    {
-	      rs6000_restore_saved_cr (cr_save_reg, using_mtcr_multiple);
-	      if (DEFAULT_ABI == ABI_V4)
-		cfa_restores
-		  = alloc_reg_note (REG_CFA_RESTORE,
-				    gen_rtx_REG (SImode, CR2_REGNO),
-				    cfa_restores);
-	    }
-
-	  emit_jump_insn (par);
-
-	  /* We don't want anybody else emitting things after we jumped
-	     back.  */
-	  return;
-	}
-
-      insn = emit_insn (par);
-      if (DEFAULT_ABI == ABI_V4)
-	{
-	  if (frame_pointer_needed)
-	    {
-	      add_reg_note (insn, REG_CFA_DEF_CFA,
-			    plus_constant (frame_reg_rtx, sp_offset));
-	      RTX_FRAME_RELATED_P (insn) = 1;
-	    }
-
-	  for (i = info->first_gp_reg_save; i < 32; i++)
-	    cfa_restores
-	      = alloc_reg_note (REG_CFA_RESTORE,
-				gen_rtx_REG (reg_mode, i), cfa_restores);
-	}
+      rs6000_emit_savres_rtx (info, frame_reg_rtx,
+			      info->gp_save_offset, reg_mode,
+			      /*savep=*/false, /*gpr=*/true,
+			      /*lr=*/can_use_exit);
     }
   else if (using_load_multiple)
     {
@@ -21251,17 +21276,8 @@  rs6000_emit_epilogue (int sibcall)
 	  rtx reg = gen_rtx_REG (reg_mode, info->first_gp_reg_save + i);
 
 	  RTVEC_ELT (p, i) = gen_rtx_SET (VOIDmode, reg, mem);
-	  if (DEFAULT_ABI == ABI_V4)
-	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg,
-					   cfa_restores);
-	}
-      insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, p));
-      if (DEFAULT_ABI == ABI_V4 && frame_pointer_needed)
-	{
-	  add_reg_note (insn, REG_CFA_DEF_CFA,
-			plus_constant (frame_reg_rtx, sp_offset));
-	  RTX_FRAME_RELATED_P (insn) = 1;
 	}
+      emit_insn (gen_rtx_PARALLEL (VOIDmode, p));
     }
   else
     {
@@ -21275,24 +21291,70 @@  rs6000_emit_epilogue (int sibcall)
             rtx mem = gen_frame_mem (reg_mode, addr);
 	    rtx reg = gen_rtx_REG (reg_mode, info->first_gp_reg_save + i);
 
-	    insn = emit_move_insn (reg, mem);
-	    if (DEFAULT_ABI == ABI_V4)
-	      {
-	        if (frame_pointer_needed
-		    && info->first_gp_reg_save + i
-		       == HARD_FRAME_POINTER_REGNUM)
-		  {
-		    add_reg_note (insn, REG_CFA_DEF_CFA,
-				  plus_constant (frame_reg_rtx, sp_offset));
-		    RTX_FRAME_RELATED_P (insn) = 1;
-		  }
-
-		cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg,
-					       cfa_restores);
-	      }
+	    emit_move_insn (reg, mem);
           }
     }
 
+  if (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap)
+    {
+      /* If the frame pointer was used then we can't delay emitting
+	 a REG_CFA_DEF_CFA note.  This must happen on the insn that
+	 restores the frame pointer, r31.  We may have already emitted
+	 a REG_CFA_DEF_CFA note, but that's OK;  A duplicate is
+	 discarded by dwarf2cfi.c/dwarf2out.c, and in any case would
+	 be harmless if emitted.  */
+      if (frame_pointer_needed)
+	{
+	  insn = get_last_insn ();
+	  add_reg_note (insn, REG_CFA_DEF_CFA,
+			plus_constant (frame_reg_rtx, sp_offset));
+	  RTX_FRAME_RELATED_P (insn) = 1;
+	}
+
+      /* Set up cfa_restores.  We always need these when
+	 shrink-wrapping.  If not shrink-wrapping then we only need
+	 the cfa_restore when the stack location is no longer valid.
+	 The cfa_restores must be emitted on or before the insn that
+	 invalidates the stack, and of course must not be emitted
+	 before the insn that actually does the restore.  The latter
+	 is why the LR cfa_restore condition below is a little
+	 complicated.  It's also why it is a bad idea to emit the
+	 cfa_restores as a group on the last instruction here that
+	 actually does a restore: That insn may be reordered with
+	 respect to others doing restores.  */
+      if (info->cr_save_p)
+	cfa_restores = alloc_reg_note (REG_CFA_RESTORE,
+				       gen_rtx_REG (SImode, CR2_REGNO),
+				       cfa_restores);
+      if (flag_shrink_wrap
+	  && (restore_lr
+	      || (info->lr_save_p
+		  && !restoring_GPRs_inline
+		  && info->first_fp_reg_save == 64)))
+	cfa_restores = alloc_reg_note (REG_CFA_RESTORE,
+				       gen_rtx_REG (Pmode, LR_REGNO),
+				       cfa_restores);
+
+      for (i = info->first_gp_reg_save; i < 32; i++)
+	if (!restoring_GPRs_inline
+	    || using_load_multiple
+	    || rs6000_reg_live_or_pic_offset_p (i))
+	  {
+	    rtx reg = gen_rtx_REG (reg_mode, i);
+
+	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
+	  }
+    }
+
+  if (!restoring_GPRs_inline
+      && info->first_fp_reg_save == 64)
+    {
+      /* We are jumping to an out-of-line function.  */
+      if (cfa_restores)
+	emit_cfa_restores (cfa_restores);
+      return;
+    }
+
   if (restore_lr && !restoring_GPRs_inline)
     {
       rtx mem = gen_frame_mem_offset (Pmode, frame_reg_rtx,
@@ -21306,8 +21368,8 @@  rs6000_emit_epilogue (int sibcall)
   /* Restore fpr's if we need to do it without calling a function.  */
   if (restoring_FPRs_inline)
     for (i = 0; i < 64 - info->first_fp_reg_save; i++)
-      if ((df_regs_ever_live_p (info->first_fp_reg_save+i)
-	   && ! call_used_regs[info->first_fp_reg_save+i]))
+      if ((df_regs_ever_live_p (info->first_fp_reg_save + i)
+	   && !call_used_regs[info->first_fp_reg_save + i]))
 	{
 	  rtx addr, mem, reg;
 	  addr = gen_rtx_PLUS (Pmode, frame_reg_rtx,
@@ -21321,20 +21383,13 @@  rs6000_emit_epilogue (int sibcall)
 			     info->first_fp_reg_save + i);
 
  	  emit_move_insn (reg, mem);
-	  if (DEFAULT_ABI == ABI_V4)
-	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg,
-					   cfa_restores);
+	  if (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap)
+	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
 	}
 
   /* If we saved cr, restore it here.  Just those that were used.  */
   if (info->cr_save_p)
-    {
-      rs6000_restore_saved_cr (cr_save_reg, using_mtcr_multiple);
-      if (DEFAULT_ABI == ABI_V4)
-	cfa_restores
-	  = alloc_reg_note (REG_CFA_RESTORE, gen_rtx_REG (SImode, CR2_REGNO),
-			    cfa_restores);
-    }
+    rs6000_restore_saved_cr (cr_save_reg, using_mtcr_multiple);
 
   /* If this is V.4, unwind the stack pointer after all of the loads
      have been done.  */
@@ -21362,15 +21417,40 @@  rs6000_emit_epilogue (int sibcall)
       rtvec p;
       bool lr = (strategy & REST_NOINLINE_FPRS_DOESNT_RESTORE_LR) == 0;
       if (! restoring_FPRs_inline)
-	p = rtvec_alloc (4 + 64 - info->first_fp_reg_save);
+	{
+	  p = rtvec_alloc (4 + 64 - info->first_fp_reg_save);
+	  RTVEC_ELT (p, 0) = ret_rtx;
+	}
       else
-	p = rtvec_alloc (2);
+	{
+	  if (cfa_restores)
+	    {
+	      /* We can't hang the cfa_restores off a simple return,
+		 since the shrink-wrap code sometimes uses an existing
+		 return.  This means there might be a path from
+		 pre-prologue code to this return, and dwarf2cfi code
+		 wants the eh_frame unwinder state to be the same on
+		 all paths to any point.  So we need to emit the
+		 cfa_restores before the return.  For -m64 we really
+		 don't need epilogue cfa_restores at all, except for
+		 this irritating dwarf2cfi with shrink-wrap
+		 requirement;  The stack red-zone means eh_frame info
+		 from the prologue telling the unwinder to restore
+		 from the stack is perfectly good right to the end of
+		 the function.  */
+	      emit_insn (gen_blockage ());
+	      emit_cfa_restores (cfa_restores);
+	      cfa_restores = NULL_RTX;
+	    }
+	  p = rtvec_alloc (2);
+	  RTVEC_ELT (p, 0) = simple_return_rtx;
+	}
 
-      RTVEC_ELT (p, 0) = ret_rtx;
       RTVEC_ELT (p, 1) = ((restoring_FPRs_inline || !lr)
-			  ? gen_rtx_USE (VOIDmode, gen_rtx_REG (Pmode, 65))
+			  ? gen_rtx_USE (VOIDmode,
+					 gen_rtx_REG (Pmode, LR_REGNO))
 			  : gen_rtx_CLOBBER (VOIDmode,
-					     gen_rtx_REG (Pmode, 65)));
+					     gen_rtx_REG (Pmode, LR_REGNO)));
 
       /* If we have to restore more than two FP registers, branch to the
 	 restore function.  It will return to our caller.  */
@@ -21379,6 +21459,12 @@  rs6000_emit_epilogue (int sibcall)
 	  int i;
 	  rtx sym;
 
+	  if ((DEFAULT_ABI == ABI_V4 || flag_shrink_wrap)
+	      && lr)
+	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE,
+					   gen_rtx_REG (Pmode, LR_REGNO),
+					   cfa_restores);
+
 	  sym = rs6000_savres_routine_sym (info,
 					   /*savep=*/false,
 					   /*gpr=*/false,
@@ -21390,20 +21476,32 @@  rs6000_emit_epilogue (int sibcall)
 						       ? 1 : 11));
 	  for (i = 0; i < 64 - info->first_fp_reg_save; i++)
 	    {
-	      rtx addr, mem;
+	      rtx addr, mem, reg;
+
 	      addr = gen_rtx_PLUS (Pmode, sp_reg_rtx,
-				   GEN_INT (info->fp_save_offset + 8*i));
+				   GEN_INT (info->fp_save_offset + 8 * i));
 	      mem = gen_frame_mem (DFmode, addr);
+	      reg = gen_rtx_REG (DFmode, info->first_fp_reg_save + i);
 
-	      RTVEC_ELT (p, i+4) =
-		gen_rtx_SET (VOIDmode,
-			     gen_rtx_REG (DFmode, info->first_fp_reg_save + i),
-			     mem);
+	      RTVEC_ELT (p, i + 4) = gen_rtx_SET (VOIDmode, reg, mem);
+	      if (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap)
+		cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg,
+					       cfa_restores);
 	    }
 	}
 
       emit_jump_insn (gen_rtx_PARALLEL (VOIDmode, p));
     }
+
+  if (cfa_restores)
+    {
+      if (sibcall)
+	/* Ensure the cfa_restores are hung off an insn that won't
+	   be reordered above other restores.  */
+	emit_insn (gen_blockage ());
+
+      emit_cfa_restores (cfa_restores);
+    }
 }
 
 /* Write function epilogue.  */
@@ -21768,7 +21866,7 @@  rs6000_output_mi_thunk (FILE *file, tree
 			gen_rtx_USE (VOIDmode,
 				     gen_rtx_REG (SImode,
 						  LR_REGNO)),
-			ret_rtx)));
+			simple_return_rtx)));
   SIBLING_CALL_P (insn) = 1;
   emit_barrier ();
 
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 178876)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -894,10 +894,11 @@  extern unsigned rs6000_pointer_size;
 	cr1		(not saved, but used for FP operations)
 	cr0		(not saved, but used for arithmetic operations)
 	cr4, cr3, cr2	(saved)
-	r0		(not saved; cannot be base reg)
 	r9		(not saved; best for TImode)
-	r11, r10, r8-r4	(not saved; highest used first to make less conflict)
+	r10, r8-r4	(not saved; highest first for less conflict with params)
 	r3		(not saved; return value register)
+	r11		(not saved; later alloc to help shrink-wrap)
+	r0		(not saved; cannot be base reg)
 	r31 - r13	(saved; order given to save least number)
 	r12		(not saved; if used for DImode or DFmode would use r13)
 	mq		(not saved; best to use it if we can)
@@ -922,6 +923,14 @@  extern unsigned rs6000_pointer_size;
 #define MAYBE_R2_FIXED
 #endif
 
+#if FIXED_R13 == 1
+#define EARLY_R12 12,
+#define LATE_R12
+#else
+#define EARLY_R12
+#define LATE_R12 12,
+#endif
+
 #define REG_ALLOC_ORDER						\
   {32,								\
    45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34,		\
@@ -929,11 +938,11 @@  extern unsigned rs6000_pointer_size;
    63, 62, 61, 60, 59, 58, 57, 56, 55, 54, 53, 52, 51,		\
    50, 49, 48, 47, 46,						\
    75, 74, 69, 68, 72, 71, 70,					\
-   0, MAYBE_R2_AVAILABLE					\
-   9, 11, 10, 8, 7, 6, 5, 4,					\
-   3,								\
+   MAYBE_R2_AVAILABLE						\
+   9, 10, 8, 7, 6, 5, 4,					\
+   3, EARLY_R12 11, 0,						\
    31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19,		\
-   18, 17, 16, 15, 14, 13, 12,					\
+   18, 17, 16, 15, 14, 13, LATE_R12				\
    64, 66, 65,							\
    73, 1, MAYBE_R2_FIXED 67, 76,				\
    /* AltiVec registers.  */					\
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 178876)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -264,6 +265,12 @@  (define_mode_iterator RECIPF [SF DF V4SF
 ; Iterator for just SF/DF
 (define_mode_iterator SFDF [SF DF])
 
+; Conditional returns.
+(define_code_iterator any_return [return simple_return])
+(define_code_attr return_pred [(return "direct_return ()")
+			       (simple_return "")])
+(define_code_attr return_str [(return "") (simple_return "simple_")])
+
 ; Various instructions that come in SI and DI forms.
 ; A generic w/d attribute, for things like cmpw/cmpd.
 (define_mode_attr wd [(QI "b") (HI "h") (SI "w") (DI "d")])
@@ -12814,7 +12831,7 @@  (define_expand "sibcall"
 		    (match_operand 1 "" ""))
 	      (use (match_operand 2 "" ""))
 	      (use (reg:SI LR_REGNO))
-	      (return)])]
+	      (simple_return)])]
   ""
   "
 {
@@ -12838,7 +12855,7 @@  (define_insn "*sibcall_local32"
 	 (match_operand 1 "" "g,g"))
    (use (match_operand:SI 2 "immediate_operand" "O,n"))
    (use (reg:SI LR_REGNO))
-   (return)]
+   (simple_return)]
   "(INTVAL (operands[2]) & CALL_LONG) == 0"
   "*
 {
@@ -12858,7 +12875,7 @@  (define_insn "*sibcall_local64"
 	 (match_operand 1 "" "g,g"))
    (use (match_operand:SI 2 "immediate_operand" "O,n"))
    (use (reg:SI LR_REGNO))
-   (return)]
+   (simple_return)]
   "TARGET_64BIT && (INTVAL (operands[2]) & CALL_LONG) == 0"
   "*
 {
@@ -12879,7 +12896,7 @@  (define_insn "*sibcall_value_local32"
 	      (match_operand 2 "" "g,g")))
    (use (match_operand:SI 3 "immediate_operand" "O,n"))
    (use (reg:SI LR_REGNO))
-   (return)]
+   (simple_return)]
   "(INTVAL (operands[3]) & CALL_LONG) == 0"
   "*
 {
@@ -12901,7 +12918,7 @@  (define_insn "*sibcall_value_local64"
 	      (match_operand 2 "" "g,g")))
    (use (match_operand:SI 3 "immediate_operand" "O,n"))
    (use (reg:SI LR_REGNO))
-   (return)]
+   (simple_return)]
   "TARGET_64BIT && (INTVAL (operands[3]) & CALL_LONG) == 0"
   "*
 {
@@ -12921,7 +12938,7 @@  (define_insn "*sibcall_nonlocal_aix<mode
 	 (match_operand 1 "" "g,g"))
    (use (match_operand:SI 2 "immediate_operand" "O,O"))
    (use (reg:SI LR_REGNO))
-   (return)]
+   (simple_return)]
   "DEFAULT_ABI == ABI_AIX
    && (INTVAL (operands[2]) & CALL_LONG) == 0"
   "@
@@ -12936,7 +12953,7 @@  (define_insn "*sibcall_value_nonlocal_ai
 	      (match_operand 2 "" "g,g")))
    (use (match_operand:SI 3 "immediate_operand" "O,O"))
    (use (reg:SI LR_REGNO))
-   (return)]
+   (simple_return)]
   "DEFAULT_ABI == ABI_AIX
    && (INTVAL (operands[3]) & CALL_LONG) == 0"
   "@
@@ -12950,7 +12967,7 @@  (define_insn "*sibcall_nonlocal_sysv<mod
 	 (match_operand 1 "" ""))
    (use (match_operand 2 "immediate_operand" "O,n,O,n"))
    (use (reg:SI LR_REGNO))
-   (return)]
+   (simple_return)]
   "(DEFAULT_ABI == ABI_DARWIN
     || DEFAULT_ABI == ABI_V4)
    && (INTVAL (operands[2]) & CALL_LONG) == 0"
@@ -12981,7 +12998,7 @@  (define_expand "sibcall_value"
 		      (match_operand 2 "" "")))
 	      (use (match_operand 3 "" ""))
 	      (use (reg:SI LR_REGNO))
-	      (return)])]
+	      (simple_return)])]
   ""
   "
 {
@@ -13002,7 +13019,7 @@  (define_insn "*sibcall_value_nonlocal_sy
 	      (match_operand 2 "" "")))
    (use (match_operand:SI 3 "immediate_operand" "O,n,O,n"))
    (use (reg:SI LR_REGNO))
-   (return)]
+   (simple_return)]
   "(DEFAULT_ABI == ABI_DARWIN
     || DEFAULT_ABI == ABI_V4)
    && (INTVAL (operands[3]) & CALL_LONG) == 0"
@@ -15328,9 +15345,9 @@  (define_insn ""
 				      [(match_operand 1
 						      "cc_reg_operand" "y")
 				       (const_int 0)])
-		      (return)
+		      (any_return)
 		      (pc)))]
-  "direct_return ()"
+  "<return_pred>"
   "*
 {
   return output_cbranch (operands[0], NULL, 0, insn);
@@ -15360,8 +15377,8 @@  (define_insn ""
 						      "cc_reg_operand" "y")
 				       (const_int 0)])
 		      (pc)
-		      (return)))]
-  "direct_return ()"
+		      (any_return)))]
+  "<return_pred>"
   "*
 {
   return output_cbranch (operands[0], NULL, 1, insn);
@@ -15491,9 +15508,9 @@  (define_insn "jump"
   "b %l0"
   [(set_attr "type" "branch")])
 
-(define_insn "return"
-  [(return)]
-  "direct_return ()"
+(define_insn "<return_str>return"
+  [(any_return)]
+  "<return_pred>"
   "{br|blr}"
   [(set_attr "type" "jmpreg")])
 
@@ -16015,7 +16032,7 @@  (define_insn "*lmw"
    (set_attr "cell_micro" "always")])
 
 (define_insn "*return_internal_<mode>"
-  [(return)
+  [(simple_return)
    (use (match_operand:P 0 "register_operand" "lc"))]
   ""
   "b%T0"