From patchwork Wed Sep 28 15:47:47 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Modra X-Patchwork-Id: 116830 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 1520FB6F6F for ; Thu, 29 Sep 2011 01:48:35 +1000 (EST) Received: (qmail 11263 invoked by alias); 28 Sep 2011 15:48:31 -0000 Received: (qmail 11250 invoked by uid 22791); 28 Sep 2011 15:48:25 -0000 X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from mail-gy0-f175.google.com (HELO mail-gy0-f175.google.com) (209.85.160.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 28 Sep 2011 15:48:00 +0000 Received: by gyg8 with SMTP id 8so7202643gyg.20 for ; Wed, 28 Sep 2011 08:47:59 -0700 (PDT) Received: by 10.236.170.167 with SMTP id p27mr6965671yhl.0.1317224877957; Wed, 28 Sep 2011 08:47:57 -0700 (PDT) Received: from bubble.grove.modra.org ([115.187.252.19]) by mx.google.com with ESMTPS id p46sm38807243yhh.15.2011.09.28.08.47.52 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 28 Sep 2011 08:47:56 -0700 (PDT) Received: by bubble.grove.modra.org (Postfix, from userid 1000) id BD099170C2BC; Thu, 29 Sep 2011 01:17:47 +0930 (CST) Date: Thu, 29 Sep 2011 01:17:47 +0930 From: Alan Modra To: gcc-patches@gcc.gnu.org, David Edelsohn Cc: Bernd Schmidt Subject: Re: PowerPC shrink-wrap support 3 of 3 Message-ID: <20110928154747.GO10321@bubble.grove.modra.org> Mail-Followup-To: gcc-patches@gcc.gnu.org, David Edelsohn , Bernd Schmidt References: <20110917071643.GT10321@bubble.grove.modra.org> <20110917071914.GW10321@bubble.grove.modra.org> <20110926135254.GJ10321@bubble.grove.modra.org> <20110926223241.GL10321@bubble.grove.modra.org> <4E80FF28.2070509@codesourcery.com> <20110927001120.GM10321@bubble.grove.modra.org> <4E81159C.8090503@codesourcery.com> <20110927004906.GN10321@bubble.grove.modra.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20110927004906.GN10321@bubble.grove.modra.org> User-Agent: Mutt/1.5.20 (2009-06-14) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This supercedes http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01004.html and http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01593.html, fixing the two regressions introduced by those patches. The first patch is unchanged except to leave all the out-of-line restore functions using "return" rather than "simple_return". We don't want these being confused with a plain "simple_return" and perhaps used by the shrink- wrapping to return from pre-prologue code. The second of these two patches was way too simplistic. It was a real pain getting the cfa_restores correct. A lot were missing, or emitted at the wrong place (due to bug in rs6000_emit_stack_reset). I also had the real restore insn move past the cfa_restores ("mtlr 0" insn scheduled over loads from stack). * config/rs6000/rs6000.c (rs6000_make_savres_rtx): Delete unneeded declaration. (rs6000_emit_stack_reset): Only return insn emitted when it adjusts sp. (rs6000_make_savres_rtx): Rename to rs6000_emit_savres_rtx. Use simple_return in pattern, emit instruction, and set jump_label. (rs6000_emit_prologue): Update for rs6000_emit_savres_rtx. Use simple_return rather than return. (emit_cfa_restores): New function. (rs6000_emit_epilogue): Emit cfa_restores when flag_shrink_wrap. Add missing cfa_restores for SAVE_WORLD. Add missing LR cfa_restore when using out-of-line gpr restore. Add missing LR and FP regs cfa_restores for out-of-line fpr restore. Consolidate code setting up cfa_restores. Formatting. Use LR_REGNO define. (rs6000_output_mi_thunk): Use simple_return rather than return. * config/rs6000/rs6000.md (sibcall*, sibcall_value*): Likewise. (return_internal*): Likewise. (any_return, return_pred, return_str): New iterators. (return, conditional return insns): Provide both return and simple_return variants. * gcc/config/rs6000/rs6000.h (EARLY_R12, LATE_R12): Define. (REG_ALLOC_ORDER): Move r12 before call-saved regs when FIXED_R13. Move r11 and r0 later to suit shrink-wrapping. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 178876) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -899,8 +900,6 @@ static const char *rs6000_mangle_type (c static void rs6000_set_default_type_attributes (tree); static rtx rs6000_savres_routine_sym (rs6000_stack_t *, bool, bool, bool); static rtx rs6000_emit_stack_reset (rs6000_stack_t *, rtx, rtx, int, bool); -static rtx rs6000_make_savres_rtx (rs6000_stack_t *, rtx, int, - enum machine_mode, bool, bool, bool); static bool rs6000_reg_live_or_pic_offset_p (int); static tree rs6000_builtin_vectorized_libmass (tree, tree, tree); static tree rs6000_builtin_vectorized_function (tree, tree, tree); @@ -19704,8 +19728,10 @@ rs6000_emit_stack_reset (rs6000_stack_t if (sp_offset != 0) { rtx dest_reg = savres ? gen_rtx_REG (Pmode, 11) : sp_reg_rtx; - return emit_insn (gen_add3_insn (dest_reg, frame_reg_rtx, - GEN_INT (sp_offset))); + rtx insn = emit_insn (gen_add3_insn (dest_reg, frame_reg_rtx, + GEN_INT (sp_offset))); + if (!savres) + return insn; } else if (!savres) return emit_move_insn (sp_reg_rtx, frame_reg_rtx); @@ -19729,10 +19755,11 @@ rs6000_emit_stack_reset (rs6000_stack_t } /* Construct a parallel rtx describing the effect of a call to an - out-of-line register save/restore routine. */ + out-of-line register save/restore routine, and emit the insn + or jump_insn as appropriate. */ static rtx -rs6000_make_savres_rtx (rs6000_stack_t *info, +rs6000_emit_savres_rtx (rs6000_stack_t *info, rtx frame_reg_rtx, int save_area_offset, enum machine_mode reg_mode, bool savep, bool gpr, bool lr) @@ -19742,6 +19769,7 @@ rs6000_make_savres_rtx (rs6000_stack_t * int reg_size = GET_MODE_SIZE (reg_mode); rtx sym; rtvec p; + rtx par, insn; offset = 0; start_reg = (gpr @@ -19755,7 +19783,7 @@ rs6000_make_savres_rtx (rs6000_stack_t * RTVEC_ELT (p, offset++) = ret_rtx; RTVEC_ELT (p, offset++) - = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (Pmode, 65)); + = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (Pmode, LR_REGNO)); sym = rs6000_savres_routine_sym (info, savep, gpr, lr); RTVEC_ELT (p, offset++) = gen_rtx_USE (VOIDmode, sym); @@ -19788,7 +19816,16 @@ rs6000_make_savres_rtx (rs6000_stack_t * RTVEC_ELT (p, i + offset) = gen_rtx_SET (VOIDmode, mem, reg); } - return gen_rtx_PARALLEL (VOIDmode, p); + par = gen_rtx_PARALLEL (VOIDmode, p); + + if (!savep && lr) + { + insn = emit_jump_insn (par); + JUMP_LABEL (insn) = ret_rtx; + } + else + insn = emit_insn (par); + return insn; } /* Determine whether the gp REG is really used. */ @@ -20087,16 +20124,13 @@ rs6000_emit_prologue (void) } else if (!WORLD_SAVE_P (info) && info->first_fp_reg_save != 64) { - rtx par; - - par = rs6000_make_savres_rtx (info, frame_reg_rtx, - info->fp_save_offset + sp_offset, - DFmode, - /*savep=*/true, /*gpr=*/false, - /*lr=*/(strategy - & SAVE_NOINLINE_FPRS_SAVES_LR) - != 0); - insn = emit_insn (par); + insn = rs6000_emit_savres_rtx (info, frame_reg_rtx, + info->fp_save_offset + sp_offset, + DFmode, + /*savep=*/true, /*gpr=*/false, + /*lr=*/((strategy + & SAVE_NOINLINE_FPRS_SAVES_LR) + != 0)); rs6000_frame_related (insn, frame_ptr_rtx, info->total_size, NULL_RTX, NULL_RTX); } @@ -20186,13 +20220,10 @@ rs6000_emit_prologue (void) } else { - rtx par; - - par = rs6000_make_savres_rtx (info, gen_rtx_REG (Pmode, 11), - 0, reg_mode, - /*savep=*/true, /*gpr=*/true, - /*lr=*/false); - insn = emit_insn (par); + insn = rs6000_emit_savres_rtx (info, gen_rtx_REG (Pmode, 11), + 0, reg_mode, + /*savep=*/true, /*gpr=*/true, + /*lr=*/false); rs6000_frame_related (insn, frame_ptr_rtx, info->total_size, NULL_RTX, NULL_RTX); } @@ -20204,8 +20235,6 @@ rs6000_emit_prologue (void) } else if (!WORLD_SAVE_P (info) && !saving_GPRs_inline) { - rtx par; - /* Need to adjust r11 (r12) if we saved any FPRs. */ if (info->first_fp_reg_save != 64) { @@ -20216,14 +20245,13 @@ rs6000_emit_prologue (void) emit_insn (gen_add3_insn (dest_reg, frame_reg_rtx, offset)); } - par = rs6000_make_savres_rtx (info, frame_reg_rtx, - info->gp_save_offset + sp_offset, - reg_mode, - /*savep=*/true, /*gpr=*/true, - /*lr=*/(strategy - & SAVE_NOINLINE_GPRS_SAVES_LR) - != 0); - insn = emit_insn (par); + insn = rs6000_emit_savres_rtx (info, frame_reg_rtx, + info->gp_save_offset + sp_offset, + reg_mode, + /*savep=*/true, /*gpr=*/true, + /*lr=*/((strategy + & SAVE_NOINLINE_GPRS_SAVES_LR) + != 0)); rs6000_frame_related (insn, frame_ptr_rtx, info->total_size, NULL_RTX, NULL_RTX); } @@ -20672,6 +20718,20 @@ offset_below_red_zone_p (HOST_WIDE_INT o : TARGET_32BIT ? -220 : -288); } +/* Append CFA_RESTORES to any existing REG_NOTES on the last insn. */ + +static void +emit_cfa_restores (rtx cfa_restores) +{ + rtx insn = get_last_insn (); + rtx *loc = ®_NOTES (insn); + + while (*loc) + loc = &XEXP (*loc, 1); + *loc = cfa_restores; + RTX_FRAME_RELATED_P (insn) = 1; +} + /* Emit function epilogue as insns. */ void @@ -20769,6 +20829,14 @@ rs6000_emit_epilogue (int sibcall) rtx mem = gen_frame_mem (reg_mode, addr); RTVEC_ELT (p, j++) = gen_rtx_SET (VOIDmode, reg, mem); + + if (flag_shrink_wrap) + { + cfa_restores = alloc_reg_note (REG_CFA_RESTORE, + gen_rtx_REG (Pmode, LR_REGNO), + cfa_restores); + cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); + } } for (i = 0; i < 32 - info->first_gp_reg_save; i++) @@ -20780,6 +20848,8 @@ rs6000_emit_epilogue (int sibcall) rtx mem = gen_frame_mem (reg_mode, addr); RTVEC_ELT (p, j++) = gen_rtx_SET (VOIDmode, reg, mem); + if (flag_shrink_wrap) + cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); } for (i = 0; info->first_altivec_reg_save + i <= LAST_ALTIVEC_REGNO; i++) { @@ -20790,6 +20860,8 @@ rs6000_emit_epilogue (int sibcall) rtx mem = gen_frame_mem (V4SImode, addr); RTVEC_ELT (p, j++) = gen_rtx_SET (VOIDmode, reg, mem); + if (flag_shrink_wrap) + cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); } for (i = 0; info->first_fp_reg_save + i <= 63; i++) { @@ -20803,6 +20875,8 @@ rs6000_emit_epilogue (int sibcall) ? DFmode : SFmode), addr); RTVEC_ELT (p, j++) = gen_rtx_SET (VOIDmode, reg, mem); + if (flag_shrink_wrap) + cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); } RTVEC_ELT (p, j++) = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (Pmode, 0)); @@ -20814,8 +20888,14 @@ rs6000_emit_epilogue (int sibcall) = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (SImode, 8)); RTVEC_ELT (p, j++) = gen_rtx_USE (VOIDmode, gen_rtx_REG (SImode, 10)); - emit_jump_insn (gen_rtx_PARALLEL (VOIDmode, p)); + insn = emit_jump_insn (gen_rtx_PARALLEL (VOIDmode, p)); + if (flag_shrink_wrap) + { + REG_NOTES (insn) = cfa_restores; + add_reg_note (insn, REG_CFA_DEF_CFA, sp_reg_rtx); + RTX_FRAME_RELATED_P (insn) = 1; + } return; } @@ -20860,9 +20940,10 @@ rs6000_emit_epilogue (int sibcall) reg = gen_rtx_REG (V4SImode, i); emit_move_insn (reg, mem); - if (offset_below_red_zone_p (info->altivec_save_offset - + (i - info->first_altivec_reg_save) - * 16)) + if (flag_shrink_wrap + || offset_below_red_zone_p (info->altivec_save_offset + + (i - info->first_altivec_reg_save) + * 16)) cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); } @@ -21001,7 +21082,7 @@ rs6000_emit_epilogue (int sibcall) reg = gen_rtx_REG (V4SImode, i); emit_move_insn (reg, mem); - if (DEFAULT_ABI == ABI_V4) + if (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap) cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); } @@ -21051,8 +21132,7 @@ rs6000_emit_epilogue (int sibcall) emit_move_insn (cr_save_reg, mem); } - /* Set LR here to try to overlap restores below. LR is always saved - above incoming stack, so it never needs REG_CFA_RESTORE. */ + /* Set LR here to try to overlap restores below. */ if (restore_lr && restoring_GPRs_inline) emit_move_insn (gen_rtx_REG (Pmode, LR_REGNO), gen_rtx_REG (Pmode, 0)); @@ -21090,7 +21170,7 @@ rs6000_emit_epilogue (int sibcall) /* Restore GPRs. This is done as a PARALLEL if we are using the load-multiple instructions. */ if (TARGET_SPE_ABI - && info->spe_64bit_regs_used != 0 + && info->spe_64bit_regs_used && info->first_gp_reg_save != 32) { /* Determine whether we can address all of the registers that need @@ -21114,7 +21194,7 @@ rs6000_emit_epilogue (int sibcall) int ool_adjust = (restoring_GPRs_inline ? 0 : (info->first_gp_reg_save - - (FIRST_SAVRES_REGISTER+1))*8); + - (FIRST_SAVRES_REGISTER + 1)) * 8); if (frame_reg_rtx == sp_reg_rtx) frame_reg_rtx = gen_rtx_REG (Pmode, 11); @@ -21145,48 +21225,28 @@ rs6000_emit_epilogue (int sibcall) mem = gen_rtx_MEM (V2SImode, addr); reg = gen_rtx_REG (reg_mode, info->first_gp_reg_save + i); - insn = emit_move_insn (reg, mem); - if (DEFAULT_ABI == ABI_V4) - { - if (frame_pointer_needed - && info->first_gp_reg_save + i - == HARD_FRAME_POINTER_REGNUM) - { - add_reg_note (insn, REG_CFA_DEF_CFA, - plus_constant (frame_reg_rtx, - sp_offset)); - RTX_FRAME_RELATED_P (insn) = 1; - } - - cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, - cfa_restores); - } + emit_move_insn (reg, mem); } } else - { - rtx par; - - par = rs6000_make_savres_rtx (info, gen_rtx_REG (Pmode, 11), - 0, reg_mode, - /*savep=*/false, /*gpr=*/true, - /*lr=*/true); - emit_jump_insn (par); - /* We don't want anybody else emitting things after we jumped - back. */ - return; - } + rs6000_emit_savres_rtx (info, gen_rtx_REG (Pmode, 11), + 0, reg_mode, + /*savep=*/false, /*gpr=*/true, + /*lr=*/true); } else if (!restoring_GPRs_inline) { /* We are jumping to an out-of-line function. */ bool can_use_exit = info->first_fp_reg_save == 64; - rtx par; /* Emit stack reset code if we need it. */ if (can_use_exit) - rs6000_emit_stack_reset (info, sp_reg_rtx, frame_reg_rtx, - sp_offset, can_use_exit); + { + rs6000_emit_stack_reset (info, sp_reg_rtx, frame_reg_rtx, + sp_offset, can_use_exit); + if (info->cr_save_p) + rs6000_restore_saved_cr (cr_save_reg, using_mtcr_multiple); + } else { emit_insn (gen_add3_insn (gen_rtx_REG (Pmode, DEFAULT_ABI == ABI_AIX @@ -21197,45 +21257,10 @@ rs6000_emit_epilogue (int sibcall) sp_offset += info->fp_size; } - par = rs6000_make_savres_rtx (info, frame_reg_rtx, - info->gp_save_offset, reg_mode, - /*savep=*/false, /*gpr=*/true, - /*lr=*/can_use_exit); - - if (can_use_exit) - { - if (info->cr_save_p) - { - rs6000_restore_saved_cr (cr_save_reg, using_mtcr_multiple); - if (DEFAULT_ABI == ABI_V4) - cfa_restores - = alloc_reg_note (REG_CFA_RESTORE, - gen_rtx_REG (SImode, CR2_REGNO), - cfa_restores); - } - - emit_jump_insn (par); - - /* We don't want anybody else emitting things after we jumped - back. */ - return; - } - - insn = emit_insn (par); - if (DEFAULT_ABI == ABI_V4) - { - if (frame_pointer_needed) - { - add_reg_note (insn, REG_CFA_DEF_CFA, - plus_constant (frame_reg_rtx, sp_offset)); - RTX_FRAME_RELATED_P (insn) = 1; - } - - for (i = info->first_gp_reg_save; i < 32; i++) - cfa_restores - = alloc_reg_note (REG_CFA_RESTORE, - gen_rtx_REG (reg_mode, i), cfa_restores); - } + rs6000_emit_savres_rtx (info, frame_reg_rtx, + info->gp_save_offset, reg_mode, + /*savep=*/false, /*gpr=*/true, + /*lr=*/can_use_exit); } else if (using_load_multiple) { @@ -21251,17 +21276,8 @@ rs6000_emit_epilogue (int sibcall) rtx reg = gen_rtx_REG (reg_mode, info->first_gp_reg_save + i); RTVEC_ELT (p, i) = gen_rtx_SET (VOIDmode, reg, mem); - if (DEFAULT_ABI == ABI_V4) - cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, - cfa_restores); - } - insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, p)); - if (DEFAULT_ABI == ABI_V4 && frame_pointer_needed) - { - add_reg_note (insn, REG_CFA_DEF_CFA, - plus_constant (frame_reg_rtx, sp_offset)); - RTX_FRAME_RELATED_P (insn) = 1; } + emit_insn (gen_rtx_PARALLEL (VOIDmode, p)); } else { @@ -21275,24 +21291,70 @@ rs6000_emit_epilogue (int sibcall) rtx mem = gen_frame_mem (reg_mode, addr); rtx reg = gen_rtx_REG (reg_mode, info->first_gp_reg_save + i); - insn = emit_move_insn (reg, mem); - if (DEFAULT_ABI == ABI_V4) - { - if (frame_pointer_needed - && info->first_gp_reg_save + i - == HARD_FRAME_POINTER_REGNUM) - { - add_reg_note (insn, REG_CFA_DEF_CFA, - plus_constant (frame_reg_rtx, sp_offset)); - RTX_FRAME_RELATED_P (insn) = 1; - } - - cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, - cfa_restores); - } + emit_move_insn (reg, mem); } } + if (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap) + { + /* If the frame pointer was used then we can't delay emitting + a REG_CFA_DEF_CFA note. This must happen on the insn that + restores the frame pointer, r31. We may have already emitted + a REG_CFA_DEF_CFA note, but that's OK; A duplicate is + discarded by dwarf2cfi.c/dwarf2out.c, and in any case would + be harmless if emitted. */ + if (frame_pointer_needed) + { + insn = get_last_insn (); + add_reg_note (insn, REG_CFA_DEF_CFA, + plus_constant (frame_reg_rtx, sp_offset)); + RTX_FRAME_RELATED_P (insn) = 1; + } + + /* Set up cfa_restores. We always need these when + shrink-wrapping. If not shrink-wrapping then we only need + the cfa_restore when the stack location is no longer valid. + The cfa_restores must be emitted on or before the insn that + invalidates the stack, and of course must not be emitted + before the insn that actually does the restore. The latter + is why the LR cfa_restore condition below is a little + complicated. It's also why it is a bad idea to emit the + cfa_restores as a group on the last instruction here that + actually does a restore: That insn may be reordered with + respect to others doing restores. */ + if (info->cr_save_p) + cfa_restores = alloc_reg_note (REG_CFA_RESTORE, + gen_rtx_REG (SImode, CR2_REGNO), + cfa_restores); + if (flag_shrink_wrap + && (restore_lr + || (info->lr_save_p + && !restoring_GPRs_inline + && info->first_fp_reg_save == 64))) + cfa_restores = alloc_reg_note (REG_CFA_RESTORE, + gen_rtx_REG (Pmode, LR_REGNO), + cfa_restores); + + for (i = info->first_gp_reg_save; i < 32; i++) + if (!restoring_GPRs_inline + || using_load_multiple + || rs6000_reg_live_or_pic_offset_p (i)) + { + rtx reg = gen_rtx_REG (reg_mode, i); + + cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); + } + } + + if (!restoring_GPRs_inline + && info->first_fp_reg_save == 64) + { + /* We are jumping to an out-of-line function. */ + if (cfa_restores) + emit_cfa_restores (cfa_restores); + return; + } + if (restore_lr && !restoring_GPRs_inline) { rtx mem = gen_frame_mem_offset (Pmode, frame_reg_rtx, @@ -21306,8 +21368,8 @@ rs6000_emit_epilogue (int sibcall) /* Restore fpr's if we need to do it without calling a function. */ if (restoring_FPRs_inline) for (i = 0; i < 64 - info->first_fp_reg_save; i++) - if ((df_regs_ever_live_p (info->first_fp_reg_save+i) - && ! call_used_regs[info->first_fp_reg_save+i])) + if ((df_regs_ever_live_p (info->first_fp_reg_save + i) + && !call_used_regs[info->first_fp_reg_save + i])) { rtx addr, mem, reg; addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, @@ -21321,20 +21383,13 @@ rs6000_emit_epilogue (int sibcall) info->first_fp_reg_save + i); emit_move_insn (reg, mem); - if (DEFAULT_ABI == ABI_V4) - cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, - cfa_restores); + if (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap) + cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); } /* If we saved cr, restore it here. Just those that were used. */ if (info->cr_save_p) - { - rs6000_restore_saved_cr (cr_save_reg, using_mtcr_multiple); - if (DEFAULT_ABI == ABI_V4) - cfa_restores - = alloc_reg_note (REG_CFA_RESTORE, gen_rtx_REG (SImode, CR2_REGNO), - cfa_restores); - } + rs6000_restore_saved_cr (cr_save_reg, using_mtcr_multiple); /* If this is V.4, unwind the stack pointer after all of the loads have been done. */ @@ -21362,15 +21417,40 @@ rs6000_emit_epilogue (int sibcall) rtvec p; bool lr = (strategy & REST_NOINLINE_FPRS_DOESNT_RESTORE_LR) == 0; if (! restoring_FPRs_inline) - p = rtvec_alloc (4 + 64 - info->first_fp_reg_save); + { + p = rtvec_alloc (4 + 64 - info->first_fp_reg_save); + RTVEC_ELT (p, 0) = ret_rtx; + } else - p = rtvec_alloc (2); + { + if (cfa_restores) + { + /* We can't hang the cfa_restores off a simple return, + since the shrink-wrap code sometimes uses an existing + return. This means there might be a path from + pre-prologue code to this return, and dwarf2cfi code + wants the eh_frame unwinder state to be the same on + all paths to any point. So we need to emit the + cfa_restores before the return. For -m64 we really + don't need epilogue cfa_restores at all, except for + this irritating dwarf2cfi with shrink-wrap + requirement; The stack red-zone means eh_frame info + from the prologue telling the unwinder to restore + from the stack is perfectly good right to the end of + the function. */ + emit_insn (gen_blockage ()); + emit_cfa_restores (cfa_restores); + cfa_restores = NULL_RTX; + } + p = rtvec_alloc (2); + RTVEC_ELT (p, 0) = simple_return_rtx; + } - RTVEC_ELT (p, 0) = ret_rtx; RTVEC_ELT (p, 1) = ((restoring_FPRs_inline || !lr) - ? gen_rtx_USE (VOIDmode, gen_rtx_REG (Pmode, 65)) + ? gen_rtx_USE (VOIDmode, + gen_rtx_REG (Pmode, LR_REGNO)) : gen_rtx_CLOBBER (VOIDmode, - gen_rtx_REG (Pmode, 65))); + gen_rtx_REG (Pmode, LR_REGNO))); /* If we have to restore more than two FP registers, branch to the restore function. It will return to our caller. */ @@ -21379,6 +21459,12 @@ rs6000_emit_epilogue (int sibcall) int i; rtx sym; + if ((DEFAULT_ABI == ABI_V4 || flag_shrink_wrap) + && lr) + cfa_restores = alloc_reg_note (REG_CFA_RESTORE, + gen_rtx_REG (Pmode, LR_REGNO), + cfa_restores); + sym = rs6000_savres_routine_sym (info, /*savep=*/false, /*gpr=*/false, @@ -21390,20 +21476,32 @@ rs6000_emit_epilogue (int sibcall) ? 1 : 11)); for (i = 0; i < 64 - info->first_fp_reg_save; i++) { - rtx addr, mem; + rtx addr, mem, reg; + addr = gen_rtx_PLUS (Pmode, sp_reg_rtx, - GEN_INT (info->fp_save_offset + 8*i)); + GEN_INT (info->fp_save_offset + 8 * i)); mem = gen_frame_mem (DFmode, addr); + reg = gen_rtx_REG (DFmode, info->first_fp_reg_save + i); - RTVEC_ELT (p, i+4) = - gen_rtx_SET (VOIDmode, - gen_rtx_REG (DFmode, info->first_fp_reg_save + i), - mem); + RTVEC_ELT (p, i + 4) = gen_rtx_SET (VOIDmode, reg, mem); + if (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap) + cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, + cfa_restores); } } emit_jump_insn (gen_rtx_PARALLEL (VOIDmode, p)); } + + if (cfa_restores) + { + if (sibcall) + /* Ensure the cfa_restores are hung off an insn that won't + be reordered above other restores. */ + emit_insn (gen_blockage ()); + + emit_cfa_restores (cfa_restores); + } } /* Write function epilogue. */ @@ -21768,7 +21866,7 @@ rs6000_output_mi_thunk (FILE *file, tree gen_rtx_USE (VOIDmode, gen_rtx_REG (SImode, LR_REGNO)), - ret_rtx))); + simple_return_rtx))); SIBLING_CALL_P (insn) = 1; emit_barrier (); Index: gcc/config/rs6000/rs6000.h =================================================================== --- gcc/config/rs6000/rs6000.h (revision 178876) +++ gcc/config/rs6000/rs6000.h (working copy) @@ -894,10 +894,11 @@ extern unsigned rs6000_pointer_size; cr1 (not saved, but used for FP operations) cr0 (not saved, but used for arithmetic operations) cr4, cr3, cr2 (saved) - r0 (not saved; cannot be base reg) r9 (not saved; best for TImode) - r11, r10, r8-r4 (not saved; highest used first to make less conflict) + r10, r8-r4 (not saved; highest first for less conflict with params) r3 (not saved; return value register) + r11 (not saved; later alloc to help shrink-wrap) + r0 (not saved; cannot be base reg) r31 - r13 (saved; order given to save least number) r12 (not saved; if used for DImode or DFmode would use r13) mq (not saved; best to use it if we can) @@ -922,6 +923,14 @@ extern unsigned rs6000_pointer_size; #define MAYBE_R2_FIXED #endif +#if FIXED_R13 == 1 +#define EARLY_R12 12, +#define LATE_R12 +#else +#define EARLY_R12 +#define LATE_R12 12, +#endif + #define REG_ALLOC_ORDER \ {32, \ 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, \ @@ -929,11 +938,11 @@ extern unsigned rs6000_pointer_size; 63, 62, 61, 60, 59, 58, 57, 56, 55, 54, 53, 52, 51, \ 50, 49, 48, 47, 46, \ 75, 74, 69, 68, 72, 71, 70, \ - 0, MAYBE_R2_AVAILABLE \ - 9, 11, 10, 8, 7, 6, 5, 4, \ - 3, \ + MAYBE_R2_AVAILABLE \ + 9, 10, 8, 7, 6, 5, 4, \ + 3, EARLY_R12 11, 0, \ 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, \ - 18, 17, 16, 15, 14, 13, 12, \ + 18, 17, 16, 15, 14, 13, LATE_R12 \ 64, 66, 65, \ 73, 1, MAYBE_R2_FIXED 67, 76, \ /* AltiVec registers. */ \ Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 178876) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -264,6 +265,12 @@ (define_mode_iterator RECIPF [SF DF V4SF ; Iterator for just SF/DF (define_mode_iterator SFDF [SF DF]) +; Conditional returns. +(define_code_iterator any_return [return simple_return]) +(define_code_attr return_pred [(return "direct_return ()") + (simple_return "")]) +(define_code_attr return_str [(return "") (simple_return "simple_")]) + ; Various instructions that come in SI and DI forms. ; A generic w/d attribute, for things like cmpw/cmpd. (define_mode_attr wd [(QI "b") (HI "h") (SI "w") (DI "d")]) @@ -12814,7 +12831,7 @@ (define_expand "sibcall" (match_operand 1 "" "")) (use (match_operand 2 "" "")) (use (reg:SI LR_REGNO)) - (return)])] + (simple_return)])] "" " { @@ -12838,7 +12855,7 @@ (define_insn "*sibcall_local32" (match_operand 1 "" "g,g")) (use (match_operand:SI 2 "immediate_operand" "O,n")) (use (reg:SI LR_REGNO)) - (return)] + (simple_return)] "(INTVAL (operands[2]) & CALL_LONG) == 0" "* { @@ -12858,7 +12875,7 @@ (define_insn "*sibcall_local64" (match_operand 1 "" "g,g")) (use (match_operand:SI 2 "immediate_operand" "O,n")) (use (reg:SI LR_REGNO)) - (return)] + (simple_return)] "TARGET_64BIT && (INTVAL (operands[2]) & CALL_LONG) == 0" "* { @@ -12879,7 +12896,7 @@ (define_insn "*sibcall_value_local32" (match_operand 2 "" "g,g"))) (use (match_operand:SI 3 "immediate_operand" "O,n")) (use (reg:SI LR_REGNO)) - (return)] + (simple_return)] "(INTVAL (operands[3]) & CALL_LONG) == 0" "* { @@ -12901,7 +12918,7 @@ (define_insn "*sibcall_value_local64" (match_operand 2 "" "g,g"))) (use (match_operand:SI 3 "immediate_operand" "O,n")) (use (reg:SI LR_REGNO)) - (return)] + (simple_return)] "TARGET_64BIT && (INTVAL (operands[3]) & CALL_LONG) == 0" "* { @@ -12921,7 +12938,7 @@ (define_insn "*sibcall_nonlocal_aix" "* { return output_cbranch (operands[0], NULL, 0, insn); @@ -15360,8 +15377,8 @@ (define_insn "" "cc_reg_operand" "y") (const_int 0)]) (pc) - (return)))] - "direct_return ()" + (any_return)))] + "" "* { return output_cbranch (operands[0], NULL, 1, insn); @@ -15491,9 +15508,9 @@ (define_insn "jump" "b %l0" [(set_attr "type" "branch")]) -(define_insn "return" - [(return)] - "direct_return ()" +(define_insn "return" + [(any_return)] + "" "{br|blr}" [(set_attr "type" "jmpreg")]) @@ -16015,7 +16032,7 @@ (define_insn "*lmw" (set_attr "cell_micro" "always")]) (define_insn "*return_internal_" - [(return) + [(simple_return) (use (match_operand:P 0 "register_operand" "lc"))] "" "b%T0"