From patchwork Thu Dec 20 09:53:16 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joey Ye X-Patchwork-Id: 207642 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 14EE12C0094 for ; Thu, 20 Dec 2012 20:53:50 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1356602031; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type: Content-Transfer-Encoding:Mailing-List:Precedence:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=QcMJ3GPaIOKpAL8pvgwYGZiOqLE=; b=So71j+FhPrF4Xm9 O4lGLdGCGtii9/N5m657IxNOY2H/kWOYKW0Bz6kUtX5v22YYjbsZ4qeGSKEYvCEb /lVmhkm8fC8d+gJIaTZ0EQNUBqaPkKEYeh1088aIOSPXErnuN7Tq1crwp6znlZKj 7FMJ65Vx6g17dErpH0CsR9hZMtSk= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:From:To:Cc:Subject:Date:Message-ID:MIME-Version:X-MC-Unique:Content-Type:Content-Transfer-Encoding:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=WwvRb3IydPuFbBBCYEd95NswknWQGk9uqo8buIYmYzkEtX/cUFSvjNGHqyBlHz tdyoxhDY27GLX0N3Z1XlXXTVzjQFcT3SSTbF8Y9qRMZjFJ7IAqxQLM9BhaVO3zr3 YVwKMYp2Ygz0fjJZfAxO18GrxMcypee2E+mqrEkbjkaA4=; Received: (qmail 22511 invoked by alias); 20 Dec 2012 09:53:46 -0000 Received: (qmail 22502 invoked by uid 22791); 20 Dec 2012 09:53:46 -0000 X-SWARE-Spam-Status: No, hits=0.2 required=5.0 tests=AWL, BAYES_50, KHOP_RCVD_UNTRUST, MSGID_MULTIPLE_AT, RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 20 Dec 2012 09:53:35 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Thu, 20 Dec 2012 09:53:31 +0000 Received: from E103005 ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.0); Thu, 20 Dec 2012 09:53:28 +0000 From: "Joey Ye" To: , "Ramana Radhakrishnan" Cc: "Joey Ye" Subject: [PATCH][ARM][thumb1] Reduce lr save for leaf function with non-far jump Date: Thu, 20 Dec 2012 17:53:16 +0800 Message-ID: <000f01cdde97$d7a90a20$86fb1e60$@ye@arm.com> MIME-Version: 1.0 X-MC-Unique: 112122009533100201 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Current GCC thumb1 has an annoying problem that always assuming far branch. So it forces to save lr, even when unnecessarily. The most extreme case complained by partner is: // compiled with "-mthumb -mcpu=cortex-m0 -Os". void foo() { for (;;); } => foo: push {lr} // Crazy!!! .L2: b .L2 The reason is that thumb1 far jump is only resolved in the very late pass "shorten_branch". Prologue/epilogue pass doesn't actually know a branch is far or not from its attribute. It has to conservatively save/restore lr whenever there is a branch. This patch tries to fix it with a simple heuristic, i.e., using function size to decide if a far jump will likely be used. Function size information is meaningful in prologue/epilogue pass. The heuristic uses following check to decide if lr should be saved for far jump: function_size * 3 >= 2048 // yes: save lr for possible far jump. No: don't save lr for far jump The scheme has an issue: if some corner case does break above condition, there is no chance to fix-up but to ICE. But the heuristic condition is very conservative. It is base on the worse normal condition that each instruction is associated with a 4 byte literal ( (2+4)/2=3, blooming size by 3 times ). I can't think of a real case to trigger the ICE. So I think it should work. Other approaches than the heuristic scheme are too expensive to implement for this small size/performance issue. I did explored some but none of them persuaded myself. Tests passed: * build libgcc, libstdc++, newlib, libm * make check-gcc with cpu=cortex-m0 * Small and extreme test cases ChangeLog: 2012-12-20 Joey Ye * config/arm/arm.c(thumb1_final_prescan_insn): Assert lr save for real far jump. (thumb_far_jump_used_p): Count instruction size and set far_jump_used. + { /* Record the fact that we have decided that the function does use far jumps. */ cfun->machine->far_jump_used = 1; diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 327ef22..ad79451 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn) else if (conds != CONDS_NOCOND) cfun->machine->thumb1_cc_insn = NULL_RTX; } + + /* Check if unexpected far jump is used. */ + if (cfun->machine->lr_save_eliminated + && get_attr_far_jump (insn) == FAR_JUMP_YES) + internal_error("Unexpected thumb1 far jump"); } int @@ -21815,6 +21887,8 @@ static int thumb_far_jump_used_p (void) { rtx insn; + bool far_jump = false; + unsigned int func_size = 0; /* This test is only important for leaf functions. */ /* assert (!leaf_function_p ()); */ @@ -21870,6 +21944,26 @@ thumb_far_jump_used_p (void) && get_attr_far_jump (insn) == FAR_JUMP_YES ) { + far_jump = true; + } + func_size += get_attr_length (insn); + } + + /* Attribute far_jump will always be true for thumb1 before shorten_branch + pass. So checking far_jump attribute before shorten_branch isn't much + useful. + + Following heuristic tries to estimate more accruately if a far jump may + finally be used. The heuristic is very conservative as there is no chance + to roll-back the decision of not to use far jump. + + Thumb1 long branch offset is -2048 to 2046. The worst case is each 2-byte + insn is assiociated with a 4 byte constant pool. Using function size + 2048/3 as the threshold is conservative enough. */ + if (far_jump) + { + if ((func_size * 3) >= 2048)