From patchwork Thu Jan 7 09:22:52 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrill Tkachov X-Patchwork-Id: 564210 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 11F9A14010F for ; Thu, 7 Jan 2016 20:23:07 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=G6wtshss; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:references :in-reply-to:content-type; q=dns; s=default; b=rUPV53aAeJyuy+zgT lnBV84EoF/8MLN+bIsskO5E/H997XIbkAs3k2RDNftOy0P8vwtEBDStFcc321v4S JoiiBZtz3WGcj5qQ2hiJfgzty4DVRmiuvj1iejtCgzRthaUayf641FbpXeSqdEyQ 9GfGP/A3CC3+L6dPhzPX2ZzhfE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:references :in-reply-to:content-type; s=default; bh=k8Ho/BdctBuFRyLYBWBArp4 nKCM=; b=G6wtshssV5D5QWIVhSHbSB6biSR0r0YcpO+78ecibfa8vWMaQIqiAu3 xuGmodGmBAryQuoNHGxKz29vNw2Ns+iB2npaErPHEvVJjU5O11qJfUSYl46lGKuU gJedZ78UFeGQ8KmA5+6h+GA+R4O/pxRP2l+ZNx4pFNSzyS8j8o9k= Received: (qmail 90534 invoked by alias); 7 Jan 2016 09:22:59 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 88730 invoked by uid 89); 7 Jan 2016 09:22:57 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.3 required=5.0 tests=AWL, BAYES_00, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=lifted, appealing, !reg_p, !REG_P X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 07 Jan 2016 09:22:56 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 877E249; Thu, 7 Jan 2016 01:22:21 -0800 (PST) Received: from [10.2.206.200] (e100706-lin.cambridge.arm.com [10.2.206.200]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1429C3F308; Thu, 7 Jan 2016 01:22:53 -0800 (PST) Message-ID: <568E2E6C.4050806@foss.arm.com> Date: Thu, 07 Jan 2016 09:22:52 +0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Bernd Schmidt , GCC Patches Subject: Re: [PATCH][RTL-ifcvt] PR rtl-optimization/68841: Make sure one basic block doesn't clobber CC reg usage of the other References: <56740520.30908@foss.arm.com> <56740734.1080107@redhat.com> <5677C50F.2010302@foss.arm.com> <568BC82D.20004@redhat.com> <568BD193.9020802@foss.arm.com> <568BF08E.1010900@redhat.com> <568BF7FA.2060108@foss.arm.com> <568C093C.90402@foss.arm.com> <568CEE1B.6080800@foss.arm.com> In-Reply-To: <568CEE1B.6080800@foss.arm.com> Hi Bernd, On 06/01/16 10:36, Kyrill Tkachov wrote: > Hi Bernd, > > On 05/01/16 18:19, Kyrill Tkachov wrote: >> >> On 05/01/16 17:06, Kyrill Tkachov wrote: >>> >>> On 05/01/16 16:34, Bernd Schmidt wrote: >>>> On 01/05/2016 03:22 PM, Kyrill Tkachov wrote: >>>>> >>>>> This works around the issue but we don't want to do perform the check >>>>> for pairs of >>>>> simple basic blocks because then we'll end up rejecting code that does >>>>> things like: >>>>> x = cond ? x + 1 : x - 1 >>>>> i.e. source of the set in both blocks reads and writes the same register. >>>>> We can deal with this safely later on in the function since we rename >>>>> the destinations >>>>> of the two sets, so we don't want to reject this case here. >>>> >>>> So we need to teach bbs_ok_for_cmove_arith that this is going to happen. How about the approach below? Still seems to fix the issue, and it looks like the CC set is present in the df info so everything should work as intended. Right? >>>> >>> >>> Yeah, this looks like it works. >>> However, now we reject if-conversion whereas with my patch we still tried switching the order in which >>> the blocks were emitted, which allowed for a bit more aggressive if-conversion. >>> I don't know if this approach is overly restrictive yet. >>> I'll try its effects on codegen quality on SPEC as soon as I get some cycles. >>> But this approach does look appealing to me. >>> >> >> Hmm, from a first look at SPEC, it seems to still overly restrict ifconversion in the >> x = cond ? x + 1 : x - 1 case. >> I'll look deeper tomorrow as to what's going on there. >> > > Ok, found the problem. > bbs_ok_for_cmove_arith also checks that we don't perform any stores in the basic block, which kills opportunities > to convert operations of the form [addr] = c ? a : b. Since with your patch we now call this even for > simple basic blocks we miss these opportunities. > > bb_valid_for_noce_process_p called earlier should have already ensured that for complex > blocks we allow only the last set to be a store, so we should be able to relax the restriction in > bbs_ok_for_cmove_arith. That change in combination with your patch fixes the testcase and has no code quality > fallout that I can see (it even slightly increases if-conversion opportunities). I'll test more thoroughly > and post a patch in due time. > And here is the updated patch. It is a modified version of the patch with the restriction on stores in the basic block lifted. Also I chose instead to pass the register 'x' on which we should not be recording conflicts rather than a boolean will_remain. I found if we don't do this we will pessimise cases where one basic block is something like: t1 = x + y; x = t1 + 2; and the other is something like: x = x + 1. We want to ignore the conflict on x in both invocations of bbs_ok_for_cmove_arith. With this patch the testcase passes and there is no codegen fallout on SPEC. Bootstrapped and tested on arm, aarch64, x86_64. How does this look? Thanks, Kyrill 2016-01-06 Bernd Schmidt Kyrylo Tkachov PR rtl-optimization/68841 * ifcvt.c (bbs_ok_for_cmove_arith): Add to_rename parameter. Don't record conflicts on to_rename if it's present. Allow memory destinations in sets. (noce_try_cmove_arith): Call bbs_ok_for_cmove_arith even on simple blocks. 2016-01-06 Kyrylo Tkachov PR rtl-optimization/68841 * gcc.dg/pr68841.c: New test. * gcc.c-torture/execute/pr68841.c: New test. > Thanks, > Kyrill > >> Kyrill >> >>> Thanks for the help, >>> Kyrill >>> >>>> >>>> Bernd >>> >> > diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index 5812ce30151ed7425780890c66e7763f5758df7e..96957aac5481acbc7ac5f186a653fe63ad3e5f51 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -1896,11 +1896,13 @@ insn_valid_noce_process_p (rtx_insn *insn, rtx cc) } -/* Return true iff the registers that the insns in BB_A set do not - get used in BB_B. */ +/* Return true iff the registers that the insns in BB_A set do not get + used in BB_B. If TO_RENAME is non NULL then it is a REG that will be + renamed later by the caller and so conflicts on it should be ignored + in this function. */ static bool -bbs_ok_for_cmove_arith (basic_block bb_a, basic_block bb_b) +bbs_ok_for_cmove_arith (basic_block bb_a, basic_block bb_b, rtx to_rename) { rtx_insn *a_insn; bitmap bba_sets = BITMAP_ALLOC (®_obstack); @@ -1920,10 +1922,10 @@ bbs_ok_for_cmove_arith (basic_block bb_a, basic_block bb_b) BITMAP_FREE (bba_sets); return false; } - /* Record all registers that BB_A sets. */ FOR_EACH_INSN_DEF (def, a_insn) - bitmap_set_bit (bba_sets, DF_REF_REGNO (def)); + if (!(to_rename && DF_REF_REG (def) == to_rename)) + bitmap_set_bit (bba_sets, DF_REF_REGNO (def)); } rtx_insn *b_insn; @@ -1942,8 +1944,12 @@ bbs_ok_for_cmove_arith (basic_block bb_a, basic_block bb_b) } /* Make sure this is a REG and not some instance - of ZERO_EXTRACT or SUBREG or other dangerous stuff. */ - if (!REG_P (SET_DEST (sset_b))) + of ZERO_EXTRACT or SUBREG or other dangerous stuff. + If we have a memory destination then we have a pair of simple + basic blocks performing an operation of the form [addr] = c ? a : b. + bb_valid_for_noce_process_p will have ensured that these are + the only stores present. */ + if (!REG_P (SET_DEST (sset_b)) && !MEM_P (SET_DEST (sset_b))) { BITMAP_FREE (bba_sets); return false; @@ -2112,9 +2118,9 @@ noce_try_cmove_arith (struct noce_if_info *if_info) } } - if (then_bb && else_bb && !a_simple && !b_simple - && (!bbs_ok_for_cmove_arith (then_bb, else_bb) - || !bbs_ok_for_cmove_arith (else_bb, then_bb))) + if (then_bb && else_bb + && (!bbs_ok_for_cmove_arith (then_bb, else_bb, x) + || !bbs_ok_for_cmove_arith (else_bb, then_bb, x))) return FALSE; start_sequence (); diff --git a/gcc/testsuite/gcc.c-torture/execute/pr68841.c b/gcc/testsuite/gcc.c-torture/execute/pr68841.c new file mode 100644 index 0000000000000000000000000000000000000000..15a27e7dc382d97398ca05427f431f5ecd3b89da --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/pr68841.c @@ -0,0 +1,31 @@ +static inline int +foo (int *x, int y) +{ + int z = *x; + while (y > z) + z *= 2; + return z; +} + +int +main () +{ + int i; + for (i = 1; i < 17; i++) + { + int j; + int k; + j = foo (&i, 7); + if (i >= 7) + k = i; + else if (i >= 4) + k = 8 + (i - 4) * 2; + else if (i == 3) + k = 12; + else + k = 8; + if (j != k) + __builtin_abort (); + } + return 0; +} diff --git a/gcc/testsuite/gcc.dg/pr68841.c b/gcc/testsuite/gcc.dg/pr68841.c new file mode 100644 index 0000000000000000000000000000000000000000..470048cc24f0d7150ed1e3141181bc1e8472ae12 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr68841.c @@ -0,0 +1,34 @@ +/* { dg-do run } */ +/* { dg-options "-Og -fif-conversion -flive-range-shrinkage -fpeel-loops -frerun-cse-after-loop" } */ + +static inline int +foo (int *x, int y) +{ + int z = *x; + while (y > z) + z *= 2; + return z; +} + +int +main () +{ + int i; + for (i = 1; i < 17; i++) + { + int j; + int k; + j = foo (&i, 7); + if (i >= 7) + k = i; + else if (i >= 4) + k = 8 + (i - 4) * 2; + else if (i == 3) + k = 12; + else + k = 8; + if (j != k) + __builtin_abort (); + } + return 0; +}