From patchwork Wed Jul 29 13:49:31 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrylo Tkachov X-Patchwork-Id: 501722 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id B75A314030C for ; Wed, 29 Jul 2015 23:49:45 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=Do6iq4fH; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; q= dns; s=default; b=fbhB85YPyrVL3OgfmUVKCbbdLY34gG9cO5N84ZJam0NV0Y 0P3oQ6qkekbQ3q71k9677D16YrTNsi9ACoQyF6T0XhzK6yvlNk5cVuz+tdmS5Tpd stVFlSzQV3aS7PPpc19IjwjXKXqVS/9NxXcZ795ZP6VXDg7AxGV/u+5sDpki4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; s= default; bh=p5cU13nkM2oLRho+V+3odNqTjSE=; b=Do6iq4fHbdd3FBNzRM4X 9OFmrRO/y1v4zsU5IV5Zxp7WUaVVz+31AeVStAkZTKGxfAn75x92pbrKu1NhgxgB xDPkQB192ad6Nl8VaY1bpNU6GhocMwOVA42wwmL1CXhfZHGp0xPE1xtjqoiCGBsB NwiqLwNIySacY0pB8oZOKLw= Received: (qmail 58957 invoked by alias); 29 Jul 2015 13:49:39 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 58947 invoked by uid 89); 29 Jul 2015 13:49:38 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 29 Jul 2015 13:49:36 +0000 Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-4-u5iXsyRiRMSCin23T3Clww-1; Wed, 29 Jul 2015 14:49:31 +0100 Received: from [10.2.207.50] ([10.1.2.79]) by cam-owa2.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 29 Jul 2015 14:49:31 +0100 Message-ID: <55B8D9EB.3060806@arm.com> Date: Wed, 29 Jul 2015 14:49:31 +0100 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches Subject: [PATCH][RTL-ifcvt] Improve conditional select ops on immediates X-MC-Unique: u5iXsyRiRMSCin23T3Clww-1 X-IsSubscribed: yes Hi all, This patch improves RTL if-conversion on sequences that perform a conditional select on integer constants. Most of the smart logic to do that already exists in the noce_try_store_flag_constants function. However, currently that function is tried after noce_try_cmove. noce_try_cmove is a simple catch-all function that just loads the two immediates and performs a conditional select between them. It returns true and then the caller noce_process_if_block doesn't try any other transformations, completely skipping the more aggressive transformations that noce_try_store_flag_constants allows! Calling noce_try_store_flag_constants before noce_try_cmove allows for the smarter if-conversion transformations to be used. An example that occurs a lot in the gcc code itself is for the C code: int foo (int a, int b) { return ((a & (1 << 25)) ? 5 : 4); } i.e. test a bit in a and return 5 or 4. Currently on aarch64 this generates the naive: and w2, w0, 33554432 // mask away all bits except bit 25 mov w1, 4 cmp w2, wzr mov w0, 5 csel w0, w0, w1, ne whereas with this patch this can be transformed into the much better: ubfx x0, x0, 25, 1 // extract bit 25 add w0, w0, 4 Another issue I encountered is that the code that claims to perform the transformation: /* if (test) x = 3; else x = 4; => x = 3 + (test == 0); */ doesn't seem to do exactly that in all cases. In fact for that case it will try something like: x = 4 - (test == 0) which is suboptimal for targets like aarch64 which have a conditional increment operation. This patch tweaks that code to always try to generate an addition of the condition rather than a subtraction. Anyway, for code: int fooinc (int x) { return x ? 1025 : 1026; } we currently generate: mov w2, 1025 mov w1, 1026 cmp w0, wzr csel w0, w2, w1, ne whereas with this patch we will generate: cmp w0, wzr cset w0, eq add w0, w0, 1025 Bootstrapped and tested on arm, aarch64, x86_64. Ok for trunk? Thanks, Kyrill P.S. noce_try_store_flag_constants is currently gated on !targetm.have_conditional_execution () but I don't see any reason to restrict it on targets with conditional execution. For example, I think the first example above would show a benefit on arm if it was enabled there. But that can be a separate investigation. 2015-07-29 Kyrylo Tkachov * ifcvt.c (noce_try_store_flag_constants): Reverse when diff is -STORE_FLAG and condition is reversable. Prefer to add to the flag value. (noce_process_if_block): Try noce_try_store_flag_constants before noce_try_cmove. 2015-07-29 Kyrylo Tkachov * gcc.target/aarch64/csel_bfx_1.c: New test. * gcc.target/aarch64/csel_imms_inc_1.c: Likewise. commit 0164ef164483bdf0b2f73e267e2ff1df7800dd6d Author: Kyrylo Tkachov Date: Tue Jul 28 14:59:46 2015 +0100 [RTL-ifcvt] Improve conditional increment ops on immediates diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index a57d78c..80d0285 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -1222,7 +1222,7 @@ noce_try_store_flag_constants (struct noce_if_info *if_info) reversep = 0; if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE) - normalize = 0; + normalize = 0, reversep = (diff == -STORE_FLAG_VALUE) && can_reverse; else if (ifalse == 0 && exact_log2 (itrue) >= 0 && (STORE_FLAG_VALUE == 1 || if_info->branch_cost >= 2)) @@ -1261,10 +1261,13 @@ noce_try_store_flag_constants (struct noce_if_info *if_info) => x = 3 + (test == 0); */ if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE) { - target = expand_simple_binop (mode, - (diff == STORE_FLAG_VALUE - ? PLUS : MINUS), - gen_int_mode (ifalse, mode), target, + rtx_code code = reversep ? PLUS : + (diff == STORE_FLAG_VALUE ? PLUS + : MINUS); + HOST_WIDE_INT to_add = reversep ? MIN (ifalse, itrue) : ifalse; + + target = expand_simple_binop (mode, code, + gen_int_mode (to_add, mode), target, if_info->x, 0, OPTAB_WIDEN); } @@ -3120,13 +3123,14 @@ noce_process_if_block (struct noce_if_info *if_info) goto success; if (noce_try_abs (if_info)) goto success; + if (!targetm.have_conditional_execution () + && noce_try_store_flag_constants (if_info)) + goto success; if (HAVE_conditional_move && noce_try_cmove (if_info)) goto success; if (! targetm.have_conditional_execution ()) { - if (noce_try_store_flag_constants (if_info)) - goto success; if (noce_try_addcc (if_info)) goto success; if (noce_try_store_flag_mask (if_info)) diff --git a/gcc/testsuite/gcc.target/aarch64/csel_bfx_1.c b/gcc/testsuite/gcc.target/aarch64/csel_bfx_1.c new file mode 100644 index 0000000..c20597f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/csel_bfx_1.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-save-temps -O2" } */ + +int +foo (int a, int b) +{ + return ((a & (1 << 25)) ? 5 : 4); +} + +/* { dg-final { scan-assembler "ubfx\t\[xw\]\[0-9\]*.*" } } */ +/* { dg-final { scan-assembler-not "csel\tw\[0-9\]*.*" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/csel_imms_inc_1.c b/gcc/testsuite/gcc.target/aarch64/csel_imms_inc_1.c new file mode 100644 index 0000000..2ae434d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/csel_imms_inc_1.c @@ -0,0 +1,42 @@ +/* { dg-do run } */ +/* { dg-options "-save-temps -O2 -fno-inline" } */ + +extern void abort (void); + +int +fooinc (int x) +{ + if (x) + return 1025; + else + return 1026; +} + +int +fooinc2 (int x) +{ + if (x) + return 1026; + else + return 1025; +} + +int +main (void) +{ + if (fooinc (0) != 1026) + abort (); + + if (fooinc (1) != 1025) + abort (); + + if (fooinc2 (0) != 1025) + abort (); + + if (fooinc2 (1) != 1026) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-not "csel\tw\[0-9\]*.*" } } */