From patchwork Mon Aug 10 11:14:01 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrylo Tkachov X-Patchwork-Id: 505559 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3DD191400A0 for ; Mon, 10 Aug 2015 21:14:18 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=PzO/Mfom; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=default; b=ghrxR0pXbm7cFWtFS NB/FAojkA+Ebe650QnfUUikp16mteTC/HEo9eFVgGcFNLn6GTzTU9jokrYFsEMl+ tuia+3VJVAbJhMUBddYYf4lR9u4tZ8E4Xz5c13bG7goMBm7yli9q/nPFi48UbJax GEkuCYdYk4Mtq4/f9V3VhNl7os= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=default; bh=E+CnpQqX5120xoVERLQM6xQ DJ/M=; b=PzO/Mfom/rMnFrGdSL8eejpR1zZp5qrC1elg25xVXnY5KuORpmWqWSx 1LD6X2R3mPybJE3au1icdhin9lPXeztrW5p+EgU9/N1YubwLgO08V156nhacSy1H hcJS3yfZ+O/ZQWXIHW+0YN+hkC588QBry9MQg0PK9KtYcRCjWTnc= Received: (qmail 109432 invoked by alias); 10 Aug 2015 11:14:09 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 109418 invoked by uid 89); 10 Aug 2015 11:14:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 10 Aug 2015 11:14:06 +0000 Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-29-5g0uGSRuQtqqPj7L6fR9IA-1; Mon, 10 Aug 2015 12:14:02 +0100 Received: from [10.2.207.50] ([10.1.2.79]) by cam-owa2.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 10 Aug 2015 12:14:02 +0100 Message-ID: <55C88779.4080801@arm.com> Date: Mon, 10 Aug 2015 12:14:01 +0100 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Ramana Radhakrishnan , Richard Earnshaw , Marcus Shawcroft , James Greenhalgh Subject: Re: [PATCH][ARM][3/3] Expand mod by power of 2 References: <55B219AE.6010102@arm.com> <55BB2FDD.2010900@arm.com> In-Reply-To: <55BB2FDD.2010900@arm.com> X-MC-Unique: 5g0uGSRuQtqqPj7L6fR9IA-1 X-IsSubscribed: yes Here is a slight respin. The important parts are the same, just the expander now uses the slightly shorter arm_gen_compare_reg and the rtx costs hunk is moved under an explicit case MOD. Note, the tests still require patch 1/3 that does this for aarch64 that I hope to post a respinned version of soon. Ok after the prerequisite goes in? Thanks, Kyrill 2015-08-10 Kyrylo Tkachov * config/arm/arm.md (*subsi3_compare0): Rename to... (subsi3_compare0): ... This. (*arm_andsi3_insn): Rename to... (arm_andsi3_insn): ... This. (modsi3): New define_expand. * config/arm/arm.c (arm_new_rtx_costs, MOD case): Handle case when operand is power of 2. 2015-08-10 Kyrylo Tkachov * gcc.target/aarch64/mod_2.x: New file. * gcc.target/aarch64/mod_256.x: Likewise. * gcc.target/arm/mod_2.c: New test. * gcc.target/arm/mod_256.c: Likewise. * gcc.target/aarch64/mod_2.c: Likewise. * gcc.target/aarch64/mod_256.c: Likewise. On 31/07/15 09:20, Kyrill Tkachov wrote: > Ping. > > https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02037.html > Thanks, > Kyrill > > On 24/07/15 11:55, Kyrill Tkachov wrote: >> Hi all, >> >> This third patch implements the same algorithm as patch 1/3 but for arm. >> That is, for X % N where N is a power of 2 we do: >> >> rsbs r1, r0, #0 >> and r0, r0, #(N - 1) >> and r1, r1, #(N - 1) >> rsbpl r0, r1, #0 >> >> For the special case where N is 2 we do the shorter: >> cmp r0, #0 >> and r0, r0, #1 >> rsblt r0, r0, #0 >> >> Note that for the final conditional negate we expand to an IF_THEN_ELSE of a NEG >> rather than a cond_exec rtx because the lra dataflow analysis doesn't always deal >> with cond_execs correctly. The splitters fixed in patch 2/3 then break it into a >> cond_exec after reload, so it all works out. >> >> Bootstrapped and tested on arm, with both ARM and Thumb2 states. >> >> Tests are added and shared with aarch64. >> >> Ok for trunk? >> >> Thanks, >> Kyrill >> >> 2015-07-24 Kyrylo Tkachov >> >> * config/arm/arm.md (*subsi3_compare0): Rename to... >> (subsi3_compare0): ... This. >> (*arm_andsi3_insn): Rename to... >> (arm_andsi3_insn): ... This. >> (modsi3): New define_expand. >> * config/arm/arm.c (arm_new_rtx_costs, MOD case): Handle case >> operand is power of 2. >> >> >> 2015-07-24 Kyrylo Tkachov >> >> * gcc.target/aarch64/mod_2.x: New file. >> * gcc.target/aarch64/mod_256.x: Likewise. >> * gcc.target/arm/mod_2.c: New test. >> * gcc.target/arm/mod_256.c: Likewise. >> * gcc.target/aarch64/mod_2.c: Likewise. >> * gcc.target/aarch64/mod_256.c: Likewise. commit 7d0da77d73552d8e683525f4e6fb8bc660ed1c56 Author: Kyrylo Tkachov Date: Fri Jul 17 16:30:01 2015 +0100 [ARM][3/3] Expand mod by power of 2 diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 1ea9e27..a607a5c 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -9559,6 +9559,24 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, return false; /* All arguments must be in registers. */ case MOD: + /* MOD by a power of 2 can be expanded as: + rsbs r1, r0, #0 + and r0, r0, #(n - 1) + and r1, r1, #(n - 1) + rsbpl r0, r1, #0. */ + if (CONST_INT_P (XEXP (x, 1)) + && exact_log2 (INTVAL (XEXP (x, 1))) > 0 + && mode == SImode) + { + *cost += COSTS_N_INSNS (3); + + if (speed_p) + *cost += 2 * extra_cost->alu.logical + + extra_cost->alu.arith; + return true; + } + + /* Fall-through. */ case UMOD: *cost = LIBCALL_COST (2); return false; /* All arguments must be in registers. */ diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 817860d..652ec51 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -1229,7 +1229,7 @@ (define_peephole2 "" ) -(define_insn "*subsi3_compare0" +(define_insn "subsi3_compare0" [(set (reg:CC_NOOV CC_REGNUM) (compare:CC_NOOV (minus:SI (match_operand:SI 1 "arm_rhs_operand" "r,r,I") @@ -2158,7 +2158,7 @@ (define_expand "andsi3" ) ; ??? Check split length for Thumb-2 -(define_insn_and_split "*arm_andsi3_insn" +(define_insn_and_split "arm_andsi3_insn" [(set (match_operand:SI 0 "s_register_operand" "=r,l,r,r,r") (and:SI (match_operand:SI 1 "s_register_operand" "%r,0,r,r,r") (match_operand:SI 2 "reg_or_int_operand" "I,l,K,r,?n")))] @@ -11143,6 +11143,76 @@ (define_expand "thumb_legacy_rev" "" ) +;; ARM-specific expansion of signed mod by power of 2 +;; using conditional negate. +;; For r0 % n where n is a power of 2 produce: +;; rsbs r1, r0, #0 +;; and r0, r0, #(n - 1) +;; and r1, r1, #(n - 1) +;; rsbpl r0, r1, #0 + +(define_expand "modsi3" + [(match_operand:SI 0 "register_operand" "") + (match_operand:SI 1 "register_operand" "") + (match_operand:SI 2 "const_int_operand" "")] + "TARGET_32BIT" + { + HOST_WIDE_INT val = INTVAL (operands[2]); + + if (val <= 0 + || exact_log2 (INTVAL (operands[2])) <= 0 + || !const_ok_for_arm (INTVAL (operands[2]) - 1)) + FAIL; + + rtx mask = GEN_INT (val - 1); + + /* In the special case of x0 % 2 we can do the even shorter: + cmp r0, #0 + and r0, r0, #1 + rsblt r0, r0, #0. */ + + if (val == 2) + { + rtx cc_reg = arm_gen_compare_reg (LT, + operands[1], const0_rtx, NULL_RTX); + rtx cond = gen_rtx_LT (SImode, cc_reg, const0_rtx); + rtx masked = gen_reg_rtx (SImode); + + emit_insn (gen_arm_andsi3_insn (masked, operands[1], mask)); + emit_move_insn (operands[0], + gen_rtx_IF_THEN_ELSE (SImode, cond, + gen_rtx_NEG (SImode, + masked), + masked)); + DONE; + } + + rtx neg_op = gen_reg_rtx (SImode); + rtx_insn *insn = emit_insn (gen_subsi3_compare0 (neg_op, const0_rtx, + operands[1])); + + /* Extract the condition register and mode. */ + rtx cmp = XVECEXP (PATTERN (insn), 0, 0); + rtx cc_reg = SET_DEST (cmp); + rtx cond = gen_rtx_GE (SImode, cc_reg, const0_rtx); + + emit_insn (gen_arm_andsi3_insn (operands[0], operands[1], mask)); + + rtx masked_neg = gen_reg_rtx (SImode); + emit_insn (gen_arm_andsi3_insn (masked_neg, neg_op, mask)); + + /* We want a conditional negate here, but emitting COND_EXEC rtxes + during expand does not always work. Do an IF_THEN_ELSE instead. */ + emit_move_insn (operands[0], + gen_rtx_IF_THEN_ELSE (SImode, cond, + gen_rtx_NEG (SImode, masked_neg), + operands[0])); + + + DONE; + } +) + (define_expand "bswapsi2" [(set (match_operand:SI 0 "s_register_operand" "=r") (bswap:SI (match_operand:SI 1 "s_register_operand" "r")))] diff --git a/gcc/testsuite/gcc.target/aarch64/mod_2.c b/gcc/testsuite/gcc.target/aarch64/mod_2.c new file mode 100644 index 0000000..2645c18 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/mod_2.c @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */ + +#include "mod_2.x" + +/* { dg-final { scan-assembler "csneg\t\[wx\]\[0-9\]*" } } */ +/* { dg-final { scan-assembler-times "and\t\[wx\]\[0-9\]*" 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/mod_2.x b/gcc/testsuite/gcc.target/aarch64/mod_2.x new file mode 100644 index 0000000..2b079a4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/mod_2.x @@ -0,0 +1,5 @@ +int +f (int x) +{ + return x % 2; +} diff --git a/gcc/testsuite/gcc.target/aarch64/mod_256.c b/gcc/testsuite/gcc.target/aarch64/mod_256.c new file mode 100644 index 0000000..567332c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/mod_256.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */ + +#include "mod_256.x" + +/* { dg-final { scan-assembler "csneg\t\[wx\]\[0-9\]*" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/mod_256.x b/gcc/testsuite/gcc.target/aarch64/mod_256.x new file mode 100644 index 0000000..c1de42c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/mod_256.x @@ -0,0 +1,5 @@ +int +f (int x) +{ + return x % 256; +} diff --git a/gcc/testsuite/gcc.target/arm/mod_2.c b/gcc/testsuite/gcc.target/arm/mod_2.c new file mode 100644 index 0000000..93017a1 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mod_2.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm32 } */ +/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */ + +#include "../aarch64/mod_2.x" + +/* { dg-final { scan-assembler "rsblt\tr\[0-9\]*" } } */ +/* { dg-final { scan-assembler-times "and\tr\[0-9\].*1" 1 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mod_256.c b/gcc/testsuite/gcc.target/arm/mod_256.c new file mode 100644 index 0000000..92ab05a --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mod_256.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm32 } */ +/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */ + +#include "../aarch64/mod_256.x" + +/* { dg-final { scan-assembler "rsbpl\tr\[0-9\]*" } } */ +/* { dg-final { scan-assembler "and\tr\[0-9\].*255" } } */