From patchwork Mon Jun 29 16:50:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 1318933 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=nextmovesoftware.com header.i=@nextmovesoftware.com header.a=rsa-sha256 header.s=default header.b=P0AMIMiU; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49wYQ21CFrz9sQt for ; Tue, 30 Jun 2020 02:50:17 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 09A63386F831; Mon, 29 Jun 2020 16:50:15 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 286F3386EC42 for ; Mon, 29 Jun 2020 16:50:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 286F3386EC42 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=roger@nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=4yVzTSmmL2zWK/lYm04VrodZXYbiPYYc777iItfhtfE=; b=P0AMIMiUV5QwsUEDMUVEhWdCRq hd8uOgh8NIfc1yXukL+S3N+IDBcPrWk55VnhyJ6C/HnJBk5FLUb82cvAYVWszG0v0Jxx8zdxhSNj0 gE7WLrFhFDez7W4/eOze0F24uuJ2ZkitHXzeuQKj3TvTIGVD+kFfw5YWn8gAvktilzh9g2S9vxS1I BGCwqKtqtXNEsAe8MES/A7pLOyLWtTPkoks2Xk2HA46INRUVJ2yNfv6DDEB83fBDqeFHmYeo0iIti ZMpZOcg1cri/zn6McH3QTNUpXEFoHV+VCvESdk8Fn09kqf4/ui14EcgXcemPJzxMwAVOMv4/Dwe+o gdX/0pWw==; Received: from [185.62.158.67] (port=60153 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1jpwyx-0001Y6-3z; Mon, 29 Jun 2020 12:50:11 -0400 From: "Roger Sayle" To: "'Richard Sandiford'" Subject: [committed] middle-end: Optimize (A&C)^(B&C) to (A^B)&C in simplify_rtx (take 3). Date: Mon, 29 Jun 2020 17:50:08 +0100 Message-ID: <00ca01d64e35$59cb3090$0d6191b0$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdZONUgfmFEOCzlmTKClNzipruUoUQ== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi Richard, Many thanks for the review. Many thanks also to Hans-Peter Nilsson, Joseph Myers and overseers@gcc.gnu.org for helping get my ssh keys updated. I've taken the opportunity of committing this patch to check that everything is working. For the record, here's the final version as committed. I've added the (xor (ashiftrt x c) (ashiftrt y c)) case as per your suggestion, which fires 6 times during make -k check on x86_64-pc-linux-gnu. Cheers, Roger --- -----Original Message----- From: Richard Sandiford Sent: 22 June 2020 20:41 To: Roger Sayle Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH take 2] middle-end: Optimize (A&C)^(B&C) to (A^B)&C in simplify_rtx. Hi Roger, Thanks for the update and sorry for the slow reply. "Roger Sayle" writes: > As suggested by Richard Sandiford, this patch splits out the previous > RTL simplification, of (X&C)^(Y&C) to (X^Y)&C, to its own function, > simplify_distributive_operation, and calls it when appropriate for > IOR, XOR and AND. > > Instrumenting a bootstrap reveals this optimization triggers > 393358 times during stage2, stage3 and building the libraries, and a > further 263659 times during make check. By order of occurrence the > RTL transformation opportunities are: > > 284447 01 and ior > 156798 00 xor and > 131110 11 and ior > 47253 00 and ior > 28035 00 ior and > 2804 01 ior and > 2698 11 ior and > 2262 01 xor and > 602 11 xor and > 312 00 xor xor > 298 00 xor rotate > 120 00 and ashift > 108 00 ior lshiftrt > 60 00 and ashiftrt > 54 00 ior ashift > 18 00 and lshiftrt > 12 10 xor and > 12 00 xor rotatert > 8 00 xor lshiftrt > 4 10 ior and > 2 00 ior ashiftrt That's an impressive number of hits :-) > where the two binary digits denote the surviving inner unique > operands, so "00 xor and" corresponds to the original (xor (and X Z) > (and Y Z)), and "01 and ior" corresponds to (and (ior X Y) (ior Y Z)). > > Many thanks also to Richard for pointing out simplify_rtx_c_tests, the > self-testing framework in simplify-rtx.c, which is new since my day. > This patch supplements the existing vector mode testing, with a suite > of scalar integer mode tests that confirm that many of the expected > integer simplifications in simplify-rtx are being applied as expected. > This includes three tests of the new simplify_distributive_operation. > > Before: > xgcc -xc -fself-test: 59693 pass(es) in 0.820956 seconds xgcc -xc++ > -fself-test: 59723 pass(es) in 0.786662 seconds > After: > xgcc -xc -fself-test: 60003 pass(es) in 0.860637 seconds xgcc -xc++ > -fself-test: 60033 pass(es) in 0.794624 seconds > > > I do have one thought/suggestion around test_scalar_ops for future > generations. These tests are extremely strict; instead of an > unexpected failure in the testsuite, breaking a self-test stops the > build. Instead of reverting this patch, should anything go wrong (in > future on a misbehaving platform), might I instead propose simply > commenting out the call to test_scalar_ops in simplify_rtx_c_tests as > a mitigation strategy whilst the build is restored. In fact, removing > the "static" from test_scalar_ops would avoid the "defined but not > used" complication from this disaster recovery plan. Yeah, we can work around it rather than revert the patch. > This patch has been tested with "make bootstrap" and "make -k check" > on x86_64-pc-linux-gnu with no regressions. > > > 2020-06-16 Roger Sayle > Richard Sandiford Thanks for the gesture, but I don't think I should be co-author here. I didn't write anything :-) > […] > @@ -3064,6 +3112,21 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode, > } > } > > + /* Convert (ior (and A C) (and B C)) into (and (ior A B) C). */ > + if (GET_CODE (op0) == GET_CODE (op1) > + && (GET_CODE (op0) == AND > + || GET_CODE (op0) == IOR > + || GET_CODE (op0) == LSHIFTRT > + || GET_CODE (op0) == ASHIFTRT > + || GET_CODE (op0) == ASHIFT > + || GET_CODE (op0) == ROTATE > + || GET_CODE (op0) == ROTATERT)) > + { > + tem = simplify_distributive_operation (code, mode, op0, op1); > + if (tem) > + return tem; > + } > + > tem = simplify_byte_swapping_operation (code, mode, op0, op1); > if (tem) > return tem; > @@ -3302,6 +3365,20 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode, > && (reversed = reversed_comparison (op0, int_mode))) > return reversed; > > + /* Convert (xor (and A C) (and B C)) into (and (xor A B) C). */ > + if (GET_CODE (op0) == GET_CODE (op1) > + && (GET_CODE (op0) == AND > + || GET_CODE (op0) == XOR > + || GET_CODE (op0) == LSHIFTRT > + || GET_CODE (op0) == ASHIFT > + || GET_CODE (op0) == ROTATE > + || GET_CODE (op0) == ROTATERT)) > + { > + tem = simplify_distributive_operation (code, mode, op0, op1); > + if (tem) > + return tem; > + } > + > tem = simplify_byte_swapping_operation (code, mode, op0, op1); > if (tem) > return tem; > @@ -3500,6 +3577,21 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode, > && rtx_equal_p (op1, XEXP (XEXP (op0, 1), 0))) > return simplify_gen_binary (AND, mode, op1, XEXP (op0, 0)); > > + /* Convert (and (ior A C) (ior B C)) into (ior (and A B) C). */ > + if (GET_CODE (op0) == GET_CODE (op1) > + && (GET_CODE (op0) == AND > + || GET_CODE (op0) == IOR > + || GET_CODE (op0) == LSHIFTRT > + || GET_CODE (op0) == ASHIFTRT > + || GET_CODE (op0) == ASHIFT > + || GET_CODE (op0) == ROTATE > + || GET_CODE (op0) == ROTATERT)) > + { > + tem = simplify_distributive_operation (code, mode, op0, op1); > + if (tem) > + return tem; > + } > + > tem = simplify_byte_swapping_operation (code, mode, op0, op1); > if (tem) > return tem; Any reason to exclude ASHIFTRT from the XOR case? AFAICT it distributes in the same way: for equal shifts, the ASHIFTRT reproduces the same number of sign bits, which are handled by all logical ops in the same way as the original sign bit. Thus doing the logical OP first on the original sign bit and then shifting will have the same effect. LGTM otherwise, thanks. If you can update the patch then I'll push it. Richard diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index 28c2dc6..cfb338e 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -2392,6 +2392,54 @@ simplify_binary_operation_series (rtx_code code, machine_mode mode, return gen_vec_series (mode, new_base, new_step); } +/* Subroutine of simplify_binary_operation_1. Un-distribute a binary + operation CODE with result mode MODE, operating on OP0 and OP1. + e.g. simplify (xor (and A C) (and (B C)) to (and (xor (A B) C). + Returns NULL_RTX if no simplification is possible. */ + +static rtx +simplify_distributive_operation (enum rtx_code code, machine_mode mode, + rtx op0, rtx op1) +{ + enum rtx_code op = GET_CODE (op0); + gcc_assert (GET_CODE (op1) == op); + + if (rtx_equal_p (XEXP (op0, 1), XEXP (op1, 1)) + && ! side_effects_p (XEXP (op0, 1))) + return simplify_gen_binary (op, mode, + simplify_gen_binary (code, mode, + XEXP (op0, 0), + XEXP (op1, 0)), + XEXP (op0, 1)); + + if (GET_RTX_CLASS (op) == RTX_COMM_ARITH) + { + if (rtx_equal_p (XEXP (op0, 0), XEXP (op1, 0)) + && ! side_effects_p (XEXP (op0, 0))) + return simplify_gen_binary (op, mode, + simplify_gen_binary (code, mode, + XEXP (op0, 1), + XEXP (op1, 1)), + XEXP (op0, 0)); + if (rtx_equal_p (XEXP (op0, 0), XEXP (op1, 1)) + && ! side_effects_p (XEXP (op0, 0))) + return simplify_gen_binary (op, mode, + simplify_gen_binary (code, mode, + XEXP (op0, 1), + XEXP (op1, 0)), + XEXP (op0, 0)); + if (rtx_equal_p (XEXP (op0, 1), XEXP (op1, 0)) + && ! side_effects_p (XEXP (op0, 1))) + return simplify_gen_binary (op, mode, + simplify_gen_binary (code, mode, + XEXP (op0, 0), + XEXP (op1, 1)), + XEXP (op0, 1)); + } + + return NULL_RTX; +} + /* Subroutine of simplify_binary_operation. Simplify a binary operation CODE with result mode MODE, operating on OP0 and OP1. If OP0 and/or OP1 are constant pool references, TRUEOP0 and TRUEOP1 represent the @@ -3064,6 +3112,21 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode, } } + /* Convert (ior (and A C) (and B C)) into (and (ior A B) C). */ + if (GET_CODE (op0) == GET_CODE (op1) + && (GET_CODE (op0) == AND + || GET_CODE (op0) == IOR + || GET_CODE (op0) == LSHIFTRT + || GET_CODE (op0) == ASHIFTRT + || GET_CODE (op0) == ASHIFT + || GET_CODE (op0) == ROTATE + || GET_CODE (op0) == ROTATERT)) + { + tem = simplify_distributive_operation (code, mode, op0, op1); + if (tem) + return tem; + } + tem = simplify_byte_swapping_operation (code, mode, op0, op1); if (tem) return tem; @@ -3302,6 +3365,21 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode, && (reversed = reversed_comparison (op0, int_mode))) return reversed; + /* Convert (xor (and A C) (and B C)) into (and (xor A B) C). */ + if (GET_CODE (op0) == GET_CODE (op1) + && (GET_CODE (op0) == AND + || GET_CODE (op0) == XOR + || GET_CODE (op0) == LSHIFTRT + || GET_CODE (op0) == ASHIFTRT + || GET_CODE (op0) == ASHIFT + || GET_CODE (op0) == ROTATE + || GET_CODE (op0) == ROTATERT)) + { + tem = simplify_distributive_operation (code, mode, op0, op1); + if (tem) + return tem; + } + tem = simplify_byte_swapping_operation (code, mode, op0, op1); if (tem) return tem; @@ -3500,6 +3578,21 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode, && rtx_equal_p (op1, XEXP (XEXP (op0, 1), 0))) return simplify_gen_binary (AND, mode, op1, XEXP (op0, 0)); + /* Convert (and (ior A C) (ior B C)) into (ior (and A B) C). */ + if (GET_CODE (op0) == GET_CODE (op1) + && (GET_CODE (op0) == AND + || GET_CODE (op0) == IOR + || GET_CODE (op0) == LSHIFTRT + || GET_CODE (op0) == ASHIFTRT + || GET_CODE (op0) == ASHIFT + || GET_CODE (op0) == ROTATE + || GET_CODE (op0) == ROTATERT)) + { + tem = simplify_distributive_operation (code, mode, op0, op1); + if (tem) + return tem; + } + tem = simplify_byte_swapping_operation (code, mode, op0, op1); if (tem) return tem; @@ -7218,6 +7311,81 @@ make_test_reg (machine_mode mode) return gen_rtx_REG (mode, test_reg_num++); } +static void +test_scalar_int_ops (machine_mode mode) +{ + rtx op0 = make_test_reg (mode); + rtx op1 = make_test_reg (mode); + rtx six = GEN_INT (6); + + rtx neg_op0 = simplify_gen_unary (NEG, mode, op0, mode); + rtx not_op0 = simplify_gen_unary (NOT, mode, op0, mode); + rtx bswap_op0 = simplify_gen_unary (BSWAP, mode, op0, mode); + + rtx and_op0_op1 = simplify_gen_binary (AND, mode, op0, op1); + rtx ior_op0_op1 = simplify_gen_binary (IOR, mode, op0, op1); + rtx xor_op0_op1 = simplify_gen_binary (XOR, mode, op0, op1); + + rtx and_op0_6 = simplify_gen_binary (AND, mode, op0, six); + rtx and_op1_6 = simplify_gen_binary (AND, mode, op1, six); + + /* Test some binary identities. */ + ASSERT_RTX_EQ (op0, simplify_gen_binary (PLUS, mode, op0, const0_rtx)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (PLUS, mode, const0_rtx, op0)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (MINUS, mode, op0, const0_rtx)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (MULT, mode, op0, const1_rtx)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (MULT, mode, const1_rtx, op0)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (DIV, mode, op0, const1_rtx)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (AND, mode, op0, constm1_rtx)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (AND, mode, constm1_rtx, op0)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (IOR, mode, op0, const0_rtx)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (IOR, mode, const0_rtx, op0)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (XOR, mode, op0, const0_rtx)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (XOR, mode, const0_rtx, op0)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (ASHIFT, mode, op0, const0_rtx)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (ROTATE, mode, op0, const0_rtx)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (ASHIFTRT, mode, op0, const0_rtx)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (LSHIFTRT, mode, op0, const0_rtx)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (ROTATERT, mode, op0, const0_rtx)); + + /* Test some self-inverse operations. */ + ASSERT_RTX_EQ (op0, simplify_gen_unary (NEG, mode, neg_op0, mode)); + ASSERT_RTX_EQ (op0, simplify_gen_unary (NOT, mode, not_op0, mode)); + ASSERT_RTX_EQ (op0, simplify_gen_unary (BSWAP, mode, bswap_op0, mode)); + + /* Test some reflexive operations. */ + ASSERT_RTX_EQ (op0, simplify_gen_binary (AND, mode, op0, op0)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (IOR, mode, op0, op0)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (SMIN, mode, op0, op0)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (SMAX, mode, op0, op0)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (UMIN, mode, op0, op0)); + ASSERT_RTX_EQ (op0, simplify_gen_binary (UMAX, mode, op0, op0)); + + ASSERT_RTX_EQ (const0_rtx, simplify_gen_binary (MINUS, mode, op0, op0)); + ASSERT_RTX_EQ (const0_rtx, simplify_gen_binary (XOR, mode, op0, op0)); + + /* Test simplify_distributive_operation. */ + ASSERT_RTX_EQ (simplify_gen_binary (AND, mode, xor_op0_op1, six), + simplify_gen_binary (XOR, mode, and_op0_6, and_op1_6)); + ASSERT_RTX_EQ (simplify_gen_binary (AND, mode, ior_op0_op1, six), + simplify_gen_binary (IOR, mode, and_op0_6, and_op1_6)); + ASSERT_RTX_EQ (simplify_gen_binary (AND, mode, and_op0_op1, six), + simplify_gen_binary (AND, mode, and_op0_6, and_op1_6)); +} + +/* Verify some simplifications involving scalar expressions. */ + +static void +test_scalar_ops () +{ + for (unsigned int i = 0; i < NUM_MACHINE_MODES; ++i) + { + machine_mode mode = (machine_mode) i; + if (SCALAR_INT_MODE_P (mode) && mode != BImode) + test_scalar_int_ops (mode); + } +} + /* Test vector simplifications involving VEC_DUPLICATE in which the operands and result have vector mode MODE. SCALAR_REG is a pseudo register that holds one element of MODE. */ @@ -7735,6 +7903,7 @@ simplify_const_poly_int_tests::run () void simplify_rtx_c_tests () { + test_scalar_ops (); test_vector_ops (); simplify_const_poly_int_tests::run (); }