From patchwork Sun Aug 9 10:30:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 1342583 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=nextmovesoftware.com header.i=@nextmovesoftware.com header.a=rsa-sha256 header.s=default header.b=DRu/XA/8; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BPb373Jtcz9sPB for ; Sun, 9 Aug 2020 20:30:42 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9BB563857C50; Sun, 9 Aug 2020 10:30:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 1AC073857C4A for ; Sun, 9 Aug 2020 10:30:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1AC073857C4A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=roger@nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:To:From:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=hR2t9VqgcUg0IKhTSlyXRQX6vhvJj88sqlVGU0G243E=; b=DRu/XA/8R34z0IuWXASF9CeH7x neceWkmDmdZHbV3gCMAjpUgq4GXvKoMH5JDaB8pwfo2nXxNOP58zxaXvyuL3AExmMYvlkCzo9X+Lh RLHoebKN8xfUMjbjiJnGVyw7bDha6szc+ESmOSTvnEH+NXAP6YJ+CO/fK+b1Nlfyk7iuUUh7uNliD VBX+nOMmnpoVOybu5D0yxZIaVM8JsdSASFyUJyiC4mYWtmBCHJkO+2vPRFUZTP8cpMCm1UDZuCboq KLjv90cTclvmp4EmWXgEGVZvvucFbnlroE8byMgq1Yb0okXiv7XOVb1kfLpkl00GxiyWWPbS0pjfT hHS2A+mQ==; Received: from host86-137-89-56.range86-137.btcentralplus.com ([86.137.89.56]:53700 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1k4ib3-00030L-On for gcc-patches@gcc.gnu.org; Sun, 09 Aug 2020 06:30:33 -0400 From: "Roger Sayle" To: "'GCC Patches'" Subject: [PATCH] middle-end: Correct calculation of mul_widen_cost and mul_highpart_cost. Date: Sun, 9 Aug 2020 11:30:32 +0100 Message-ID: <00a801d66e38$1d22eb60$5768c220$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdZuN8XnnTY5mnW0Rdar/+c6H4163Q== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This patch fixes a subtle bug in the depths of GCC's synth_mult, where the middle-end queries whether (how well) the target supports widening and highpart multiplications by calling targetm.rtx_costs. The code in init_expmed and init_expmed_one_mode iterates over various RTL patterns querying the cost of each. To avoid generating & garbage collecting too much junk, it reuses the same RTL over and over, but adjusting the modes between each call. Alas this reuse of state is a little fragile, and at some point a change to init_expmed_one_conv has resulted in the state (mode of a register) being changed, but not reset before being used again. The fallout is that GCC currently queries the backend for the cost of non-sense RTL such as: (mult:HI (zero_extend:HI (reg:TI 82)) (zero_extend:HI (reg:TI 82))) and (lshiftrt:HI (mult:HI (zero_extend:HI (reg:TI 82)) (zero_extend:HI (reg:TI 82))) (const_int 8 [0x8])) The fix is to set the mode of the register back to its assumed state, as (reg:QI 82) in the above patterns makes much more sense. Using the old software engineering/defensive programming maxim of "why fix a bug just once, if it can be fixed in multiple places", this patch both restores the original value in init_expmed_one_conv, and also sets it to the expected value in init_expmed_one_mode. This should hopefully signal the need to be careful of invariants for anyone modifying this code in future. Alas things are rarely simple... Fixing this obviously incorrect logic causes a failure of gcc.target/i386/pr71321.c that tests for a particular expansion from synth_mult. The issue here is that this test is checking for a specific multiplication expansion, when it should really be checking that we don't generate the inefficient "leal 0(,%rax,4), %edx" forms that were produced in GCC v6, as reported in the PR target/71321. Now that we use correct costs, GCC uses a multiply instruction matching icc, LLVM and GCC prior to v4.8. I've even microbenchmarked this function (over several minutes) with (disappointingly) no difference in timings. Three dependent leas has 3-cycle latency, exactly the same as a widening byte multiply (on Haswell), so the shorter code splits the tie. [I have a follow-up patch that may improve things further]. Before: movzbl %dil, %eax leal (%rax,%rax,4), %edx leal (%rax,%rdx,8), %edx leal (%rdx,%rdx,4), %edx shrw $11, %dx leal (%rdx,%rdx,4), %eax ... After: movl $-51, %edx movl %edx, %eax mulb %dil movl %eax, %edx shrw $11, %dx leal (%rdx,%rdx,4), %eax ... This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2020-08-09 Roger Sayle gcc/ChangeLog * expmed.c (init_expmed_one_conv): Restore all->reg's mode. (init_expmed_one_mode): Set all->reg to desired mode. gcc/testsuite/ChangeLog PR target/71321 * gcc.target/i386/pr71321.c: Check that the code doesn't use the 4B zero displacement lea, not that it uses lea. Thanks in advance, Roger --- Roger Sayle NextMove Software Cambridge, UK diff --git a/gcc/expmed.c b/gcc/expmed.c index 3d2d234..d34f0fb 100644 --- a/gcc/expmed.c +++ b/gcc/expmed.c @@ -155,6 +155,8 @@ init_expmed_one_conv (struct init_expmed_rtl *all, scalar_int_mode to_mode, PUT_MODE (all->reg, from_mode); set_convert_cost (to_mode, from_mode, speed, set_src_cost (which, to_mode, speed)); + /* Restore all->reg's mode. */ + PUT_MODE (all->reg, to_mode); } static void @@ -229,6 +231,7 @@ init_expmed_one_mode (struct init_expmed_rtl *all, if (GET_MODE_CLASS (int_mode_to) == MODE_INT && GET_MODE_WIDER_MODE (int_mode_to).exists (&wider_mode)) { + PUT_MODE (all->reg, mode); PUT_MODE (all->zext, wider_mode); PUT_MODE (all->wide_mult, wider_mode); PUT_MODE (all->wide_lshr, wider_mode); diff --git a/gcc/testsuite/gcc.target/i386/pr71321.c b/gcc/testsuite/gcc.target/i386/pr71321.c index 86ad812..24d144b 100644 --- a/gcc/testsuite/gcc.target/i386/pr71321.c +++ b/gcc/testsuite/gcc.target/i386/pr71321.c @@ -12,5 +12,4 @@ unsigned cvt_to_2digit_ascii(uint8_t i) { return cvt_to_2digit(i, 10) + 0x0a3030; } -/* { dg-final { scan-assembler-times "lea.\t\\(%\[0-9a-z\]+,%\[0-9a-z\]+,4" 3 } } */ -/* { dg-final { scan-assembler-times "lea.\t\\(%\[0-9a-z\]+,%\[0-9a-z\]+,8" 1 } } */ +/* { dg-final { scan-assembler-not "lea.*0" } } */