From patchwork Thu Sep 10 21:58:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Bergner X-Patchwork-Id: 1362103 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=mZV/3M3x; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BnXnd3V8Fz9sVR for ; Fri, 11 Sep 2020 07:58:11 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B46EE398544E; Thu, 10 Sep 2020 21:58:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B46EE398544E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1599775089; bh=a8mt8cJaDhCIehnoqWjJFRNjxdaW4PUNLPRhj0GRodY=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=mZV/3M3xqfSE9EKiNdChpGAdHxlEEJH/a2uybs/k1+aCCgNo+KNWeke4AZHE8fUhH +jDi4jtbDTi7tcH+rBQr7Suv7+5VUOaJg4cfdcubrUeKDLgyD6NDdlGFlNjkVNW5mg B0dOry9/RZbS8Xk1Md2pGIc4Vbf7/mGQY+Ae892U= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 5532F3985443 for ; Thu, 10 Sep 2020 21:58:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5532F3985443 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 08ALVnRe186162; Thu, 10 Sep 2020 17:58:06 -0400 Received: from ppma04dal.us.ibm.com (7a.29.35a9.ip4.static.sl-reverse.com [169.53.41.122]) by mx0b-001b2d01.pphosted.com with ESMTP id 33fu8pt0fv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 10 Sep 2020 17:58:05 -0400 Received: from pps.filterd (ppma04dal.us.ibm.com [127.0.0.1]) by ppma04dal.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 08ALsDmw007349; Thu, 10 Sep 2020 21:58:05 GMT Received: from b01cxnp23034.gho.pok.ibm.com (b01cxnp23034.gho.pok.ibm.com [9.57.198.29]) by ppma04dal.us.ibm.com with ESMTP id 33c2a9wxnf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 10 Sep 2020 21:58:05 +0000 Received: from b01ledav002.gho.pok.ibm.com (b01ledav002.gho.pok.ibm.com [9.57.199.107]) by b01cxnp23034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 08ALw4Wx55181664 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 10 Sep 2020 21:58:04 GMT Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 77E2A124053; Thu, 10 Sep 2020 21:58:04 +0000 (GMT) Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F0617124052; Thu, 10 Sep 2020 21:58:03 +0000 (GMT) Received: from [9.160.99.80] (unknown [9.160.99.80]) by b01ledav002.gho.pok.ibm.com (Postfix) with ESMTP; Thu, 10 Sep 2020 21:58:03 +0000 (GMT) To: Segher Boessenkool Subject: [PATCH] rs6000: inefficient 64-bit constant generation for consecutive 1-bits Message-ID: <838b2e97-dfa9-3ca0-c3c6-1767d60ddf05@linux.ibm.com> Date: Thu, 10 Sep 2020 16:58:03 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-09-10_09:2020-09-10, 2020-09-10 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 mlxlogscore=999 mlxscore=0 spamscore=0 impostorscore=0 phishscore=0 bulkscore=0 clxscore=1015 priorityscore=1501 adultscore=0 malwarescore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2009100186 X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Peter Bergner via Gcc-patches From: Peter Bergner Reply-To: Peter Bergner Cc: Bill Schmidt , GCC Patches Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Generating arbitrary 64-bit constants on POWER can take up to 5 instructions. However, some special constants can be generated in fewer instructions. One special class of constants we don't handle, is constants that have one set of consecutive 1-bits. These can be generated with a "li rT,-1" followed by a "rldic rX,rT,SH,MB" instruction. The following patch implements this idea. This has passed bootstrap and regtesting on powerpc64le-linux with no regressions. Testing on powerpc64-linux is still running. Ok for trunk if the BE testing comes back clean too? Peter gcc/ PR target/93176 * config/rs6000/rs6000.c (has_consecutive_ones): New function. (num_insns_constant_gpr): Use it. (rs6000_emit_set_long_const): Likewise. * config/rs6000/rs6000.md UNSPEC_RLDIC: New unspec. (rldic): New. gcc/testsuite/ PR target/93176 * gcc.target/powerpc/pr93176.c: New test. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index ca5b71ecdd3..273cab14138 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -5728,6 +5728,47 @@ direct_return (void) return 0; } +/* Helper for num_insns_constant_gpr and rs6000_emit_set_long_const. + Return TRUE if VALUE contains one set of consecutive 1-bits. Also set + *SH and *MB to values needed to generate VALUE with the rldic instruction. + We accept consecutive 1-bits that wrap from MSB to LSB, ex: 0xff00...00ff. + Otherwise, return FALSE. */ + +static bool +has_consecutive_ones (unsigned HOST_WIDE_INT value, int *sh, int *mb) +{ + unsigned HOST_WIDE_INT nlz, ntz, mask; + unsigned HOST_WIDE_INT allones = -1; + + ntz = ctz_hwi (value); + nlz = clz_hwi (value); + mask = (allones >> nlz) & (allones << ntz); + if (value == mask) + { + /* Compute beginning and ending bit numbers, using IBM bit numbering. */ + *mb = nlz; + *sh = ntz; + return true; + } + + /* Check if the inverted value contains consecutive ones. We can create + that constant by basically swapping the MB and ME bit numbers. */ + value = ~value; + ntz = ctz_hwi (value); + nlz = clz_hwi (value); + mask = (allones >> nlz) & (allones << ntz); + if (value == mask) + { + /* Compute beginning and ending bit numbers, using IBM bit numbering. */ + *mb = GET_MODE_BITSIZE (DImode) - ntz; + *sh = GET_MODE_BITSIZE (DImode) - nlz; + return true; + } + + *sh = *mb = 0; + return false; +} + /* Helper for num_insns_constant. Calculate number of instructions to load VALUE to a single gpr using combinations of addi, addis, ori, oris and sldi instructions. */ @@ -5752,10 +5793,14 @@ num_insns_constant_gpr (HOST_WIDE_INT value) { HOST_WIDE_INT low = ((value & 0xffffffff) ^ 0x80000000) - 0x80000000; HOST_WIDE_INT high = value >> 31; + int sh, mb; if (high == 0 || high == -1) return 2; + if (has_consecutive_ones (value, &sh, &mb)) + return 2; + high >>= 1; if (low == 0) @@ -9427,7 +9472,8 @@ static void rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c) { rtx temp; - HOST_WIDE_INT ud1, ud2, ud3, ud4; + HOST_WIDE_INT ud1, ud2, ud3, ud4, value = c; + int sh, mb; ud1 = c & 0xffff; c = c >> 16; @@ -9453,6 +9499,12 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c) gen_rtx_IOR (DImode, copy_rtx (temp), GEN_INT (ud1))); } + else if (has_consecutive_ones (value, &sh, &mb)) + { + temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); + emit_insn (gen_rtx_SET (copy_rtx (temp), CONSTM1_RTX (DImode))); + emit_insn (gen_rldic (dest, copy_rtx (temp), GEN_INT (sh), GEN_INT (mb))); + } else if (ud3 == 0 && ud4 == 0) { temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 43b620ae1c0..feb5884505c 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -154,6 +154,7 @@ UNSPEC_CNTTZDM UNSPEC_PDEPD UNSPEC_PEXTD + UNSPEC_RLDIC ]) ;; @@ -9173,6 +9174,14 @@ DONE; }) +(define_insn "rldic" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") + (unspec:DI [(match_operand:DI 1 "gpc_reg_operand" "r") + (match_operand:DI 2 "u6bit_cint_operand" "n") + (match_operand:DI 3 "u6bit_cint_operand" "n")] + UNSPEC_RLDIC))] + "TARGET_POWERPC64" + "rldic %0,%1,%2,%3") ;; TImode/PTImode is similar, except that we usually want to compute the ;; address into a register and use lsi/stsi (the exception is during reload). diff --git a/gcc/testsuite/gcc.target/powerpc/pr93176.c b/gcc/testsuite/gcc.target/powerpc/pr93176.c new file mode 100644 index 00000000000..d4d93f8f1b3 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr93176.c @@ -0,0 +1,49 @@ +/* PR target/93176 */ +/* { dg-do compile } */ +/* { dg-require-effective-target lp64 } */ +/* { dg-options "-O2" } */ + +/* Verify we generate the correct 2 instruction sequence: + li rT,-1; rldic rX,rT,SH,MB for the constants below. */ + +unsigned long +test0 (void) +{ + return 0x00ffffffffffff00UL; +} + +unsigned long +test1 (void) +{ + return 0x00ffffffff000000UL; +} + +unsigned long +test2 (void) +{ + return 0x00ffff0000000000UL; +} + +unsigned long +test3 (void) +{ + return 0xffffff0000000000UL; +} + +unsigned long +test4 (void) +{ + return 0xffffff000000ffffUL; +} + +unsigned long +test5 (void) +{ + return 0x0000010000000000UL; +} + +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,8,8" } } */ +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,24,8" } } */ +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,40,8" } } */ +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,40,48" } } */ +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,40,23" } } */