From patchwork Wed Mar 16 17:20:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: will schmidt X-Patchwork-Id: 1606261 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=vjxNN/hI; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4KJcVp0290z9sFn for ; Thu, 17 Mar 2022 04:20:49 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 753F43898C67 for ; Wed, 16 Mar 2022 17:20:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 753F43898C67 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1647451246; bh=C3GgoncxvdAzc2kP0ivqJLBSmrY2xfj2Ko8UKrXxfLc=; h=Subject:To:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=vjxNN/hIpUfGWUSlGYrG+QGCrXy+JJ0cybeM+aASG6Wr/v87y9loKv0/ZeIuKrhOr jAzXmA7XrKELlklQoPGKfwGXzoUnd0XSdNitYFtHJ3yIY1Dr/Yyb+scKa0A69bTNUP 1WtLLJta5y48b5MGI22r9aSyRg4Z9iQe1D3Mks+4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id A1277388E80D for ; Wed, 16 Mar 2022 17:20:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A1277388E80D Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 22GFDXuf013270; Wed, 16 Mar 2022 17:20:24 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3eujbktkb7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Mar 2022 17:20:24 +0000 Received: from m0098393.ppops.net (m0098393.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 22GGwiQL004386; Wed, 16 Mar 2022 17:20:23 GMT Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0a-001b2d01.pphosted.com with ESMTP id 3eujbktkan-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Mar 2022 17:20:23 +0000 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 22GH7Fx7004382; Wed, 16 Mar 2022 17:20:22 GMT Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by ppma05wdc.us.ibm.com with ESMTP id 3erk5a64yk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Mar 2022 17:20:22 +0000 Received: from b03ledav003.gho.boulder.ibm.com (b03ledav003.gho.boulder.ibm.com [9.17.130.234]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 22GHKKoU34800066 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 16 Mar 2022 17:20:21 GMT Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E087D6A05A; Wed, 16 Mar 2022 17:20:20 +0000 (GMT) Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E151B6A047; Wed, 16 Mar 2022 17:20:19 +0000 (GMT) Received: from sig-9-65-215-144.ibm.com (unknown [9.65.215.144]) by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTP; Wed, 16 Mar 2022 17:20:19 +0000 (GMT) Message-ID: <3b4976974ca4a9e481c462ef2b9a4892f1d4174f.camel@vnet.ibm.com> Subject: rs6000: RFC/Update support for addg6s instruction. PR100693 To: gcc-patches@gcc.gnu.org Date: Wed, 16 Mar 2022 12:20:18 -0500 X-Mailer: Evolution 3.28.5 (3.28.5-18.el8) Mime-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: rX3d7QstoS62H5347DrTlPU0hqwsymf1 X-Proofpoint-GUID: wVQymqvZVyUIJyLYKXJVsiPxc3DHp_3T X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.850,Hydra:6.0.425,FMLib:17.11.64.514 definitions=2022-03-16_06,2022-03-15_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 adultscore=0 malwarescore=0 clxscore=1015 spamscore=0 phishscore=0 priorityscore=1501 mlxlogscore=982 mlxscore=0 bulkscore=0 lowpriorityscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2203160101 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: will schmidt via Gcc-patches From: will schmidt Reply-To: will schmidt Cc: Peter Bergner , David Edelsohn , Segher Boessenkool Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi, RFC/Update support for addg6s instruction. PR100693 For PR100693, we currently provide an addg6s builtin using unsigned int arguments, but we are missing an unsigned long long argument equivalent. This patch adds an overload to provide the long long version of the builtin. unsigned long long __builtin_addg6s (unsigned long long, unsigned long long); RFC/concerns: This patch works, but looking briefly at intermediate stages is not behaving quite as I expected. Looking at the intermediate dumps, I see in pr100693.original that calls I expect to be routed to the internal __builtin_addg6s_si() that uses (unsigned int) arguments are instead being handled by __builtin_addg6s_di() with casts that convert the arguments to (unsigned long long). i.e. return (unsigned int) __builtin_addg6s_di ((long long unsigned int) a, (long long unsigned int) b); As a test, I see if I swap the order of the builtins in rs6000-overload.def I end up with code casting the ULL values to UI, which provides truncated results, and is similar to what occurs today without this patch. All that said, this patch seems to work. OK for next stage 1? Tested on power8BE as well as LE power8,power9,power10. 2022-03-15 Will Schmidt gcc/ PR target/100693 * config/rs6000/rs600-builtins.def: Remove entry for __builtin_addgs() and add entries for __builtin_addg6s_di() and __builtin_addg6s_si(). * config/rs6000/rs6000-overload.def: Add overloaded entries allowing __builtin_addg6s() to map to either of the __builtin_addg6s_{di,si} builtins. * config/rs6000/rs6000.md: Add UNSPEC_ADDG6S_SI and UNSPEC_ADDG6S_DI unspecs. Add define_insn entries for addg6s_si and addg6s_di based on those unspecs. * doc/extend.texi: Add entry for ULL __builtin_addg6s (ULL, ULL); testsuite/ PR target/100693 * gcc.target/powerpc/pr100693.c: New test. diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def index ae2760c33389..4c23cac26932 100644 --- a/gcc/config/rs6000/rs6000-builtins.def +++ b/gcc/config/rs6000/rs6000-builtins.def @@ -1993,12 +1993,16 @@ XXSPLTD_V2DI vsx_xxspltd_v2di {} ; Power7 builtins (ISA 2.06). [power7] - const unsigned int __builtin_addg6s (unsigned int, unsigned int); - ADDG6S addg6s {} + const unsigned long long __builtin_addg6s_di (unsigned long long, \ + unsigned long long); + ADDG6S_DI addg6s_di {} + + const unsigned int __builtin_addg6s_si (unsigned int, unsigned int); + ADDG6S_SI addg6s_si {} const signed long __builtin_bpermd (signed long, signed long); BPERMD bpermd_di {32bit} const unsigned int __builtin_cbcdtd (unsigned int); diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def index 44e2945aaa0e..931f85b738c5 100644 --- a/gcc/config/rs6000/rs6000-overload.def +++ b/gcc/config/rs6000/rs6000-overload.def @@ -76,10 +76,15 @@ ; Blank lines may be used as desired in this file between the lines as ; defined above; that is, you can introduce as many extra newlines as you ; like after a required newline, but nowhere else. Lines beginning with ; a semicolon are also treated as blank lines. +[ADDG6S, __builtin_i_addg6s, __builtin_addg6s] + unsigned long long __builtin_addg6s_di (signed long long, unsigned long long); + ADDG6S_DI + unsigned int __builtin_addg6s_si (unsigned int, unsigned int); + ADDG6S_SI [BCDADD, __builtin_bcdadd, __builtin_vec_bcdadd] vsq __builtin_vec_bcdadd (vsq, vsq, const int); BCDADD_V1TI vuc __builtin_vec_bcdadd (vuc, vuc, const int); diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index fdfbc6566a5c..d040f127eb55 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -122,11 +122,12 @@ (define_c_enum "unspec" UNSPEC_P8V_MTVSRWZ UNSPEC_P8V_RELOAD_FROM_GPR UNSPEC_P8V_MTVSRD UNSPEC_P8V_XXPERMDI UNSPEC_P8V_RELOAD_FROM_VSX - UNSPEC_ADDG6S + UNSPEC_ADDG6S_SI + UNSPEC_ADDG6S_DI UNSPEC_CDTBCD UNSPEC_CBCDTD UNSPEC_DIVE UNSPEC_DIVEU UNSPEC_UNPACK_128BIT @@ -14495,15 +14496,24 @@ (define_peephole2 operands[5] = change_address (mem, mode, new_addr); }) ;; Miscellaneous ISA 2.06 (power7) instructions -(define_insn "addg6s" +(define_insn "addg6s_si" [(set (match_operand:SI 0 "register_operand" "=r") (unspec:SI [(match_operand:SI 1 "register_operand" "r") (match_operand:SI 2 "register_operand" "r")] - UNSPEC_ADDG6S))] + UNSPEC_ADDG6S_SI))] + "TARGET_POPCNTD" + "addg6s %0,%1,%2" + [(set_attr "type" "integer")]) + +(define_insn "addg6s_di" + [(set (match_operand:DI 0 "register_operand" "=r") + (unspec:DI [(match_operand:DI 1 "register_operand" "r") + (match_operand:DI 2 "register_operand" "r")] + UNSPEC_ADDG6S_DI))] "TARGET_POPCNTD" "addg6s %0,%1,%2" [(set_attr "type" "integer")]) (define_insn "cdtbcd" diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 0dc752e8aadd..9eeb962f7363 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -18072,10 +18072,11 @@ addition to the @option{-maltivec}, @option{-mpopcntd}, and @option{-mvsx} options. The following basic built-in functions require @option{-mpopcntd}: @smallexample unsigned int __builtin_addg6s (unsigned int, unsigned int); +unsigned long long __builtin_addg6s (unsigned long long, unsigned long long); long long __builtin_bpermd (long long, long long); unsigned int __builtin_cbcdtd (unsigned int); unsigned int __builtin_cdtbcd (unsigned int); long long __builtin_divde (long long, long long); unsigned long long __builtin_divdeu (unsigned long long, unsigned long long); diff --git a/gcc/testsuite/gcc.target/powerpc/pr100693.c b/gcc/testsuite/gcc.target/powerpc/pr100693.c new file mode 100644 index 000000000000..31fd118ee0d9 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr100693.c @@ -0,0 +1,68 @@ +/* { dg-do compile { target { powerpc*-*-linux* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-mdejagnu-cpu=power7 -O2" } */ +/* { dg-final { scan-assembler-times {\maddg6s\M} 4 } } */ +/* { dg-final { scan-assembler-not "bl __builtin" } } */ + +/* Test case for the addg6s builtin, exercising both + * unsigned int and unsigned long long arguments. See also bcd-1.c. */ + +#include + +unsigned int test1 (unsigned int a, unsigned int b) +{ + return __builtin_addg6s (a, b); +} + +unsigned long long test2 (unsigned long long a, unsigned long long b) +{ + return __builtin_addg6s (a, b); +} + +/* Expected values, Not a full pattern, these are tuned + * to match the sparse iterations as seen below. */ +unsigned int exp_int[] = { +0x00000000, +0x00000006, +0x00000666, +0x00006666, +0x00666666, +0x06666666, +0x77777777 +}; + +unsigned long long exp_longlong[] = { +0x0000000000000000, +0x0000000000000006, +0x0000000000000666, +0x0000000000006666, +0x0000000000666666, +0x0000000006666666, +0x0000000666666666, +0x0000006666666666, +0x0000666666666666, +0x0006666666666666, +0x0666666666666666, +0x7777777777777777 +}; + +int main() { + unsigned long long z; + unsigned int ux; + unsigned long long uxl; + int idx; + + for (z=0,idx=0 ; z<=31; z+=6,idx++) { + ux = test1(0x01<