From patchwork Tue Jan 19 22:33:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carl Love X-Patchwork-Id: 1428909 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=SkjjkVB6; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DL3N20xsNz9sRR for ; Wed, 20 Jan 2021 09:33:36 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4A08F39450C5; Tue, 19 Jan 2021 22:33:34 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4A08F39450C5 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1611095614; bh=z7uIMgUm8YwQKjwyQEiKJq/o77lxBzbA8ECKgZLJciU=; h=Subject:To:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=SkjjkVB6KPM6nZhrBOIxykeyfEcYD7ZZufMty/rXPpViGDuPOJwvuS9aIF78b+JbF bZAY/hEfShqnsAGKsLD4nUOhx/l6nk6A3LBFlVopyjadyNjXwvtjHn5RVVN/CDoPkh Sq/yMw7YOMw/5tlXwbBpQPS70fqnLotOc+C0slpQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 5FFB438708D6 for ; Tue, 19 Jan 2021 22:33:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5FFB438708D6 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 10JMW0J2072097; Tue, 19 Jan 2021 17:33:29 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 366839gb26-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 17:33:29 -0500 Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 10JMW1Nq072151; Tue, 19 Jan 2021 17:33:28 -0500 Received: from ppma04wdc.us.ibm.com (1a.90.2fa9.ip4.static.sl-reverse.com [169.47.144.26]) by mx0a-001b2d01.pphosted.com with ESMTP id 366839gb1s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 17:33:28 -0500 Received: from pps.filterd (ppma04wdc.us.ibm.com [127.0.0.1]) by ppma04wdc.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 10JMWH7w007903; Tue, 19 Jan 2021 22:33:27 GMT Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by ppma04wdc.us.ibm.com with ESMTP id 363qs91gc0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 22:33:27 +0000 Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 10JMXQF622348204 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 19 Jan 2021 22:33:26 GMT Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9F3B1C6061; Tue, 19 Jan 2021 22:33:26 +0000 (GMT) Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1BC8CC6063; Tue, 19 Jan 2021 22:33:26 +0000 (GMT) Received: from li-e362e14c-2378-11b2-a85c-87d605f3c641.ibm.com (unknown [9.163.70.85]) by b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTP; Tue, 19 Jan 2021 22:33:26 +0000 (GMT) Message-ID: Subject: [PATCH 1/6 ver 3] rs6000, Fix arguments in altivec_vrlwmi and altivec_rlwdi builtins To: Segher Boessenkool , will schmidt , cel@us.ibm.com Date: Tue, 19 Jan 2021 14:33:25 -0800 In-Reply-To: <20201013002313.GV2672@gate.crashing.org> References: <815d6b091f4b8bf3ab7c7e203c41d03c6c0e8d81.camel@us.ibm.com> <8acbb7bc3964944154491037884523c94ac3bdb1.camel@us.ibm.com> <384c17c8b764c850f8a9a08e963ed34ec89de28b.camel@vnet.ibm.com> <82b546ae55356938b9002ca4a9d0d4eb62961dae.camel@vnet.ibm.com> <20201013002313.GV2672@gate.crashing.org> X-Mailer: Evolution 3.28.5 (3.28.5-12.el8) Mime-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.343, 18.0.737 definitions=2021-01-19_12:2021-01-18, 2021-01-19 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 phishscore=0 mlxscore=0 bulkscore=0 clxscore=1015 priorityscore=1501 mlxlogscore=999 adultscore=0 lowpriorityscore=0 suspectscore=0 spamscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2101190117 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Carl Love via Gcc-patches From: Carl Love Reply-To: Carl Love Cc: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Will, Segher: This patch fixes the order of the argument in the vec_rlmi and vec_rlnm builtins. The patch also adds a new test cases to verify the fix. The patch has been tested on powerpc64-linux instead (Power 8 BE) powerpc64-linux instead (Power 9 LE) powerpc64-linux instead (Power 10 LE) Please let me know if the patch is acceptable for mainline. Carl Love ---------------------------------------------------------------------- gcc/ChangeLog 2021-01-12 Carl Love gcc/ * config/rs6000/altivec.md (altivec_vrlmi): Fix bug in argument generation. gcc/testsuite/ gcc.target/powerpc/check-builtin-vec_rlnm-runnable.c: New runnable test case. gcc.target/powerpc/vec-rlmi-rlnm.c: Update scan assembler times for xxlor instruction. --- gcc/config/rs6000/altivec.md | 6 +- .../powerpc/check-builtin-vec_rlnm-runnable.c | 233 ++++++++++++++++++ .../gcc.target/powerpc/vec-rlmi-rlnm.c | 2 +- 3 files changed, 237 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/check-builtin-vec_rlnm-runnable.c diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index fc19a8fc807..4d08cca2228 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -1982,12 +1982,12 @@ (define_insn "altivec_vrlmi" [(set (match_operand:VIlong 0 "register_operand" "=v") - (unspec:VIlong [(match_operand:VIlong 1 "register_operand" "0") - (match_operand:VIlong 2 "register_operand" "v") + (unspec:VIlong [(match_operand:VIlong 1 "register_operand" "v") + (match_operand:VIlong 2 "register_operand" "0") (match_operand:VIlong 3 "register_operand" "v")] UNSPEC_VRLMI))] "TARGET_P9_VECTOR" - "vrlmi %0,%2,%3" + "vrlmi %0,%1,%3" [(set_attr "type" "veclogical")]) (define_insn "altivec_vrlnm" diff --git a/gcc/testsuite/gcc.target/powerpc/check-builtin-vec_rlnm-runnable.c b/gcc/testsuite/gcc.target/powerpc/check-builtin-vec_rlnm-runnable.c new file mode 100644 index 00000000000..b97bc519c87 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/check-builtin-vec_rlnm-runnable.c @@ -0,0 +1,233 @@ +/* { dg-do run } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-O2 -mdejagnu-cpu=power9 -save-temps" } */ + +/* Verify the vec_rlm and vec_rlmi builtins works correctly. */ +/* { dg-final { scan-assembler-times {\mvrldmi\M} 1 } } */ + +#include + +#define DEBUG 1 + +#if DEBUG +#include +#include +#endif + +void abort (void); + +int main () +{ + int i; + + vector unsigned int vec_arg1_int, vec_arg2_int, vec_arg3_int; + vector unsigned int vec_result_int, vec_expected_result_int; + + vector unsigned long long int vec_arg1_di, vec_arg2_di, vec_arg3_di; + vector unsigned long long int vec_result_di, vec_expected_result_di; + + unsigned int mask_begin, mask_end, shift; + unsigned long long int mask; + +/* Check vec int version of vec_rlmi builtin */ + mask = 0; + mask_begin = 0; + mask_end = 4; + shift = 16; + + for (i = 0; i < 31; i++) + if ((i >= mask_begin) && (i <= mask_end)) + mask |= 0x80000000ULL >> i; + + for (i = 0; i < 4; i++) { + vec_arg1_int[i] = 0x12345678 + i*0x11111111; + vec_arg2_int[i] = 0xA1B1CDEF; + vec_arg3_int[i] = mask_begin << 16 | mask_end << 8 | shift; + + /* do rotate */ + vec_expected_result_int[i] = ( vec_arg2_int[i] & ~mask) + | ((vec_arg1_int[i] << shift) | (vec_arg1_int[i] >> (32-shift))) & mask; + + } + + /* vec_rlmi(arg1, arg2, arg3) + result - rotate each element of arg1 left and inserting it into arg2 + element of arg2 based on the mask specified in arg3. The shift, mask + start and end is specified in arg3. */ + vec_result_int = vec_rlmi (vec_arg1_int, vec_arg2_int, vec_arg3_int); + + for (i = 0; i < 4; i++) { + if (vec_result_int[i] != vec_expected_result_int[i]) +#if DEBUG + printf("ERROR: i = %d, vec_rlmi int result 0x%x, does not match " + "expected result 0x%x\n", i, vec_result_int[i], + vec_expected_result_int[i]); +#else + abort(); +#endif + } + +/* Check vec long long int version of vec_rlmi builtin */ + mask = 0; + mask_begin = 0; + mask_end = 4; + shift = 16; + + for (i = 0; i < 31; i++) + if ((i >= mask_begin) && (i <= mask_end)) + mask |= 0x80000000ULL >> i; + + for (i = 0; i < 2; i++) { + vec_arg1_di[i] = 0x1234567800000000 + i*0x11111111; + vec_arg2_di[i] = 0xA1B1C1D1E1F12345; + vec_arg3_di[i] = mask_begin << 16 | mask_end << 8 | shift; + + /* do rotate */ + vec_expected_result_di[i] = ( vec_arg2_di[i] & ~mask) + | ((vec_arg1_di[i] << shift) | (vec_arg1_di[i] >> (64-shift))) & mask; + } + + /* vec_rlmi(arg1, arg2, arg3) + result - rotate each element of arg1 left and inserting it into arg2 + element of arg2 based on the mask specified in arg3. The shift, mask + start and end is specified in arg3. */ + vec_result_di = vec_rlmi (vec_arg1_di, vec_arg2_di, vec_arg3_di); + + for (i = 0; i < 2; i++) { + if (vec_result_di[i] != vec_expected_result_di[i]) +#if DEBUG + printf("ERROR: i = %d, vec_rlmi int result 0x%x, does not match " + "expected result 0x%x\n", i, vec_result_di[i], + vec_expected_result_di[i]); +#else + abort(); +#endif + } + + /* Check vec int version of vec_rlnm builtin */ + mask = 0; + mask_begin = 0; + mask_end = 4; + shift = 16; + + for (i = 0; i < 31; i++) + if ((i >= mask_begin) && (i <= mask_end)) + mask |= 0x80000000ULL >> i; + + for (i = 0; i < 4; i++) { + vec_arg1_int[i] = 0x12345678 + i*0x11111111; + vec_arg2_int[i] = shift; + vec_arg3_int[i] = mask_begin << 8 | mask_end; + vec_expected_result_int[i] = (vec_arg1_int[i] << shift) & mask; + } + + /* vec_rlnm(arg1, arg2, arg3) + result - rotate each element of arg1 left by shift in element of arg2. + Then AND with mask whose start/stop bits are specified in element of + arg3. */ + vec_result_int = vec_rlnm (vec_arg1_int, vec_arg2_int, vec_arg3_int); + for (i = 0; i < 4; i++) { + if (vec_result_int[i] != vec_expected_result_int[i]) +#if DEBUG + printf("ERROR: vec_rlnm, i = %d, int result 0x%x does not match " + "expected result 0x%x\n", i, vec_result_int[i], + vec_expected_result_int[i]); +#else + abort(); +#endif + } + +/* Check vec long int version of builtin */ + mask = 0; + mask_begin = 0; + mask_end = 4; + shift = 20; + + for (i = 0; i < 63; i++) + if ((i >= mask_begin) && (i <= mask_end)) + mask |= 0x8000000000000000ULL >> i; + + for (i = 0; i < 2; i++) { + vec_arg1_di[i] = 0x123456789ABCDE00ULL + i*0x1111111111111111ULL; + vec_arg2_di[i] = shift; + vec_arg3_di[i] = mask_begin << 8 | mask_end; + vec_expected_result_di[i] = (vec_arg1_di[i] << shift) & mask; + } + + vec_result_di = vec_rlnm (vec_arg1_di, vec_arg2_di, vec_arg3_di); + + for (i = 0; i < 2; i++) { + if (vec_result_di[i] != vec_expected_result_di[i]) +#if DEBUG + printf("ERROR: vec_rlnm, i = %d, long long int result 0x%llx does not " + "match expected result 0x%llx\n", i, vec_result_di[i], + vec_expected_result_di[i]); +#else + abort(); +#endif + } + + /* Check vec int version of vec_vrlnm builtin */ + mask = 0; + mask_begin = 0; + mask_end = 4; + shift = 16; + + for (i = 0; i < 31; i++) + if ((i >= mask_begin) && (i <= mask_end)) + mask |= 0x80000000ULL >> i; + + for (i = 0; i < 4; i++) { + vec_arg1_int[i] = 0x12345678 + i*0x11111111; + vec_arg2_int[i] = mask_begin << 16 | mask_end << 8 | shift; + vec_expected_result_int[i] = (vec_arg1_int[i] << shift) & mask; + } + + /* vec_vrlnm(arg1, arg2, arg3) + result - rotate each element of arg1 left then AND with mask. The mask + start, stop bits is specified in the second argument. The shift amount + is also specified in the second argument. */ + vec_result_int = vec_vrlnm (vec_arg1_int, vec_arg2_int); + + for (i = 0; i < 4; i++) { + if (vec_result_int[i] != vec_expected_result_int[i]) +#if DEBUG + printf("ERROR: vec_vrlnm, i = %d, int result 0x%x does not match " + "expected result 0x%x\n", i, vec_result_int[i], + vec_expected_result_int[i]); +#else + abort(); +#endif + } + +/* Check vec long int version of vec_vrlnm builtin */ + mask = 0; + mask_begin = 0; + mask_end = 4; + shift = 20; + + for (i = 0; i < 63; i++) + if ((i >= mask_begin) && (i <= mask_end)) + mask |= 0x8000000000000000ULL >> i; + + for (i = 0; i < 2; i++) { + vec_arg1_di[i] = 0x123456789ABCDE00ULL + i*0x1111111111111111ULL; + vec_arg2_di[i] = mask_begin << 16 | mask_end << 8 | shift; + vec_expected_result_di[i] = (vec_arg1_di[i] << shift) & mask; + } + + vec_result_di = vec_vrlnm (vec_arg1_di, vec_arg2_di); + + for (i = 0; i < 2; i++) { + if (vec_result_di[i] != vec_expected_result_di[i]) +#if DEBUG + printf("ERROR: vec_vrlnm, i = %d, long long int result 0x%llx does not " + "match expected result 0x%llx\n", i, vec_result_di[i], + vec_expected_result_di[i]); +#else + abort(); +#endif + } + + return 0; +} diff --git a/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c b/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c index 1e7d7390c5b..b0f26c8f4cb 100644 --- a/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c +++ b/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c @@ -62,6 +62,6 @@ rlnm_test_2 (vector unsigned long long x, vector unsigned long long y, /* { dg-final { scan-assembler-times "vextsb2d" 1 } } */ /* { dg-final { scan-assembler-times "vslw" 1 } } */ /* { dg-final { scan-assembler-times "vsld" 1 } } */ -/* { dg-final { scan-assembler-times "xxlor" 3 } } */ +/* { dg-final { scan-assembler-times "xxlor" 5 } } */ /* { dg-final { scan-assembler-times "vrlwnm" 2 } } */ /* { dg-final { scan-assembler-times "vrldnm" 2 } } */ From patchwork Tue Jan 19 22:33:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carl Love X-Patchwork-Id: 1428910 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=iScnUPsg; dkim-atps=neutral Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DL3N54VYSz9sRR for ; Wed, 20 Jan 2021 09:33:41 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 22F5B39450C4; Tue, 19 Jan 2021 22:33:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 22F5B39450C4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1611095617; bh=zoxEFEVQHvLRAn3zcgmY9OzjxfSqsC/inAUuOIN7Yx4=; h=Subject:To:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=iScnUPsgCv0Ws6jID4zGCJDJ0xZieJugykw9Ai6ELpMVvFwXYG+uR0Ug6+dCOQUu7 CiDoJu9RWWiuf0cQnABtn8Urs/AvsD23iOLNg4jg7aa3r448Vzg5tC/guKfMIpFvKa ztM3sZ6912J9Kdc0SuUbDm5y3pl/fzPtH66Ea6Us= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 227323938C2E for ; Tue, 19 Jan 2021 22:33:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 227323938C2E Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 10JMWCIB086277; Tue, 19 Jan 2021 17:33:33 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3668258cs9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 17:33:33 -0500 Received: from m0098404.ppops.net (m0098404.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 10JMXFla095487; Tue, 19 Jan 2021 17:33:32 -0500 Received: from ppma04wdc.us.ibm.com (1a.90.2fa9.ip4.static.sl-reverse.com [169.47.144.26]) by mx0a-001b2d01.pphosted.com with ESMTP id 3668258crx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 17:33:32 -0500 Received: from pps.filterd (ppma04wdc.us.ibm.com [127.0.0.1]) by ppma04wdc.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 10JMWCCq007838; Tue, 19 Jan 2021 22:33:31 GMT Received: from b01cxnp22036.gho.pok.ibm.com (b01cxnp22036.gho.pok.ibm.com [9.57.198.26]) by ppma04wdc.us.ibm.com with ESMTP id 363qs91gcd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 22:33:31 +0000 Received: from b01ledav002.gho.pok.ibm.com (b01ledav002.gho.pok.ibm.com [9.57.199.107]) by b01cxnp22036.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 10JMXVUH5833060 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 19 Jan 2021 22:33:31 GMT Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0C3C812407F; Tue, 19 Jan 2021 22:33:31 +0000 (GMT) Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0AF19124081; Tue, 19 Jan 2021 22:33:30 +0000 (GMT) Received: from li-e362e14c-2378-11b2-a85c-87d605f3c641.ibm.com (unknown [9.163.70.85]) by b01ledav002.gho.pok.ibm.com (Postfix) with ESMTP; Tue, 19 Jan 2021 22:33:29 +0000 (GMT) Message-ID: Subject: [PATCH 2/6 ver 3] RS6000 Add 128-bit Binary Integer sign extend operations To: Segher Boessenkool , will schmidt , cel@us.ibm.com Date: Tue, 19 Jan 2021 14:33:29 -0800 In-Reply-To: <20201013002313.GV2672@gate.crashing.org> References: <815d6b091f4b8bf3ab7c7e203c41d03c6c0e8d81.camel@us.ibm.com> <8acbb7bc3964944154491037884523c94ac3bdb1.camel@us.ibm.com> <384c17c8b764c850f8a9a08e963ed34ec89de28b.camel@vnet.ibm.com> <82b546ae55356938b9002ca4a9d0d4eb62961dae.camel@vnet.ibm.com> <20201013002313.GV2672@gate.crashing.org> X-Mailer: Evolution 3.28.5 (3.28.5-12.el8) Mime-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.343, 18.0.737 definitions=2021-01-19_12:2021-01-18, 2021-01-19 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 priorityscore=1501 impostorscore=0 clxscore=1015 spamscore=0 suspectscore=0 adultscore=0 mlxlogscore=999 lowpriorityscore=0 mlxscore=0 bulkscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2101190117 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Carl Love via Gcc-patches From: Carl Love Reply-To: Carl Love Cc: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Will, Segher: Patch 1, adds the 128-bit sign extension instruction support and corresponding builtin support. version 3: doc/extend.texi: Fixed the "uThe" typo and added the colon at the end of the line. p9-sign_extend-runnable.c: Changed the dg-do run to *-*-linux instead of powerpc*-*-linux. Tested on Power 8BE, Power9, Power10. version 2: Removed the blank line per Will's latest feedback. Retested the patch on Power 9 with no regression errors. Carl Love ---------------------------------------------------------- gcc/ChangeLog 2021-01-12 Carl Love * config/rs6000/altivec.h (vec_signextll, vec_signexti): Add define for new builtins. * config/rs6000/rs6000-builtin.def (VSIGNEXTI, VSIGNEXTLL): Add overloaded builtin definitions. (VSIGNEXTSB2W, VSIGNEXTSH2W, VSIGNEXTSB2D, VSIGNEXTSH2D,VSIGNEXTSW2D): Add builtin expansions. * config/rs6000-call.c (P9V_BUILTIN_VEC_VSIGNEXTI, P9V_BUILTIN_VEC_VSIGNEXTLL): Add overloaded argument definitions. * config/rs6000/vsx.md: Make define_insn vsx_sign_extend_si_v2di visible. * doc/extend.texi: Add documentation for the vec_signexti and vec_signextll builtins. gcc/testsuite/ChangeLog 2021-01-12 Carl Love * gcc.target/powerpc/p9-sign_extend-runnable.c: New test case. --- gcc/config/rs6000/altivec.h | 2 + gcc/config/rs6000/rs6000-builtin.def | 9 ++ gcc/config/rs6000/rs6000-call.c | 13 ++ gcc/config/rs6000/vsx.md | 2 +- gcc/doc/extend.texi | 15 ++ .../powerpc/p9-sign_extend-runnable.c | 128 ++++++++++++++++++ 6 files changed, 168 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h index 06f0d4d9f14..460310a5132 100644 --- a/gcc/config/rs6000/altivec.h +++ b/gcc/config/rs6000/altivec.h @@ -497,6 +497,8 @@ #define vec_xlx __builtin_vec_vextulx #define vec_xrx __builtin_vec_vexturx +#define vec_signexti __builtin_vec_vsignexti +#define vec_signextll __builtin_vec_vsignextll #endif diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index 8aa31ad0a06..842f07196de 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -2800,6 +2800,8 @@ BU_P9V_OVERLOAD_1 (VPRTYBD, "vprtybd") BU_P9V_OVERLOAD_1 (VPRTYBQ, "vprtybq") BU_P9V_OVERLOAD_1 (VPRTYBW, "vprtybw") BU_P9V_OVERLOAD_1 (VPARITY_LSBB, "vparity_lsbb") +BU_P9V_OVERLOAD_1 (VSIGNEXTI, "vsignexti") +BU_P9V_OVERLOAD_1 (VSIGNEXTLL, "vsignextll") /* 2 argument functions added in ISA 3.0 (power9). */ BU_P9_2 (CMPRB, "byte_in_range", CONST, cmprb) @@ -2811,6 +2813,13 @@ BU_P9_OVERLOAD_2 (CMPRB, "byte_in_range") BU_P9_OVERLOAD_2 (CMPRB2, "byte_in_either_range") BU_P9_OVERLOAD_2 (CMPEQB, "byte_in_set") + +BU_P9V_AV_1 (VSIGNEXTSB2W, "vsignextsb2w", CONST, vsx_sign_extend_qi_v4si) +BU_P9V_AV_1 (VSIGNEXTSH2W, "vsignextsh2w", CONST, vsx_sign_extend_hi_v4si) +BU_P9V_AV_1 (VSIGNEXTSB2D, "vsignextsb2d", CONST, vsx_sign_extend_qi_v2di) +BU_P9V_AV_1 (VSIGNEXTSH2D, "vsignextsh2d", CONST, vsx_sign_extend_hi_v2di) +BU_P9V_AV_1 (VSIGNEXTSW2D, "vsignextsw2d", CONST, vsx_sign_extend_si_v2di) + /* Builtins for scalar instructions added in ISA 3.1 (power10). */ BU_P10_POWERPC64_MISC_2 (CFUGED, "cfuged", CONST, cfuged) BU_P10_POWERPC64_MISC_2 (CNTLZDM, "cntlzdm", CONST, cntlzdm) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 2308cc8b4a2..3af325317a1 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -5660,6 +5660,19 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI, RS6000_BTI_INTSI }, + /* Sign extend builtins that work work on ISA 3.0, not added until ISA 3.1 */ + { P9V_BUILTIN_VEC_VSIGNEXTI, P9V_BUILTIN_VSIGNEXTSB2W, + RS6000_BTI_V4SI, RS6000_BTI_V16QI, 0, 0 }, + { P9V_BUILTIN_VEC_VSIGNEXTI, P9V_BUILTIN_VSIGNEXTSH2W, + RS6000_BTI_V4SI, RS6000_BTI_V8HI, 0, 0 }, + + { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSB2D, + RS6000_BTI_V2DI, RS6000_BTI_V16QI, 0, 0 }, + { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSH2D, + RS6000_BTI_V2DI, RS6000_BTI_V8HI, 0, 0 }, + { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSW2D, + RS6000_BTI_V2DI, RS6000_BTI_V4SI, 0, 0 }, + /* Overloaded built-in functions for ISA3.1 (power10). */ { P10_BUILTIN_VEC_CLRL, P10V_BUILTIN_VCLRLB, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_UINTSI, 0 }, diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 0c1bda522a9..e17b9c556d4 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -4807,7 +4807,7 @@ "vextsh2 %0,%1" [(set_attr "type" "vecexts")]) -(define_insn "*vsx_sign_extend_si_v2di" +(define_insn "vsx_sign_extend_si_v2di" [(set (match_operand:V2DI 0 "vsx_register_operand" "=v") (unspec:V2DI [(match_operand:V4SI 1 "vsx_register_operand" "v")] UNSPEC_VSX_SIGN_EXTEND))] diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 2748e98462e..feaa4929697 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -21146,6 +21146,21 @@ void vec_xst (vector unsigned char, int, vector unsigned char *); void vec_xst (vector unsigned char, int, unsigned char *); @end smallexample +The following sign extension builtins are provided: + +@smallexample +vector signed int vec_signexti (vector signed char a) +vector signed long long vec_signextll (vector signed char a) +vector signed int vec_signexti (vector signed short a) +vector signed long long vec_signextll (vector signed short a) +vector signed long long vec_signextll (vector signed int a) +@end smallexample + +Each element of the result is produced by sign-extending the element of the +input vector that would fall in the least significant portion of the result +element. For example, a sign-extension of a vector signed char to a vector +signed long long will sign extend the rightmost byte of each doubleword. + @node PowerPC AltiVec Built-in Functions Available on ISA 3.1 @subsubsection PowerPC AltiVec Built-in Functions Available on ISA 3.1 diff --git a/gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c b/gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c new file mode 100644 index 00000000000..fdcad019b96 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c @@ -0,0 +1,128 @@ +/* { dg-do run { target { *-*-linux* && { lp64 && p9vector_hw } } } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-O2 -mdejagnu-cpu=power9 -save-temps" } */ + +/* These builtins were not defined until ISA 3.1 but only require ISA 3.0 + support. */ + +/* { dg-final { scan-assembler-times {\mvextsb2w\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvextsb2d\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvextsh2w\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvextsh2d\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvextsw2d\M} 1 } } */ + +#include + +#define DEBUG 0 + +#if DEBUG +#include +#include +#endif + +void abort (void); + +int main () +{ + int i; + + vector signed char vec_arg_qi, vec_result_qi; + vector signed short int vec_arg_hi, vec_result_hi, vec_expected_hi; + vector signed int vec_arg_wi, vec_result_wi, vec_expected_wi; + vector signed long long vec_result_di, vec_expected_di; + + /* test sign extend byte to word */ + vec_arg_qi = (vector signed char) {1, 2, 3, 4, 5, 6, 7, 8, + -1, -2, -3, -4, -5, -6, -7, -8}; + vec_expected_wi = (vector signed int) {1, 5, -1, -5}; + + vec_result_wi = vec_signexti (vec_arg_qi); + + for (i = 0; i < 4; i++) + if (vec_result_wi[i] != vec_expected_wi[i]) { +#if DEBUG + printf("ERROR: vec_signexti(char, int): "); + printf("vec_result_wi[%d] != vec_expected_wi[%d]\n", + i, i); + printf("vec_result_wi[%d] = %d\n", i, vec_result_wi[i]); + printf("vec_expected_wi[%d] = %d\n", i, vec_expected_wi[i]); +#else + abort(); +#endif + } + + /* test sign extend byte to double */ + vec_arg_qi = (vector signed char){1, 2, 3, 4, 5, 6, 7, 8, + -1, -2, -3, -4, -5, -6, -7, -8}; + vec_expected_di = (vector signed long long int){1, -1}; + + vec_result_di = vec_signextll(vec_arg_qi); + + for (i = 0; i < 2; i++) + if (vec_result_di[i] != vec_expected_di[i]) { +#if DEBUG + printf("ERROR: vec_signextll(byte, long long int): "); + printf("vec_result_di[%d] != vec_expected_di[%d]\n", i, i); + printf("vec_result_di[%d] = %lld\n", i, vec_result_di[i]); + printf("vec_expected_di[%d] = %lld\n", i, vec_expected_di[i]); +#else + abort(); +#endif + } + + /* test sign extend short to word */ + vec_arg_hi = (vector signed short int){1, 2, 3, 4, -1, -2, -3, -4}; + vec_expected_wi = (vector signed int){1, 3, -1, -3}; + + vec_result_wi = vec_signexti(vec_arg_hi); + + for (i = 0; i < 4; i++) + if (vec_result_wi[i] != vec_expected_wi[i]) { +#if DEBUG + printf("ERROR: vec_signexti(short, int): "); + printf("vec_result_wi[%d] != vec_expected_wi[%d]\n", i, i); + printf("vec_result_wi[%d] = %d\n", i, vec_result_wi[i]); + printf("vec_expected_wi[%d] = %d\n", i, vec_expected_wi[i]); +#else + abort(); +#endif + } + + /* test sign extend short to double word */ + vec_arg_hi = (vector signed short int ){1, 3, 5, 7, -1, -3, -5, -7}; + vec_expected_di = (vector signed long long int){1, -1}; + + vec_result_di = vec_signextll(vec_arg_hi); + + for (i = 0; i < 2; i++) + if (vec_result_di[i] != vec_expected_di[i]) { +#if DEBUG + printf("ERROR: vec_signextll(short, double): "); + printf("vec_result_di[%d] != vec_expected_di[%d]\n", i, i); + printf("vec_result_di[%d] = %lld\n", i, vec_result_di[i]); + printf("vec_expected_di[%d] = %lld\n", i, vec_expected_di[i]); +#else + abort(); +#endif + } + + /* test sign extend word to double word */ + vec_arg_wi = (vector signed int ){1, 3, -1, -3}; + vec_expected_di = (vector signed long long int){1, -1}; + + vec_result_di = vec_signextll(vec_arg_wi); + + for (i = 0; i < 2; i++) + if (vec_result_di[i] != vec_expected_di[i]) { +#if DEBUG + printf("ERROR: vec_signextll(word, double): "); + printf("vec_result_di[%d] != vec_expected_di[%d]\n", i, i); + printf("vec_result_di[%d] = %lld\n", i, vec_result_di[i]); + printf("vec_expected_di[%d] = %lld\n", i, vec_expected_di[i]); +#else + abort(); +#endif + } + + return 0; +} From patchwork Tue Jan 19 22:33:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carl Love X-Patchwork-Id: 1428914 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=sD1kXsSn; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DL3NS6zG9z9sWF for ; Wed, 20 Jan 2021 09:34:00 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8B4E339450E7; Tue, 19 Jan 2021 22:33:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8B4E339450E7 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1611095631; bh=U3NZQvv1r3l9kMHMTLuIXRuTwOwIlgwLpcUF6is3iAk=; h=Subject:To:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=sD1kXsSnlPp2IK6m1IDYXXAOd25FaYGBuNZ20cOBlX1O45aSIsELjhGSbKBMNrx4m VCCylFQ3es8zFuO8VoT4T2yKiGARdtIlIqs9xf6urWPGzFLOTY8jWK+7rxHdXa9/WJ nsFD22uDoTZkYxyBRADH+J3HasmO9/42ivbJntoQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 840C43938C2E for ; Tue, 19 Jan 2021 22:33:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 840C43938C2E Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 10JMVahM100980; Tue, 19 Jan 2021 17:33:38 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3667x88k4d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 17:33:37 -0500 Received: from m0187473.ppops.net (m0187473.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 10JMVdVt101188; Tue, 19 Jan 2021 17:33:37 -0500 Received: from ppma04dal.us.ibm.com (7a.29.35a9.ip4.static.sl-reverse.com [169.53.41.122]) by mx0a-001b2d01.pphosted.com with ESMTP id 3667x88k42-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 17:33:36 -0500 Received: from pps.filterd (ppma04dal.us.ibm.com [127.0.0.1]) by ppma04dal.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 10JMWnmj014830; Tue, 19 Jan 2021 22:33:36 GMT Received: from b01cxnp22036.gho.pok.ibm.com (b01cxnp22036.gho.pok.ibm.com [9.57.198.26]) by ppma04dal.us.ibm.com with ESMTP id 363qs9fufd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 22:33:36 +0000 Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com [9.57.199.109]) by b01cxnp22036.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 10JMXZ5d5571048 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 19 Jan 2021 22:33:35 GMT Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 34C54112066; Tue, 19 Jan 2021 22:33:35 +0000 (GMT) Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0078F112063; Tue, 19 Jan 2021 22:33:33 +0000 (GMT) Received: from li-e362e14c-2378-11b2-a85c-87d605f3c641.ibm.com (unknown [9.163.70.85]) by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTP; Tue, 19 Jan 2021 22:33:33 +0000 (GMT) Message-ID: Subject: [PATCH 3/6 ver 3] RS6000 add 128-bit Integer Operations part 1 To: Segher Boessenkool , will schmidt , cel@us.ibm.com Date: Tue, 19 Jan 2021 14:33:33 -0800 In-Reply-To: <20201013002313.GV2672@gate.crashing.org> References: <815d6b091f4b8bf3ab7c7e203c41d03c6c0e8d81.camel@us.ibm.com> <8acbb7bc3964944154491037884523c94ac3bdb1.camel@us.ibm.com> <384c17c8b764c850f8a9a08e963ed34ec89de28b.camel@vnet.ibm.com> <82b546ae55356938b9002ca4a9d0d4eb62961dae.camel@vnet.ibm.com> <20201013002313.GV2672@gate.crashing.org> X-Mailer: Evolution 3.28.5 (3.28.5-12.el8) Mime-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.343, 18.0.737 definitions=2021-01-19_12:2021-01-18, 2021-01-19 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 malwarescore=0 impostorscore=0 suspectscore=0 mlxlogscore=999 bulkscore=0 priorityscore=1501 mlxscore=0 clxscore=1015 phishscore=0 spamscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2101190117 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Carl Love via Gcc-patches From: Carl Love Reply-To: Carl Love Cc: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Will, Segher: This patch adds the 128-bit integer support for divide, modulo, shift, compare of 128-bit integers instructions and builtin support. version 3: int_128bit-runnable.c: Removed ppc_native_128bit from dg-require-effective-target. Was missed from an earlier cleanup. Tested on Power 8BE, Power9, Power10. version 2: Fixed the references to 128-bit in ChangeLog that got missed in the last go round. Fixed missing spaces in emit_insn calls. Re-tested the patch on Power 9 with no regression errors. Carl Love ---------------------------------------------------------------------- gcc/ChangeLog 2021-01-12 Carl Love * config/rs6000/altivec.h (vec_signextq, vec_dive, vec_mod): Add define for new builtins. * config/rs6000/altivec.md (UNSPEC_VMULEUD, UNSPEC_VMULESD, UNSPEC_VMULOUD, UNSPEC_VMULOSD): New unspecs. (altivec_eqv1ti, altivec_gtv1ti, altivec_gtuv1ti, altivec_vmuleud, altivec_vmuloud, altivec_vmulesd, altivec_vmulosd, altivec_vrlq, altivec_vrlqmi, altivec_vrlqmi_inst, altivec_vrlqnm, altivec_vrlqnm_inst, altivec_vslq, altivec_vsrq, altivec_vsraq, altivec_vcmpequt_p, altivec_vcmpgtst_p, altivec_vcmpgtut_p): New define_insn. (vec_widen_umult_even_v2di, vec_widen_smult_even_v2di, vec_widen_umult_odd_v2di, vec_widen_smult_odd_v2di, altivec_vrlqmi, altivec_vrlqnm): New define_expands. * config/rs6000/rs6000-builtin.def (VCMPEQUT_P, VCMPGTST_P, VCMPGTUT_P): Add macro expansions. (BU_P10V_AV_P): Add builtin predicate definition. (VCMPGTUT, VCMPGTST, VCMPEQUT, CMPNET, CMPGE_1TI, CMPGE_U1TI, CMPLE_1TI, CMPLE_U1TI, VNOR_V1TI_UNS, VNOR_V1TI, VCMPNET_P, VCMPAET_P, VSIGNEXTSD2Q, VMULEUD, VMULESD, VMULOUD, VMULOSD, VRLQ, VSLQ, VSRQ, VSRAQ, VRLQNM, DIV_V1TI, UDIV_V1TI, DIVES_V1TI, DIVEU_V1TI, MODS_V1TI, MODU_V1TI, VRLQMI): New macro expansions. (VRLQ, VSLQ, VSRQ, VSRAQ, DIVE, MOD, SIGNEXT): New overload expansions. * config/rs6000/rs6000-call.c (P10_BUILTIN_VCMPEQUT, P10V_BUILTIN_CMPGE_1TI, P10V_BUILTIN_CMPGE_U1TI, P10V_BUILTIN_VCMPGTUT, P10V_BUILTIN_VCMPGTST, P10V_BUILTIN_CMPLE_1TI, P10V_BUILTIN_VCMPLE_U1TI, P10V_BUILTIN_DIV_V1TI, P10V_BUILTIN_UDIV_V1TI, P10V_BUILTIN_VMULESD, P10V_BUILTIN_VMULEUD, P10V_BUILTIN_VMULOSD, P10V_BUILTIN_VMULOUD, P10V_BUILTIN_VNOR_V1TI, P10V_BUILTIN_VNOR_V1TI_UNS, P10V_BUILTIN_VRLQ, P10V_BUILTIN_VRLQMI, P10V_BUILTIN_VRLQNM, P10V_BUILTIN_VSLQ, P10V_BUILTIN_VSRQ, P10V_BUILTIN_VSRAQ, P10V_BUILTIN_VCMPGTUT_P, P10V_BUILTIN_VCMPGTST_P, P10V_BUILTIN_VCMPEQUT_P, P10V_BUILTIN_VCMPGTUT_P, P10V_BUILTIN_VCMPGTST_P, P10V_BUILTIN_CMPNET, P10V_BUILTIN_VCMPNET_P, P10V_BUILTIN_VCMPAET_P, P10V_BUILTIN_VSIGNEXTSD2Q, P10V_BUILTIN_DIVES_V1TI, P10V_BUILTIN_MODS_V1TI, P10V_BUILTIN_MODU_V1TI): New overloaded definitions. (rs6000_gimple_fold_builtin) [P10V_BUILTIN_VCMPEQUT, P10_BUILTIN_CMPNET, P10_BUILTIN_CMPGE_1TI, P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT, P10_BUILTIN_VCMPGTST, P10_BUILTIN_CMPLE_1TI, P10_BUILTIN_CMPLE_U1TI]: New case statements. (rs6000_init_builtins) [bool_V1TI_type_node, int_ftype_int_v1ti_v1ti]: New assignments. (altivec_init_builtins): New E_V1TImode case statement. (builtin_function_type)[P10_BUILTIN_128BIT_VMULEUD, P10_BUILTIN_128BIT_VMULOUD, P10_BUILTIN_128BIT_DIVEU_V1TI, P10_BUILTIN_128BIT_MODU_V1TI, P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT, P10_BUILTIN_VCMPEQUT]: New case statements. * config/rs6000/r6000.c (rs6000_handle_altivec_attribute)[E_TImode, E_V1TImode]: New case statements. * config/rs6000/r6000.h (rs6000_builtin_type_index): New enum value RS6000_BTI_bool_V1TI. * config/rs6000/vector.md (vector_gtv1ti,vector_nltv1ti, vector_gtuv1ti, vector_nltuv1ti, vector_ngtv1ti, vector_ngtuv1ti, vector_eq_v1ti_p, vector_ne_v1ti_p, vector_ae_v1ti_p, vector_gt_v1ti_p, vector_gtu_v1ti_p, vrotlv1ti3, vashlv1ti3, vlshrv1ti3, vashrv1ti3): New define_expands. * config/rs6000/vsx.md (UNSPEC_VSX_DIVSQ, UNSPEC_VSX_DIVUQ, UNSPEC_VSX_DIVESQ, UNSPEC_VSX_DIVEUQ, UNSPEC_VSX_MODSQ, UNSPEC_VSX_MODUQ): New unspecs. (mulv2di3, vsx_div_v1ti, vsx_udiv_v1ti, vsx_dives_v1ti, vsx_diveu_v1ti, vsx_mods_v1ti, vsx_modu_v1ti, xxswapd_v1ti, vsx_sign_extend_v2di_v1ti): New define_insns. (vcmpnet): New define_expand. * gcc/doc/extend.texi: Add documentation for the new builtins vec_rl, vec_rlmi, vec_rlnm, vec_sl, vec_sr, vec_sra, vec_mule, vec_mulo, vec_div, vec_dive, vec_mod, vec_cmpeq, vec_cmpne, vec_cmpgt, vec_cmplt, vec_cmpge, vec_cmple, vec_all_eq, vec_all_ne, vec_all_gt, vec_all_lt, vec_all_ge, vec_all_le, vec_any_eq, vec_any_ne, vec_any_gt, vec_any_lt, vec_any_ge, vec_any_le. gcc/testsuite/ChangeLog 2021-01-12 Carl Love * gcc.target/powerpc/int_128bit-runnable.c: New test file. --- gcc/config/rs6000/altivec.h | 4 + gcc/config/rs6000/altivec.md | 241 ++ gcc/config/rs6000/rs6000-builtin.def | 53 +- gcc/config/rs6000/rs6000-call.c | 142 +- gcc/config/rs6000/rs6000.c | 1 + gcc/config/rs6000/rs6000.h | 3 +- gcc/config/rs6000/vector.md | 191 ++ gcc/config/rs6000/vsx.md | 107 + gcc/doc/extend.texi | 174 ++ .../gcc.target/powerpc/int_128bit-runnable.c | 2301 +++++++++++++++++ 10 files changed, 3214 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h index 460310a5132..3dedccca189 100644 --- a/gcc/config/rs6000/altivec.h +++ b/gcc/config/rs6000/altivec.h @@ -717,6 +717,10 @@ __altivec_scalar_pred(vec_any_nle, #define vec_step(x) __builtin_vec_step (* (__typeof__ (x) *) 0) #ifdef _ARCH_PWR10 +#define vec_signextq __builtin_vec_vsignextq +#define vec_dive __builtin_vec_dive +#define vec_mod __builtin_vec_mod + /* May modify these macro definitions if future capabilities overload with support for different vector argument and result types. */ #define vec_cntlzm(a, b) __builtin_altivec_vclzdm (a, b) diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 4d08cca2228..cb83c5ce012 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -39,12 +39,16 @@ UNSPEC_VMULESH UNSPEC_VMULEUW UNSPEC_VMULESW + UNSPEC_VMULEUD + UNSPEC_VMULESD UNSPEC_VMULOUB UNSPEC_VMULOSB UNSPEC_VMULOUH UNSPEC_VMULOSH UNSPEC_VMULOUW UNSPEC_VMULOSW + UNSPEC_VMULOUD + UNSPEC_VMULOSD UNSPEC_VPKPX UNSPEC_VPACK_SIGN_SIGN_SAT UNSPEC_VPACK_SIGN_UNS_SAT @@ -629,6 +633,14 @@ "vcmpequ %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "altivec_eqv1ti" + [(set (match_operand:V1TI 0 "altivec_register_operand" "=v") + (eq:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v") + (match_operand:V1TI 2 "altivec_register_operand" "v")))] + "TARGET_POWER10" + "vcmpequq %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_gt" [(set (match_operand:VI2 0 "altivec_register_operand" "=v") (gt:VI2 (match_operand:VI2 1 "altivec_register_operand" "v") @@ -637,6 +649,14 @@ "vcmpgts %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "*altivec_gtv1ti" + [(set (match_operand:V1TI 0 "altivec_register_operand" "=v") + (gt:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v") + (match_operand:V1TI 2 "altivec_register_operand" "v")))] + "TARGET_POWER10" + "vcmpgtsq %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_gtu" [(set (match_operand:VI2 0 "altivec_register_operand" "=v") (gtu:VI2 (match_operand:VI2 1 "altivec_register_operand" "v") @@ -645,6 +665,14 @@ "vcmpgtu %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "*altivec_gtuv1ti" + [(set (match_operand:V1TI 0 "altivec_register_operand" "=v") + (gtu:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v") + (match_operand:V1TI 2 "altivec_register_operand" "v")))] + "TARGET_POWER10" + "vcmpgtuq %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_eqv4sf" [(set (match_operand:V4SF 0 "altivec_register_operand" "=v") (eq:V4SF (match_operand:V4SF 1 "altivec_register_operand" "v") @@ -1688,6 +1716,19 @@ DONE; }) +(define_expand "vec_widen_umult_even_v2di" + [(use (match_operand:V1TI 0 "register_operand")) + (use (match_operand:V2DI 1 "register_operand")) + (use (match_operand:V2DI 2 "register_operand"))] + "TARGET_POWER10" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_altivec_vmuleud (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmuloud (operands[0], operands[1], operands[2])); + DONE; +}) + (define_expand "vec_widen_smult_even_v4si" [(use (match_operand:V2DI 0 "register_operand")) (use (match_operand:V4SI 1 "register_operand")) @@ -1701,6 +1742,19 @@ DONE; }) +(define_expand "vec_widen_smult_even_v2di" + [(use (match_operand:V1TI 0 "register_operand")) + (use (match_operand:V2DI 1 "register_operand")) + (use (match_operand:V2DI 2 "register_operand"))] + "TARGET_POWER10" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_altivec_vmulesd (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmulosd (operands[0], operands[1], operands[2])); + DONE; +}) + (define_expand "vec_widen_umult_odd_v16qi" [(use (match_operand:V8HI 0 "register_operand")) (use (match_operand:V16QI 1 "register_operand")) @@ -1766,6 +1820,19 @@ DONE; }) +(define_expand "vec_widen_umult_odd_v2di" + [(use (match_operand:V1TI 0 "register_operand")) + (use (match_operand:V2DI 1 "register_operand")) + (use (match_operand:V2DI 2 "register_operand"))] + "TARGET_POWER10" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_altivec_vmuloud (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmuleud (operands[0], operands[1], operands[2])); + DONE; +}) + (define_expand "vec_widen_smult_odd_v4si" [(use (match_operand:V2DI 0 "register_operand")) (use (match_operand:V4SI 1 "register_operand")) @@ -1779,6 +1846,19 @@ DONE; }) +(define_expand "vec_widen_smult_odd_v2di" + [(use (match_operand:V1TI 0 "register_operand")) + (use (match_operand:V2DI 1 "register_operand")) + (use (match_operand:V2DI 2 "register_operand"))] + "TARGET_POWER10" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_altivec_vmulosd (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmulesd (operands[0], operands[1], operands[2])); + DONE; +}) + (define_insn "altivec_vmuleub" [(set (match_operand:V8HI 0 "register_operand" "=v") (unspec:V8HI [(match_operand:V16QI 1 "register_operand" "v") @@ -1860,6 +1940,15 @@ "vmuleuw %0,%1,%2" [(set_attr "type" "veccomplex")]) +(define_insn "altivec_vmuleud" + [(set (match_operand:V1TI 0 "register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v")] + UNSPEC_VMULEUD))] + "TARGET_POWER10" + "vmuleud %0,%1,%2" + [(set_attr "type" "veccomplex")]) + (define_insn "altivec_vmulouw" [(set (match_operand:V2DI 0 "register_operand" "=v") (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v") @@ -1869,6 +1958,15 @@ "vmulouw %0,%1,%2" [(set_attr "type" "veccomplex")]) +(define_insn "altivec_vmuloud" + [(set (match_operand:V1TI 0 "register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v")] + UNSPEC_VMULOUD))] + "TARGET_POWER10" + "vmuloud %0,%1,%2" + [(set_attr "type" "veccomplex")]) + (define_insn "altivec_vmulesw" [(set (match_operand:V2DI 0 "register_operand" "=v") (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v") @@ -1878,6 +1976,15 @@ "vmulesw %0,%1,%2" [(set_attr "type" "veccomplex")]) +(define_insn "altivec_vmulesd" + [(set (match_operand:V1TI 0 "register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v")] + UNSPEC_VMULESD))] + "TARGET_POWER10" + "vmulesd %0,%1,%2" + [(set_attr "type" "veccomplex")]) + (define_insn "altivec_vmulosw" [(set (match_operand:V2DI 0 "register_operand" "=v") (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v") @@ -1887,6 +1994,15 @@ "vmulosw %0,%1,%2" [(set_attr "type" "veccomplex")]) +(define_insn "altivec_vmulosd" + [(set (match_operand:V1TI 0 "register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v")] + UNSPEC_VMULOSD))] + "TARGET_POWER10" + "vmulosd %0,%1,%2" + [(set_attr "type" "veccomplex")]) + ;; Vector pack/unpack (define_insn "altivec_vpkpx" [(set (match_operand:V8HI 0 "register_operand" "=v") @@ -1980,6 +2096,15 @@ "vrl %0,%1,%2" [(set_attr "type" "vecsimple")]) +(define_insn "altivec_vrlq" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (rotate:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" +;; rotate amount in needs to be in bits[57:63] of operand2. + "vrlq %0,%1,%2" + [(set_attr "type" "vecsimple")]) + (define_insn "altivec_vrlmi" [(set (match_operand:VIlong 0 "register_operand" "=v") (unspec:VIlong [(match_operand:VIlong 1 "register_operand" "v") @@ -1990,6 +2115,34 @@ "vrlmi %0,%1,%3" [(set_attr "type" "veclogical")]) +(define_expand "altivec_vrlqmi" + [(set (match_operand:V1TI 0 "vsx_register_operand") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand") + (match_operand:V1TI 2 "vsx_register_operand") + (match_operand:V1TI 3 "vsx_register_operand")] + UNSPEC_VRLMI))] + "TARGET_POWER10" +{ + /* Mask bit begin, end fields need to be in bits [41:55] of 128-bit operand2. + Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn (gen_xxswapd_v1ti (tmp, operands[3])); + emit_insn (gen_altivec_vrlqmi_inst (operands[0], operands[1], operands[2], + tmp)); + DONE; +}) + +(define_insn "altivec_vrlqmi_inst" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "0") + (match_operand:V1TI 3 "vsx_register_operand" "v")] + UNSPEC_VRLMI))] + "TARGET_POWER10" + "vrlqmi %0,%1,%3" + [(set_attr "type" "veclogical")]) + (define_insn "altivec_vrlnm" [(set (match_operand:VIlong 0 "register_operand" "=v") (unspec:VIlong [(match_operand:VIlong 1 "register_operand" "v") @@ -1999,6 +2152,31 @@ "vrlnm %0,%1,%2" [(set_attr "type" "veclogical")]) +(define_expand "altivec_vrlqnm" + [(set (match_operand:V1TI 0 "vsx_register_operand") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand") + (match_operand:V1TI 2 "vsx_register_operand")] + UNSPEC_VRLNM))] + "TARGET_POWER10" +{ + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn (gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn (gen_altivec_vrlqnm_inst (operands[0], operands[1], tmp)); + DONE; +}) + +(define_insn "altivec_vrlqnm_inst" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VRLNM))] + "TARGET_POWER10" + ;; rotate and mask bits need to be in upper 64-bits of operand2. + "vrlqnm %0,%1,%2" + [(set_attr "type" "veclogical")]) + (define_insn "altivec_vsl" [(set (match_operand:V4SI 0 "register_operand" "=v") (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") @@ -2043,6 +2221,15 @@ "vsl %0,%1,%2" [(set_attr "type" "vecsimple")]) +(define_insn "altivec_vslq" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" + /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */ + "vslq %0,%1,%2" + [(set_attr "type" "vecsimple")]) + (define_insn "*altivec_vsr" [(set (match_operand:VI2 0 "register_operand" "=v") (lshiftrt:VI2 (match_operand:VI2 1 "register_operand" "v") @@ -2051,6 +2238,15 @@ "vsr %0,%1,%2" [(set_attr "type" "vecsimple")]) +(define_insn "altivec_vsrq" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" + /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */ + "vsrq %0,%1,%2" + [(set_attr "type" "vecsimple")]) + (define_insn "*altivec_vsra" [(set (match_operand:VI2 0 "register_operand" "=v") (ashiftrt:VI2 (match_operand:VI2 1 "register_operand" "v") @@ -2059,6 +2255,15 @@ "vsra %0,%1,%2" [(set_attr "type" "vecsimple")]) +(define_insn "altivec_vsraq" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (ashiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" + /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */ + "vsraq %0,%1,%2" + [(set_attr "type" "vecsimple")]) + (define_insn "altivec_vsr" [(set (match_operand:V4SI 0 "register_operand" "=v") (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") @@ -2619,6 +2824,18 @@ "vcmpequ. %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "altivec_vcmpequt_p" + [(set (reg:CC CR6_REGNO) + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand" "v") + (match_operand:V1TI 2 "altivec_register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "altivec_register_operand" "=v") + (eq:V1TI (match_dup 1) + (match_dup 2)))] + "TARGET_POWER10" + "vcmpequq. %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_vcmpgts_p" [(set (reg:CC CR6_REGNO) (unspec:CC [(gt:CC (match_operand:VI2 1 "register_operand" "v") @@ -2631,6 +2848,18 @@ "vcmpgts. %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "*altivec_vcmpgtst_p" + [(set (reg:CC CR6_REGNO) + (unspec:CC [(gt:CC (match_operand:V1TI 1 "register_operand" "v") + (match_operand:V1TI 2 "register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "register_operand" "=v") + (gt:V1TI (match_dup 1) + (match_dup 2)))] + "TARGET_POWER10" + "vcmpgtsq. %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_vcmpgtu_p" [(set (reg:CC CR6_REGNO) (unspec:CC [(gtu:CC (match_operand:VI2 1 "register_operand" "v") @@ -2643,6 +2872,18 @@ "vcmpgtu. %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "*altivec_vcmpgtut_p" + [(set (reg:CC CR6_REGNO) + (unspec:CC [(gtu:CC (match_operand:V1TI 1 "register_operand" "v") + (match_operand:V1TI 2 "register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "register_operand" "=v") + (gtu:V1TI (match_dup 1) + (match_dup 2)))] + "TARGET_POWER10" + "vcmpgtuq. %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_vcmpeqfp_p" [(set (reg:CC CR6_REGNO) (unspec:CC [(eq:CC (match_operand:V4SF 1 "register_operand" "v") diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index 842f07196de..623907216af 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1201,6 +1201,15 @@ | RS6000_BTC_TERNARY), \ CODE_FOR_ ## ICODE) /* ICODE */ +/* See the comment on BU_ALTIVEC_P. */ +#define BU_P10V_AV_P(ENUM, NAME, ATTR, ICODE) \ + RS6000_BUILTIN_P (P10V_BUILTIN_ ## ENUM, /* ENUM */ \ + "__builtin_altivec_" NAME, /* NAME */ \ + RS6000_BTM_P10, /* MASK */ \ + (RS6000_BTC_ ## ATTR /* ATTR */ \ + | RS6000_BTC_PREDICATE), \ + CODE_FOR_ ## ICODE) /* ICODE */ + #define BU_P10V_AV_X(ENUM, NAME, ATTR) \ RS6000_BUILTIN_X (P10_BUILTIN_ ## ENUM, /* ENUM */ \ "__builtin_altivec_" NAME, /* NAME */ \ @@ -2821,6 +2830,10 @@ BU_P9V_AV_1 (VSIGNEXTSH2D, "vsignextsh2d", CONST, vsx_sign_extend_hi_v2di) BU_P9V_AV_1 (VSIGNEXTSW2D, "vsignextsw2d", CONST, vsx_sign_extend_si_v2di) /* Builtins for scalar instructions added in ISA 3.1 (power10). */ +BU_P10V_AV_P (VCMPEQUT_P, "vcmpequt_p", CONST, vector_eq_v1ti_p) +BU_P10V_AV_P (VCMPGTST_P, "vcmpgtst_p", CONST, vector_gt_v1ti_p) +BU_P10V_AV_P (VCMPGTUT_P, "vcmpgtut_p", CONST, vector_gtu_v1ti_p) + BU_P10_POWERPC64_MISC_2 (CFUGED, "cfuged", CONST, cfuged) BU_P10_POWERPC64_MISC_2 (CNTLZDM, "cntlzdm", CONST, cntlzdm) BU_P10_POWERPC64_MISC_2 (CNTTZDM, "cnttzdm", CONST, cnttzdm) @@ -2841,7 +2854,38 @@ BU_P10V_VSX_2 (XXGENPCVM_V16QI, "xxgenpcvm_v16qi", CONST, xxgenpcvm_v16qi) BU_P10V_VSX_2 (XXGENPCVM_V8HI, "xxgenpcvm_v8hi", CONST, xxgenpcvm_v8hi) BU_P10V_VSX_2 (XXGENPCVM_V4SI, "xxgenpcvm_v4si", CONST, xxgenpcvm_v4si) BU_P10V_VSX_2 (XXGENPCVM_V2DI, "xxgenpcvm_v2di", CONST, xxgenpcvm_v2di) - +BU_P10V_AV_2 (VCMPGTUT, "vcmpgtut", CONST, vector_gtuv1ti) +BU_P10V_AV_2 (VCMPGTST, "vcmpgtst", CONST, vector_gtv1ti) +BU_P10V_AV_2 (VCMPEQUT, "vcmpequt", CONST, eqvv1ti3) +BU_P10V_AV_2 (CMPNET, "vcmpnet", CONST, vcmpnet) +BU_P10V_AV_2 (CMPGE_1TI, "cmpge_1ti", CONST, vector_nltv1ti) +BU_P10V_AV_2 (CMPGE_U1TI, "cmpge_u1ti", CONST, vector_nltuv1ti) +BU_P10V_AV_2 (CMPLE_1TI, "cmple_1ti", CONST, vector_ngtv1ti) +BU_P10V_AV_2 (CMPLE_U1TI, "cmple_u1ti", CONST, vector_ngtuv1ti) +BU_P10V_AV_2 (VNOR_V1TI_UNS, "vnor_v1ti_uns",CONST, norv1ti3) +BU_P10V_AV_2 (VNOR_V1TI, "vnor_v1ti", CONST, norv1ti3) +BU_P10V_AV_2 (VCMPNET_P, "vcmpnet_p", CONST, vector_ne_v1ti_p) +BU_P10V_AV_2 (VCMPAET_P, "vcmpaet_p", CONST, vector_ae_v1ti_p) + +BU_P10V_AV_1 (VSIGNEXTSD2Q, "vsignext", CONST, vsx_sign_extend_v2di_v1ti) + +BU_P10V_AV_2 (VMULEUD, "vmuleud", CONST, vec_widen_umult_even_v2di) +BU_P10V_AV_2 (VMULESD, "vmulesd", CONST, vec_widen_smult_even_v2di) +BU_P10V_AV_2 (VMULOUD, "vmuloud", CONST, vec_widen_umult_odd_v2di) +BU_P10V_AV_2 (VMULOSD, "vmulosd", CONST, vec_widen_smult_odd_v2di) +BU_P10V_AV_2 (VRLQ, "vrlq", CONST, vrotlv1ti3) +BU_P10V_AV_2 (VSLQ, "vslq", CONST, vashlv1ti3) +BU_P10V_AV_2 (VSRQ, "vsrq", CONST, vlshrv1ti3) +BU_P10V_AV_2 (VSRAQ, "vsraq", CONST, vashrv1ti3) +BU_P10V_AV_2 (VRLQNM, "vrlqnm", CONST, altivec_vrlqnm) +BU_P10V_AV_2 (DIV_V1TI, "div_1ti", CONST, vsx_div_v1ti) +BU_P10V_AV_2 (UDIV_V1TI, "udiv_1ti", CONST, vsx_udiv_v1ti) +BU_P10V_AV_2 (DIVES_V1TI, "dives", CONST, vsx_dives_v1ti) +BU_P10V_AV_2 (DIVEU_V1TI, "diveu", CONST, vsx_diveu_v1ti) +BU_P10V_AV_2 (MODS_V1TI, "mods", CONST, vsx_mods_v1ti) +BU_P10V_AV_2 (MODU_V1TI, "modu", CONST, vsx_modu_v1ti) + +BU_P10V_AV_3 (VRLQMI, "vrlqmi", CONST, altivec_vrlqmi) BU_P10V_AV_3 (VEXTRACTBL, "vextdubvlx", CONST, vextractlv16qi) BU_P10V_AV_3 (VEXTRACTHL, "vextduhvlx", CONST, vextractlv8hi) BU_P10V_AV_3 (VEXTRACTWL, "vextduwvlx", CONST, vextractlv4si) @@ -2948,6 +2992,12 @@ BU_P10_OVERLOAD_2 (CLRR, "clrr") BU_P10_OVERLOAD_2 (GNB, "gnb") BU_P10_OVERLOAD_4 (XXEVAL, "xxeval") BU_P10_OVERLOAD_2 (XXGENPCVM, "xxgenpcvm") +BU_P10_OVERLOAD_2 (VRLQ, "vrlq") +BU_P10_OVERLOAD_2 (VSLQ, "vslq") +BU_P10_OVERLOAD_2 (VSRQ, "vsrq") +BU_P10_OVERLOAD_2 (VSRAQ, "vsraq") +BU_P10_OVERLOAD_2 (DIVE, "dive") +BU_P10_OVERLOAD_2 (MOD, "mod") BU_P10_OVERLOAD_3 (EXTRACTL, "extractl") BU_P10_OVERLOAD_3 (EXTRACTH, "extracth") @@ -2967,6 +3017,7 @@ BU_P10_OVERLOAD_1 (VSTRIL_P, "stril_p") BU_P10_OVERLOAD_1 (XVTLSBB_ZEROS, "xvtlsbb_all_zeros") BU_P10_OVERLOAD_1 (XVTLSBB_ONES, "xvtlsbb_all_ones") +BU_P10_OVERLOAD_1 (SIGNEXT, "vsignextq") BU_P10_OVERLOAD_1 (MTVSRBM, "mtvsrbm") BU_P10_OVERLOAD_1 (MTVSRHM, "mtvsrhm") diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 3af325317a1..e9ba4751cd4 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -837,6 +837,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPEQ, P8V_BUILTIN_VCMPEQUD, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPEQ, P10V_BUILTIN_VCMPEQUT, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPEQ, P10V_BUILTIN_VCMPEQUT, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPEQ, ALTIVEC_BUILTIN_VCMPEQFP, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_CMPEQ, VSX_BUILTIN_XVCMPEQDP, @@ -883,6 +887,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_CMPGE, VSX_BUILTIN_CMPGE_U2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0}, + + { ALTIVEC_BUILTIN_VEC_CMPGE, P10V_BUILTIN_CMPGE_1TI, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0}, + { ALTIVEC_BUILTIN_VEC_CMPGE, P10V_BUILTIN_CMPGE_U1TI, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0}, { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTUB, RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTSB, @@ -897,8 +907,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, P8V_BUILTIN_VCMPGTUD, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPGT, P10V_BUILTIN_VCMPGTUT, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, P8V_BUILTIN_VCMPGTSD, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPGT, P10V_BUILTIN_VCMPGTST, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTFP, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, VSX_BUILTIN_XVCMPGTDP, @@ -941,6 +955,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_CMPLE, VSX_BUILTIN_CMPLE_U2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0}, + { ALTIVEC_BUILTIN_VEC_CMPLE, P10V_BUILTIN_CMPLE_1TI, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0}, + { ALTIVEC_BUILTIN_VEC_CMPLE, P10V_BUILTIN_CMPLE_U1TI, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0}, { ALTIVEC_BUILTIN_VEC_CMPLT, ALTIVEC_BUILTIN_VCMPGTUB, RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPLT, ALTIVEC_BUILTIN_VCMPGTSB, @@ -1069,6 +1088,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, { VSX_BUILTIN_VEC_DIV, VSX_BUILTIN_UDIV_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { VSX_BUILTIN_VEC_DIV, P10V_BUILTIN_DIV_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, + { VSX_BUILTIN_VEC_DIV, P10V_BUILTIN_UDIV_V1TI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, + { VSX_BUILTIN_VEC_DOUBLE, VSX_BUILTIN_XVCVSXDDP, RS6000_BTI_V2DF, RS6000_BTI_V2DI, 0, 0 }, { VSX_BUILTIN_VEC_DOUBLE, VSX_BUILTIN_XVCVUXDDP, @@ -1922,6 +1947,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_MULE, P8V_BUILTIN_VMULEUW, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_MULE, P10V_BUILTIN_VMULESD, + RS6000_BTI_V1TI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_MULE, P10V_BUILTIN_VMULEUD, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V2DI, + RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULEUB, ALTIVEC_BUILTIN_VMULEUB, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULESB, ALTIVEC_BUILTIN_VMULESB, @@ -1945,6 +1975,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_MULO, P8V_BUILTIN_VMULOUW, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_MULO, P10V_BUILTIN_VMULOSD, + RS6000_BTI_V1TI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_MULO, P10V_BUILTIN_VMULOUD, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V2DI, + RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOSH, RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULOSH, ALTIVEC_BUILTIN_VMULOSH, @@ -1987,6 +2022,16 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR_V2DI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10V_BUILTIN_VNOR_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_bool_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10V_BUILTIN_VNOR_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10V_BUILTIN_VNOR_V1TI_UNS, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10V_BUILTIN_VNOR_V1TI_UNS, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_bool_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10V_BUILTIN_VNOR_V1TI_UNS, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR_V2DI_UNS, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR_V2DI_UNS, @@ -2248,6 +2293,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_RL, P8V_BUILTIN_VRLD, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_RL, P10V_BUILTIN_VRLQ, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_RL, P10V_BUILTIN_VRLQ, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_VRLW, ALTIVEC_BUILTIN_VRLW, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_VRLW, ALTIVEC_BUILTIN_VRLW, @@ -2266,12 +2316,23 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { P9V_BUILTIN_VEC_RLMI, P9V_BUILTIN_VRLDMI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI }, + { P9V_BUILTIN_VEC_RLMI, P10V_BUILTIN_VRLQMI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI }, + { P9V_BUILTIN_VEC_RLMI, P10V_BUILTIN_VRLQMI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI }, { P9V_BUILTIN_VEC_RLNM, P9V_BUILTIN_VRLWNM, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { P9V_BUILTIN_VEC_RLNM, P9V_BUILTIN_VRLDNM, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { P9V_BUILTIN_VEC_RLNM, P10V_BUILTIN_VRLQNM, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, + { P9V_BUILTIN_VEC_RLNM, P10V_BUILTIN_VRLQNM, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_SL, ALTIVEC_BUILTIN_VSLB, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_SL, ALTIVEC_BUILTIN_VSLB, @@ -2288,6 +2349,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_SL, P8V_BUILTIN_VSLD, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_SL, P10V_BUILTIN_VSLQ, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_SL, P10V_BUILTIN_VSLQ, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_SQRT, VSX_BUILTIN_XVSQRTDP, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0, 0 }, { ALTIVEC_BUILTIN_VEC_SQRT, VSX_BUILTIN_XVSQRTSP, @@ -2484,6 +2550,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_SR, P8V_BUILTIN_VSRD, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_SR, P10V_BUILTIN_VSRQ, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_SR, P10V_BUILTIN_VSRQ, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_VSRW, ALTIVEC_BUILTIN_VSRW, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_VSRW, ALTIVEC_BUILTIN_VSRW, @@ -2512,6 +2583,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_SRA, P8V_BUILTIN_VSRAD, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_SRA, P10V_BUILTIN_VSRAQ, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_SRA, P10V_BUILTIN_VSRAQ, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_VSRAW, ALTIVEC_BUILTIN_VSRAW, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_VSRAW, ALTIVEC_BUILTIN_VSRAW, @@ -4129,12 +4205,16 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTUD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P10V_BUILTIN_VCMPGTUT_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P10V_BUILTIN_VCMPGTST_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, ALTIVEC_BUILTIN_VCMPGTFP_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, VSX_BUILTIN_XVCMPGTDP_P, @@ -4199,6 +4279,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, P8V_BUILTIN_VCMPEQUD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, P10V_BUILTIN_VCMPEQUT_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI }, + { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, P10V_BUILTIN_VCMPEQUT_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, ALTIVEC_BUILTIN_VCMPEQFP_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, VSX_BUILTIN_XVCMPEQDP_P, @@ -4250,12 +4334,16 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTUD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P10V_BUILTIN_VCMPGTUT_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P10V_BUILTIN_VCMPGTST_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, ALTIVEC_BUILTIN_VCMPGEFP_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, VSX_BUILTIN_XVCMPGEDP_P, @@ -4904,6 +4992,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_CMPNE, P9V_BUILTIN_CMPNEW, RS6000_BTI_bool_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPNE, P10V_BUILTIN_CMPNET, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, + RS6000_BTI_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPNE, P10V_BUILTIN_CMPNET, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, /* The following 2 entries have been deprecated. */ { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNEB_P, @@ -5004,6 +5098,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNED_P, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, 0 }, + { P9V_BUILTIN_VEC_VCMPNE_P, P10V_BUILTIN_VCMPNET_P, + RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, + { P9V_BUILTIN_VEC_VCMPNE_P, P10V_BUILTIN_VCMPNET_P, + RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNEFP_P, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, @@ -5109,7 +5207,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAED_P, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, 0 }, - + { P9V_BUILTIN_VEC_VCMPAE_P, P10V_BUILTIN_VCMPAET_P, + RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, + { P9V_BUILTIN_VEC_VCMPAE_P, P10V_BUILTIN_VCMPAET_P, + RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAEFP_P, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAEDP_P, @@ -6036,6 +6137,21 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { P10_BUILTIN_VEC_XVTLSBB_ONES, P10V_BUILTIN_XVTLSBB_ONES, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V16QI, 0, 0 }, + { P10_BUILTIN_VEC_SIGNEXT, P10V_BUILTIN_VSIGNEXTSD2Q, + RS6000_BTI_V1TI, RS6000_BTI_V2DI, 0, 0 }, + + { P10_BUILTIN_VEC_DIVE, P10V_BUILTIN_DIVES_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, + { P10_BUILTIN_VEC_DIVE, P10V_BUILTIN_DIVEU_V1TI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, + + { P10_BUILTIN_VEC_MOD, P10V_BUILTIN_MODS_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, + { P10_BUILTIN_VEC_MOD, P10V_BUILTIN_MODU_V1TI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, + { RS6000_BUILTIN_NONE, RS6000_BUILTIN_NONE, 0, 0, 0, 0 } }; @@ -12530,12 +12646,14 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case ALTIVEC_BUILTIN_VCMPEQUH: case ALTIVEC_BUILTIN_VCMPEQUW: case P8V_BUILTIN_VCMPEQUD: + case P10V_BUILTIN_VCMPEQUT: fold_compare_helper (gsi, EQ_EXPR, stmt); return true; case P9V_BUILTIN_CMPNEB: case P9V_BUILTIN_CMPNEH: case P9V_BUILTIN_CMPNEW: + case P10V_BUILTIN_CMPNET: fold_compare_helper (gsi, NE_EXPR, stmt); return true; @@ -12547,6 +12665,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case VSX_BUILTIN_CMPGE_U4SI: case VSX_BUILTIN_CMPGE_2DI: case VSX_BUILTIN_CMPGE_U2DI: + case P10V_BUILTIN_CMPGE_1TI: + case P10V_BUILTIN_CMPGE_U1TI: fold_compare_helper (gsi, GE_EXPR, stmt); return true; @@ -12558,6 +12678,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case ALTIVEC_BUILTIN_VCMPGTUW: case P8V_BUILTIN_VCMPGTUD: case P8V_BUILTIN_VCMPGTSD: + case P10V_BUILTIN_VCMPGTUT: + case P10V_BUILTIN_VCMPGTST: fold_compare_helper (gsi, GT_EXPR, stmt); return true; @@ -12569,6 +12691,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case VSX_BUILTIN_CMPLE_U4SI: case VSX_BUILTIN_CMPLE_2DI: case VSX_BUILTIN_CMPLE_U2DI: + case P10V_BUILTIN_CMPLE_1TI: + case P10V_BUILTIN_CMPLE_U1TI: fold_compare_helper (gsi, LE_EXPR, stmt); return true; @@ -13296,6 +13420,8 @@ rs6000_init_builtins (void) ? "__vector __bool long" : "__vector __bool long long", bool_long_long_type_node, 2); + bool_V1TI_type_node = rs6000_vector_type ("__vector __bool __int128", + intTI_type_node, 1); pixel_V8HI_type_node = rs6000_vector_type ("__vector __pixel", pixel_type_node, 8); @@ -13481,6 +13607,10 @@ altivec_init_builtins (void) = build_function_type_list (integer_type_node, integer_type_node, V2DI_type_node, V2DI_type_node, NULL_TREE); + tree int_ftype_int_v1ti_v1ti + = build_function_type_list (integer_type_node, + integer_type_node, V1TI_type_node, + V1TI_type_node, NULL_TREE); tree void_ftype_v4si = build_function_type_list (void_type_node, V4SI_type_node, NULL_TREE); tree v8hi_ftype_void @@ -13848,6 +13978,9 @@ altivec_init_builtins (void) case E_VOIDmode: type = int_ftype_int_opaque_opaque; break; + case E_V1TImode: + type = int_ftype_int_v1ti_v1ti; + break; case E_V2DImode: type = int_ftype_int_v2di_v2di; break; @@ -14451,6 +14584,10 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, case P10V_BUILTIN_XXGENPCVM_V8HI: case P10V_BUILTIN_XXGENPCVM_V4SI: case P10V_BUILTIN_XXGENPCVM_V2DI: + case P10V_BUILTIN_VMULEUD: + case P10V_BUILTIN_VMULOUD: + case P10V_BUILTIN_DIVEU_V1TI: + case P10V_BUILTIN_MODU_V1TI: h.uns_p[0] = 1; h.uns_p[1] = 1; h.uns_p[2] = 1; @@ -14550,10 +14687,13 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, case VSX_BUILTIN_CMPGE_U8HI: case VSX_BUILTIN_CMPGE_U4SI: case VSX_BUILTIN_CMPGE_U2DI: + case P10V_BUILTIN_CMPGE_U1TI: case ALTIVEC_BUILTIN_VCMPGTUB: case ALTIVEC_BUILTIN_VCMPGTUH: case ALTIVEC_BUILTIN_VCMPGTUW: case P8V_BUILTIN_VCMPGTUD: + case P10V_BUILTIN_VCMPGTUT: + case P10V_BUILTIN_VCMPEQUT: h.uns_p[1] = 1; h.uns_p[2] = 1; break; diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 67681d18150..0c51491cbe7 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -19855,6 +19855,7 @@ rs6000_handle_altivec_attribute (tree *node, case 'b': switch (mode) { + case E_TImode: case E_V1TImode: result = bool_V1TI_type_node; break; case E_DImode: case E_V2DImode: result = bool_V2DI_type_node; break; case E_SImode: case E_V4SImode: result = bool_V4SI_type_node; break; case E_HImode: case E_V8HImode: result = bool_V8HI_type_node; break; diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index b05dd827b13..d6c3ba040be 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -2330,7 +2330,6 @@ extern int frame_pointer_needed; #define RS6000_BTM_MMA MASK_MMA /* ISA 3.1 MMA. */ #define RS6000_BTM_P10 MASK_POWER10 - #define RS6000_BTM_COMMON (RS6000_BTM_ALTIVEC \ | RS6000_BTM_VSX \ | RS6000_BTM_P8_VECTOR \ @@ -2443,6 +2442,7 @@ enum rs6000_builtin_type_index RS6000_BTI_bool_V8HI, /* __vector __bool short */ RS6000_BTI_bool_V4SI, /* __vector __bool int */ RS6000_BTI_bool_V2DI, /* __vector __bool long */ + RS6000_BTI_bool_V1TI, /* __vector __bool 128-bit */ RS6000_BTI_pixel_V8HI, /* __vector __pixel */ RS6000_BTI_long, /* long_integer_type_node */ RS6000_BTI_unsigned_long, /* long_unsigned_type_node */ @@ -2496,6 +2496,7 @@ enum rs6000_builtin_type_index #define bool_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V8HI]) #define bool_V4SI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V4SI]) #define bool_V2DI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V2DI]) +#define bool_V1TI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V1TI]) #define pixel_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_pixel_V8HI]) #define long_long_integer_type_internal_node (rs6000_builtin_types[RS6000_BTI_long_long]) diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md index e5191bd1424..0f252c915b0 100644 --- a/gcc/config/rs6000/vector.md +++ b/gcc/config/rs6000/vector.md @@ -685,6 +685,13 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +(define_expand "vector_gtv1ti" + [(set (match_operand:V1TI 0 "vlogical_operand") + (gt:V1TI (match_operand:V1TI 1 "vlogical_operand") + (match_operand:V1TI 2 "vlogical_operand")))] + "TARGET_POWER10" + "") + ; >= for integer vectors: swap operands and apply not-greater-than (define_expand "vector_nlt" [(set (match_operand:VEC_I 3 "vlogical_operand") @@ -697,6 +704,17 @@ operands[3] = gen_reg_rtx_and_attrs (operands[0]); }) +(define_expand "vector_nltv1ti" + [(set (match_operand:V1TI 3 "vlogical_operand") + (gt:V1TI (match_operand:V1TI 2 "vlogical_operand") + (match_operand:V1TI 1 "vlogical_operand"))) + (set (match_operand:V1TI 0 "vlogical_operand") + (not:V1TI (match_dup 3)))] + "TARGET_POWER10" +{ + operands[3] = gen_reg_rtx_and_attrs (operands[0]); +}) + (define_expand "vector_gtu" [(set (match_operand:VEC_I 0 "vint_operand") (gtu:VEC_I (match_operand:VEC_I 1 "vint_operand") @@ -704,6 +722,13 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +(define_expand "vector_gtuv1ti" + [(set (match_operand:V1TI 0 "altivec_register_operand") + (gtu:V1TI (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand")))] + "TARGET_POWER10" + "") + ; >= for integer vectors: swap operands and apply not-greater-than (define_expand "vector_nltu" [(set (match_operand:VEC_I 3 "vlogical_operand") @@ -716,6 +741,17 @@ operands[3] = gen_reg_rtx_and_attrs (operands[0]); }) +(define_expand "vector_nltuv1ti" + [(set (match_operand:V1TI 3 "vlogical_operand") + (gtu:V1TI (match_operand:V1TI 2 "vlogical_operand") + (match_operand:V1TI 1 "vlogical_operand"))) + (set (match_operand:V1TI 0 "vlogical_operand") + (not:V1TI (match_dup 3)))] + "TARGET_POWER10" +{ + operands[3] = gen_reg_rtx_and_attrs (operands[0]); +}) + (define_expand "vector_geu" [(set (match_operand:VEC_I 0 "vint_operand") (geu:VEC_I (match_operand:VEC_I 1 "vint_operand") @@ -735,6 +771,17 @@ operands[3] = gen_reg_rtx_and_attrs (operands[0]); }) +(define_expand "vector_ngtv1ti" + [(set (match_operand:V1TI 3 "vlogical_operand") + (gt:V1TI (match_operand:V1TI 1 "vlogical_operand") + (match_operand:V1TI 2 "vlogical_operand"))) + (set (match_operand:V1TI 0 "vlogical_operand") + (not:V1TI (match_dup 3)))] + "TARGET_POWER10" +{ + operands[3] = gen_reg_rtx_and_attrs (operands[0]); +}) + (define_expand "vector_ngtu" [(set (match_operand:VEC_I 3 "vlogical_operand") (gtu:VEC_I (match_operand:VEC_I 1 "vlogical_operand") @@ -746,6 +793,17 @@ operands[3] = gen_reg_rtx_and_attrs (operands[0]); }) +(define_expand "vector_ngtuv1ti" + [(set (match_operand:V1TI 3 "vlogical_operand") + (gtu:V1TI (match_operand:V1TI 1 "vlogical_operand") + (match_operand:V1TI 2 "vlogical_operand"))) + (set (match_operand:V1TI 0 "vlogical_operand") + (not:V1TI (match_dup 3)))] + "TARGET_POWER10" +{ + operands[3] = gen_reg_rtx_and_attrs (operands[0]); +}) + ; There are 14 possible vector FP comparison operators, gt and eq of them have ; been expanded above, so just support 12 remaining operators here. @@ -894,6 +952,18 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +(define_expand "vector_eq_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "vlogical_operand") + (eq:V1TI (match_dup 1) + (match_dup 2)))])] + "TARGET_POWER10" + "") + ;; This expansion handles the V16QI, V8HI, and V4SI modes in the ;; implementation of the vec_all_ne built-in functions on Power9. (define_expand "vector_ne__p" @@ -976,6 +1046,23 @@ operands[3] = gen_reg_rtx (V2DImode); }) +(define_expand "vector_ne_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))] + UNSPEC_PREDICATE)) + (set (match_dup 3) + (eq:V1TI (match_dup 1) + (match_dup 2)))]) + (set (match_operand:SI 0 "register_operand" "=r") + (eq:SI (reg:CC CR6_REGNO) + (const_int 0)))] + "TARGET_POWER10" +{ + operands[3] = gen_reg_rtx (V1TImode); +}) + ;; This expansion handles the V2DI mode in the implementation of the ;; vec_any_eq built-in function on Power9. ;; @@ -1002,6 +1089,26 @@ operands[3] = gen_reg_rtx (V2DImode); }) +(define_expand "vector_ae_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))] + UNSPEC_PREDICATE)) + (set (match_dup 3) + (eq:V1TI (match_dup 1) + (match_dup 2)))]) + (set (match_operand:SI 0 "register_operand" "=r") + (eq:SI (reg:CC CR6_REGNO) + (const_int 0))) + (set (match_dup 0) + (xor:SI (match_dup 0) + (const_int 1)))] + "TARGET_POWER10" +{ + operands[3] = gen_reg_rtx (V1TImode); +}) + ;; This expansion handles the V4SF and V2DF modes in the Power9 ;; implementation of the vec_all_ne built-in functions. Note that the ;; expansions for this pattern with these modes makes no use of power9- @@ -1061,6 +1168,18 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +(define_expand "vector_gt_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(gt:CC (match_operand:V1TI 1 "vlogical_operand") + (match_operand:V1TI 2 "vlogical_operand"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "vlogical_operand") + (gt:V1TI (match_dup 1) + (match_dup 2)))])] + "TARGET_POWER10" + "") + (define_expand "vector_ge__p" [(parallel [(set (reg:CC CR6_REGNO) @@ -1085,6 +1204,18 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +(define_expand "vector_gtu_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(gtu:CC (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "altivec_register_operand") + (gtu:V1TI (match_dup 1) + (match_dup 2)))])] + "TARGET_POWER10" + "") + ;; AltiVec/VSX predicates. ;; This expansion is triggered during expansion of predicate built-in @@ -1460,6 +1591,20 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +(define_expand "vrotlv1ti3" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (rotate:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" +{ + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn (gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn (gen_altivec_vrlq (operands[0], operands[1], tmp)); + DONE; +}) + ;; Expanders for rotatert to make use of vrotl (define_expand "vrotr3" [(set (match_operand:VEC_I 0 "vint_operand") @@ -1481,6 +1626,21 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +;; No immediate version of this 128-bit instruction +(define_expand "vashlv1ti3" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" +{ + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn (gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn (gen_altivec_vslq (operands[0], operands[1], tmp)); + DONE; +}) + ;; Expanders for logical shift right on each vector element (define_expand "vlshr3" [(set (match_operand:VEC_I 0 "vint_operand") @@ -1489,6 +1649,21 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +;; No immediate version of this 128-bit instruction +(define_expand "vlshrv1ti3" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" +{ + /* Shift amount in needs to be put into bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn (gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn (gen_altivec_vsrq (operands[0], operands[1], tmp)); + DONE; +}) + ;; Expanders for arithmetic shift right on each vector element (define_expand "vashr3" [(set (match_operand:VEC_I 0 "vint_operand") @@ -1496,6 +1671,22 @@ (match_operand:VEC_I 2 "vint_operand")))] "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") + +;; No immediate version of this 128-bit instruction +(define_expand "vashrv1ti3" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (ashiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" +{ + /* Shift amount in needs to be put into bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn (gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn (gen_altivec_vsraq (operands[0], operands[1], tmp)); + DONE; +}) + ;; Vector reduction expanders for VSX ; The (VEC_reduc:... diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index e17b9c556d4..fd779435390 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -298,6 +298,12 @@ UNSPEC_VSX_XXSPLTD UNSPEC_VSX_DIVSD UNSPEC_VSX_DIVUD + UNSPEC_VSX_DIVSQ + UNSPEC_VSX_DIVUQ + UNSPEC_VSX_DIVESQ + UNSPEC_VSX_DIVEUQ + UNSPEC_VSX_MODSQ + UNSPEC_VSX_MODUQ UNSPEC_VSX_MULSD UNSPEC_VSX_SIGN_EXTEND UNSPEC_VSX_XVCVBF16SPN @@ -1752,6 +1758,70 @@ } [(set_attr "type" "div")]) +;; 64-bit multiply +(define_insn "mulv2di3" + [(set (match_operand:V2DI 0 "register_operand" "=v") + (mult:V2DI (match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v")))] + "TARGET_POWER10" + "vmulld %0,%1,%2" + [(set_attr "type" "veccomplex")]) + +;; Vector integer signed/unsigned divide +(define_insn "vsx_div_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_DIVSQ))] + "TARGET_POWER10" + "vdivsq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_udiv_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_DIVUQ))] + "TARGET_POWER10" + "vdivuq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_dives_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_DIVESQ))] + "TARGET_POWER10" + "vdivesq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_diveu_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_DIVEUQ))] + "TARGET_POWER10" + "vdiveuq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_mods_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_MODSQ))] + "TARGET_POWER10" + "vmodsq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_modu_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_MODUQ))] + "TARGET_POWER10" + "vmoduq %0,%1,%2" + [(set_attr "type" "div")]) + ;; *tdiv* instruction returning the FG flag (define_expand "vsx_tdiv3_fg" [(set (match_dup 3) @@ -3103,6 +3173,21 @@ "xxpermdi %x0,%x1,%x1,2" [(set_attr "type" "vecperm")]) +;; Swap upper/lower 64-bit values in a 128-bit vector +(define_insn "xxswapd_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (subreg:V1TI + (vec_select:V2DI + (subreg:V2DI + (match_operand:V1TI 1 "vsx_register_operand" "v") 0 ) + (parallel [(const_int 1)(const_int 0)])) + 0))] + "TARGET_POWER10" +;; AIX does not support extended mnemonic xxswapd. Use the basic +;; mnemonic xxpermdi instead. + "xxpermdi %x0,%x1,%x1,2" + [(set_attr "type" "vecperm")]) + (define_insn "xxgenpcvm__internal" [(set (match_operand:VSX_EXTRACT_I4 0 "altivec_register_operand" "=wa") (unspec:VSX_EXTRACT_I4 @@ -4787,6 +4872,15 @@ (set_attr "type" "vecload")]) +;; ISA 3.1 vector extend sign support +(define_insn "vsx_sign_extend_v2di_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "vsx_register_operand" "v")] + UNSPEC_VSX_SIGN_EXTEND))] + "TARGET_POWER10" + "vextsd2q %0,%1" + [(set_attr "type" "vecexts")]) + ;; ISA 3.0 vector extend sign support (define_insn "vsx_sign_extend_qi_" @@ -5502,6 +5596,19 @@ "vcmpneb %0,%1,%2" [(set_attr "type" "vecsimple")]) +;; Vector Compare Not Equal v1ti (specified/not+eq:) +(define_expand "vcmpnet" + [(set (match_operand:V1TI 0 "altivec_register_operand") + (not:V1TI + (eq:V1TI (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))))] + "TARGET_POWER10" +{ + emit_insn (gen_eqvv1ti3 (operands[0], operands[1], operands[2])); + emit_insn (gen_one_cmplv1ti2 (operands[0], operands[0])); + DONE; +}) + ;; Vector Compare Not Equal or Zero Byte (define_insn "vcmpnezb" [(set (match_operand:V16QI 0 "altivec_register_operand" "=v") diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index feaa4929697..f2efd73e12d 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -21662,6 +21662,180 @@ Generate PCV from specified Mask size, as if implemented by the immediate value is either 0, 1, 2 or 3. @findex vec_genpcvm +@smallexample +@exdent vector unsigned __int128 vec_rl (vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_rl (vector signed __int128, + vector unsigned __int128); +@end smallexample + +Returns the result of rotating the first input left by the number of bits +specified in the most significant quad word of the second input truncated to +7 bits (bits [125:131]). + +@smallexample +@exdent vector unsigned __int128 vec_rlmi (vector unsigned __int128, + vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_rlmi (vector signed __int128, + vector signed __int128, + vector unsigned __int128); +@end smallexample + +Returns the result of rotating the first input and inserting it under mask +into the second input. The first bit in the mask, the last bit in the mask are +obtained from the two 7-bit fields bits [108:115] and bits [117:123] +respectively of the second input. The shift is obtained from the third input +in the 7-bit field [125:131] where all bits counted from zero at the left. + +@smallexample +@exdent vector unsigned __int128 vec_rlnm (vector unsigned __int128, + vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_rlnm (vector signed __int128, + vector unsigned __int128, + vector unsigned __int128); +@end smallexample + +Returns the result of rotating the first input and ANDing it with a mask. The +first bit in the mask and the last bit in the mask are obtained from the two +7-bit fields bits [117:123] and bits [125:131] respectively of the second +input. The shift is obtained from the third input in the 7-bit field bits +[125:131] where all bits counted from zero at the left. + +@smallexample +@exdent vector unsigned __int128 vec_sl(vector unsigned __int128, vector unsigned __int128); +@exdent vector signed __int128 vec_sl(vector signed __int128, vector unsigned __int128); +@end smallexample + +Returns the result of shifting the first input left by the number of bits +specified in the most significant bits of the second input truncated to +7 bits (bits [125:131]). + +@smallexample +@exdent vector unsigned __int128 vec_sr(vector unsigned __int128, vector unsigned __int128); +@exdent vector signed __int128 vec_sr(vector signed __int128, vector unsigned __int128); +@end smallexample + +Returns the result of performing a logical right shift of the first argument +by the number of bits specified in the most significant double word of the +second input truncated to 7 bits (bits [125:131]). + +@smallexample +@exdent vector unsigned __int128 vec_sra(vector unsigned __int128, vector unsigned __int128); +@exdent vector signed __int128 vec_sra(vector signed __int128, vector unsigned __int128); +@end smallexample + +Returns the result of performing arithmetic right shift of the first argument +by the number of bits specified in the most significant bits of the +second input truncated to 7 bits (bits [125:131]). + +@smallexample +@exdent vector unsigned __int128 vec_mule (vector unsigned long long, + vector unsigned long long); +@exdent vector signed __int128 vec_mule (vector signed long long, + vector signed long long); +@end smallexample + +Returns a vector containing a 128-bit integer result of multiplying the even +doubleword elements of the two inputs. + +@smallexample +@exdent vector unsigned __int128 vec_mulo (vector unsigned long long, + vector unsigned long long); +@exdent vector signed __int128 vec_mulo (vector signed long long, + vector signed long long); +@end smallexample + +Returns a vector containing a 128-bit integer result of multiplying the odd +doubleword elements of the two inputs. + +@smallexample +@exdent vector unsigned __int128 vec_div (vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_div (vector signed __int128, + vector signed __int128); +@end smallexample + +Returns the result of dividing the first operand by the second operand. An +attempt to divide any value by zero or to divide the most negative signed +128-bit integer by negative one results in an undefined value. + +@smallexample +@exdent vector unsigned __int128 vec_dive (vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_dive (vector signed __int128, + vector signed __int128); +@end smallexample + +The result is produced by shifting the first input left by 128 bits and +dividing by the second. If an attempt is made to divide by zero or the result +is larger than 128 bits, the result is undefined. + +@smallexample +@exdent vector unsigned __int128 vec_mod (vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_mod (vector signed __int128, + vector signed __int128); +@end smallexample + +The result is the modulo result of dividing the first input by the second +input. + +The following builtins perform 128-bit vector comparisons. The +@code{vec_all_xx}, @code{vec_any_xx}, and @code{vec_cmpxx}, where @code{xx} is +one of the operations @code{eq, ne, gt, lt, ge, le} perform pairwise +comparisons between the elements at the same positions within their two vector +arguments. The @code{vec_all_xx}function returns a non-zero value if and only +if all pairwise comparisons are true. The @code{vec_any_xx} function returns +a non-zero value if and only if at least one pairwise comparison is true. The +@code{vec_cmpxx}function returns a vector of the same type as its two +arguments, within which each element consists of all ones to denote that +specified logical comparison of the corresponding elements was true. +Otherwise, the element of the returned vector contains all zeros. + +@smallexample +vector bool __int128 vec_cmpeq (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmpeq (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmpne (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmpne (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmpgt (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmpgt (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmplt (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmplt (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmpge (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmpge (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmple (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmple (vector unsigned __int128, vector unsigned __int128); + +int vec_all_eq (vector signed __int128, vector signed __int128); +int vec_all_eq (vector unsigned __int128, vector unsigned __int128); +int vec_all_ne (vector signed __int128, vector signed __int128); +int vec_all_ne (vector unsigned __int128, vector unsigned __int128); +int vec_all_gt (vector signed __int128, vector signed __int128); +int vec_all_gt (vector unsigned __int128, vector unsigned __int128); +int vec_all_lt (vector signed __int128, vector signed __int128); +int vec_all_lt (vector unsigned __int128, vector unsigned __int128); +int vec_all_ge (vector signed __int128, vector signed __int128); +int vec_all_ge (vector unsigned __int128, vector unsigned __int128); +int vec_all_le (vector signed __int128, vector signed __int128); +int vec_all_le (vector unsigned __int128, vector unsigned __int128); + +int vec_any_eq (vector signed __int128, vector signed __int128); +int vec_any_eq (vector unsigned __int128, vector unsigned __int128); +int vec_any_ne (vector signed __int128, vector signed __int128); +int vec_any_ne (vector unsigned __int128, vector unsigned __int128); +int vec_any_gt (vector signed __int128, vector signed __int128); +int vec_any_gt (vector unsigned __int128, vector unsigned __int128); +int vec_any_lt (vector signed __int128, vector signed __int128); +int vec_any_lt (vector unsigned __int128, vector unsigned __int128); +int vec_any_ge (vector signed __int128, vector signed __int128); +int vec_any_ge (vector unsigned __int128, vector unsigned __int128); +int vec_any_le (vector signed __int128, vector signed __int128); +int vec_any_le (vector unsigned __int128, vector unsigned __int128); +@end smallexample + + @node PowerPC Hardware Transactional Memory Built-in Functions @subsection PowerPC Hardware Transactional Memory Built-in Functions GCC provides two interfaces for accessing the Hardware Transactional diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c new file mode 100644 index 00000000000..3f8892b39d6 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c @@ -0,0 +1,2301 @@ +/* { dg-do run } */ +/* { dg-options "-mcpu=power10 -save-temps" } */ +/* { dg-require-effective-target power10_hw } */ + +/* Check that the expected 128-bit instructions are generated if the processor + supports the 128-bit integer instructions. */ +/* { dg-final { scan-assembler-times {\mvextsd2q\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvslq\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvsrq\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvsraq\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvrlq\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvrlqnm\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvrlqmi\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvcmpuq\M} 0 } } */ +/* { dg-final { scan-assembler-times {\mvcmpsq\M} 0 } } */ +/* { dg-final { scan-assembler-times {\mvcmpequq\M} 0 } } */ +/* { dg-final { scan-assembler-times {\mvcmpequq.\M} 16 } } */ +/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 0 } } */ +/* { dg-final { scan-assembler-times {\mvcmpgtsq.\M} 16 } } */ +/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 0 } } */ +/* { dg-final { scan-assembler-times {\mvcmpgtuq.\M} 16 } } */ +/* { dg-final { scan-assembler-times {\mvmuleud\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvmuloud\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvmulesd\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvmulosd\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvmulld\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvdivsq\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvdivuq\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvdivesq\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvdiveuq\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvmodsq\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvmoduq\M} 1 } } */ + +#include + +#define DEBUG 0 + +#if DEBUG +#include +#include + + +void print_i128(__int128_t val) +{ + printf(" %lld %llu (0x%llx %llx)", + (signed long long)(val >> 64), + (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF), + (unsigned long long)(val >> 64), + (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF)); +} +#endif + +void abort (void); + +int main () +{ + int i, result_int; + + __int128_t arg1, result; + __uint128_t uarg2; + + vector signed long long int vec_arg1_di, vec_arg2_di; + vector signed long long int vec_result_di, vec_expected_result_di; + vector unsigned long long int vec_uarg1_di, vec_uarg2_di, vec_uarg3_di; + vector unsigned long long int vec_uresult_di; + vector unsigned long long int vec_uexpected_result_di; + + __int128_t expected_result; + __uint128_t uexpected_result; + + vector __int128 vec_arg1, vec_arg2, vec_result; + vector unsigned __int128 vec_uarg1, vec_uarg2, vec_uarg3, vec_uresult; + vector bool __int128 vec_result_bool; + + /* sign extend double to 128-bit integer */ + vec_arg1_di[0] = 1000; + vec_arg1_di[1] = -123456; + + expected_result = 1000; + + vec_result = vec_signextq (vec_arg1_di); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_signextq ((long long) %lld) = ", vec_arg1_di[0]); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1_di[0] = -123456; + vec_arg1_di[1] = 1000; + + expected_result = -123456; + + vec_result = vec_signextq (vec_arg1_di); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_signextq ((long long) %lld) = ", vec_arg1_di[0]); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* test shift 128-bit integers. + Note, shift amount is given by the lower 7-bits of the shift amount. */ + vec_arg1[0] = 3; + vec_uarg2[0] = 2; + expected_result = vec_arg1[0]*4; + + vec_result = vec_sl (vec_arg1, vec_uarg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_sl(int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" << %lld", vec_uarg2[0] & 0xFF); + printf(" = "); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + arg1 = 3; + uarg2 = 4; + expected_result = arg1*16; + + result = arg1 << uarg2; + + if (result != expected_result) { +#if DEBUG + printf("ERROR: int128 << uint128): "); + print_i128(arg1); + printf(" << %lld", uarg2 & 0xFF); + printf(" = "); + print_i128(result); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 3; + vec_uarg2[0] = 2; + uexpected_result = vec_uarg1[0]*4; + + vec_uresult = vec_sl (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_sl(uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" << %lld", vec_uarg2[0] & 0xFF); + printf(" = "); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12; + vec_uarg2[0] = 2; + expected_result = vec_arg1[0]/4; + + vec_result = vec_sr (vec_arg1, vec_uarg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_sr(int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" >> %lld", vec_uarg2[0] & 0xFF); + printf(" = "); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 48; + vec_uarg2[0] = 2; + uexpected_result = vec_uarg1[0]/4; + + vec_uresult = vec_sr (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_sr(uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" >> %lld", vec_uarg2[0] & 0xFF); + printf(" = "); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + arg1 = 48; + uarg2 = 4; + expected_result = arg1/16; + + result = arg1 >> uarg2; + + if (result != expected_result) { +#if DEBUG + printf("ERROR: int128 >> uint128: "); + print_i128(arg1); + printf(" >> %lld", uarg2 & 0xFF); + printf(" = "); + print_i128(result); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 0x1234567890ABCDEFULL; + vec_arg1[0] = (vec_arg1[0] << 64) | 0xAABBCCDDEEFF1122ULL; + vec_uarg2[0] = 32; + expected_result = 0x0000000012345678ULL; + expected_result = (expected_result << 64) | 0x90ABCDEFAABBCCDDULL; + + vec_result = vec_sra (vec_arg1, vec_uarg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_sra(int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" >> %lld = \n", vec_uarg2[0]); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 0xAABBCCDDEEFF1122ULL; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 0x1234567890ABCDEFULL; + vec_uarg2[0] = 48; + uexpected_result = 0xFFFFFFFFFFFFAABBLL; + uexpected_result = (uexpected_result << 64) | 0xCCDDEEFF11221234ULL; + + vec_uresult = vec_sra (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_sra(uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" >> %lld = \n", vec_uarg2[0] & 0xFF); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 0x1234567890ABCDEFULL; + vec_arg1[0] = (vec_arg1[0] << 64) | 0xAABBCCDDEEFF1122ULL; + vec_uarg2[0] = 32; + expected_result = 0x90ABCDEFAABBCCDDULL; + expected_result = (expected_result << 64) | 0xEEFF112212345678ULL; + + vec_result = vec_rl (vec_arg1, vec_uarg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_rl(int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" >> %lld = \n", vec_uarg2[0]); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 0xAABBCCDDEEFF1122ULL; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 0x1234567890ABCDEFULL; + vec_uarg2[0] = 48; + uexpected_result = 0x11221234567890ABULL; + uexpected_result = (uexpected_result << 64) | 0xCDEFAABBCCDDEEFFULL; + + vec_uresult = vec_rl (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_rl(uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" >> %lld = \n", vec_uarg2[0]); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* vec_rlnm(arg1, arg2, arg3) + result - rotate each element of arg1 left by shift in element of arg2. + Then AND with mask whose start/stop bits are specified in element of + arg3. */ + vec_arg1[0] = 0x1234567890ABCDEFULL; + vec_arg1[0] = (vec_arg1[0] << 64) | 0xAABBCCDDEEFF1122ULL; + vec_uarg2[0] = 32; + vec_uarg3[0] = (32 << 8) | 95; + expected_result = 0xaabbccddULL; + expected_result = (expected_result << 64) | 0xeeff112200000000ULL; + + vec_result = vec_rlnm (vec_arg1, vec_uarg2, vec_uarg3); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_rlnm(int128, uint128, uint128): "); + print_i128(vec_arg1[0]); + printf(" << %lld = \n", vec_uarg3[0] & 0xFF); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + + + /* vec_rlnm(arg1, arg2, arg3) + result - rotate each element of arg1 left by shift in element of arg2; + then AND with mask whose start/stop bits are specified in element of + arg3. */ + vec_uarg1[0] = 0xAABBCCDDEEFF1122ULL; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 0x1234567890ABCDEFULL; + vec_uarg2[0] = 48; + vec_uarg3[0] = (8 << 8) | 119; + + uexpected_result = 0x00221234567890ABULL; + uexpected_result = (uexpected_result << 64) | 0xCDEFAABBCCDDEE00ULL; + + vec_uresult = vec_rlnm (vec_uarg1, vec_uarg2, vec_uarg3); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_rlnm(uint128, uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" << %lld = \n", vec_uarg3[0] & 0xFF); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* vec_rlmi(arg1, arg2, arg3) + result - rotate each element of arg1 left and inserting it into arg2 + ement of arg2 based on the mask specified in arg3. The shift, mask + start and end is specified in arg3. */ + vec_arg1[0] = 0x1234567890ABCDEFULL; + vec_arg1[0] = (vec_arg1[0] << 64) | 0xAABBCCDDEEFF1122ULL; + vec_arg2[0] = 0x000000000000DEADULL; + vec_arg2[0] = (vec_arg2[0] << 64) | 0x0000BEEF00000000ULL; + vec_uarg3[0] = 96 << 16 | 127 << 8 | 32; + expected_result = 0x000000000000DEADULL; + expected_result = (expected_result << 64) | 0x0000BEEF12345678ULL; + + vec_result = vec_rlmi (vec_arg1, vec_arg2, vec_uarg3); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_rlmi(int128, int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" << %lld = \n", vec_uarg2_di[1] & 0xFF); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* vec_rlmi(arg1, arg2, arg3) + result - rotate each element of arg1 left and inserting it into arg2 + ement of arg2 based on the mask specified in arg3. The shift, mask + start and end is specified in arg3. */ + vec_uarg1[0] = 0xAABBCCDDEEFF1122ULL; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 0x1234567890ABCDEFULL; + vec_uarg2[0] = 0xDEAD000000000000ULL; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 0x000000000000BEEFULL; + vec_uarg3[0] = 16 << 16 | 111 << 8 | 48; + uexpected_result = 0xDEAD1234567890ABULL; + uexpected_result = (uexpected_result << 64) | 0xCDEFAABBCCDDBEEFULL; + + vec_uresult = vec_rlmi (vec_uarg1, vec_uarg2, vec_uarg3); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_rlmi(uint128, unit128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" << %lld = \n", vec_uarg3[1] & 0xFF); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* 128-bit compare tests, result is all 1's if true */ + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1[0] = 2468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + uexpected_result = 0xFFFFFFFFFFFFFFFFULL; + uexpected_result = (uexpected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpgt (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != uexpected_result) { +#if DEBUG + printf("ERROR: unsigned vec_cmpgt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpgt (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed vec_cmpgt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmpeq (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR:not equal signed vec_cmpeq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpeq (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed equal vec_cmpeq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmpeq (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned not equal vec_cmpeq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpeq (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: equal unsigned vec_cmpeq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpne (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned not equal vec_cmpne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmpne (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: equal unsigned vec_cmpne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpne (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR:not equal signed vec_cmpne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmpne (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed equal vec_cmpne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmplt (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 > arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 1234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 12468; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmplt (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 < arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmplt (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 = arg2 vec_cmplt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmplt (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 > arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -1234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 12468; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmplt (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 < arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmplt (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmple (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 > arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 1234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 12468; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmple (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 < arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmple (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 = arg2 vec_cmple ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmple (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 > arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -1234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 12468; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmple (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 < arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmple (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpge (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 > arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 1234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 12468; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmpge (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 < arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpge (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 = arg2 vec_cmpge ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpge (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 > arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -1234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 12468; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmpge (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 < arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpge (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + +#if 1 + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_eq (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_eq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_eq (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_eq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_eq (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_eq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_eq (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_eq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_ne (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_ne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_ne (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_ne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_ne (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_ne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_ne (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_ne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_lt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_lt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_lt (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_lt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_lt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_lt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_lt (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_lt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_le (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_le ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_le (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_le ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_le (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_le ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_le (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_le ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_gt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_gt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_gt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_gt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_gt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_gt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_ge (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_ge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_ge (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_ge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_ge (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_ge ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_ge (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_ge ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_eq (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_eq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_eq (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_eq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_eq (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_eq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_eq (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_eq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_ne (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_ne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_ne (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_ne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_ne (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_ne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_ne (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_ne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_lt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_lt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_lt (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_lt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_lt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_lt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_lt (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_lt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_gt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_gt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_gt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_gt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_gt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_gt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_le (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_le ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_le (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_le ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_le (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_le ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_le (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_le ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_ge (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_ge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_ge (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_ge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_ge (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_ge ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_ge (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } +#endif + + /* Vector multiply Even and Odd tests */ + vec_arg1_di[0] = 200; + vec_arg1_di[1] = 400; + vec_arg2_di[0] = 1234; + vec_arg2_di[1] = 4567; + expected_result = vec_arg1_di[0] * vec_arg2_di[0]; + + vec_result = vec_mule (vec_arg1_di, vec_arg2_di); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_mule (signed, signed) failed.\n"); + printf(" vec_arg1_di[0] = %lld\n", vec_arg1_di[0]); + printf(" vec_arg2_di[0] = %lld\n", vec_arg2_di[0]); + printf("Result = "); + print_i128(vec_result[0]); + printf("\nExpected Result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1_di[0] = -200; + vec_arg1_di[1] = -400; + vec_arg2_di[0] = 1234; + vec_arg2_di[1] = 4567; + expected_result = vec_arg1_di[1] * vec_arg2_di[1]; + + vec_result = vec_mulo (vec_arg1_di, vec_arg2_di); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_mulo (signed, signed) failed.\n"); + printf(" vec_arg1_di[1] = %lld\n", vec_arg1_di[1]); + printf(" vec_arg2_di[1] = %lld\n", vec_arg2_di[1]); + printf("Result = "); + print_i128(vec_result[0]); + printf("\nExpected Result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1_di[0] = 200; + vec_uarg1_di[1] = 400; + vec_uarg2_di[0] = 1234; + vec_uarg2_di[1] = 4567; + uexpected_result = vec_uarg1_di[0] * vec_uarg2_di[0]; + + vec_uresult = vec_mule (vec_uarg1_di, vec_uarg2_di); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_mule (unsigned, unsigned) failed.\n"); + printf(" vec_uarg1_di[1] = %lld\n", vec_uarg1_di[1]); + printf(" vec_uarg2_di[1] = %lld\n", vec_uarg2_di[1]); + printf("Result = "); + print_i128(vec_uresult[0]); + printf("\nExpected Result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1_di[0] = 200; + vec_uarg1_di[1] = 400; + vec_uarg2_di[0] = 1234; + vec_uarg2_di[1] = 4567; + uexpected_result = vec_uarg1_di[1] * vec_uarg2_di[1]; + + vec_uresult = vec_mulo (vec_uarg1_di, vec_uarg2_di); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_mulo (unsigned, unsigned) failed.\n"); + printf(" vec_uarg1_di[0] = %lld\n", vec_uarg1_di[0]); + printf(" vec_uarg2_di[0] = %lld\n", vec_uarg2_di[0]); + printf("Result = "); + print_i128(vec_uresult[0]); + printf("\nExpected Result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* Vector Multiply Longword */ + vec_arg1_di[0] = 100; + vec_arg1_di[1] = -123456; + + vec_arg2_di[0] = 123; + vec_arg2_di[1] = 1000; + + vec_expected_result_di[0] = 12300; + vec_expected_result_di[1] = -123456000; + + vec_result_di = vec_arg1_di * vec_arg2_di; + + for (i = 0; i<2; i++) { + if (vec_result_di[i] != vec_expected_result_di[i]) { +#if DEBUG + printf("ERROR: vector multipy [%d] ((long long) %lld) = ", i, + vec_result_di[i]); + printf("\n does not match expected_result [%d] = ((long long) %lld)", i, + vec_expected_result_di[i]); + printf("\n\n"); +#else + abort(); +#endif + } + } + + /* Vector Divide Quadword */ + vec_arg1[0] = -12345678; + vec_arg2[0] = 2; + expected_result = -6172839; + + vec_result = vec_div (vec_arg1, vec_arg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_div (signed, signed) failed.\n"); + printf("vec_arg1[0] = "); + print_i128(vec_arg1[0]); + printf("\nvec_arg2[0] = "); + print_i128(vec_arg2[0]); + printf("\nResult = "); + print_i128(vec_result[0]); + printf("\nExpected result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 24680; + vec_uarg2[0] = 4; + uexpected_result = 6170; + + vec_uresult = vec_div (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_div (unsigned, unsigned) failed.\n"); + printf("vec_uarg1[0] = "); + print_i128(vec_uarg1[0]); + printf("\nvec_uarg2[0] = "); + print_i128(vec_uarg2[0]); + printf("\nResult = "); + print_i128(vec_uresult[0]); + printf("\nExpected result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* Vector Divide Extended Quadword */ + vec_arg1[0] = -20; // has 128-bit of zero concatenated onto it + vec_arg2[0] = 0x2000000000000000; + vec_arg2[0] = vec_arg2[0] << 64; + expected_result = -160; + + vec_result = vec_dive (vec_arg1, vec_arg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_dive (signed, signed) failed.\n"); + printf("vec_arg1[0] = "); + print_i128(vec_arg1[0]); + printf("\nvec_arg2[0] = "); + print_i128(vec_arg2[0]); + printf("\nResult = "); + print_i128(vec_result[0]); + printf("\nExpected result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 20; // has 128-bit of zero concatenated onto it + vec_uarg2[0] = 0x4000000000000000; + vec_uarg2[0] = vec_uarg2[0] << 64; + uexpected_result = 80; + + vec_uresult = vec_dive (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_dive (unsigned, unsigned) failed.\n"); + printf("vec_uarg1[0] = "); + print_i128(vec_uarg1[0]); + printf("\nvec_uarg2[0] = "); + print_i128(vec_uarg2[0]); + printf("\nResult = "); + print_i128(vec_uresult[0]); + printf("\nExpected result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* Vector modulo quad word */ + vec_arg1[0] = -12345675; + vec_arg2[0] = 2; + expected_result = -1; + + vec_result = vec_mod (vec_arg1, vec_arg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_mod (signed, signed) failed.\n"); + printf("vec_arg1[0] = "); + print_i128(vec_arg1[0]); + printf("\nvec_arg2[0] = "); + print_i128(vec_arg2[0]); + printf("\nResult = "); + print_i128(vec_result[0]); + printf("\nExpected result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 24685; + vec_uarg2[0] = 4; + uexpected_result = 1; + + vec_uresult = vec_mod (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_mod (unsigned, unsigned) failed.\n"); + printf("vec_uarg1[0] = "); + print_i128(vec_uarg1[0]); + printf("\nvec_uarg2[0] = "); + print_i128(vec_uarg2[0]); + printf("\nResult = "); + print_i128(vec_uresult[0]); + printf("\nExpected result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + return 0; +} From patchwork Tue Jan 19 22:33:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carl Love X-Patchwork-Id: 1428911 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=rr8QlWFa; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DL3NB3t9Zz9sRR for ; Wed, 20 Jan 2021 09:33:46 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3943239450D4; Tue, 19 Jan 2021 22:33:43 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3943239450D4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1611095623; bh=F6WZkL3fazsFgg47DxB/CCm3sz76hRFI8mcmXfgYWWk=; h=Subject:To:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=rr8QlWFaokYOaupi1EU8bwevnjUj4edfYr+6DBBtpvFsP/tS4r/GqU8yfzgG9XIzg PNdEDc4a7tEbNc3QBsyvRpmkxwIyRfMDT8/u6+Nn/U6PcUoS5jgqWmz9bdlURFrUbA 3I0Z3cfhy8jCaPOp0qqCclwxJFF33ho3EUmWsmRU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id E801E39450DE for ; Tue, 19 Jan 2021 22:33:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E801E39450DE Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 10JMWNLd123448; Tue, 19 Jan 2021 17:33:40 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 36686t84md-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 17:33:40 -0500 Received: from m0098414.ppops.net (m0098414.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 10JMWu6w128402; Tue, 19 Jan 2021 17:33:40 -0500 Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0b-001b2d01.pphosted.com with ESMTP id 36686t84ky-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 17:33:39 -0500 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 10JMP3cQ028686; Tue, 19 Jan 2021 22:33:39 GMT Received: from b03cxnp08027.gho.boulder.ibm.com (b03cxnp08027.gho.boulder.ibm.com [9.17.130.19]) by ppma02wdc.us.ibm.com with ESMTP id 363qs91e45-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 22:33:39 +0000 Received: from b03ledav001.gho.boulder.ibm.com (b03ledav001.gho.boulder.ibm.com [9.17.130.232]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 10JMXcqI10224246 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 19 Jan 2021 22:33:38 GMT Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 302046E065; Tue, 19 Jan 2021 22:33:38 +0000 (GMT) Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A460A6E05F; Tue, 19 Jan 2021 22:33:37 +0000 (GMT) Received: from li-e362e14c-2378-11b2-a85c-87d605f3c641.ibm.com (unknown [9.163.70.85]) by b03ledav001.gho.boulder.ibm.com (Postfix) with ESMTP; Tue, 19 Jan 2021 22:33:37 +0000 (GMT) Message-ID: <9d7a7a8451a4c3f27e82b630c949fe43e3ab4d49.camel@us.ibm.com> Subject: [PATCH 4/6 ver 3] Add TI to TD (128-bit DFP) and TD to TI support To: Segher Boessenkool , will schmidt , cel@us.ibm.com Date: Tue, 19 Jan 2021 14:33:36 -0800 In-Reply-To: <20201013002313.GV2672@gate.crashing.org> References: <815d6b091f4b8bf3ab7c7e203c41d03c6c0e8d81.camel@us.ibm.com> <8acbb7bc3964944154491037884523c94ac3bdb1.camel@us.ibm.com> <384c17c8b764c850f8a9a08e963ed34ec89de28b.camel@vnet.ibm.com> <82b546ae55356938b9002ca4a9d0d4eb62961dae.camel@vnet.ibm.com> <20201013002313.GV2672@gate.crashing.org> X-Mailer: Evolution 3.28.5 (3.28.5-12.el8) Mime-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.343, 18.0.737 definitions=2021-01-19_12:2021-01-18, 2021-01-19 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 mlxlogscore=999 mlxscore=0 suspectscore=0 spamscore=0 bulkscore=0 lowpriorityscore=0 malwarescore=0 impostorscore=0 adultscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2101190117 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Carl Love via Gcc-patches From: Carl Love Reply-To: Carl Love Cc: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Will, Segher: This patch adds support for converting to/from 128-bit integers and 128-bit decimal floating point formats. Version 3: No functional changes. Tested on Power 8BE, Power9, Power10. Version 2: Updated ChangeLog comments. Fixed up comments in the test program. Re-tested the patch on Power 9 with no regression errors. Carl ----------------------------------------------------------- gcc/ChangeLog 2021-01-12 Carl Love * config/rs6000/dfp.md (floattitd2, fixtdti2): New define_insns. * config/rs6000/rs6000-call.c (P10V_BUILTIN_VCMPNET_P, P10V_BUILTIN_VCMPAET_P): New overloaded definitions. gcc/testsuite/ChangeLog 2021-01-12 Carl Love * gcc.target/powerpc/int_128bit-runnable.c: Add 128-bit DFP conversion tests. --- gcc/config/rs6000/dfp.md | 14 +++++ .../gcc.target/powerpc/int_128bit-runnable.c | 61 +++++++++++++++++++ 2 files changed, 75 insertions(+) diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md index c8cdb645865..876ab2ed682 100644 --- a/gcc/config/rs6000/dfp.md +++ b/gcc/config/rs6000/dfp.md @@ -222,6 +222,13 @@ "dcffixq %0,%1" [(set_attr "type" "dfp")]) +(define_insn "floattitd2" + [(set (match_operand:TD 0 "gpc_reg_operand" "=d") + (float:TD (match_operand:TI 1 "gpc_reg_operand" "v")))] + "TARGET_POWER10" + "dcffixqq %0,%1" + [(set_attr "type" "dfp")]) + ;; Convert a decimal64/128 to a decimal64/128 whose value is an integer. ;; This is the first stage of converting it to an integer type. @@ -241,6 +248,13 @@ "TARGET_DFP" "dctfix %0,%1" [(set_attr "type" "dfp")]) + +(define_insn "fixtdti2" + [(set (match_operand:TI 0 "gpc_reg_operand" "=v") + (fix:TI (match_operand:TD 1 "gpc_reg_operand" "d")))] + "TARGET_POWER10" + "dctfixqq %0,%1" + [(set_attr "type" "dfp")]) ;; Decimal builtin support diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c index 3f8892b39d6..42cb91c7ba9 100644 --- a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c @@ -38,6 +38,7 @@ #if DEBUG #include #include +#include void print_i128(__int128_t val) @@ -59,6 +60,13 @@ int main () __int128_t arg1, result; __uint128_t uarg2; + _Decimal128 arg1_dfp128, result_dfp128, expected_result_dfp128; + + struct conv_t { + __uint128_t u128; + _Decimal128 d128; + } conv, conv2; + vector signed long long int vec_arg1_di, vec_arg2_di; vector signed long long int vec_result_di, vec_expected_result_di; vector unsigned long long int vec_uarg1_di, vec_uarg2_di, vec_uarg3_di; @@ -2296,6 +2304,59 @@ int main () abort(); #endif } + + /* DFP to __int128 and __int128 to DFP conversions */ + /* Print the DFP value as an unsigned int so we can see the bit patterns. */ + conv.u128 = 0x2208000000000000ULL; + conv.u128 = (conv.u128 << 64) | 0x4ULL; //DFP bit pattern for integer 4 + expected_result_dfp128 = conv.d128; + arg1 = 4; + + conv.d128 = (_Decimal128) arg1; + + result_dfp128 = (_Decimal128) arg1; + if (((conv.u128 >>64) != 0x2208000000000000ULL) && + ((conv.u128 & 0xFFFFFFFFFFFFFFFF) != 0x4ULL)) { +#if DEBUG + printf("ERROR: convert int128 value "); + print_i128 (arg1); + conv.d128 = result_dfp128; + printf("\nto DFP value 0x%llx %llx (printed as hex bit string) ", + (unsigned long long)((conv.u128) >>64), + (unsigned long long)((conv.u128) & 0xFFFFFFFFFFFFFFFF)); + + conv.d128 = expected_result_dfp128; + printf("\ndoes not match expected_result = 0x%llx %llx\n\n", + (unsigned long long) (conv.u128>>64), + (unsigned long long) (conv.u128 & 0xFFFFFFFFFFFFFFFF)); +#else + abort(); +#endif + } + + expected_result = 4; + + conv.u128 = 0x2208000000000000ULL; + conv.u128 = (conv.u128 << 64) | 0x4ULL; // 4 as DFP + arg1_dfp128 = conv.d128; + + result = (__int128_t) arg1_dfp128; + + if (result != expected_result) { +#if DEBUG + printf("ERROR: convert DFP value "); + printf("0x%llx %llx (printed as hex bit string) ", + (unsigned long long)(conv.u128>>64), + (unsigned long long)(conv.u128 & 0xFFFFFFFFFFFFFFFF)); + printf("to __int128 value = "); + print_i128 (result); + printf("\ndoes not match expected_result = "); + print_i128 (expected_result); + printf("\n"); +#else + abort(); +#endif + } return 0; } From patchwork Tue Jan 19 22:33:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carl Love X-Patchwork-Id: 1428913 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=TxCzVmLH; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DL3NN2vM9z9sRR for ; Wed, 20 Jan 2021 09:33:56 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5400739450E8; Tue, 19 Jan 2021 22:33:48 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5400739450E8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1611095628; bh=da+P3+9yRaZYanhCfpsdtNrkGk1rPfK6cu7bxl0SrX8=; h=Subject:To:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=TxCzVmLHN2rWq737PG/qJSn5qLveSwj+xIVaD6lKwZtSckOY/FEt+S7AjiXUUk/1E HF0ydgBL9mklbWBZfBbRJgWnsji3GFZBeSSNZVIb7Vxf09hnDKFzrNvp2R9RqKDS81 8kEQrrG6wcI67hl57mFx06ghJGA3uldarXTCeVlg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 5739E39450CE for ; Tue, 19 Jan 2021 22:33:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5739E39450CE Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 10JM9mN3043831; Tue, 19 Jan 2021 17:33:44 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3667yg0hgn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 17:33:44 -0500 Received: from m0098393.ppops.net (m0098393.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 10JMQ2Eq112627; Tue, 19 Jan 2021 17:33:44 -0500 Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0a-001b2d01.pphosted.com with ESMTP id 3667yg0hga-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 17:33:43 -0500 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 10JMWS0v022053; Tue, 19 Jan 2021 22:33:43 GMT Received: from b01cxnp23034.gho.pok.ibm.com (b01cxnp23034.gho.pok.ibm.com [9.57.198.29]) by ppma01dal.us.ibm.com with ESMTP id 363qs9qvt6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 22:33:43 +0000 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp23034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 10JMXgAM45547988 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 19 Jan 2021 22:33:42 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 34D95B2078; Tue, 19 Jan 2021 22:33:42 +0000 (GMT) Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4F01BB2077; Tue, 19 Jan 2021 22:33:41 +0000 (GMT) Received: from li-e362e14c-2378-11b2-a85c-87d605f3c641.ibm.com (unknown [9.163.70.85]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP; Tue, 19 Jan 2021 22:33:41 +0000 (GMT) Message-ID: <1822d31054984200a3349f0669182557212a56b2.camel@us.ibm.com> Subject: [PATCH 5/6 ver 3] rs6000, Add test 128-bit shifts for just the int128 type. To: Segher Boessenkool , will schmidt , cel@us.ibm.com Date: Tue, 19 Jan 2021 14:33:40 -0800 In-Reply-To: <20201013002313.GV2672@gate.crashing.org> References: <815d6b091f4b8bf3ab7c7e203c41d03c6c0e8d81.camel@us.ibm.com> <8acbb7bc3964944154491037884523c94ac3bdb1.camel@us.ibm.com> <384c17c8b764c850f8a9a08e963ed34ec89de28b.camel@vnet.ibm.com> <82b546ae55356938b9002ca4a9d0d4eb62961dae.camel@vnet.ibm.com> <20201013002313.GV2672@gate.crashing.org> X-Mailer: Evolution 3.28.5 (3.28.5-12.el8) Mime-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.343, 18.0.737 definitions=2021-01-19_12:2021-01-18, 2021-01-19 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 lowpriorityscore=0 mlxlogscore=999 adultscore=0 spamscore=0 phishscore=0 priorityscore=1501 clxscore=1015 suspectscore=0 impostorscore=0 malwarescore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2101190117 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Carl Love via Gcc-patches From: Carl Love Reply-To: Carl Love Cc: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Will, Segher: Patch 4 adds the vector 128-bit integer shift instruction support for the V1TI type. This patch also renames and moves the VSX_TI iterator from vsx.md to VEC_TI in vector.md. The uses of VEC_TI are also updated. This patch also renames and moves the VSX_TI iterator from vsx.md to VEC_TI in vector.md. The uses of VEC_TI are also updated. version 3: No additional functional changes. Tested on Power 8BE, Power 9, Power 10. version 2: Re-tested the patch on Power 9 with no regression errors. Carl Love -------------------------------------------------------- gcc/ChangeLog 2021-01-12 Carl Love * config/rs6000/altivec.md (altivec_vslq, altivec_vsrq): Rename to altivec_vslq_, altivec_vsrq_, mode VEC_TI. * config/rs6000/vector.md (VEC_TI): Was named VSX_TI in vsx.md. (vashlv1ti3): Change to vashl3, mode VEC_TI. (vlshrv1ti3): Change to vlshr3, mode VEC_TI. * config/rs6000/vsx.md (VSX_TI): Remove define_mode_iterator. Update uses of VSX_TI to VEC_TI. gcc/testsuite/ChangeLog 2021-01-12 Carl Love gcc.target/powerpc/int_128bit-runnable.c: Add shift_right, shift_left tests. --- gcc/config/rs6000/altivec.md | 16 ++++----- gcc/config/rs6000/vector.md | 27 ++++++++------- gcc/config/rs6000/vsx.md | 33 +++++++++---------- .../gcc.target/powerpc/int_128bit-runnable.c | 16 +++++++-- 4 files changed, 52 insertions(+), 40 deletions(-) diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index cb83c5ce012..61ab5c9afb6 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -2221,10 +2221,10 @@ "vsl %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "altivec_vslq" - [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") - (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") - (match_operand:V1TI 2 "vsx_register_operand" "v")))] +(define_insn "altivec_vslq_" + [(set (match_operand:VEC_TI 0 "vsx_register_operand" "=v") + (ashift:VEC_TI (match_operand:VEC_TI 1 "vsx_register_operand" "v") + (match_operand:VEC_TI 2 "vsx_register_operand" "v")))] "TARGET_POWER10" /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */ "vslq %0,%1,%2" @@ -2238,10 +2238,10 @@ "vsr %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "altivec_vsrq" - [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") - (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") - (match_operand:V1TI 2 "vsx_register_operand" "v")))] +(define_insn "altivec_vsrq_" + [(set (match_operand:VEC_TI 0 "vsx_register_operand" "=v") + (lshiftrt:VEC_TI (match_operand:VEC_TI 1 "vsx_register_operand" "v") + (match_operand:VEC_TI 2 "vsx_register_operand" "v")))] "TARGET_POWER10" /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */ "vsrq %0,%1,%2" diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md index 0f252c915b0..6a4cd69d866 100644 --- a/gcc/config/rs6000/vector.md +++ b/gcc/config/rs6000/vector.md @@ -26,6 +26,9 @@ ;; Vector int modes (define_mode_iterator VEC_I [V16QI V8HI V4SI V2DI]) +;; 128-bit int modes +(define_mode_iterator VEC_TI [V1TI TI]) + ;; Vector int modes for parity (define_mode_iterator VEC_IP [V8HI V4SI @@ -1627,17 +1630,17 @@ "") ;; No immediate version of this 128-bit instruction -(define_expand "vashlv1ti3" - [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") - (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") - (match_operand:V1TI 2 "vsx_register_operand" "v")))] +(define_expand "vashl3" + [(set (match_operand:VEC_TI 0 "vsx_register_operand" "=v") + (ashift:VEC_TI (match_operand:VEC_TI 1 "vsx_register_operand") + (match_operand:VEC_TI 2 "vsx_register_operand")))] "TARGET_POWER10" { /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ - rtx tmp = gen_reg_rtx (V1TImode); + rtx tmp = gen_reg_rtx (mode); emit_insn (gen_xxswapd_v1ti (tmp, operands[2])); - emit_insn (gen_altivec_vslq (operands[0], operands[1], tmp)); + emit_insn(gen_altivec_vslq_ (operands[0], operands[1], tmp)); DONE; }) @@ -1650,17 +1653,17 @@ "") ;; No immediate version of this 128-bit instruction -(define_expand "vlshrv1ti3" - [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") - (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") - (match_operand:V1TI 2 "vsx_register_operand" "v")))] +(define_expand "vlshr3" + [(set (match_operand:VEC_TI 0 "vsx_register_operand" "=v") + (lshiftrt:VEC_TI (match_operand:VEC_TI 1 "vsx_register_operand") + (match_operand:VEC_TI 2 "vsx_register_operand")))] "TARGET_POWER10" { /* Shift amount in needs to be put into bits[57:63] of 128-bit operand2. */ - rtx tmp = gen_reg_rtx (V1TImode); + rtx tmp = gen_reg_rtx (mode); emit_insn (gen_xxswapd_v1ti (tmp, operands[2])); - emit_insn (gen_altivec_vsrq (operands[0], operands[1], tmp)); + emit_insn(gen_altivec_vsrq_ (operands[0], operands[1], tmp)); DONE; }) diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index fd779435390..e5db5793488 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -37,9 +37,6 @@ TI V1TI]) -;; Iterator for 128-bit integer types that go in a single vector register. -(define_mode_iterator VSX_TI [TI V1TI]) - ;; Iterator for the 2 32-bit vector types (define_mode_iterator VSX_W [V4SF V4SI]) @@ -946,9 +943,9 @@ ;; special V1TI container class, which it is not appropriate to use vec_select ;; for the type. (define_insn "*vsx_le_permute_" - [(set (match_operand:VSX_TI 0 "nonimmediate_operand" "=wa,wa,Z,&r,&r,Q") - (rotate:VSX_TI - (match_operand:VSX_TI 1 "input_operand" "wa,Z,wa,r,Q,r") + [(set (match_operand:VEC_TI 0 "nonimmediate_operand" "=wa,wa,Z,&r,&r,Q") + (rotate:VEC_TI + (match_operand:VEC_TI 1 "input_operand" "wa,Z,wa,r,Q,r") (const_int 64)))] "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR" "@ @@ -962,10 +959,10 @@ (set_attr "type" "vecperm,vecload,vecstore,*,load,store")]) (define_insn_and_split "*vsx_le_undo_permute_" - [(set (match_operand:VSX_TI 0 "vsx_register_operand" "=wa,wa") - (rotate:VSX_TI - (rotate:VSX_TI - (match_operand:VSX_TI 1 "vsx_register_operand" "0,wa") + [(set (match_operand:VEC_TI 0 "vsx_register_operand" "=wa,wa") + (rotate:VEC_TI + (rotate:VEC_TI + (match_operand:VEC_TI 1 "vsx_register_operand" "0,wa") (const_int 64)) (const_int 64)))] "!BYTES_BIG_ENDIAN && TARGET_VSX" @@ -1033,11 +1030,11 @@ ;; Peepholes to catch loads and stores for TImode if TImode landed in ;; GPR registers on a little endian system. (define_peephole2 - [(set (match_operand:VSX_TI 0 "int_reg_operand") - (rotate:VSX_TI (match_operand:VSX_TI 1 "memory_operand") + [(set (match_operand:VEC_TI 0 "int_reg_operand") + (rotate:VEC_TI (match_operand:VEC_TI 1 "memory_operand") (const_int 64))) - (set (match_operand:VSX_TI 2 "int_reg_operand") - (rotate:VSX_TI (match_dup 0) + (set (match_operand:VEC_TI 2 "int_reg_operand") + (rotate:VEC_TI (match_dup 0) (const_int 64)))] "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && (rtx_equal_p (operands[0], operands[2]) @@ -1045,11 +1042,11 @@ [(set (match_dup 2) (match_dup 1))]) (define_peephole2 - [(set (match_operand:VSX_TI 0 "int_reg_operand") - (rotate:VSX_TI (match_operand:VSX_TI 1 "int_reg_operand") + [(set (match_operand:VEC_TI 0 "int_reg_operand") + (rotate:VEC_TI (match_operand:VEC_TI 1 "int_reg_operand") (const_int 64))) - (set (match_operand:VSX_TI 2 "memory_operand") - (rotate:VSX_TI (match_dup 0) + (set (match_operand:VEC_TI 2 "memory_operand") + (rotate:VEC_TI (match_dup 0) (const_int 64)))] "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && peep2_reg_dead_p (2, operands[0])" diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c index 42cb91c7ba9..953c23ec046 100644 --- a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c @@ -53,6 +53,18 @@ void print_i128(__int128_t val) void abort (void); +__attribute__((noinline)) +__int128_t shift_right (__int128_t a, __uint128_t b) +{ + return a >> b; +} + +__attribute__((noinline)) +__int128_t shift_left (__int128_t a, __uint128_t b) +{ + return a << b; +} + int main () { int i, result_int; @@ -142,7 +154,7 @@ int main () #endif } - arg1 = 3; + arg1 = vec_result[0]; uarg2 = 4; expected_result = arg1*16; @@ -226,7 +238,7 @@ int main () #endif } - arg1 = 48; + arg1 = vec_uresult[0]; uarg2 = 4; expected_result = arg1/16; From patchwork Tue Jan 19 22:33:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carl Love X-Patchwork-Id: 1428915 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=nPIh2an2; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DL3NZ4Rl0z9sWF for ; Wed, 20 Jan 2021 09:34:06 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BA8F239450EC; Tue, 19 Jan 2021 22:33:55 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BA8F239450EC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1611095635; bh=CDVa+gFC4sa8gvPEY7b7Ago72T8l26i2YqMD3+u2qTk=; h=Subject:To:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=nPIh2an28QhxRRbUz5OrgolVhJKltqm1d4nmKQpG1S2Io2AhZBrdni6VHTUxL4FGs kV/7scA6xZzRfENJwd7RHoiwfkEvpdwgTrwPsQ6RA3nSjdFeIkxRcMCKxDxQDDyTCJ WkrBU6L7WWqaIl/oEy9x3Wjby2qYQ0OdbXAMZRzI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 029DC38708D6 for ; Tue, 19 Jan 2021 22:33:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 029DC38708D6 Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 10JMVaML101045; Tue, 19 Jan 2021 17:33:49 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3667x88k8q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 17:33:48 -0500 Received: from m0187473.ppops.net (m0187473.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 10JMVsQP101812; Tue, 19 Jan 2021 17:33:48 -0500 Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0a-001b2d01.pphosted.com with ESMTP id 3667x88k8d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 17:33:48 -0500 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 10JMWSEq022050; Tue, 19 Jan 2021 22:33:47 GMT Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by ppma01dal.us.ibm.com with ESMTP id 363qs9qvtc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 22:33:47 +0000 Received: from b03ledav002.gho.boulder.ibm.com (b03ledav002.gho.boulder.ibm.com [9.17.130.233]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 10JMXjVY19726736 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 19 Jan 2021 22:33:45 GMT Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 70D5C136063; Tue, 19 Jan 2021 22:33:45 +0000 (GMT) Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C2AA1136068; Tue, 19 Jan 2021 22:33:44 +0000 (GMT) Received: from li-e362e14c-2378-11b2-a85c-87d605f3c641.ibm.com (unknown [9.163.70.85]) by b03ledav002.gho.boulder.ibm.com (Postfix) with ESMTP; Tue, 19 Jan 2021 22:33:44 +0000 (GMT) Message-ID: Subject: [PATCH 6/6 ver 3] Conversions between 128-bit integer and floating point values. To: Segher Boessenkool , will schmidt , cel@us.ibm.com, Michael Meissner Date: Tue, 19 Jan 2021 14:33:44 -0800 In-Reply-To: <20201013002313.GV2672@gate.crashing.org> References: <815d6b091f4b8bf3ab7c7e203c41d03c6c0e8d81.camel@us.ibm.com> <8acbb7bc3964944154491037884523c94ac3bdb1.camel@us.ibm.com> <384c17c8b764c850f8a9a08e963ed34ec89de28b.camel@vnet.ibm.com> <82b546ae55356938b9002ca4a9d0d4eb62961dae.camel@vnet.ibm.com> <20201013002313.GV2672@gate.crashing.org> X-Mailer: Evolution 3.28.5 (3.28.5-12.el8) X-TM-AS-GCONF: 00 X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.343, 18.0.737 definitions=2021-01-19_12:2021-01-18, 2021-01-19 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 malwarescore=0 impostorscore=0 suspectscore=0 mlxlogscore=999 bulkscore=0 priorityscore=1501 mlxscore=0 clxscore=1015 phishscore=0 spamscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2101190117 X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Carl Love via Gcc-patches From: Carl Love Reply-To: Carl Love Cc: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Will, Segher: This patch adds support for converting to/from 128-bit integers and 128-bit decimal floating point formats using the new P10 instructions dcffixqq and dctfixqq. The new instructions are only used on P10 HW, otherwise the conversions continue to use the existing SW routines. The files fixkfti-sw.c and fixunskfti-sw.c are renamed versions of fixkfti.c and fixunskfti.c respectively. The function names in the files were updated with the rename as well as some white spaces fixes. version 3: Numerous changes with help/input from Michael Meissner Add assembler checks for the 128-bit conversion instructions, see configure and configure.ac. Add the libgcc resolvers to select sw or hw support for the conversions. Rename, rewrite the existing conversion files (fixkfti.c, fixunskfti.c, floattikf.c, floatuntikf.c) to create the sw conversion files. Tested on Power 8BE, Power9, Power10. version 2: Fixed a typo in the ChangeLog noted by Will. Removed the target ppc_native_128bit from the test case as we no longer have the 128-bit flag. Re-tested the patch on Power 9 with no regression errors. Carl Love ---------------------------------------------------- gcc/ChangeLog 2021-01-15 Carl Love * config/rs6000/rs6000.md (floatti2, floatunsti2, fix_truncti2, fixuns_truncti2): Add define_insn for mode IEEE 128. gcc/testsuite/ChangeLog 2021-01-15 Carl Love * gcc.target/powerpc/fp128_conversions.c: New file. * gcc.target/powerpc/int_128bit-runnable.c(vextsd2ppc_native_128bitq, vcmpuq, vcmpsq, vcmpequq, vcmpequq., vcmpgtsq, vcmpgtsq. vcmpgtuq, vcmpgtuq.): Update scan-assembler-times. (ppc_native_128bit): Remove dg-require-effective-target. libgcc/ChangeLog 2021-01-15 Carl Love * config.host: Add if test and set for libgcc_cv_powerpc_3_1_float128_hw. * libgcc/config/rs6000/fixkfti.c: Renamed to fixkfti-sw.c. Change calls of __fixkfti to __fixkfti_sw. * libgcc/config/rs6000/fixunskfti.c: Renamed to fixunskfti-sw.c. Change calls of __fixunskfti to __fixunskfti_sw. * libgcc/config/rs6000/float128-p10.c (__floattikf_hw, __floatuntikf_hw, __fixkfti_hw, __fixunskfti_hw): New file. * libgcc/config/rs6000/float128-ifunc.c (SW_OR_HW_ISA3_1): New macro. (__floattikf_resolve, __floatuntikf_resolve, __fixkfti_resolve, __fixunskfti_resolve): Add resolve functions. (__floattikf, __floatuntikf, __fixkfti, __fixunskfti): New functions. * libgcc/config/rs6000/float128-sed (floattitf, __floatuntitf, __fixtfti, __fixunstfti): Add editor commands to change names. * libgcc/config/rs6000/float128-sed-hw (__floattitf, __floatuntitf, __fixtfti, __fixunstfti): Add editor commands to change names. * libgcc/config/rs6000/floattikf.c: Renamed to floattikf-sw.c. * libgcc/config/rs6000/floatuntikf.c: Renamed to floatuntikf-sw.c. * libgcc/config/rs6000/quaad-float128.h (__floattikf_sw, __floatuntikf_sw, __fixkfti_sw, __fixunskfti_sw, __floattikf_hw, __floatuntikf_hw, __fixkfti_hw, __fixunskfti_hw, __floattikf, __floatuntikf, __fixkfti, __fixunskfti): New extern declarations. * libgcc/config/rs6000/t-float128 (floattikf, floatuntikf, fixkfti, fixunskfti): Remove file names from fp128_ppc_funcs. (floattikf-sw, floatuntikf-sw, fixkfti-sw, fixunskfti-sw): Add file names to fp128_ppc_funcs. * libgcc/config/rs6000/t-float128-hw(fp128_3_1_hw_funcs, fp128_3_1_hw_src, fp128_3_1_hw_static_obj, fp128_3_1_hw_shared_obj, fp128_3_1_hw_obj): Add variables for ISA 3.1 support. * libgcc/config/rs6000/t-float128-p10-hw: New file. * configure: Update script for isa 3.1 128-bit float support. * configure.ac: Add check for 128-bit float hardware support. --- gcc/config/rs6000/rs6000.md | 36 +++ .../gcc.target/powerpc/fp128_conversions.c | 294 ++++++++++++++++++ .../gcc.target/powerpc/int_128bit-runnable.c | 14 +- libgcc/config.host | 4 + .../config/rs6000/{fixkfti.c => fixkfti-sw.c} | 4 +- .../rs6000/{fixunskfti.c => fixunskfti-sw.c} | 4 +- libgcc/config/rs6000/float128-ifunc.c | 44 ++- libgcc/config/rs6000/float128-p10.c | 71 +++++ libgcc/config/rs6000/float128-sed | 4 + libgcc/config/rs6000/float128-sed-hw | 4 + .../rs6000/{floattikf.c => floattikf-sw.c} | 4 +- .../{floatuntikf.c => floatuntikf-sw.c} | 4 +- libgcc/config/rs6000/quad-float128.h | 17 +- libgcc/config/rs6000/t-float128 | 12 +- libgcc/config/rs6000/t-float128-hw | 16 + libgcc/config/rs6000/t-float128-p10-hw | 24 ++ libgcc/configure | 39 ++- libgcc/configure.ac | 25 ++ 18 files changed, 586 insertions(+), 34 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/fp128_conversions.c rename libgcc/config/rs6000/{fixkfti.c => fixkfti-sw.c} (96%) rename libgcc/config/rs6000/{fixunskfti.c => fixunskfti-sw.c} (96%) create mode 100644 libgcc/config/rs6000/float128-p10.c rename libgcc/config/rs6000/{floattikf.c => floattikf-sw.c} (96%) rename libgcc/config/rs6000/{floatuntikf.c => floatuntikf-sw.c} (96%) create mode 100644 libgcc/config/rs6000/t-float128-p10-hw diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index bb9fb42f82a..096c874f6e4 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -6390,6 +6390,42 @@ xscvsxddp %x0,%x1" [(set_attr "type" "fp")]) +(define_insn "floatti2" + [(set (match_operand:IEEE128 0 "vsx_register_operand" "=v") + (float:IEEE128 (match_operand:TI 1 "vsx_register_operand" "v")))] + "TARGET_POWER10" +{ + return "xscvsqqp %0,%1"; +} + [(set_attr "type" "fp")]) + +(define_insn "floatunsti2" + [(set (match_operand:IEEE128 0 "vsx_register_operand" "=v") + (unsigned_float:IEEE128 (match_operand:TI 1 "vsx_register_operand" "v")))] + "TARGET_POWER10" +{ + return "xscvuqqp %0,%1"; +} + [(set_attr "type" "fp")]) + +(define_insn "fix_truncti2" + [(set (match_operand:TI 0 "vsx_register_operand" "=v") + (fix:TI (match_operand:IEEE128 1 "vsx_register_operand" "v")))] + "TARGET_POWER10" +{ + return "xscvqpsqz %0,%1"; +} + [(set_attr "type" "fp")]) + +(define_insn "fixuns_truncti2" + [(set (match_operand:TI 0 "vsx_register_operand" "=v") + (unsigned_fix:TI (match_operand:IEEE128 1 "vsx_register_operand" "v")))] + "TARGET_POWER10" +{ + return "xscvqpuqz %0,%1"; +} + [(set_attr "type" "fp")]) + ; Allow the combiner to merge source memory operands to the conversion so that ; the optimizer/register allocator doesn't try to load the value too early in a ; GPR and then use store/load to move it to a FPR and suffer from a store-load diff --git a/gcc/testsuite/gcc.target/powerpc/fp128_conversions.c b/gcc/testsuite/gcc.target/powerpc/fp128_conversions.c new file mode 100644 index 00000000000..c20282fa0e0 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/fp128_conversions.c @@ -0,0 +1,294 @@ +/* { dg-do run } */ +/* { dg-require-effective-target power10_hw } */ +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */ + +/* Check that the expected 128-bit instructions are generated if the processor + supports the 128-bit integer instructions. */ +/* { dg-final { scan-assembler-times {\mxscvsqqp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxscvuqqp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxscvqpsqz\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxscvqpuqz\M} 1 } } */ + +#include +#include +#include +#include +#include + +#define DEBUG 0 + +void +abort (void); + +float +conv_i_2_fp( long long int a) +{ + return (float) a; +} + +double +conv_i_2_fpd( long long int a) +{ + return (double) a; +} + +double +conv_ui_2_fpd( unsigned long long int a) +{ + return (double) a; +} + +__float128 +conv_i128_2_fp128 (__int128_t a) +{ + // default, gen inst KF mode + // -mabi=ibmlongdouble, gen inst floattiieee KF mode + // -mabi=ieeelongdouble gen inst floattiieee TF mode + return (__float128) a; +} + +__float128 +conv_ui128_2_fp128 (__uint128_t a) +{ + // default, gen inst KF mode + // -mabi=ibmlongdouble, gen inst floattiieee KF mode + // -mabi=ieeelongdouble gen inst floattiieee TF mode + return (__float128) a; +} + +__int128_t +conv_fp128_2_i128 (__float128 a) +{ + // default, gen inst KF mode + // -mabi=ibmlongdouble, gen inst floattiieee KF mode + // -mabi=ieeelongdouble gen inst floattiieee TF mode + return (__int128_t) a; +} + +__uint128_t +conv_fp128_2_ui128 (__float128 a) +{ + // default, gen inst KF mode + // -mabi=ibmlongdouble, gen inst floattiieee KF mode + // -mabi=ieeelongdouble gen inst floattiieee TF mode + return (__uint128_t) a; +} + +long double +conv_i128_2_ld (__int128_t a) +{ + // default, gen call __floattitf + // -mabi=ibmlongdouble, gen call __floattitf + // -mabi=ieeelongdouble gen inst floattiieee TF mode + return (long double) a; +} + +__ibm128 +conv_i128_2_ibm128 (__int128_t a) +{ + // default, gen call __floattitf + // -mabi=ibmlongdouble, gen call __floattitf + // -mabi=ieeelongdouble, message uses IBM long double, no binary output + return (__ibm128) a; +} + +int +main() +{ + float a, expected_result_float; + double b, expected_result_double; + long long int c, expected_result_llint; + unsigned long long int u; + __int128_t d; + __uint128_t u128; + unsigned long long expected_result_uint128[2] ; + __float128 e; + long double ld; // another 128-bit float version + + union conv_t { + float a; + double b; + long long int c; + long long int128[2] ; + unsigned long long uint128[2] ; + unsigned long long int u; + __int128_t d; + __uint128_t u128; + __float128 e; + long double ld; // another 128-bit float version + } conv, conv_result; + + c = 20; + expected_result_llint = 20.00000; + a = conv_i_2_fp (c); + + if (a != expected_result_llint) { +#if DEBUG + printf("ERROR: conv_i_2_fp(%lld) = %10.5f\n", c, a); + printf("\n does not match expected_result = %10.5f\n\n", + expected_result_llint); +#else + abort(); +#endif + } + + c = 20; + expected_result_double = 20.00000; + b = conv_i_2_fpd (c); + + if (b != expected_result_double) { +#if DEBUG + printf("ERROR: conv_i_2_fpd(%lld) = %10.5f\n", d, b); + printf("\n does not match expected_result = %10.5f\n\n", + expected_result_double); + #else + abort(); +#endif + } + + u = 20; + expected_result_double = 20.00000; + b = conv_ui_2_fpd (u); + + if (b != expected_result_double) { +#if DEBUG + printf("ERROR: conv_ui_2_fpd(%llu) = %10.5f\n", u, b); + printf("\n does not match expected_result = %10.5f\n\n", + expected_result_double); + #else + abort(); +#endif + } + + d = -3210; + d = (d * 10000000000) + 9876543210; + conv_result.e = conv_i128_2_fp128 (d); + expected_result_uint128[1] = 0xc02bd2f9068d1160; + expected_result_uint128[0] = 0x0; + + if ((conv_result.uint128[1] != expected_result_uint128[1]) + && (conv_result.uint128[0] != expected_result_uint128[0])) { +#if DEBUG + printf("ERROR: conv_i128_2_fp128(-32109876543210) = (result in hex) 0x%llx %llx\n", + conv.uint128[1], conv.uint128[0]); + printf("\n does not match expected_result = (result in hex) 0x%llx %llx\n\n", + expected_result_uint128[1], expected_result_uint128[0]); + #else + abort(); +#endif + } + + d = 123; + d = (d * 10000000000) + 1234567890; + conv_result.ld = conv_i128_2_fp128 (d); + expected_result_uint128[1] = 0x0; + expected_result_uint128[0] = 0x4271eab4c8ed2000; + + if ((conv_result.uint128[1] != expected_result_uint128[1]) + && (conv_result.uint128[0] != expected_result_uint128[0])) { +#if DEBUG + printf("ERROR: conv_i128_2_fp128(1231234567890) = (result in hex) 0x%llx %llx\n", + conv.uint128[1], conv.uint128[0]); + printf("\n does not match expected_result = (result in hex) 0x%llx %llx\n\n", + expected_result_uint128[1], expected_result_uint128[0]); + #else + abort(); +#endif + } + + u128 = 8760; + u128 = (u128 * 10000000000) + 1234567890; + conv_result.e = conv_ui128_2_fp128 (u128); + expected_result_uint128[1] = 0x402d3eb101df8b48; + expected_result_uint128[0] = 0x0; + + if ((conv_result.uint128[1] != expected_result_uint128[1]) + && (conv_result.uint128[0] != expected_result_uint128[0])) { +#if DEBUG + printf("ERROR: conv_ui128_2_fp128(87601234567890) = (result in hex) 0x%llx %llx\n", + conv.uint128[1], conv.uint128[0]); + printf("\n does not match expected_result = (result in hex) 0x%llx %llx\n\n", + expected_result_uint128[1], expected_result_uint128[0]); + #else + abort(); +#endif + } + + u128 = 3210; + u128 = (u128 * 10000000000) + 9876543210; + expected_result_uint128[1] = 0x402bd3429c8feea0; + expected_result_uint128[0] = 0x0; + conv_result.e = conv_ui128_2_fp128 (u128); + + if ((conv_result.uint128[1] != expected_result_uint128[1]) + && (conv_result.uint128[0] != expected_result_uint128[0])) { +#if DEBUG + printf("ERROR: conv_ui128_2_fp128(32109876543210) = (result in hex) 0x%llx %llx\n", + conv.uint128[1], conv.uint128[0]); + printf("\n does not match expected_result = (result in hex) 0x%llx %llx\n\n", + expected_result_uint128[1], expected_result_uint128[0]); + #else + abort(); +#endif + } + + conv.e = 12345.6789; + expected_result_uint128[1] = 0x1407374883526960; + expected_result_uint128[0] = 0x3039; + + conv_result.d = conv_fp128_2_i128 (conv.e); + + if ((conv_result.uint128[1] != expected_result_uint128[1]) + && (conv_result.uint128[0] != expected_result_uint128[0])) { +#if DEBUG + printf("ERROR: conv_fp128_2_i128(0x%llx %llx) = ", + conv.uint128[1], conv.uint128[0]); + printf("0x%llx %llx\n", conv_result.uint128[1], conv_result.uint128[0]); + + printf("\n does not match expected_result = (result in hex) 0x%llx %llx\n\n", + expected_result_uint128[1], expected_result_uint128[0]); + #else + abort(); +#endif + } + + conv.e = -6789.12345; + expected_result_uint128[1] = 0x0; + expected_result_uint128[0] = 0xffffffffffffe57b; + conv_result.d = conv_fp128_2_i128 (conv.e); + + if ((conv_result.uint128[1] != expected_result_uint128[1]) + && (conv_result.uint128[0] != expected_result_uint128[0])) { +#if DEBUG + printf("ERROR: conv_fp128_2_i128(0x%llx %llx) = ", + conv.uint128[1], conv.uint128[0]); + printf("0x%llx %llx\n", conv_result.uint128[1], conv_result.uint128[0]); + + printf("\n does not match expected_result = (result in hex) 0x%llx %llx\n\n", + expected_result_uint128[1], expected_result_uint128[0]); + #else + abort(); +#endif + } + + conv.e = 6789.12345; + expected_result_uint128[1] = 0x0; + expected_result_uint128[0] = 0x1a85; + conv_result.d = conv_fp128_2_ui128 (conv.e); + + if ((conv_result.uint128[1] != expected_result_uint128[1]) + && (conv_result.uint128[0] != expected_result_uint128[0])) { +#if DEBUG + printf("ERROR: conv_fp128_2_ui128(0x%llx %llx) = ", + conv.uint128[1], conv.uint128[0]); + printf("0x%llx %llx\n", conv_result.uint128[1], conv_result.uint128[0]); + + printf("\n does not match expected_result = (result in hex) 0x%llx %llx\n\n", + expected_result_uint128[1], expected_result_uint128[0]); + #else + abort(); +#endif + } + + return 0; +} diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c index 953c23ec046..5c3aa4fa5ca 100644 --- a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c @@ -4,22 +4,16 @@ /* Check that the expected 128-bit instructions are generated if the processor supports the 128-bit integer instructions. */ -/* { dg-final { scan-assembler-times {\mvextsd2q\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvextsd2q\M} 6 } } */ /* { dg-final { scan-assembler-times {\mvslq\M} 2 } } */ /* { dg-final { scan-assembler-times {\mvsrq\M} 2 } } */ /* { dg-final { scan-assembler-times {\mvsraq\M} 2 } } */ /* { dg-final { scan-assembler-times {\mvrlq\M} 2 } } */ /* { dg-final { scan-assembler-times {\mvrlqnm\M} 2 } } */ /* { dg-final { scan-assembler-times {\mvrlqmi\M} 2 } } */ -/* { dg-final { scan-assembler-times {\mvcmpuq\M} 0 } } */ -/* { dg-final { scan-assembler-times {\mvcmpsq\M} 0 } } */ -/* { dg-final { scan-assembler-times {\mvcmpequq\M} 0 } } */ -/* { dg-final { scan-assembler-times {\mvcmpequq.\M} 16 } } */ -/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 0 } } */ -/* { dg-final { scan-assembler-times {\mvcmpgtsq.\M} 16 } } */ -/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 0 } } */ -/* { dg-final { scan-assembler-times {\mvcmpgtuq.\M} 16 } } */ -/* { dg-final { scan-assembler-times {\mvmuleud\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvcmpequq\M} 16 } } */ +/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 16 } } */ +/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 16 } } */ /* { dg-final { scan-assembler-times {\mvmuloud\M} 1 } } */ /* { dg-final { scan-assembler-times {\mvmulesd\M} 1 } } */ /* { dg-final { scan-assembler-times {\mvmulosd\M} 1 } } */ diff --git a/libgcc/config.host b/libgcc/config.host index f808b61be70..50f00062232 100644 --- a/libgcc/config.host +++ b/libgcc/config.host @@ -1224,6 +1224,10 @@ powerpc*-*-linux*) tmake_file="${tmake_file} rs6000/t-float128-hw" fi + if test $libgcc_cv_powerpc_3_1_float128_hw = yes; then + tmake_file="${tmake_file} rs6000/t-float128-p10-hw" + fi + extra_parts="$extra_parts ecrti.o ecrtn.o ncrti.o ncrtn.o" md_unwind_header=rs6000/linux-unwind.h ;; diff --git a/libgcc/config/rs6000/fixkfti.c b/libgcc/config/rs6000/fixkfti-sw.c similarity index 96% rename from libgcc/config/rs6000/fixkfti.c rename to libgcc/config/rs6000/fixkfti-sw.c index 0d965bc6253..cc000fca0f8 100644 --- a/libgcc/config/rs6000/fixkfti.c +++ b/libgcc/config/rs6000/fixkfti-sw.c @@ -5,7 +5,7 @@ This file is part of the GNU C Library. Contributed by Steven Munroe (munroesj@linux.vnet.ibm.com) Code is based on the main soft-fp library written by: - Uros Bizjak (ubizjak@gmail.com). + Uros Bizjak (ubizjak@gmail.com). The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public @@ -35,7 +35,7 @@ #include "quad-float128.h" TItype -__fixkfti (TFtype a) +__fixkfti_sw (TFtype a) { FP_DECL_EX; FP_DECL_Q (A); diff --git a/libgcc/config/rs6000/fixunskfti.c b/libgcc/config/rs6000/fixunskfti-sw.c similarity index 96% rename from libgcc/config/rs6000/fixunskfti.c rename to libgcc/config/rs6000/fixunskfti-sw.c index f285b4e3fbd..7a04d1a489a 100644 --- a/libgcc/config/rs6000/fixunskfti.c +++ b/libgcc/config/rs6000/fixunskfti-sw.c @@ -5,7 +5,7 @@ This file is part of the GNU C Library. Contributed by Steven Munroe (munroesj@linux.vnet.ibm.com) Code is based on the main soft-fp library written by: - Uros Bizjak (ubizjak@gmail.com). + Uros Bizjak (ubizjak@gmail.com). The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public @@ -35,7 +35,7 @@ #include "quad-float128.h" UTItype -__fixunskfti (TFtype a) +__fixunskfti_sw (TFtype a) { FP_DECL_EX; FP_DECL_Q (A); diff --git a/libgcc/config/rs6000/float128-ifunc.c b/libgcc/config/rs6000/float128-ifunc.c index 85380471c5e..57545dd7edb 100644 --- a/libgcc/config/rs6000/float128-ifunc.c +++ b/libgcc/config/rs6000/float128-ifunc.c @@ -46,14 +46,9 @@ #endif #define SW_OR_HW(SW, HW) (__builtin_cpu_supports ("ieee128") ? HW : SW) +#define SW_OR_HW_ISA3_1(SW, HW) (__builtin_cpu_supports ("arch_3_1") ? HW : SW) /* Resolvers. */ - -/* We do not provide ifunc resolvers for __fixkfti, __fixunskfti, __floattikf, - and __floatuntikf. There is no ISA 3.0 instruction that converts between - 128-bit integer types and 128-bit IEEE floating point, or vice versa. So - use the emulator functions for these conversions. */ - static __typeof__ (__addkf3_sw) * __addkf3_resolve (void) { @@ -102,6 +97,18 @@ __floatdikf_resolve (void) return SW_OR_HW (__floatdikf_sw, __floatdikf_hw); } +static __typeof__ (__floattikf_sw) * +__floattikf_resolve (void) +{ + return SW_OR_HW_ISA3_1 (__floattikf_sw, __floattikf_hw); +} + +static __typeof__ (__floatuntikf_sw) * +__floatuntikf_resolve (void) +{ + return SW_OR_HW_ISA3_1 (__floatuntikf_sw, __floatuntikf_hw); +} + static __typeof__ (__floatunsikf_sw) * __floatunsikf_resolve (void) { @@ -114,6 +121,19 @@ __floatundikf_resolve (void) return SW_OR_HW (__floatundikf_sw, __floatundikf_hw); } + +static __typeof__ (__fixkfti_sw) * +__fixkfti_resolve (void) +{ + return SW_OR_HW_ISA3_1 (__fixkfti_sw, __fixkfti_hw); +} + +static __typeof__ (__fixunskfti_sw) * +__fixunskfti_resolve (void) +{ + return SW_OR_HW_ISA3_1 (__fixunskfti_sw, __fixunskfti_hw); +} + static __typeof__ (__fixkfsi_sw) * __fixkfsi_resolve (void) { @@ -303,6 +323,18 @@ TFtype __floatsikf (SItype_ppc) TFtype __floatdikf (DItype_ppc) __attribute__ ((__ifunc__ ("__floatdikf_resolve"))); +TFtype __floattikf (TItype_ppc) + __attribute__ ((__ifunc__ ("__floattikf_resolve"))); + +TFtype __floatuntikf (UTItype_ppc) + __attribute__ ((__ifunc__ ("__floatuntikf_resolve"))); + +TItype_ppc __fixkfti (TFtype) + __attribute__ ((__ifunc__ ("__fixkfti_resolve"))); + +UTItype_ppc __fixunskfti (TFtype) + __attribute__ ((__ifunc__ ("__fixunskfti_resolve"))); + TFtype __floatunsikf (USItype_ppc) __attribute__ ((__ifunc__ ("__floatunsikf_resolve"))); diff --git a/libgcc/config/rs6000/float128-p10.c b/libgcc/config/rs6000/float128-p10.c new file mode 100644 index 00000000000..7f5d317631a --- /dev/null +++ b/libgcc/config/rs6000/float128-p10.c @@ -0,0 +1,71 @@ +/* Automatic switching between software and hardware IEEE 128-bit + ISA 3.1 floating-point emulation for PowerPC. + + Copyright (C) 2016-2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Carl Love (cel@us.ibm.com) + Code is based on the main soft-fp library written by: + Richard Henderson (rth@cygnus.com) and + Jakub Jelinek (jj@ultra.linux.cz). + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Note, the hardware conversion instructions for 128-bit integers are + supported for ISA 3.1 and later. Only compile this file with -mcpu=power10 + or newer support. */ + +#include +#include + +#ifndef __FLOAT128_HARDWARE__ +#error "This module must be compiled with IEEE 128-bit hardware support" +#endif + +#ifndef _ARCH_PWR10 +#error "This module must be compiled for Power 10 support" +#endif + +TFtype +__floattikf_hw (TItype_ppc a) +{ + return (TFtype) a; +} + +TFtype +__floatuntikf_hw (UTItype_ppc a) +{ + return (TFtype) a; +} + +TItype_ppc +__fixkfti_hw (TFtype a) +{ + return (TItype_ppc) a; +} + +UTItype_ppc +__fixunskfti_hw (TFtype a) +{ + return (UTItype_ppc) a; +} diff --git a/libgcc/config/rs6000/float128-sed b/libgcc/config/rs6000/float128-sed index d9a089ff9ba..c0fcddb1959 100644 --- a/libgcc/config/rs6000/float128-sed +++ b/libgcc/config/rs6000/float128-sed @@ -8,6 +8,10 @@ s/__fixtfsi/__fixkfsi/g s/__fixunstfdi/__fixunskfdi/g s/__fixunstfsi/__fixunskfsi/g s/__floatditf/__floatdikf/g +s/__floattitf/__floattikf/g +s/__floatuntitf/__floatuntikf/g +s/__fixtfti/__fixkfti/g +s/__fixunstfti/__fixunskfti/g s/__floatsitf/__floatsikf/g s/__floatunditf/__floatundikf/g s/__floatunsitf/__floatunsikf/g diff --git a/libgcc/config/rs6000/float128-sed-hw b/libgcc/config/rs6000/float128-sed-hw index acf36b0c17d..3d2bf556da1 100644 --- a/libgcc/config/rs6000/float128-sed-hw +++ b/libgcc/config/rs6000/float128-sed-hw @@ -8,6 +8,10 @@ s/__fixtfsi/__fixkfsi_sw/g s/__fixunstfdi/__fixunskfdi_sw/g s/__fixunstfsi/__fixunskfsi_sw/g s/__floatditf/__floatdikf_sw/g +s/__floattitf/__floattikf_sw/g +s/__floatuntitf/__floatuntikf_sw/g +s/__fixtfti/__fixkfti_sw/g +s/__fixunstfti/__fixunskfti_sw/g s/__floatsitf/__floatsikf_sw/g s/__floatunditf/__floatundikf_sw/g s/__floatunsitf/__floatunsikf_sw/g diff --git a/libgcc/config/rs6000/floattikf.c b/libgcc/config/rs6000/floattikf-sw.c similarity index 96% rename from libgcc/config/rs6000/floattikf.c rename to libgcc/config/rs6000/floattikf-sw.c index cc5c7ca0fd0..4e1786cd229 100644 --- a/libgcc/config/rs6000/floattikf.c +++ b/libgcc/config/rs6000/floattikf-sw.c @@ -5,7 +5,7 @@ This file is part of the GNU C Library. Contributed by Steven Munroe (munroesj@linux.vnet.ibm.com) Code is based on the main soft-fp library written by: - Uros Bizjak (ubizjak@gmail.com). + Uros Bizjak (ubizjak@gmail.com). The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public @@ -35,7 +35,7 @@ #include "quad-float128.h" TFtype -__floattikf (TItype i) +__floattikf_sw (TItype i) { FP_DECL_EX; FP_DECL_Q (A); diff --git a/libgcc/config/rs6000/floatuntikf.c b/libgcc/config/rs6000/floatuntikf-sw.c similarity index 96% rename from libgcc/config/rs6000/floatuntikf.c rename to libgcc/config/rs6000/floatuntikf-sw.c index 96f2d3bdcb8..c4b814ddd68 100644 --- a/libgcc/config/rs6000/floatuntikf.c +++ b/libgcc/config/rs6000/floatuntikf-sw.c @@ -5,7 +5,7 @@ This file is part of the GNU C Library. Contributed by Steven Munroe (munroesj@linux.vnet.ibm.com) Code is based on the main soft-fp library written by: - Uros Bizjak (ubizjak@gmail.com). + Uros Bizjak (ubizjak@gmail.com). The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public @@ -35,7 +35,7 @@ #include "quad-float128.h" TFtype -__floatuntikf (UTItype i) +__floatuntikf_sw (UTItype i) { FP_DECL_EX; FP_DECL_Q (A); diff --git a/libgcc/config/rs6000/quad-float128.h b/libgcc/config/rs6000/quad-float128.h index 0eb1d34691f..bbb7fcb422e 100644 --- a/libgcc/config/rs6000/quad-float128.h +++ b/libgcc/config/rs6000/quad-float128.h @@ -87,19 +87,18 @@ extern USItype_ppc __fixunskfsi_sw (TFtype); extern UDItype_ppc __fixunskfdi_sw (TFtype); extern TFtype __floatsikf_sw (SItype_ppc); extern TFtype __floatdikf_sw (DItype_ppc); +extern TFtype __floattikf_sw (TItype_ppc); extern TFtype __floatunsikf_sw (USItype_ppc); extern TFtype __floatundikf_sw (UDItype_ppc); +extern TFtype __floatuntikf_sw (UTItype_ppc); +extern TItype_ppc __fixkfti_sw (TFtype); +extern UTItype_ppc __fixunskfti_sw (TFtype); extern IBM128_TYPE __extendkftf2_sw (TFtype); extern TFtype __trunctfkf2_sw (IBM128_TYPE); extern TCtype __mulkc3_sw (TFtype, TFtype, TFtype, TFtype); extern TCtype __divkc3_sw (TFtype, TFtype, TFtype, TFtype); #ifdef _ARCH_PPC64 -/* We do not provide ifunc resolvers for __fixkfti, __fixunskfti, __floattikf, - and __floatuntikf. There is no ISA 3.0 instruction that converts between - 128-bit integer types and 128-bit IEEE floating point, or vice versa. So - use the emulator functions for these conversions. */ - extern TItype_ppc __fixkfti (TFtype); extern UTItype_ppc __fixunskfti (TFtype); extern TFtype __floattikf (TItype_ppc); @@ -130,8 +129,12 @@ extern USItype_ppc __fixunskfsi_hw (TFtype); extern UDItype_ppc __fixunskfdi_hw (TFtype); extern TFtype __floatsikf_hw (SItype_ppc); extern TFtype __floatdikf_hw (DItype_ppc); +extern TFtype __floattikf_hw (TItype_ppc); extern TFtype __floatunsikf_hw (USItype_ppc); extern TFtype __floatundikf_hw (UDItype_ppc); +extern TFtype __floatuntikf_hw (UTItype_ppc); +extern TItype_ppc __fixkfti_hw (TFtype); +extern UTItype_ppc __fixunskfti_hw (TFtype); extern IBM128_TYPE __extendkftf2_hw (TFtype); extern TFtype __trunctfkf2_hw (IBM128_TYPE); extern TCtype __mulkc3_hw (TFtype, TFtype, TFtype, TFtype); @@ -162,8 +165,12 @@ extern USItype_ppc __fixunskfsi (TFtype); extern UDItype_ppc __fixunskfdi (TFtype); extern TFtype __floatsikf (SItype_ppc); extern TFtype __floatdikf (DItype_ppc); +extern TFtype __floattikf (TItype_ppc); extern TFtype __floatunsikf (USItype_ppc); extern TFtype __floatundikf (UDItype_ppc); +extern TFtype __floatuntikf (UTItype_ppc); +extern TItype_ppc __fixkfti (TFtype); +extern UTItype_ppc __fixunskfti (TFtype); extern IBM128_TYPE __extendkftf2 (TFtype); extern TFtype __trunctfkf2 (IBM128_TYPE); diff --git a/libgcc/config/rs6000/t-float128 b/libgcc/config/rs6000/t-float128 index d5413445189..fe67b831888 100644 --- a/libgcc/config/rs6000/t-float128 +++ b/libgcc/config/rs6000/t-float128 @@ -23,7 +23,8 @@ fp128_softfp_shared_obj = $(addsuffix -sw_s$(objext),$(fp128_softfp_funcs)) fp128_softfp_obj = $(fp128_softfp_static_obj) $(fp128_softfp_shared_obj) # New functions for software emulation -fp128_ppc_funcs = floattikf floatuntikf fixkfti fixunskfti \ +fp128_ppc_funcs = floattikf-sw floatuntikf-sw \ + fixkfti-sw fixunskfti-sw \ extendkftf2-sw trunctfkf2-sw \ sfp-exceptions _mulkc3 _divkc3 _powikf2 @@ -35,13 +36,16 @@ fp128_ppc_obj = $(fp128_ppc_static_obj) $(fp128_ppc_shared_obj) # All functions fp128_funcs = $(fp128_softfp_funcs) $(fp128_ppc_funcs) \ - $(fp128_hw_funcs) $(fp128_ifunc_funcs) + $(fp128_hw_funcs) $(fp128_ifunc_funcs) \ + $(fp128_3_1_hw_funcs) fp128_src = $(fp128_softfp_src) $(fp128_ppc_src) \ - $(fp128_hw_src) $(fp128_ifunc_src) + $(fp128_hw_src) $(fp128_ifunc_src) \ + $(fp128_3_1_hw_src) fp128_obj = $(fp128_softfp_obj) $(fp128_ppc_obj) \ - $(fp128_hw_obj) $(fp128_ifunc_obj) + $(fp128_hw_obj) $(fp128_ifunc_obj) \ + $(fp128_3_1_hw_obj) fp128_sed = $(srcdir)/config/rs6000/float128-sed$(fp128_sed_hw) fp128_dep = $(fp128_sed) $(srcdir)/config/rs6000/t-float128 diff --git a/libgcc/config/rs6000/t-float128-hw b/libgcc/config/rs6000/t-float128-hw index d64ca4dd694..c0827366cc4 100644 --- a/libgcc/config/rs6000/t-float128-hw +++ b/libgcc/config/rs6000/t-float128-hw @@ -13,6 +13,13 @@ fp128_hw_static_obj = $(addsuffix $(objext),$(fp128_hw_funcs)) fp128_hw_shared_obj = $(addsuffix _s$(objext),$(fp128_hw_funcs)) fp128_hw_obj = $(fp128_hw_static_obj) $(fp128_hw_shared_obj) +# New functions for ISA 3.1 hardware support +fp128_3_1_hw_funcs = float128-p10 +fp128_3_1_hw_src = $(srcdir)/config/rs6000/float128-p10.c +fp128_3_1_hw_static_obj = $(addsuffix $(objext),$(fp128_3_1_hw_funcs)) +fp128_3_1_hw_shared_obj = $(addsuffix _s$(objext),$(fp128_3_1_hw_funcs)) +fp128_3_1_hw_obj = $(fp128_3_1_hw_static_obj) $(fp128_3_1_hw_shared_obj) + fp128_ifunc_funcs = float128-ifunc fp128_ifunc_src = $(srcdir)/config/rs6000/float128-ifunc.c fp128_ifunc_static_obj = float128-ifunc$(objext) @@ -30,9 +37,18 @@ FP128_CFLAGS_HW = -Wno-type-limits -mvsx -mfloat128 \ -I$(srcdir)/config/rs6000 \ $(FLOAT128_HW_INSNS) +FP128_3_1_CFLAGS_HW = -Wno-type-limits -mvsx -mcpu=power10 \ + -mfloat128-hardware -mno-gnu-attribute \ + -I$(srcdir)/soft-fp \ + -I$(srcdir)/config/rs6000 \ + $(FLOAT128_HW_INSNS) + $(fp128_hw_obj) : INTERNAL_CFLAGS += $(FP128_CFLAGS_HW) $(fp128_hw_obj) : $(srcdir)/config/rs6000/t-float128-hw +$(fp128_3_1_hw_obj) : INTERNAL_CFLAGS += $(FP128_3_1_CFLAGS_HW) +$(fp128_3_1_hw_obj) : $(srcdir)/config/rs6000/t-float128-p10-hw + $(fp128_ifunc_obj) : INTERNAL_CFLAGS += $(FP128_CFLAGS_SW) $(fp128_ifunc_obj) : $(srcdir)/config/rs6000/t-float128-hw diff --git a/libgcc/config/rs6000/t-float128-p10-hw b/libgcc/config/rs6000/t-float128-p10-hw new file mode 100644 index 00000000000..de36227c3d1 --- /dev/null +++ b/libgcc/config/rs6000/t-float128-p10-hw @@ -0,0 +1,24 @@ +# Support for adding __float128 hardware support to the powerpc. +# Tell the float128 functions that the ISA 3.1 hardware support can +# be compiled it to be selected via IFUNC functions. + +FLOAT128_HW_INSNS = -DFLOAT128_HW_INSNS + +# New functions for hardware support + +fp128_3_1_hw_funcs = float128-p10 +fp128_3_1_hw_src = $(srcdir)/config/rs6000/float128-p10.c +fp128_3_1_hw_static_obj = $(addsuffix $(objext),$(fp128_3_1_hw_funcs)) +fp128_3_1_hw_shared_obj = $(addsuffix _s$(objext),$(fp128_3_1_hw_funcs)) +fp128_3_1_hw_obj = $(fp128_3_1_hw_static_obj) $(fp128_3_1_hw_shared_obj) + +# Build the hardware support functions with appropriate hardware support +FP128_3_1_CFLAGS_HW = -Wno-type-limits -mvsx -mfloat128 \ + -mpower10 \ + -mfloat128-hardware -mno-gnu-attribute \ + -I$(srcdir)/soft-fp \ + -I$(srcdir)/config/rs6000 \ + $(FLOAT128_HW_INSNS) + +$(fp128_3_1_hw_obj) : INTERNAL_CFLAGS += $(FP128_3_1_CFLAGS_HW) +$(fp128_3_1_hw_obj) : $(srcdir)/config/rs6000/t-float128-p10-hw diff --git a/libgcc/configure b/libgcc/configure index 78fc22a5784..db00c9ac7ca 100755 --- a/libgcc/configure +++ b/libgcc/configure @@ -4913,7 +4913,7 @@ case "$host" in case "$enable_cet" in auto) # Check if target supports multi-byte NOPs - # and if assembler supports CET insn. + # and if compiler and assembler support CET insn. cet_save_CFLAGS="$CFLAGS" CFLAGS="$CFLAGS -fcf-protection" cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -5261,6 +5261,43 @@ fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $libgcc_cv_powerpc_float128_hw" >&5 +$as_echo "$libgcc_cv_powerpc_float128_hw" >&6; } + CFLAGS="$saved_CFLAGS" + + saved_CFLAGS="$CFLAGS" + CFLAGS="$CFLAGS -mpower10 -mfloat128-hardware" + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for PowerPC ISA 3.1 to build hardware __float128 libraries" >&5 +$as_echo_n "checking for PowerPC ISA 3.1 to build hardware __float128 libraries... " >&6; } +if ${libgcc_cv_powerpc_float128_hw+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +#include + #ifndef AT_PLATFORM + #error "AT_PLATFORM is not defined" + #endif + #ifndef __BUILTIN_CPU_SUPPORTS__ + #error "__builtin_cpu_supports is not available" + #endif + vector unsigned char add (vector unsigned char a, vector unsigned char b) + { + vector unsigned char ret; + __asm__ ("xscvsqqp %0,%1,%2" : "=v" (ret) : "v" (a), "v" (b)); + return ret; + } + void *add_resolver (void) { return (void *) add; } + __float128 add_ifunc (__float128, __float128) + __attribute__ ((__ifunc__ ("add_resolver"))); +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + libgcc_cv_powerpc_3_1_float128_hw=yes +else + libgcc_cv_powerpc_3_1_float128_hw=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libgcc_cv_powerpc_float128_hw" >&5 $as_echo "$libgcc_cv_powerpc_float128_hw" >&6; } CFLAGS="$saved_CFLAGS" esac diff --git a/libgcc/configure.ac b/libgcc/configure.ac index ed50c0e9b49..b0088a16372 100644 --- a/libgcc/configure.ac +++ b/libgcc/configure.ac @@ -458,6 +458,31 @@ powerpc*-*-linux*) [libgcc_cv_powerpc_float128_hw=yes], [libgcc_cv_powerpc_float128_hw=no])]) CFLAGS="$saved_CFLAGS" + + saved_CFLAGS="$CFLAGS" + CFLAGS="$CFLAGS -mpower10 -mfloat128-hardware" + AC_CACHE_CHECK([for PowerPC ISA 3.1 to build hardware __float128 libraries], + [libgcc_cv_powerpc_float128_hw], + [AC_COMPILE_IFELSE( + [AC_LANG_SOURCE([#include + #ifndef AT_PLATFORM + #error "AT_PLATFORM is not defined" + #endif + #ifndef __BUILTIN_CPU_SUPPORTS__ + #error "__builtin_cpu_supports is not available" + #endif + vector unsigned char add (vector unsigned char a, vector unsigned char b) + { + vector unsigned char ret; + __asm__ ("xscvsqqp %0,%1,%2" : "=v" (ret) : "v" (a), "v" (b)); + return ret; + } + void *add_resolver (void) { return (void *) add; } + __float128 add_ifunc (__float128, __float128) + __attribute__ ((__ifunc__ ("add_resolver")));])], + [libgcc_cv_powerpc_3_1_float128_hw=yes], + [libgcc_cv_powerpc_3_1_float128_hw=no])]) + CFLAGS="$saved_CFLAGS" esac # Collect host-machine-specific information.