From patchwork Wed Oct 26 15:01:42 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 687118 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3t3tXl5wvQz9sQw for ; Thu, 27 Oct 2016 02:02:13 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=igEoU3ih; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :mime-version:content-type; q=dns; s=default; b=fs/B6AuNCZlOqKnC nofjL9TGX1hF5hN7mguDf19cDcAHM/LV01WaN7C/ZFMbxZjMXUdOVy2vYbOCNkSC H1Fd3oLgqGHkKe6f2etfKo4NZgboPEg4EImpmvFFwWHurUf4+3m8BL96tYVqqx1r jquzbo9L+L5k4PGbx/biwjhQF3Y= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :mime-version:content-type; s=default; bh=Wz162PpiLKHwd4qjOXldXM JbGCA=; b=igEoU3ihC8uwcJnzzmfNNrQLJFgqOJY2WOWtmMUugsUhsu5gqkMPxf xHQYUSTlBTNUl1c9KoFC6SsuRP4cIEE2eXV9buzhJn9Z2YPdmCrM8XgtsWMjmQfx zwbHW7+YlToIQtk4vFEk5joT8JQ4Ef79zWNQNT9PosTI+Em6b9tUo= Received: (qmail 85343 invoked by alias); 26 Oct 2016 15:02:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 85304 invoked by uid 89); 26 Oct 2016 15:02:03 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL, BAYES_00, KAM_LOTSOFHASH, SPF_PASS autolearn=no version=3.3.2 spammy=nn, Patches, 1160, fmax X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 26 Oct 2016 15:01:53 +0000 Received: from EUR03-VE1-obe.outbound.protection.outlook.com (mail-ve1eur03lp0148.outbound.protection.outlook.com [213.199.154.148]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-6-6_kfmX92MeyYqBNn2qWw_A-1; Wed, 26 Oct 2016 16:01:49 +0100 Received: from VI1PR0801MB2031.eurprd08.prod.outlook.com (10.173.74.140) by HE1PR0802MB2346.eurprd08.prod.outlook.com (10.172.129.12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.669.16; Wed, 26 Oct 2016 15:01:43 +0000 Received: from VI1PR0801MB2031.eurprd08.prod.outlook.com ([10.173.74.140]) by VI1PR0801MB2031.eurprd08.prod.outlook.com ([10.173.74.140]) with mapi id 15.01.0679.015; Wed, 26 Oct 2016 15:01:43 +0000 From: Tamar Christina To: Christophe Lyon CC: GCC Patches , Kyrylo Tkachov , nd Subject: Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs. Date: Wed, 26 Oct 2016 15:01:42 +0000 Message-ID: References: , In-Reply-To: x-ms-office365-filtering-correlation-id: 066902a1-b96f-416b-2a50-08d3fdb0ff77 x-microsoft-exchange-diagnostics: 1; HE1PR0802MB2346; 7:vVqyh4UF7KX+IvM42xryjjlqGgqpvxFS6htd9q6Ub+ZrgDKq++oUYFMc3p1fLCveget8TsCk2E8QDOjfcr9lVU3TZlw3Jr9KKHz4Vv8iP8+HSEDzKhVMbXNBXASzwidMU8pyJXCsseVYKjCLhkNJ/qD4tcodF84seGaBTJL03+5Owm6nUhfgRzs3fe4l3758uR2UAxBQ8v/JDiqybisyeTbmiOgVSduUjeMCzJ+C89YTHfhmFvd4DxI6+FI3CPpc3zAzRxMp3Hxf/tsTWTK999cXl8qLBP1HgRrybtBzRaylyreQ637+3zCCm8fQMMCzPN1U11wGmOB7R6FGDZ5r+hnoggGUrFL/OwIMlSb5Ke0= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:HE1PR0802MB2346; nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917)(22074186197030)(183786458502308); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(102415321)(6040176)(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046)(6055026); SRVR:HE1PR0802MB2346; BCL:0; PCL:0; RULEID:; SRVR:HE1PR0802MB2346; x-forefront-prvs: 0107098B6C x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(7916002)(24454002)(377424004)(377454003)(199003)(189002)(53754006)(189998001)(11100500001)(92566002)(8936002)(106356001)(6916009)(5660300001)(68736007)(50986999)(106116001)(2950100002)(9686002)(105586002)(110136003)(19580405001)(87936001)(3846002)(7696004)(97736004)(19580395003)(4001150100001)(6116002)(5890100001)(102836003)(15975445007)(2900100001)(7736002)(4326007)(3660700001)(10400500002)(8676002)(586003)(66066001)(101416001)(2906002)(33656002)(76176999)(5002640100001)(99936001)(7846002)(77096005)(305945005)(54356999)(74316002)(81156014)(86362001)(122556002)(76576001)(3280700002)(81166006); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1PR0802MB2346; H:VI1PR0801MB2031.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-originalarrivaltime: 26 Oct 2016 15:01:42.8390 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0802MB2346 X-MC-Unique: 6_kfmX92MeyYqBNn2qWw_A-1 X-IsSubscribed: yes Hi Christophe, Here's the updated patch. Cheers, Tamar diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c index 72837001d1011e366233236a6ba3d1e5775583b1..dcb883d750506a02257e6e2e49880f2d1b9888fa 100644 --- a/gcc/config/arm/arm-c.c +++ b/gcc/config/arm/arm-c.c @@ -86,6 +86,9 @@ arm_cpu_builtins (struct cpp_reader* pfile) ((TARGET_ARM_ARCH >= 5 && !TARGET_THUMB) || TARGET_ARM_ARCH_ISA_THUMB >=2)); + def_or_undef_macro (pfile, "__ARM_FEATURE_NUMERIC_MAXMIN", + TARGET_ARM_ARCH >= 8 && TARGET_NEON && TARGET_FPU_ARMV8); + def_or_undef_macro (pfile, "__ARM_FEATURE_SIMD32", TARGET_INT_SIMD); builtin_define_with_int_value ("__ARM_SIZEOF_MINIMAL_ENUM", diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h index 54bbc7dd83cf979b6fad7724ba1d4b327b311f5c..3898ff7302dc3f21e6b50a8a7b835033c1ae2021 100644 --- a/gcc/config/arm/arm_neon.h +++ b/gcc/config/arm/arm_neon.h @@ -2956,6 +2956,34 @@ vmaxq_f32 (float32x4_t __a, float32x4_t __b) return (float32x4_t)__builtin_neon_vmaxfv4sf (__a, __b); } +#pragma GCC push_options +#pragma GCC target ("fpu=neon-fp-armv8") +__extension__ static __inline float32x2_t __attribute__ ((__always_inline__)) +vmaxnm_f32 (float32x2_t a, float32x2_t b) +{ + return (float32x2_t)__builtin_neon_vmaxnmv2sf (a, b); +} + +__extension__ static __inline float32x4_t __attribute__ ((__always_inline__)) +vmaxnmq_f32 (float32x4_t a, float32x4_t b) +{ + return (float32x4_t)__builtin_neon_vmaxnmv4sf (a, b); +} + +__extension__ static __inline float32x2_t __attribute__ ((__always_inline__)) +vminnm_f32 (float32x2_t a, float32x2_t b) +{ + return (float32x2_t)__builtin_neon_vminnmv2sf (a, b); +} + +__extension__ static __inline float32x4_t __attribute__ ((__always_inline__)) +vminnmq_f32 (float32x4_t a, float32x4_t b) +{ + return (float32x4_t)__builtin_neon_vminnmv4sf (a, b); +} +#pragma GCC pop_options + + __extension__ static __inline uint8x16_t __attribute__ ((__always_inline__)) vmaxq_u8 (uint8x16_t __a, uint8x16_t __b) { diff --git a/gcc/config/arm/arm_neon_builtins.def b/gcc/config/arm/arm_neon_builtins.def index b29aa91a64ecb85dfb5eb9661ed67d4fa326062f..58b10207c1f5c0380cb01fdb4a92a3f0b4dec591 100644 --- a/gcc/config/arm/arm_neon_builtins.def +++ b/gcc/config/arm/arm_neon_builtins.def @@ -147,12 +147,12 @@ VAR6 (BINOP, vmaxs, v8qi, v4hi, v2si, v16qi, v8hi, v4si) VAR6 (BINOP, vmaxu, v8qi, v4hi, v2si, v16qi, v8hi, v4si) VAR2 (BINOP, vmaxf, v2sf, v4sf) VAR2 (BINOP, vmaxf, v8hf, v4hf) -VAR2 (BINOP, vmaxnm, v4hf, v8hf) +VAR4 (BINOP, vmaxnm, v2sf, v4sf, v4hf, v8hf) VAR6 (BINOP, vmins, v8qi, v4hi, v2si, v16qi, v8hi, v4si) VAR6 (BINOP, vminu, v8qi, v4hi, v2si, v16qi, v8hi, v4si) VAR2 (BINOP, vminf, v2sf, v4sf) VAR2 (BINOP, vminf, v4hf, v8hf) -VAR2 (BINOP, vminnm, v8hf, v4hf) +VAR4 (BINOP, vminnm, v2sf, v4sf, v8hf, v4hf) VAR3 (BINOP, vpmaxs, v8qi, v4hi, v2si) VAR3 (BINOP, vpmaxu, v8qi, v4hi, v2si) diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 05323334ffd81aeff33ee407b96c788d123b3fe3..4f7358effdbbd7b8e7667af68dd54c2732459ced 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -2841,6 +2841,17 @@ [(set_attr "type" "neon_fp_minmax_s")] ) +;; vnm intrinsics. +(define_insn "neon_" + [(set (match_operand:VCVTF 0 "s_register_operand" "=w") + (unspec:VCVTF [(match_operand:VCVTF 1 "s_register_operand" "w") + (match_operand:VCVTF 2 "s_register_operand" "w")] + VMAXMINFNM))] + "TARGET_NEON && TARGET_FPU_ARMV8" + ".\t%0, %1, %2" + [(set_attr "type" "neon_fp_minmax_s")] +) + ;; Vector forms for the IEEE-754 fmax()/fmin() functions (define_insn "3" [(set (match_operand:VCVTF 0 "s_register_operand" "=w") diff --git a/gcc/testsuite/gcc.target/arm/simd/vmaxnm_f32_1.c b/gcc/testsuite/gcc.target/arm/simd/vmaxnm_f32_1.c new file mode 100644 index 0000000000000000000000000000000000000000..c3a9f3671b36a1491ed6d33dc894a3b4b559c4ae --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vmaxnm_f32_1.c @@ -0,0 +1,159 @@ +/* Test the `vmaxnmf32' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_neon_ok } */ +/* { dg-options "-save-temps -O3 -march=armv8-a" } */ +/* { dg-add-options arm_v8_neon } */ + +#include "arm_neon.h" + +extern void abort (); + +void __attribute__ ((noinline)) +test_vmaxnm_f32__regular_input1 () +{ + float32_t a1[] = {1,2}; + float32_t b1[] = {3,4}; + float32x2_t a = vld1_f32 (a1); + float32x2_t b = vld1_f32 (b1); + float32x2_t c = vmaxnm_f32 (a, b); + float32_t actual[2]; + vst1_f32 (actual, c); + + for (int i = 0; i < 2; ++i) + if (actual[i] != b1[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vmaxnm_f32__regular_input2 () +{ + float32_t a1[] = {3,2}; + float32_t b1[] = {1,4}; + float32_t e[] = {3,4}; + float32x2_t a = vld1_f32 (a1); + float32x2_t b = vld1_f32 (b1); + float32x2_t c = vmaxnm_f32 (a, b); + float32_t actual[2]; + vst1_f32 (actual, c); + + for (int i = 0; i < 2; ++i) + if (actual[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vmaxnm_f32__quiet_NaN_one_arg () +{ + /* When given a quiet NaN, vmaxnm returns the other operand. + In this test case we have NaNs in only one operand. */ + float32_t n = __builtin_nanf (""); + float32_t a1[] = {1,2}; + float32_t b1[] = {n,n}; + float32_t e[] = {1,2}; + float32x2_t a = vld1_f32 (a1); + float32x2_t b = vld1_f32 (b1); + float32x2_t c = vmaxnm_f32 (a, b); + float32_t actual[2]; + vst1_f32 (actual, c); + + for (int i = 0; i < 2; ++i) + if (actual[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vmaxnm_f32__quiet_NaN_both_args () +{ + /* When given a quiet NaN, vmaxnm returns the other operand. + In this test case we have NaNs in both operands. */ + float32_t n = __builtin_nanf (""); + float32_t a1[] = {n,2}; + float32_t b1[] = {1,n}; + float32_t e[] = {1,2}; + float32x2_t a = vld1_f32 (a1); + float32x2_t b = vld1_f32 (b1); + float32x2_t c = vmaxnm_f32 (a, b); + float32_t actual[2]; + vst1_f32 (actual, c); + + for (int i = 0; i < 2; ++i) + if (actual[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vmaxnm_f32__zero_both_args () +{ + /* For 0 and -0, vmaxnm returns 0. Since 0 == -0, check sign bit. */ + float32_t a1[] = {0.0, 0.0}; + float32_t b1[] = {-0.0, -0.0}; + float32_t e[] = {0.0, 0.0}; + + float32x2_t a = vld1_f32 (a1); + float32x2_t b = vld1_f32 (b1); + float32x2_t c = vmaxnm_f32 (a, b); + + float32_t actual1[2]; + vst1_f32 (actual1, c); + + for (int i = 0; i < 2; ++i) + if (actual1[i] != e[i] || __builtin_signbit (actual1[i]) != 0) + abort (); +} + +void __attribute__ ((noinline)) +test_vmaxnm_f32__inf_both_args () +{ + /* The max of inf and inf is inf. The max of -inf and -inf is -inf. */ + float32_t inf = __builtin_huge_valf (); + float32_t a1[] = {inf, -inf}; + float32_t b1[] = {inf, -inf}; + float32_t e[] = {inf, -inf}; + + float32x2_t a = vld1_f32 (a1); + float32x2_t b = vld1_f32 (b1); + float32x2_t c = vmaxnm_f32 (a, b); + + float32_t actual1[2]; + vst1_f32 (actual1, c); + + for (int i = 0; i < 2; ++i) + if (actual1[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vmaxnm_f32__two_quiet_NaNs_both_args () +{ + /* When given 2 NaNs, return a NaN. Since a NaN is not equal to anything, + not even another NaN, use __builtin_isnan () to check. */ + float32_t n = __builtin_nanf (""); + float32_t a1[] = {n,n}; + float32_t b1[] = {n,n}; + float32_t e[] = {n,n}; + float32x2_t a = vld1_f32 (a1); + float32x2_t b = vld1_f32 (b1); + float32x2_t c = vmaxnm_f32 (a, b); + float32_t actual[2]; + vst1_f32 (actual, c); + + for (int i = 0; i < 2; ++i) + if (!__builtin_isnan (actual[i])) + abort (); +} + +int +main () +{ + test_vmaxnm_f32__regular_input1 (); + test_vmaxnm_f32__regular_input2 (); + test_vmaxnm_f32__quiet_NaN_one_arg (); + test_vmaxnm_f32__quiet_NaN_both_args (); + test_vmaxnm_f32__zero_both_args (); + test_vmaxnm_f32__inf_both_args (); + test_vmaxnm_f32__two_quiet_NaNs_both_args (); + return 0; +} + +/* { dg-final { scan-assembler-times "vmaxnm\.f32\t\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+\n" 7 } } */ diff --git a/gcc/testsuite/gcc.target/arm/simd/vmaxnmq_f32_1.c b/gcc/testsuite/gcc.target/arm/simd/vmaxnmq_f32_1.c new file mode 100644 index 0000000000000000000000000000000000000000..80c4e9aa18810fea318b865e8c4e503238e826f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vmaxnmq_f32_1.c @@ -0,0 +1,160 @@ +/* Test the `vmaxnmqf32' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_neon_ok } */ +/* { dg-options "-save-temps -O3 -march=armv8-a" } */ +/* { dg-add-options arm_v8_neon } */ + +#include "arm_neon.h" + +extern void abort (); + +void __attribute__ ((noinline)) +test_vmaxnmq_f32__regular_input1 () +{ + float32_t a1[] = {1,2,5,6}; + float32_t b1[] = {3,4,7,8}; + float32x4_t a = vld1q_f32 (a1); + float32x4_t b = vld1q_f32 (b1); + float32x4_t c = vmaxnmq_f32 (a, b); + float32_t actual[4]; + vst1q_f32 (actual, c); + + for (int i = 0; i < 4; ++i) + if (actual[i] != b1[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vmaxnmq_f32__regular_input2 () +{ + float32_t a1[] = {3,2,7,6}; + float32_t b1[] = {1,4,5,8}; + float32_t e[] = {3,4,7,8}; + float32x4_t a = vld1q_f32 (a1); + float32x4_t b = vld1q_f32 (b1); + float32x4_t c = vmaxnmq_f32 (a, b); + float32_t actual[4]; + vst1q_f32 (actual, c); + + for (int i = 0; i < 4; ++i) + if (actual[i] != e[i]) + abort (); +} + + +void __attribute__ ((noinline)) +test_vmaxnmq_f32__quiet_NaN_one_arg () +{ + /* When given a quiet NaN, vmaxnmq returns the other operand. + In this test case we have NaNs in only one operand. */ + float32_t n = __builtin_nanf (""); + float32_t a1[] = {1,2,3,4}; + float32_t b1[] = {n,n,n,n}; + float32_t e[] = {1,2,3,4}; + float32x4_t a = vld1q_f32 (a1); + float32x4_t b = vld1q_f32 (b1); + float32x4_t c = vmaxnmq_f32 (a, b); + float32_t actual[4]; + vst1q_f32 (actual, c); + + for (int i = 0; i < 4; ++i) + if (actual[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vmaxnmq_f32__quiet_NaN_both_args () +{ + /* When given a quiet NaN, vmaxnmq returns the other operand. + In this test case we have NaNs in both operands. */ + float32_t n = __builtin_nanf (""); + float32_t a1[] = {n,2,n,4}; + float32_t b1[] = {1,n,3,n}; + float32_t e[] = {1,2,3,4}; + float32x4_t a = vld1q_f32 (a1); + float32x4_t b = vld1q_f32 (b1); + float32x4_t c = vmaxnmq_f32 (a, b); + float32_t actual[4]; + vst1q_f32 (actual, c); + + for (int i = 0; i < 4; ++i) + if (actual[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vmaxnmq_f32__zero_both_args () +{ + /* For 0 and -0, vmaxnmq returns 0. Since 0 == -0, check sign bit. */ + float32_t a1[] = {0.0, 0.0, -0.0, -0.0}; + float32_t b1[] = {-0.0, -0.0, 0.0, 0.0}; + float32_t e[] = {0.0, 0.0, 0.0, 0.0}; + + float32x4_t a = vld1q_f32 (a1); + float32x4_t b = vld1q_f32 (b1); + float32x4_t c = vmaxnmq_f32 (a, b); + + float32_t actual1[4]; + vst1q_f32 (actual1, c); + + for (int i = 0; i < 4; ++i) + if (actual1[i] != e[i] || __builtin_signbit (actual1[i]) != 0) + abort (); +} + +void __attribute__ ((noinline)) +test_vmaxnmq_f32__inf_both_args () +{ + /* The max of inf and inf is inf. The max of -inf and -inf is -inf. */ + float32_t inf = __builtin_huge_valf (); + float32_t a1[] = {inf, -inf, inf, inf}; + float32_t b1[] = {inf, -inf, -inf, -inf}; + float32_t e[] = {inf, -inf, inf, inf}; + + float32x4_t a = vld1q_f32 (a1); + float32x4_t b = vld1q_f32 (b1); + float32x4_t c = vmaxnmq_f32 (a, b); + + float32_t actual1[4]; + vst1q_f32 (actual1, c); + + for (int i = 0; i < 4; ++i) + if (actual1[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vmaxnmq_f32__two_quiet_NaNs_both_args () +{ + /* When given 2 NaNs, return a NaN. Since a NaN is not equal to anything, + not even another NaN, use __builtin_isnan () to check. */ + float32_t n = __builtin_nanf (""); + float32_t a1[] = {n,n,n,n}; + float32_t b1[] = {n,n,n,n}; + float32_t e[] = {n,n}; + float32x4_t a = vld1q_f32 (a1); + float32x4_t b = vld1q_f32 (b1); + float32x4_t c = vmaxnmq_f32 (a, b); + float32_t actual[4]; + vst1q_f32 (actual, c); + + for (int i = 0; i < 4; ++i) + if (!__builtin_isnan (actual[i])) + abort (); +} + +int +main () +{ + test_vmaxnmq_f32__regular_input1 (); + test_vmaxnmq_f32__regular_input2 (); + test_vmaxnmq_f32__quiet_NaN_one_arg (); + test_vmaxnmq_f32__quiet_NaN_both_args (); + test_vmaxnmq_f32__zero_both_args (); + test_vmaxnmq_f32__inf_both_args (); + test_vmaxnmq_f32__two_quiet_NaNs_both_args (); + return 0; +} + +/* { dg-final { scan-assembler-times "vmaxnm\.f32\t\[qQ\]\[0-9\]+, ?\[qQ\]\[0-9\]+, ?\[qQ\]\[0-9\]+\n" 7 } } */ diff --git a/gcc/testsuite/gcc.target/arm/simd/vminnm_f32_1.c b/gcc/testsuite/gcc.target/arm/simd/vminnm_f32_1.c new file mode 100644 index 0000000000000000000000000000000000000000..9a1d097911748108591a11f3bd7fbf3e44adebaa --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vminnm_f32_1.c @@ -0,0 +1,159 @@ +/* Test the `vminnmf32' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_neon_ok } */ +/* { dg-options "-save-temps -O3 -march=armv8-a" } */ +/* { dg-add-options arm_v8_neon } */ + +#include "arm_neon.h" + +extern void abort (); + +void __attribute__ ((noinline)) +test_vminnm_f32__regular_input1 () +{ + float32_t a1[] = {1,2}; + float32_t b1[] = {3,4}; + float32x2_t a = vld1_f32 (a1); + float32x2_t b = vld1_f32 (b1); + float32x2_t c = vminnm_f32 (a, b); + float32_t actual[2]; + vst1_f32 (actual, c); + + for (int i = 0; i < 2; ++i) + if (actual[i] != a1[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vminnm_f32__regular_input2 () +{ + float32_t a1[] = {3,2}; + float32_t b1[] = {1,4}; + float32_t e[] = {1,2}; + float32x2_t a = vld1_f32 (a1); + float32x2_t b = vld1_f32 (b1); + float32x2_t c = vminnm_f32 (a, b); + float32_t actual[2]; + vst1_f32 (actual, c); + + for (int i = 0; i < 2; ++i) + if (actual[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vminnm_f32__quiet_NaN_one_arg () +{ + /* When given a quiet NaN, vminnm returns the other operand. + In this test case we have NaNs in only one operand. */ + float32_t n = __builtin_nanf (""); + float32_t a1[] = {1,2}; + float32_t b1[] = {n,n}; + float32_t e[] = {1,2}; + float32x2_t a = vld1_f32 (a1); + float32x2_t b = vld1_f32 (b1); + float32x2_t c = vminnm_f32 (a, b); + float32_t actual[2]; + vst1_f32 (actual, c); + + for (int i = 0; i < 2; ++i) + if (actual[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vminnm_f32__quiet_NaN_both_args () +{ + /* When given a quiet NaN, vminnm returns the other operand. + In this test case we have NaNs in both operands. */ + float32_t n = __builtin_nanf (""); + float32_t a1[] = {n,2}; + float32_t b1[] = {1,n}; + float32_t e[] = {1,2}; + float32x2_t a = vld1_f32 (a1); + float32x2_t b = vld1_f32 (b1); + float32x2_t c = vminnm_f32 (a, b); + float32_t actual[2]; + vst1_f32 (actual, c); + + for (int i = 0; i < 2; ++i) + if (actual[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vminnm_f32__zero_both_args () +{ + /* For 0 and -0, vminnm returns -0. Since 0 == -0, check sign bit. */ + float32_t a1[] = {0.0,0.0}; + float32_t b1[] = {-0.0, -0.0}; + float32_t e[] = {-0.0, -0.0}; + + float32x2_t a = vld1_f32 (a1); + float32x2_t b = vld1_f32 (b1); + float32x2_t c = vminnm_f32 (a, b); + + float32_t actual1[2]; + vst1_f32 (actual1, c); + + for (int i = 0; i < 2; ++i) + if (actual1[i] != e[i] || __builtin_signbit (actual1[i]) == 0) + abort (); +} + +void __attribute__ ((noinline)) +test_vminnm_f32__inf_both_args () +{ + /* The min of inf and inf is inf. The min of -inf and -inf is -inf. */ + float32_t inf = __builtin_huge_valf (); + float32_t a1[] = {inf, -inf}; + float32_t b1[] = {inf, -inf}; + float32_t e[] = {inf, -inf}; + + float32x2_t a = vld1_f32 (a1); + float32x2_t b = vld1_f32 (b1); + float32x2_t c = vminnm_f32 (a, b); + + float32_t actual1[2]; + vst1_f32 (actual1, c); + + for (int i = 0; i < 2; ++i) + if (actual1[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vminnm_f32__two_quiet_NaNs_both_args () +{ + /* When given 2 NaNs, return a NaN. Since a NaN is not equal to anything, + not even another NaN, use __builtin_isnan () to check. */ + float32_t n = __builtin_nanf (""); + float32_t a1[] = {n,n}; + float32_t b1[] = {n,n}; + float32_t e[] = {n,n}; + float32x2_t a = vld1_f32 (a1); + float32x2_t b = vld1_f32 (b1); + float32x2_t c = vminnm_f32 (a, b); + float32_t actual[2]; + vst1_f32 (actual, c); + + for (int i = 0; i < 2; ++i) + if (!__builtin_isnan (actual[i])) + abort (); +} + +int +main () +{ + test_vminnm_f32__regular_input1 (); + test_vminnm_f32__regular_input2 (); + test_vminnm_f32__quiet_NaN_one_arg (); + test_vminnm_f32__quiet_NaN_both_args (); + test_vminnm_f32__zero_both_args (); + test_vminnm_f32__inf_both_args (); + test_vminnm_f32__two_quiet_NaNs_both_args (); + return 0; +} + +/* { dg-final { scan-assembler-times "vminnm\.f32\t\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+\n" 7 } } */ diff --git a/gcc/testsuite/gcc.target/arm/simd/vminnmq_f32_1.c b/gcc/testsuite/gcc.target/arm/simd/vminnmq_f32_1.c new file mode 100644 index 0000000000000000000000000000000000000000..a778abecd857e9ea83d249e0ab52886209030aa4 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vminnmq_f32_1.c @@ -0,0 +1,159 @@ +/* Test the `vminnmqf32' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_neon_ok } */ +/* { dg-options "-save-temps -O3 -march=armv8-a" } */ +/* { dg-add-options arm_v8_neon } */ + +#include "arm_neon.h" + +extern void abort (); + +void __attribute__ ((noinline)) +test_vminnmq_f32__regular_input1 () +{ + float32_t a1[] = {1,2,5,6}; + float32_t b1[] = {3,4,7,8}; + float32x4_t a = vld1q_f32 (a1); + float32x4_t b = vld1q_f32 (b1); + float32x4_t c = vminnmq_f32 (a, b); + float32_t actual[4]; + vst1q_f32 (actual, c); + + for (int i = 0; i < 4; ++i) + if (actual[i] != a1[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vminnmq_f32__regular_input2 () +{ + float32_t a1[] = {3,2,7,6}; + float32_t b1[] = {1,4,5,8}; + float32_t e[] = {1,2,5,6}; + float32x4_t a = vld1q_f32 (a1); + float32x4_t b = vld1q_f32 (b1); + float32x4_t c = vminnmq_f32 (a, b); + float32_t actual[4]; + vst1q_f32 (actual, c); + + for (int i = 0; i < 4; ++i) + if (actual[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vminnmq_f32__quiet_NaN_one_arg () +{ + /* When given a quiet NaN, vminnmq returns the other operand. + In this test case we have NaNs in only one operand. */ + float32_t n = __builtin_nanf (""); + float32_t a1[] = {1,2,3,4}; + float32_t b1[] = {n,n,n,n}; + float32_t e[] = {1,2,3,4}; + float32x4_t a = vld1q_f32 (a1); + float32x4_t b = vld1q_f32 (b1); + float32x4_t c = vminnmq_f32 (a, b); + float32_t actual[4]; + vst1q_f32 (actual, c); + + for (int i = 0; i < 4; ++i) + if (actual[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vminnmq_f32__quiet_NaN_both_args () +{ + /* When given a quiet NaN, vminnmq returns the other operand. + In this test case we have NaNs in both operands. */ + float32_t n = __builtin_nanf (""); + float32_t a1[] = {n,2,n,4}; + float32_t b1[] = {1,n,3,n}; + float32_t e[] = {1,2,3,4}; + float32x4_t a = vld1q_f32 (a1); + float32x4_t b = vld1q_f32 (b1); + float32x4_t c = vminnmq_f32 (a, b); + float32_t actual[4]; + vst1q_f32 (actual, c); + + for (int i = 0; i < 4; ++i) + if (actual[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vminnmq_f32__zero_both_args () +{ + /* For 0 and -0, vminnmq returns -0. Since 0 == -0, check sign bit. */ + float32_t a1[] = {0.0, 0.0, -0.0, -0.0}; + float32_t b1[] = {-0.0, -0.0, 0.0, 0.0}; + float32_t e[] = {-0.0, -0.0, -0.0, -0.0}; + + float32x4_t a = vld1q_f32 (a1); + float32x4_t b = vld1q_f32 (b1); + float32x4_t c = vminnmq_f32 (a, b); + + float32_t actual1[4]; + vst1q_f32 (actual1, c); + + for (int i = 0; i < 4; ++i) + if (actual1[i] != e[i] || __builtin_signbit (actual1[i]) == 0) + abort (); +} + +void __attribute__ ((noinline)) +test_vminnmq_f32__inf_both_args () +{ + /* The min of inf and inf is inf. The min of -inf and -inf is -inf. */ + float32_t inf = __builtin_huge_valf (); + float32_t a1[] = {inf, -inf, inf, inf}; + float32_t b1[] = {inf, -inf, -inf, -inf}; + float32_t e[] = {inf, -inf, -inf, -inf}; + + float32x4_t a = vld1q_f32 (a1); + float32x4_t b = vld1q_f32 (b1); + float32x4_t c = vminnmq_f32 (a, b); + + float32_t actual1[4]; + vst1q_f32 (actual1, c); + + for (int i = 0; i < 4; ++i) + if (actual1[i] != e[i]) + abort (); +} + +void __attribute__ ((noinline)) +test_vminnmq_f32__two_quiet_NaNs_both_args () +{ + /* When given 2 NaNs, return a NaN. Since a NaN is not equal to anything, + not even another NaN, use __builtin_isnan () to check. */ + float32_t n = __builtin_nanf (""); + float32_t a1[] = {n,n,n,n}; + float32_t b1[] = {n,n,n,n}; + float32_t e[] = {n,n}; + float32x4_t a = vld1q_f32 (a1); + float32x4_t b = vld1q_f32 (b1); + float32x4_t c = vminnmq_f32 (a, b); + float32_t actual[4]; + vst1q_f32 (actual, c); + + for (int i = 0; i < 4; ++i) + if (!__builtin_isnan (actual[i])) + abort (); +} + +int +main () +{ + test_vminnmq_f32__regular_input1 (); + test_vminnmq_f32__regular_input2 (); + test_vminnmq_f32__quiet_NaN_one_arg (); + test_vminnmq_f32__quiet_NaN_both_args (); + test_vminnmq_f32__zero_both_args (); + test_vminnmq_f32__inf_both_args (); + test_vminnmq_f32__two_quiet_NaNs_both_args (); + return 0; +} + +/* { dg-final { scan-assembler-times "vminnm\.f32\t\[qQ\]\[0-9\]+, ?\[qQ\]\[0-9\]+, ?\[qQ\]\[0-9\]+\n" 7 } } */