From patchwork Thu Jul 7 16:19:25 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiong Wang X-Patchwork-Id: 645964 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rljWz20Lvz9sdn for ; Fri, 8 Jul 2016 02:20:15 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=wI8ojfIR; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:references:message-id:date:mime-version:in-reply-to :content-type; q=dns; s=default; b=h6ZSf63quBeoIyWJ38nt3uCQAneXi qL3gg6pXyxcZ8uQKRmJSHerLZg/xHc5UoDyetEgJdC+6Jh81CepmX2F6S7pRgmbR mHgpVdbLHqChu5gC7RvPC7sQq3yzgdUHzYDWIxvCC8HXbruJzXKjDbam2wYNWruw VKnsL3CuRo1NTk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:references:message-id:date:mime-version:in-reply-to :content-type; s=default; bh=d9sSydMDVIN5kshVJ/j3iIX4+sg=; b=wI8 ojfIRsrEvQwuu0eH8vvcf5Dtne97BA3nwaLCIqlsjGO68SQ+pMr5FO3gpgr4beOh jbb5NtLT5NgmRNonMkmCSGLxoh4A35u4wOyLpHtKhWybXOXAkpY0LpZQp4E3/gDz V3z8Lh7jPNWVhiouTVG817fNoU/SimqzY3mEvPkI= Received: (qmail 69240 invoked by alias); 7 Jul 2016 16:19:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 69146 invoked by uid 89); 7 Jul 2016 16:19:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=BAYES_00, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=0x8000, 89, 8701, 5.8 X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 07 Jul 2016 16:19:29 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7159E28 for ; Thu, 7 Jul 2016 09:20:27 -0700 (PDT) Received: from [10.2.206.198] (e104437-lin.cambridge.arm.com [10.2.206.198]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 39B443F41F for ; Thu, 7 Jul 2016 09:19:27 -0700 (PDT) From: Jiong Wang Subject: [AArch64][13/14] ARMv8.2-A testsuite for new vector intrinsics To: GCC Patches References: <67f7b93f-0a92-de8f-8c50-5b4b573fed3a@foss.arm.com> <99eb95e3-5e9c-c6c9-b85f-e67d15f4859a@foss.arm.com> <21c3c64f-95ad-c127-3f8a-4afd236aae33@foss.arm.com> <938d13c1-39be-5fe3-9997-e55942bbd163@foss.arm.com> <94dcb98c-81c6-a1d5-bb1a-ff8278f0a07b@foss.arm.com> <82155ca9-a506-b1fc-bdd4-6a637dc66a1e@foss.arm.com> <135287e5-6fc1-4957-d320-16f38260fa28@foss.arm.com> Message-ID: Date: Thu, 7 Jul 2016 17:19:25 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: X-IsSubscribed: yes This patch contains testcases for those new vector intrinsics which are only available for AArch64. gcc/testsuite/ 2016-07-07 Jiong Wang * gcc.target/aarch64/advsimd-intrinsics/vdiv_f16_1.c: New. * gcc.target/aarch64/advsimd-intrinsics/vfmas_lane_f16_1.c: New. * gcc.target/aarch64/advsimd-intrinsics/vfmas_n_f16_1.c: New. * gcc.target/aarch64/advsimd-intrinsics/vmaxnmv_f16_1.c: New. * gcc.target/aarch64/advsimd-intrinsics/vmaxv_f16_1.c: New. * gcc.target/aarch64/advsimd-intrinsics/vminnmv_f16_1.c: New. * gcc.target/aarch64/advsimd-intrinsics/vminv_f16_1.c: New. * gcc.target/aarch64/advsimd-intrinsics/vmul_lane_f16_1.c: New. * gcc.target/aarch64/advsimd-intrinsics/vmulx_f16_1.c: New * gcc.target/aarch64/advsimd-intrinsics/vmulx_lane_f16_1.c: New. * gcc.target/aarch64/advsimd-intrinsics/vmulx_n_f16_1.c: New * gcc.target/aarch64/advsimd-intrinsics/vpminmaxnm_f16_1.c: New. * gcc.target/aarch64/advsimd-intrinsics/vrndi_f16_1.c: New. * gcc.target/aarch64/advsimd-intrinsics/vsqrt_f16_1.c: New. From 774c4cf2488dff693c7130a62561f2da88639283 Mon Sep 17 00:00:00 2001 From: Jiong Wang Date: Tue, 5 Jul 2016 10:39:28 +0100 Subject: [PATCH 13/14] [13/14] TESTSUITE for new vector intrinsics --- .../aarch64/advsimd-intrinsics/vdiv_f16_1.c | 86 ++ .../aarch64/advsimd-intrinsics/vfmas_lane_f16_1.c | 908 +++++++++++++++++++++ .../aarch64/advsimd-intrinsics/vfmas_n_f16_1.c | 469 +++++++++++ .../aarch64/advsimd-intrinsics/vmaxnmv_f16_1.c | 131 +++ .../aarch64/advsimd-intrinsics/vmaxv_f16_1.c | 131 +++ .../aarch64/advsimd-intrinsics/vminnmv_f16_1.c | 131 +++ .../aarch64/advsimd-intrinsics/vminv_f16_1.c | 131 +++ .../aarch64/advsimd-intrinsics/vmul_lane_f16_1.c | 454 +++++++++++ .../aarch64/advsimd-intrinsics/vmulx_f16_1.c | 84 ++ .../aarch64/advsimd-intrinsics/vmulx_lane_f16_1.c | 452 ++++++++++ .../aarch64/advsimd-intrinsics/vmulx_n_f16_1.c | 177 ++++ .../aarch64/advsimd-intrinsics/vpminmaxnm_f16_1.c | 114 +++ .../aarch64/advsimd-intrinsics/vrndi_f16_1.c | 71 ++ .../aarch64/advsimd-intrinsics/vsqrt_f16_1.c | 72 ++ 14 files changed, 3411 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdiv_f16_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfmas_lane_f16_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfmas_n_f16_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmaxnmv_f16_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmaxv_f16_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vminnmv_f16_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vminv_f16_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmul_lane_f16_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmulx_f16_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmulx_lane_f16_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmulx_n_f16_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpminmaxnm_f16_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrndi_f16_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vsqrt_f16_1.c diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdiv_f16_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdiv_f16_1.c new file mode 100644 index 0000000..c0103fb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdiv_f16_1.c @@ -0,0 +1,86 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_2a_fp16_neon_hw } */ +/* { dg-add-options arm_v8_2a_fp16_neon } */ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define FP16_C(a) ((__fp16) a) +#define A FP16_C (13.4) +#define B FP16_C (-56.8) +#define C FP16_C (-34.8) +#define D FP16_C (12) +#define E FP16_C (63.1) +#define F FP16_C (19.1) +#define G FP16_C (-4.8) +#define H FP16_C (77) + +#define I FP16_C (0.7) +#define J FP16_C (-78) +#define K FP16_C (11.23) +#define L FP16_C (98) +#define M FP16_C (87.1) +#define N FP16_C (-8) +#define O FP16_C (-1.1) +#define P FP16_C (-9.7) + +/* Expected results for vdiv. */ +VECT_VAR_DECL (expected_div_static, hfloat, 16, 4) [] + = { 0x32CC /* A / E. */, 0xC1F3 /* B / F. */, + 0x4740 /* C / G. */, 0x30FD /* D / H. */ }; + +VECT_VAR_DECL (expected_div_static, hfloat, 16, 8) [] + = { 0x32CC /* A / E. */, 0xC1F3 /* B / F. */, + 0x4740 /* C / G. */, 0x30FD /* D / H. */, + 0x201D /* I / M. */, 0x48E0 /* J / N. */, + 0xC91B /* K / O. */, 0xC90D /* L / P. */ }; + +void exec_vdiv_f16 (void) +{ +#undef TEST_MSG +#define TEST_MSG "VDIV (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_1, float, 16, 4); + DECL_VARIABLE(vsrc_2, float, 16, 4); + VECT_VAR_DECL (buf_src_1, float, 16, 4) [] = {A, B, C, D}; + VECT_VAR_DECL (buf_src_2, float, 16, 4) [] = {E, F, G, H}; + VLOAD (vsrc_1, buf_src_1, , float, f, 16, 4); + VLOAD (vsrc_2, buf_src_2, , float, f, 16, 4); + + DECL_VARIABLE (vector_res, float, 16, 4) + = vdiv_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4)); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_div_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VDIVQ (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_1, float, 16, 8); + DECL_VARIABLE(vsrc_2, float, 16, 8); + VECT_VAR_DECL (buf_src_1, float, 16, 8) [] = {A, B, C, D, I, J, K, L}; + VECT_VAR_DECL (buf_src_2, float, 16, 8) [] = {E, F, G, H, M, N, O, P}; + VLOAD (vsrc_1, buf_src_1, q, float, f, 16, 8); + VLOAD (vsrc_2, buf_src_2, q, float, f, 16, 8); + + DECL_VARIABLE (vector_res, float, 16, 8) + = vdivq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8)); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_div_static, ""); +} + +int +main (void) +{ + exec_vdiv_f16 (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfmas_lane_f16_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfmas_lane_f16_1.c new file mode 100644 index 0000000..00c95d3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfmas_lane_f16_1.c @@ -0,0 +1,908 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_2a_fp16_neon_hw } */ +/* { dg-add-options arm_v8_2a_fp16_neon } */ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define FP16_C(a) ((__fp16) a) +#define A0 FP16_C (123.4) +#define A1 FP16_C (-5.8) +#define A2 FP16_C (-0.0) +#define A3 FP16_C (10) +#define A4 FP16_C (123412.43) +#define A5 FP16_C (-5.8) +#define A6 FP16_C (90.8) +#define A7 FP16_C (24) + +#define B0 FP16_C (23.4) +#define B1 FP16_C (-5.8) +#define B2 FP16_C (8.9) +#define B3 FP16_C (4.0) +#define B4 FP16_C (3.4) +#define B5 FP16_C (-550.8) +#define B6 FP16_C (-31.8) +#define B7 FP16_C (20000.0) + +/* Expected results for vfma_lane. */ +VECT_VAR_DECL (expected0_static, hfloat, 16, 4) [] + = { 0x613E /* A0 + B0 * B0. */, + 0xD86D /* A1 + B1 * B0. */, + 0x5A82 /* A2 + B2 * B0. */, + 0x567A /* A3 + B3 * B0. */}; + +VECT_VAR_DECL (expected1_static, hfloat, 16, 4) [] + = { 0xCA33 /* A0 + B0 * B1. */, + 0x4EF6 /* A1 + B1 * B1. */, + 0xD274 /* A2 + B2 * B1. */, + 0xCA9A /* A3 + B3 * B1. */ }; + +VECT_VAR_DECL (expected2_static, hfloat, 16, 4) [] + = { 0x5D2F /* A0 + B0 * B2. */, + 0xD32D /* A1 + B1 * B2. */, + 0x54F3 /* A2 + B2 * B2. */, + 0x51B3 /* A3 + B3 * B2. */ }; + +VECT_VAR_DECL (expected3_static, hfloat, 16, 4) [] + = { 0x5AC8 /* A0 + B0 * B3. */, + 0xCF40 /* A1 + B1 * B3. */, + 0x5073 /* A2 + B2 * B3. */, + 0x4E80 /* A3 + B3 * B3. */ }; + +/* Expected results for vfmaq_lane. */ +VECT_VAR_DECL (expected0_static, hfloat, 16, 8) [] + = { 0x613E /* A0 + B0 * B0. */, + 0xD86D /* A1 + B1 * B0. */, + 0x5A82 /* A2 + B2 * B0. */, + 0x567A /* A3 + B3 * B0. */, + 0x7C00 /* A4 + B4 * B0. */, + 0xF24D /* A5 + B5 * B0. */, + 0xE11B /* A6 + B6 * B0. */, + 0x7C00 /* A7 + B7 * B0. */ }; + +VECT_VAR_DECL (expected1_static, hfloat, 16, 8) [] + = { 0xCA33 /* A0 + B0 * B1. */, + 0x4EF6 /* A1 + B1 * B1. */, + 0xD274 /* A2 + B2 * B1. */, + 0xCA9A /* A3 + B3 * B1. */, + 0x7C00 /* A4 + B4 * B1. */, + 0x6A3B /* A5 + B5 * B1. */, + 0x5C4D /* A6 + B6 * B1. */, + 0xFC00 /* A7 + B7 * B1. */ }; + +VECT_VAR_DECL (expected2_static, hfloat, 16, 8) [] + = { 0x5D2F /* A0 + B0 * B2. */, + 0xD32D /* A1 + B1 * B2. */, + 0x54F3 /* A2 + B2 * B2. */, + 0x51B3 /* A3 + B3 * B2. */, + 0x7C00 /* A4 + B4 * B2. */, + 0xECCB /* A5 + B5 * B2. */, + 0xDA01 /* A6 + B6 * B2. */, + 0x7C00 /* A7 + B7 * B2. */ }; + +VECT_VAR_DECL (expected3_static, hfloat, 16, 8) [] + = { 0x5AC8 /* A0 + B0 * B3. */, + 0xCF40 /* A1 + B1 * B3. */, + 0x5073 /* A2 + B2 * B3. */, + 0x4E80 /* A3 + B3 * B3. */, + 0x7C00 /* A4 + B4 * B3. */, + 0xE851 /* A5 + B5 * B3. */, + 0xD08C /* A6 + B6 * B3. */, + 0x7C00 /* A7 + B7 * B3. */ }; + +/* Expected results for vfma_laneq. */ +VECT_VAR_DECL (expected0_laneq_static, hfloat, 16, 4) [] + = { 0x613E /* A0 + B0 * B0. */, + 0xD86D /* A1 + B1 * B0. */, + 0x5A82 /* A2 + B2 * B0. */, + 0x567A /* A3 + B3 * B0. */ }; + +VECT_VAR_DECL (expected1_laneq_static, hfloat, 16, 4) [] + = { 0xCA33 /* A0 + B0 * B1. */, + 0x4EF6 /* A1 + B1 * B1. */, + 0xD274 /* A2 + B2 * B1. */, + 0xCA9A /* A3 + B3 * B1. */ }; + +VECT_VAR_DECL (expected2_laneq_static, hfloat, 16, 4) [] + = { 0x5D2F /* A0 + B0 * B2. */, + 0xD32D /* A1 + B1 * B2. */, + 0x54F3 /* A2 + B2 * B2. */, + 0x51B3 /* A3 + B3 * B2. */ }; + +VECT_VAR_DECL (expected3_laneq_static, hfloat, 16, 4) [] + = { 0x5AC8 /* A0 + B0 * B3. */, + 0xCF40 /* A1 + B1 * B3. */, + 0x5073 /* A2 + B2 * B3. */, + 0x4E80 /* A3 + B3 * B3. */ }; + +VECT_VAR_DECL (expected4_laneq_static, hfloat, 16, 4) [] + = { 0x5A58 /* A0 + B0 * B4. */, + 0xCE62 /* A1 + B1 * B4. */, + 0x4F91 /* A2 + B2 * B4. */, + 0x4DE6 /* A3 + B3 * B4. */ }; + +VECT_VAR_DECL (expected5_laneq_static, hfloat, 16, 4) [] + = { 0xF23D /* A0 + B0 * B5. */, + 0x6A3B /* A1 + B1 * B5. */, + 0xECCA /* A2 + B2 * B5. */, + 0xE849 /* A3 + B3 * B5. */ }; + +VECT_VAR_DECL (expected6_laneq_static, hfloat, 16, 4) [] + = { 0xE0DA /* A0 + B0 * B6. */, + 0x5995 /* A1 + B1 * B6. */, + 0xDC6C /* A2 + B2 * B6. */, + 0xD753 /* A3 + B3 * B6. */ }; + +VECT_VAR_DECL (expected7_laneq_static, hfloat, 16, 4) [] + = { 0x7C00 /* A0 + B0 * B7. */, + 0xFC00 /* A1 + B1 * B7. */, + 0x7C00 /* A2 + B2 * B7. */, + 0x7C00 /* A3 + B3 * B7. */ }; + +/* Expected results for vfmaq_laneq. */ +VECT_VAR_DECL (expected0_laneq_static, hfloat, 16, 8) [] + = { 0x613E /* A0 + B0 * B0. */, + 0xD86D /* A1 + B1 * B0. */, + 0x5A82 /* A2 + B2 * B0. */, + 0x567A /* A3 + B3 * B0. */, + 0x7C00 /* A4 + B4 * B0. */, + 0xF24D /* A5 + B5 * B0. */, + 0xE11B /* A6 + B6 * B0. */, + 0x7C00 /* A7 + B7 * B0. */ }; + +VECT_VAR_DECL (expected1_laneq_static, hfloat, 16, 8) [] + = { 0xCA33 /* A0 + B0 * B1. */, + 0x4EF6 /* A1 + B1 * B1. */, + 0xD274 /* A2 + B2 * B1. */, + 0xCA9A /* A3 + B3 * B1. */, + 0x7C00 /* A4 + B4 * B1. */, + 0x6A3B /* A5 + B5 * B1. */, + 0x5C4D /* A6 + B6 * B1. */, + 0xFC00 /* A7 + B7 * B1. */ }; + +VECT_VAR_DECL (expected2_laneq_static, hfloat, 16, 8) [] + = { 0x5D2F /* A0 + B0 * B2. */, + 0xD32D /* A1 + B1 * B2. */, + 0x54F3 /* A2 + B2 * B2. */, + 0x51B3 /* A3 + B3 * B2. */, + 0x7C00 /* A4 + B4 * B2. */, + 0xECCB /* A5 + B5 * B2. */, + 0xDA01 /* A6 + B6 * B2. */, + 0x7C00 /* A7 + B7 * B2. */ }; + +VECT_VAR_DECL (expected3_laneq_static, hfloat, 16, 8) [] + = { 0x5AC8 /* A0 + B0 * B3. */, + 0xCF40 /* A1 + B1 * B3. */, + 0x5073 /* A2 + B2 * B3. */, + 0x4E80 /* A3 + B3 * B3. */, + 0x7C00 /* A4 + B4 * B3. */, + 0xE851 /* A5 + B5 * B3. */, + 0xD08C /* A6 + B6 * B3. */, + 0x7C00 /* A7 + B7 * B3. */ }; + +VECT_VAR_DECL (expected4_laneq_static, hfloat, 16, 8) [] + = { 0x5A58 /* A0 + B0 * B4. */, + 0xCE62 /* A1 + B1 * B4. */, + 0x4F91 /* A2 + B2 * B4. */, + 0x4DE6 /* A3 + B3 * B4. */, + 0x7C00 /* A4 + B4 * B4. */, + 0xE757 /* A5 + B5 * B4. */, + 0xCC54 /* A6 + B6 * B4. */, + 0x7C00 /* A7 + B7 * B4. */ }; + +VECT_VAR_DECL (expected5_laneq_static, hfloat, 16, 8) [] + = { 0xF23D /* A0 + B0 * B5. */, + 0x6A3B /* A1 + B1 * B5. */, + 0xECCA /* A2 + B2 * B5. */, + 0xE849 /* A3 + B3 * B5. */, + 0x7C00 /* A4 + B4 * B5. */, + 0x7C00 /* A5 + B5 * B5. */, + 0x744D /* A6 + B6 * B5. */, + 0xFC00 /* A7 + B7 * B5. */ }; + +VECT_VAR_DECL (expected6_laneq_static, hfloat, 16, 8) [] + = { 0xE0DA /* A0 + B0 * B6. */, + 0x5995 /* A1 + B1 * B6. */, + 0xDC6C /* A2 + B2 * B6. */, + 0xD753 /* A3 + B3 * B6. */, + 0x7C00 /* A4 + B4 * B6. */, + 0x7447 /* A5 + B5 * B6. */, + 0x644E /* A6 + B6 * B6. */, + 0xFC00 /* A7 + B7 * B6. */ }; + +VECT_VAR_DECL (expected7_laneq_static, hfloat, 16, 8) [] + = { 0x7C00 /* A0 + B0 * B7. */, + 0xFC00 /* A1 + B1 * B7. */, + 0x7C00 /* A2 + B2 * B7. */, + 0x7C00 /* A3 + B3 * B7. */, + 0x7C00 /* A4 + B4 * B7. */, + 0xFC00 /* A5 + B5 * B7. */, + 0xFC00 /* A6 + B6 * B7. */, + 0x7C00 /* A7 + B7 * B7. */ }; + +/* Expected results for vfms_lane. */ +VECT_VAR_DECL (expected0_fms_static, hfloat, 16, 4) [] + = { 0xDEA2 /* A0 + (-B0) * B0. */, + 0x5810 /* A1 + (-B1) * B0. */, + 0xDA82 /* A2 + (-B2) * B0. */, + 0xD53A /* A3 + (-B3) * B0. */ }; + +VECT_VAR_DECL (expected1_fms_static, hfloat, 16, 4) [] + = { 0x5C0D /* A0 + (-B0) * B1. */, + 0xD0EE /* A1 + (-B1) * B1. */, + 0x5274 /* A2 + (-B2) * B1. */, + 0x5026 /* A3 + (-B3) * B1. */ }; + +VECT_VAR_DECL (expected2_fms_static, hfloat, 16, 4) [] + = { 0xD54E /* A0 + (-B0) * B2. */, + 0x51BA /* A1 + (-B1) * B2. */, + 0xD4F3 /* A2 + (-B2) * B2. */, + 0xCE66 /* A3 + (-B3) * B2. */ }; + +VECT_VAR_DECL (expected3_fms_static, hfloat, 16, 4) [] + = { 0x4F70 /* A0 + (-B0) * B3. */, + 0x4C5A /* A1 + (-B1) * B3. */, + 0xD073 /* A2 + (-B2) * B3. */, + 0xC600 /* A3 + (-B3) * B3. */ }; + +/* Expected results for vfmsq_lane. */ +VECT_VAR_DECL (expected0_fms_static, hfloat, 16, 8) [] + = { 0xDEA2 /* A0 + (-B0) * B0. */, + 0x5810 /* A1 + (-B1) * B0. */, + 0xDA82 /* A2 + (-B2) * B0. */, + 0xD53A /* A3 + (-B3) * B0. */, + 0x7C00 /* A4 + (-B4) * B0. */, + 0x724B /* A5 + (-B5) * B0. */, + 0x6286 /* A6 + (-B6) * B0. */, + 0xFC00 /* A7 + (-B7) * B0. */ }; + +VECT_VAR_DECL (expected1_fms_static, hfloat, 16, 8) [] + = { 0x5C0D /* A0 + (-B0) * B1. */, + 0xD0EE /* A1 + (-B1) * B1. */, + 0x5274 /* A2 + (-B2) * B1. */, + 0x5026 /* A3 + (-B3) * B1. */, + 0x7C00 /* A4 + (-B4) * B1. */, + 0xEA41 /* A5 + (-B5) * B1. */, + 0xD5DA /* A6 + (-B6) * B1. */, + 0x7C00 /* A7 + (-B7) * B1. */ }; + +VECT_VAR_DECL (expected2_fms_static, hfloat, 16, 8) [] + = { 0xD54E /* A0 + (-B0) * B2. */, + 0x51BA /* A1 + (-B1) * B2. */, + 0xD4F3 /* A2 + (-B2) * B2. */, + 0xCE66 /* A3 + (-B3) * B2. */, + 0x7C00 /* A4 + (-B4) * B2. */, + 0x6CC8 /* A5 + (-B5) * B2. */, + 0x5DD7 /* A6 + (-B6) * B2. */, + 0xFC00 /* A7 + (-B7) * B2. */ }; + +VECT_VAR_DECL (expected3_fms_static, hfloat, 16, 8) [] + = { 0x4F70 /* A0 + (-B0) * B3. */, + 0x4C5A /* A1 + (-B1) * B3. */, + 0xD073 /* A2 + (-B2) * B3. */, + 0xC600 /* A3 + (-B3) * B3. */, + 0x7C00 /* A4 + (-B4) * B3. */, + 0x684B /* A5 + (-B5) * B3. */, + 0x5AD0 /* A6 + (-B6) * B3. */, + 0xFC00 /* A7 + (-B7) * B3. */ }; + +/* Expected results for vfms_laneq. */ +VECT_VAR_DECL (expected0_fms_laneq_static, hfloat, 16, 4) [] + = { 0xDEA2 /* A0 + (-B0) * B0. */, + 0x5810 /* A1 + (-B1) * B0. */, + 0xDA82 /* A2 + (-B2) * B0. */, + 0xD53A /* A3 + (-B3) * B0. */ }; + +VECT_VAR_DECL (expected1_fms_laneq_static, hfloat, 16, 4) [] + = { 0x5C0D /* A0 + (-B0) * B1. */, + 0xD0EE /* A1 + (-B1) * B1. */, + 0x5274 /* A2 + (-B2) * B1. */, + 0x5026 /* A3 + (-B3) * B1. */ }; + +VECT_VAR_DECL (expected2_fms_laneq_static, hfloat, 16, 4) [] + = { 0xD54E /* A0 + (-B0) * B2. */, + 0x51BA /* A1 + (-B1) * B2. */, + 0xD4F3 /* A2 + (-B2) * B2. */, + 0xCE66 /* A3 + (-B3) * B2. */ }; + +VECT_VAR_DECL (expected3_fms_laneq_static, hfloat, 16, 4) [] + = { 0x4F70 /* A0 + (-B0) * B3. */, + 0x4C5A /* A1 + (-B1) * B3. */, + 0xD073 /* A2 + (-B2) * B3. */, + 0xC600 /* A3 + (-B3) * B3. */ }; + +VECT_VAR_DECL (expected4_fms_laneq_static, hfloat, 16, 4) [] + = { 0x5179 /* A0 + (-B0) * B4. */, + 0x4AF6 /* A1 + (-B1) * B4. */, + 0xCF91 /* A2 + (-B2) * B4. */, + 0xC334 /* A3 + (-B3) * B4. */ }; + +VECT_VAR_DECL (expected5_fms_laneq_static, hfloat, 16, 4) [] + = { 0x725C /* A0 + (-B0) * B5. */, + 0xEA41 /* A1 + (-B1) * B5. */, + 0x6CCA /* A2 + (-B2) * B5. */, + 0x6853 /* A3 + (-B3) * B5. */ }; + +VECT_VAR_DECL (expected6_fms_laneq_static, hfloat, 16, 4) [] + = { 0x62C7 /* A0 + (-B0) * B6. */, + 0xD9F2 /* A1 + (-B1) * B6. */, + 0x5C6C /* A2 + (-B2) * B6. */, + 0x584A /* A3 + (-B3) * B6. */ }; + +VECT_VAR_DECL (expected7_fms_laneq_static, hfloat, 16, 4) [] + = { 0xFC00 /* A0 + (-B0) * B7. */, + 0x7C00 /* A1 + (-B1) * B7. */, + 0xFC00 /* A2 + (-B2) * B7. */, + 0xFC00 /* A3 + (-B3) * B7. */ }; + +/* Expected results for vfmsq_laneq. */ +VECT_VAR_DECL (expected0_fms_laneq_static, hfloat, 16, 8) [] + = { 0xDEA2 /* A0 + (-B0) * B0. */, + 0x5810 /* A1 + (-B1) * B0. */, + 0xDA82 /* A2 + (-B2) * B0. */, + 0xD53A /* A3 + (-B3) * B0. */, + 0x7C00 /* A4 + (-B4) * B0. */, + 0x724B /* A5 + (-B5) * B0. */, + 0x6286 /* A6 + (-B6) * B0. */, + 0xFC00 /* A7 + (-B7) * B0. */ }; + +VECT_VAR_DECL (expected1_fms_laneq_static, hfloat, 16, 8) [] + = { 0x5C0D /* A0 + (-B0) * B1. */, + 0xD0EE /* A1 + (-B1) * B1. */, + 0x5274 /* A2 + (-B2) * B1. */, + 0x5026 /* A3 + (-B3) * B1. */, + 0x7C00 /* A4 + (-B4) * B1. */, + 0xEA41 /* A5 + (-B5) * B1. */, + 0xD5DA /* A6 + (-B6) * B1. */, + 0x7C00 /* A7 + (-B7) * B1. */ }; + +VECT_VAR_DECL (expected2_fms_laneq_static, hfloat, 16, 8) [] + = { 0xD54E /* A0 + (-B0) * B2. */, + 0x51BA /* A1 + (-B1) * B2. */, + 0xD4F3 /* A2 + (-B2) * B2. */, + 0xCE66 /* A3 + (-B3) * B2. */, + 0x7C00 /* A4 + (-B4) * B2. */, + 0x6CC8 /* A5 + (-B5) * B2. */, + 0x5DD7 /* A6 + (-B6) * B2. */, + 0xFC00 /* A7 + (-B7) * B2. */ }; + +VECT_VAR_DECL (expected3_fms_laneq_static, hfloat, 16, 8) [] + = { 0x4F70 /* A0 + (-B0) * B3. */, + 0x4C5A /* A1 + (-B1) * B3. */, + 0xD073 /* A2 + (-B2) * B3. */, + 0xC600 /* A3 + (-B3) * B3. */, + 0x7C00 /* A4 + (-B4) * B3. */, + 0x684B /* A5 + (-B5) * B3. */, + 0x5AD0 /* A6 + (-B6) * B3. */, + 0xFC00 /* A7 + (-B7) * B3. */ }; + +VECT_VAR_DECL (expected4_fms_laneq_static, hfloat, 16, 8) [] + = { 0x5179 /* A0 + (-B0) * B4. */, + 0x4AF6 /* A1 + (-B1) * B4. */, + 0xCF91 /* A2 + (-B2) * B4. */, + 0xC334 /* A3 + (-B3) * B4. */, + 0x7C00 /* A4 + (-B4) * B4. */, + 0x674C /* A5 + (-B5) * B4. */, + 0x5A37 /* A6 + (-B6) * B4. */, + 0xFC00 /* A7 + (-B7) * B4. */ }; + +VECT_VAR_DECL (expected5_fms_laneq_static, hfloat, 16, 8) [] + = { 0x725C /* A0 + (-B0) * B5. */, + 0xEA41 /* A1 + (-B1) * B5. */, + 0x6CCA /* A2 + (-B2) * B5. */, + 0x6853 /* A3 + (-B3) * B5. */, + 0x7C00 /* A4 + (-B4) * B5. */, + 0xFC00 /* A5 + (-B5) * B5. */, + 0xF441 /* A6 + (-B6) * B5. */, + 0x7C00 /* A7 + (-B7) * B5. */ }; + +VECT_VAR_DECL (expected6_fms_laneq_static, hfloat, 16, 8) [] + = { 0x62C7 /* A0 + (-B0) * B6. */, + 0xD9F2 /* A1 + (-B1) * B6. */, + 0x5C6C /* A2 + (-B2) * B6. */, + 0x584A /* A3 + (-B3) * B6. */, + 0x7C00 /* A4 + (-B4) * B6. */, + 0xF447 /* A5 + (-B5) * B6. */, + 0xE330 /* A6 + (-B6) * B6. */, + 0x7C00 /* A7 + (-B7) * B6. */ }; + +VECT_VAR_DECL (expected7_fms_laneq_static, hfloat, 16, 8) [] + = { 0xFC00 /* A0 + (-B0) * B7. */, + 0x7C00 /* A1 + (-B1) * B7. */, + 0xFC00 /* A2 + (-B2) * B7. */, + 0xFC00 /* A3 + (-B3) * B7. */, + 0x7C00 /* A4 + (-B4) * B7. */, + 0x7C00 /* A5 + (-B5) * B7. */, + 0x7C00 /* A6 + (-B6) * B7. */, + 0xFC00 /* A7 + (-B7) * B7. */ }; + +void exec_vfmas_lane_f16 (void) +{ +#undef TEST_MSG +#define TEST_MSG "VFMA_LANE (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_1, float, 16, 4); + DECL_VARIABLE(vsrc_2, float, 16, 4); + VECT_VAR_DECL (buf_src_1, float, 16, 4) [] = {A0, A1, A2, A3}; + VECT_VAR_DECL (buf_src_2, float, 16, 4) [] = {B0, B1, B2, B3}; + VLOAD (vsrc_1, buf_src_1, , float, f, 16, 4); + VLOAD (vsrc_2, buf_src_2, , float, f, 16, 4); + DECL_VARIABLE (vector_res, float, 16, 4) + = vfma_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 0); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected0_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfma_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 1); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected1_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfma_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 2); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected2_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfma_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 3); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected3_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VFMAQ_LANE (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_1, float, 16, 8); + DECL_VARIABLE(vsrc_2, float, 16, 8); + VECT_VAR_DECL (buf_src_1, float, 16, 8) [] = {A0, A1, A2, A3, A4, A5, A6, A7}; + VECT_VAR_DECL (buf_src_2, float, 16, 8) [] = {B0, B1, B2, B3, B4, B5, B6, B7}; + VLOAD (vsrc_1, buf_src_1, q, float, f, 16, 8); + VLOAD (vsrc_2, buf_src_2, q, float, f, 16, 8); + DECL_VARIABLE (vector_res, float, 16, 8) + = vfmaq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 0); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected0_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 1); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected1_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 2); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected2_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 3); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected3_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VFMA_LANEQ (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_3, float, 16, 8); + VECT_VAR_DECL (buf_src_3, float, 16, 8) [] = {B0, B1, B2, B3, B4, B5, B6, B7}; + VLOAD (vsrc_3, buf_src_3, q, float, f, 16, 8); + VECT_VAR (vector_res, float, 16, 4) + = vfma_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 0); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected0_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfma_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 1); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected1_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfma_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 2); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected2_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfma_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 3); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected3_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfma_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 4); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected4_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfma_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 5); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected5_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfma_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 6); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected6_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfma_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 7); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected7_laneq_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VFMAQ_LANEQ (FP16)" + clean_results (); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 0); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected0_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 1); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected1_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 2); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected2_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 3); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected3_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 4); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected4_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 5); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected5_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 6); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected6_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 7); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected7_laneq_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VFMS_LANE (FP16)" + clean_results (); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 0); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected0_fms_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 1); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected1_fms_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 2); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected2_fms_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 3); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected3_fms_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VFMSQ_LANE (FP16)" + clean_results (); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 0); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected0_fms_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 1); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected1_fms_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 2); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected2_fms_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 3); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected3_fms_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VFMS_LANEQ (FP16)" + clean_results (); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 0); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected0_fms_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 1); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected1_fms_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 2); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected2_fms_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 3); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected3_fms_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 4); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected4_fms_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 5); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected5_fms_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 6); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected6_fms_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), + VECT_VAR (vsrc_3, float, 16, 8), 7); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected7_fms_laneq_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VFMSQ_LANEQ (FP16)" + clean_results (); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 0); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected0_fms_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 1); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected1_fms_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 2); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected2_fms_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 3); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected3_fms_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 4); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected4_fms_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 5); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected5_fms_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 6); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected6_fms_laneq_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), + VECT_VAR (vsrc_3, float, 16, 8), 7); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected7_fms_laneq_static, ""); +} + +int +main (void) +{ + exec_vfmas_lane_f16 (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfmas_n_f16_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfmas_n_f16_1.c new file mode 100644 index 0000000..f01aefb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfmas_n_f16_1.c @@ -0,0 +1,469 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_2a_fp16_neon_hw } */ +/* { dg-add-options arm_v8_2a_fp16_neon } */ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define FP16_C(a) ((__fp16) a) +#define A0 FP16_C (123.4) +#define A1 FP16_C (-5.8) +#define A2 FP16_C (-0.0) +#define A3 FP16_C (10) +#define A4 FP16_C (123412.43) +#define A5 FP16_C (-5.8) +#define A6 FP16_C (90.8) +#define A7 FP16_C (24) + +#define B0 FP16_C (23.4) +#define B1 FP16_C (-5.8) +#define B2 FP16_C (8.9) +#define B3 FP16_C (4.0) +#define B4 FP16_C (3.4) +#define B5 FP16_C (-550.8) +#define B6 FP16_C (-31.8) +#define B7 FP16_C (20000.0) + +/* Expected results for vfma_n. */ +VECT_VAR_DECL (expected_fma0_static, hfloat, 16, 4) [] + = { 0x613E /* A0 + B0 * B0. */, + 0xD86D /* A1 + B1 * B0. */, + 0x5A82 /* A2 + B2 * B0. */, + 0x567A /* A3 + B3 * B0. */ }; + +VECT_VAR_DECL (expected_fma1_static, hfloat, 16, 4) [] + = { 0xCA33 /* A0 + B0 * B1. */, + 0x4EF6 /* A1 + B1 * B1. */, + 0xD274 /* A2 + B2 * B1. */, + 0xCA9A /* A3 + B3 * B1. */ }; + +VECT_VAR_DECL (expected_fma2_static, hfloat, 16, 4) [] + = { 0x5D2F /* A0 + B0 * B2. */, + 0xD32D /* A1 + B1 * B2. */, + 0x54F3 /* A2 + B2 * B2. */, + 0x51B3 /* A3 + B3 * B2. */ }; + +VECT_VAR_DECL (expected_fma3_static, hfloat, 16, 4) [] + = { 0x5AC8 /* A0 + B0 * B3. */, + 0xCF40 /* A1 + B1 * B3. */, + 0x5073 /* A2 + B2 * B3. */, + 0x4E80 /* A3 + B3 * B3. */ }; + +VECT_VAR_DECL (expected_fma0_static, hfloat, 16, 8) [] + = { 0x613E /* A0 + B0 * B0. */, + 0xD86D /* A1 + B1 * B0. */, + 0x5A82 /* A2 + B2 * B0. */, + 0x567A /* A3 + B3 * B0. */, + 0x7C00 /* A4 + B4 * B0. */, + 0xF24D /* A5 + B5 * B0. */, + 0xE11B /* A6 + B6 * B0. */, + 0x7C00 /* A7 + B7 * B0. */ }; + +VECT_VAR_DECL (expected_fma1_static, hfloat, 16, 8) [] + = { 0xCA33 /* A0 + B0 * B1. */, + 0x4EF6 /* A1 + B1 * B1. */, + 0xD274 /* A2 + B2 * B1. */, + 0xCA9A /* A3 + B3 * B1. */, + 0x7C00 /* A4 + B4 * B1. */, + 0x6A3B /* A5 + B5 * B1. */, + 0x5C4D /* A6 + B6 * B1. */, + 0xFC00 /* A7 + B7 * B1. */ }; + +VECT_VAR_DECL (expected_fma2_static, hfloat, 16, 8) [] + = { 0x5D2F /* A0 + B0 * B2. */, + 0xD32D /* A1 + B1 * B2. */, + 0x54F3 /* A2 + B2 * B2. */, + 0x51B3 /* A3 + B3 * B2. */, + 0x7C00 /* A4 + B4 * B2. */, + 0xECCB /* A5 + B5 * B2. */, + 0xDA01 /* A6 + B6 * B2. */, + 0x7C00 /* A7 + B7 * B2. */ }; + +VECT_VAR_DECL (expected_fma3_static, hfloat, 16, 8) [] + = { 0x5AC8 /* A0 + B0 * B3. */, + 0xCF40 /* A1 + B1 * B3. */, + 0x5073 /* A2 + B2 * B3. */, + 0x4E80 /* A3 + B3 * B3. */, + 0x7C00 /* A4 + B4 * B3. */, + 0xE851 /* A5 + B5 * B3. */, + 0xD08C /* A6 + B6 * B3. */, + 0x7C00 /* A7 + B7 * B3. */ }; + +VECT_VAR_DECL (expected_fma4_static, hfloat, 16, 8) [] + = { 0x5A58 /* A0 + B0 * B4. */, + 0xCE62 /* A1 + B1 * B4. */, + 0x4F91 /* A2 + B2 * B4. */, + 0x4DE6 /* A3 + B3 * B4. */, + 0x7C00 /* A4 + B4 * B4. */, + 0xE757 /* A5 + B5 * B4. */, + 0xCC54 /* A6 + B6 * B4. */, + 0x7C00 /* A7 + B7 * B4. */ }; + +VECT_VAR_DECL (expected_fma5_static, hfloat, 16, 8) [] + = { 0xF23D /* A0 + B0 * B5. */, + 0x6A3B /* A1 + B1 * B5. */, + 0xECCA /* A2 + B2 * B5. */, + 0xE849 /* A3 + B3 * B5. */, + 0x7C00 /* A4 + B4 * B5. */, + 0x7C00 /* A5 + B5 * B5. */, + 0x744D /* A6 + B6 * B5. */, + 0xFC00 /* A7 + B7 * B5. */ }; + +VECT_VAR_DECL (expected_fma6_static, hfloat, 16, 8) [] + = { 0xE0DA /* A0 + B0 * B6. */, + 0x5995 /* A1 + B1 * B6. */, + 0xDC6C /* A2 + B2 * B6. */, + 0xD753 /* A3 + B3 * B6. */, + 0x7C00 /* A4 + B4 * B6. */, + 0x7447 /* A5 + B5 * B6. */, + 0x644E /* A6 + B6 * B6. */, + 0xFC00 /* A7 + B7 * B6. */ }; + +VECT_VAR_DECL (expected_fma7_static, hfloat, 16, 8) [] + = { 0x7C00 /* A0 + B0 * B7. */, + 0xFC00 /* A1 + B1 * B7. */, + 0x7C00 /* A2 + B2 * B7. */, + 0x7C00 /* A3 + B3 * B7. */, + 0x7C00 /* A4 + B4 * B7. */, + 0xFC00 /* A5 + B5 * B7. */, + 0xFC00 /* A6 + B6 * B7. */, + 0x7C00 /* A7 + B7 * B7. */ }; + +/* Expected results for vfms_n. */ +VECT_VAR_DECL (expected_fms0_static, hfloat, 16, 4) [] + = { 0xDEA2 /* A0 + (-B0) * B0. */, + 0x5810 /* A1 + (-B1) * B0. */, + 0xDA82 /* A2 + (-B2) * B0. */, + 0xD53A /* A3 + (-B3) * B0. */ }; + +VECT_VAR_DECL (expected_fms1_static, hfloat, 16, 4) [] + = { 0x5C0D /* A0 + (-B0) * B1. */, + 0xD0EE /* A1 + (-B1) * B1. */, + 0x5274 /* A2 + (-B2) * B1. */, + 0x5026 /* A3 + (-B3) * B1. */ }; + +VECT_VAR_DECL (expected_fms2_static, hfloat, 16, 4) [] + = { 0xD54E /* A0 + (-B0) * B2. */, + 0x51BA /* A1 + (-B1) * B2. */, + 0xD4F3 /* A2 + (-B2) * B2. */, + 0xCE66 /* A3 + (-B3) * B2. */ }; + +VECT_VAR_DECL (expected_fms3_static, hfloat, 16, 4) [] + = { 0x4F70 /* A0 + (-B0) * B3. */, + 0x4C5A /* A1 + (-B1) * B3. */, + 0xD073 /* A2 + (-B2) * B3. */, + 0xC600 /* A3 + (-B3) * B3. */ }; + +VECT_VAR_DECL (expected_fms0_static, hfloat, 16, 8) [] + = { 0xDEA2 /* A0 + (-B0) * B0. */, + 0x5810 /* A1 + (-B1) * B0. */, + 0xDA82 /* A2 + (-B2) * B0. */, + 0xD53A /* A3 + (-B3) * B0. */, + 0x7C00 /* A4 + (-B4) * B0. */, + 0x724B /* A5 + (-B5) * B0. */, + 0x6286 /* A6 + (-B6) * B0. */, + 0xFC00 /* A7 + (-B7) * B0. */ }; + +VECT_VAR_DECL (expected_fms1_static, hfloat, 16, 8) [] + = { 0x5C0D /* A0 + (-B0) * B1. */, + 0xD0EE /* A1 + (-B1) * B1. */, + 0x5274 /* A2 + (-B2) * B1. */, + 0x5026 /* A3 + (-B3) * B1. */, + 0x7C00 /* A4 + (-B4) * B1. */, + 0xEA41 /* A5 + (-B5) * B1. */, + 0xD5DA /* A6 + (-B6) * B1. */, + 0x7C00 /* A7 + (-B7) * B1. */ }; + +VECT_VAR_DECL (expected_fms2_static, hfloat, 16, 8) [] + = { 0xD54E /* A0 + (-B0) * B2. */, + 0x51BA /* A1 + (-B1) * B2. */, + 0xD4F3 /* A2 + (-B2) * B2. */, + 0xCE66 /* A3 + (-B3) * B2. */, + 0x7C00 /* A4 + (-B4) * B2. */, + 0x6CC8 /* A5 + (-B5) * B2. */, + 0x5DD7 /* A6 + (-B6) * B2. */, + 0xFC00 /* A7 + (-B7) * B2. */ }; + +VECT_VAR_DECL (expected_fms3_static, hfloat, 16, 8) [] + = { 0x4F70 /* A0 + (-B0) * B3. */, + 0x4C5A /* A1 + (-B1) * B3. */, + 0xD073 /* A2 + (-B2) * B3. */, + 0xC600 /* A3 + (-B3) * B3. */, + 0x7C00 /* A4 + (-B4) * B3. */, + 0x684B /* A5 + (-B5) * B3. */, + 0x5AD0 /* A6 + (-B6) * B3. */, + 0xFC00 /* A7 + (-B7) * B3. */ }; + +VECT_VAR_DECL (expected_fms4_static, hfloat, 16, 8) [] + = { 0x5179 /* A0 + (-B0) * B4. */, + 0x4AF6 /* A1 + (-B1) * B4. */, + 0xCF91 /* A2 + (-B2) * B4. */, + 0xC334 /* A3 + (-B3) * B4. */, + 0x7C00 /* A4 + (-B4) * B4. */, + 0x674C /* A5 + (-B5) * B4. */, + 0x5A37 /* A6 + (-B6) * B4. */, + 0xFC00 /* A7 + (-B7) * B4. */ }; + +VECT_VAR_DECL (expected_fms5_static, hfloat, 16, 8) [] + = { 0x725C /* A0 + (-B0) * B5. */, + 0xEA41 /* A1 + (-B1) * B5. */, + 0x6CCA /* A2 + (-B2) * B5. */, + 0x6853 /* A3 + (-B3) * B5. */, + 0x7C00 /* A4 + (-B4) * B5. */, + 0xFC00 /* A5 + (-B5) * B5. */, + 0xF441 /* A6 + (-B6) * B5. */, + 0x7C00 /* A7 + (-B7) * B5. */ }; + +VECT_VAR_DECL (expected_fms6_static, hfloat, 16, 8) [] + = { 0x62C7 /* A0 + (-B0) * B6. */, + 0xD9F2 /* A1 + (-B1) * B6. */, + 0x5C6C /* A2 + (-B2) * B6. */, + 0x584A /* A3 + (-B3) * B6. */, + 0x7C00 /* A4 + (-B4) * B6. */, + 0xF447 /* A5 + (-B5) * B6. */, + 0xE330 /* A6 + (-B6) * B6. */, + 0x7C00 /* A7 + (-B7) * B6. */ }; + +VECT_VAR_DECL (expected_fms7_static, hfloat, 16, 8) [] + = { 0xFC00 /* A0 + (-B0) * B7. */, + 0x7C00 /* A1 + (-B1) * B7. */, + 0xFC00 /* A2 + (-B2) * B7. */, + 0xFC00 /* A3 + (-B3) * B7. */, + 0x7C00 /* A4 + (-B4) * B7. */, + 0x7C00 /* A5 + (-B5) * B7. */, + 0x7C00 /* A6 + (-B6) * B7. */, + 0xFC00 /* A7 + (-B7) * B7. */ }; + +void exec_vfmas_n_f16 (void) +{ +#undef TEST_MSG +#define TEST_MSG "VFMA_N (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_1, float, 16, 4); + DECL_VARIABLE(vsrc_2, float, 16, 4); + VECT_VAR_DECL (buf_src_1, float, 16, 4) [] = {A0, A1, A2, A3}; + VECT_VAR_DECL (buf_src_2, float, 16, 4) [] = {B0, B1, B2, B3}; + VLOAD (vsrc_1, buf_src_1, , float, f, 16, 4); + VLOAD (vsrc_2, buf_src_2, , float, f, 16, 4); + DECL_VARIABLE (vector_res, float, 16, 4) + = vfma_n_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), B0); + + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_fma0_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfma_n_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), B1); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_fma1_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfma_n_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), B2); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_fma2_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfma_n_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), B3); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_fma3_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VFMAQ_N (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_1, float, 16, 8); + DECL_VARIABLE(vsrc_2, float, 16, 8); + VECT_VAR_DECL (buf_src_1, float, 16, 8) [] = {A0, A1, A2, A3, A4, A5, A6, A7}; + VECT_VAR_DECL (buf_src_2, float, 16, 8) [] = {B0, B1, B2, B3, B4, B5, B6, B7}; + VLOAD (vsrc_1, buf_src_1, q, float, f, 16, 8); + VLOAD (vsrc_2, buf_src_2, q, float, f, 16, 8); + DECL_VARIABLE (vector_res, float, 16, 8) + = vfmaq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B0); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fma0_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B1); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fma1_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B2); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fma2_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B3); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fma3_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B4); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fma4_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B5); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fma5_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B6); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fma6_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmaq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B7); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fma7_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VFMA_N (FP16)" + clean_results (); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_n_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), B0); + + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_fms0_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_n_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), B1); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_fms1_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_n_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), B2); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_fms2_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vfms_n_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), B3); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_fms3_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VFMAQ_N (FP16)" + clean_results (); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B0); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fms0_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B1); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fms1_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B2); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fms2_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B3); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fms3_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B4); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fms4_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B5); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fms5_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B6); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fms6_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vfmsq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), B7); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_fms7_static, ""); +} + +int +main (void) +{ + exec_vfmas_n_f16 (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmaxnmv_f16_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmaxnmv_f16_1.c new file mode 100644 index 0000000..ce9872f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmaxnmv_f16_1.c @@ -0,0 +1,131 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_2a_fp16_neon_hw } */ +/* { dg-add-options arm_v8_2a_fp16_neon } */ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define FP16_C(a) ((__fp16) a) +#define A0 FP16_C (34.8) +#define B0 FP16_C (__builtin_nanf ("")) +#define C0 FP16_C (-__builtin_nanf ("")) +#define D0 FP16_C (0.0) + +#define A1 FP16_C (1025.8) +#define B1 FP16_C (13.4) +#define C1 FP16_C (__builtin_nanf ("")) +#define D1 FP16_C (10) +#define E1 FP16_C (-0.0) +#define F1 FP16_C (-__builtin_nanf ("")) +#define G1 FP16_C (0.0) +#define H1 FP16_C (10) + +/* Expected results for vmaxnmv. */ +uint16_t expect = 0x505A /* A0. */; +uint16_t expect_alt = 0x6402 /* A1. */; + +void exec_vmaxnmv_f16 (void) +{ +#undef TEST_MSG +#define TEST_MSG "VMAXNMV (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc, float, 16, 4); + VECT_VAR_DECL (buf_src, float, 16, 4) [] = {A0, B0, C0, D0}; + VLOAD (vsrc, buf_src, , float, f, 16, 4); + float16_t vector_res = vmaxnmv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + + VECT_VAR_DECL (buf_src1, float, 16, 4) [] = {B0, A0, C0, D0}; + VLOAD (vsrc, buf_src1, , float, f, 16, 4); + vector_res = vmaxnmv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + + VECT_VAR_DECL (buf_src2, float, 16, 4) [] = {B0, C0, A0, D0}; + VLOAD (vsrc, buf_src2, , float, f, 16, 4); + vector_res = vmaxnmv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + + VECT_VAR_DECL (buf_src3, float, 16, 4) [] = {B0, C0, D0, A0}; + VLOAD (vsrc, buf_src3, , float, f, 16, 4); + vector_res = vmaxnmv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + +#undef TEST_MSG +#define TEST_MSG "VMAXNMVQ (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc, float, 16, 8); + VECT_VAR_DECL (buf_src, float, 16, 8) [] = {A1, B1, C1, D1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src, q, float, f, 16, 8); + vector_res = vmaxnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src1, float, 16, 8) [] = {B1, A1, C1, D1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src1, q, float, f, 16, 8); + vector_res = vmaxnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src2, float, 16, 8) [] = {B1, C1, A1, D1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src2, q, float, f, 16, 8); + vector_res = vmaxnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src3, float, 16, 8) [] = {B1, C1, D1, A1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src3, q, float, f, 16, 8); + vector_res = vmaxnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src4, float, 16, 8) [] = {B1, C1, D1, E1, A1, F1, G1, H1}; + VLOAD (vsrc, buf_src4, q, float, f, 16, 8); + vector_res = vmaxnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src5, float, 16, 8) [] = {B1, C1, D1, E1, F1, A1, G1, H1}; + VLOAD (vsrc, buf_src5, q, float, f, 16, 8); + vector_res = vmaxnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src6, float, 16, 8) [] = {B1, C1, D1, E1, F1, G1, A1, H1}; + VLOAD (vsrc, buf_src6, q, float, f, 16, 8); + vector_res = vmaxnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src7, float, 16, 8) [] = {B1, C1, D1, E1, F1, G1, H1, A1}; + VLOAD (vsrc, buf_src7, q, float, f, 16, 8); + vector_res = vmaxnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); +} + +int +main (void) +{ + exec_vmaxnmv_f16 (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmaxv_f16_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmaxv_f16_1.c new file mode 100644 index 0000000..39c4897 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmaxv_f16_1.c @@ -0,0 +1,131 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_2a_fp16_neon_hw } */ +/* { dg-add-options arm_v8_2a_fp16_neon } */ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define FP16_C(a) ((__fp16) a) +#define A0 FP16_C (123.4) +#define B0 FP16_C (-567.8) +#define C0 FP16_C (34.8) +#define D0 FP16_C (0.0) + +#define A1 FP16_C (1025.8) +#define B1 FP16_C (13.4) +#define C1 FP16_C (-567.8) +#define D1 FP16_C (10) +#define E1 FP16_C (-0.0) +#define F1 FP16_C (567.8) +#define G1 FP16_C (0.0) +#define H1 FP16_C (10) + +/* Expected results for vmaxv. */ +uint16_t expect = 0x57B6 /* A0. */; +uint16_t expect_alt = 0x6402 /* A1. */; + +void exec_vmaxv_f16 (void) +{ +#undef TEST_MSG +#define TEST_MSG "VMAXV (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc, float, 16, 4); + VECT_VAR_DECL (buf_src, float, 16, 4) [] = {A0, B0, C0, D0}; + VLOAD (vsrc, buf_src, , float, f, 16, 4); + float16_t vector_res = vmaxv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + + VECT_VAR_DECL (buf_src1, float, 16, 4) [] = {B0, A0, C0, D0}; + VLOAD (vsrc, buf_src1, , float, f, 16, 4); + vector_res = vmaxv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + + VECT_VAR_DECL (buf_src2, float, 16, 4) [] = {B0, C0, A0, D0}; + VLOAD (vsrc, buf_src2, , float, f, 16, 4); + vector_res = vmaxv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + + VECT_VAR_DECL (buf_src3, float, 16, 4) [] = {B0, C0, D0, A0}; + VLOAD (vsrc, buf_src3, , float, f, 16, 4); + vector_res = vmaxv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + +#undef TEST_MSG +#define TEST_MSG "VMAXVQ (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc, float, 16, 8); + VECT_VAR_DECL (buf_src, float, 16, 8) [] = {A1, B1, C1, D1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src, q, float, f, 16, 8); + vector_res = vmaxvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src1, float, 16, 8) [] = {B1, A1, C1, D1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src1, q, float, f, 16, 8); + vector_res = vmaxvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src2, float, 16, 8) [] = {B1, C1, A1, D1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src2, q, float, f, 16, 8); + vector_res = vmaxvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src3, float, 16, 8) [] = {B1, C1, D1, A1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src3, q, float, f, 16, 8); + vector_res = vmaxvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src4, float, 16, 8) [] = {B1, C1, D1, E1, A1, F1, G1, H1}; + VLOAD (vsrc, buf_src4, q, float, f, 16, 8); + vector_res = vmaxvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src5, float, 16, 8) [] = {B1, C1, D1, E1, F1, A1, G1, H1}; + VLOAD (vsrc, buf_src5, q, float, f, 16, 8); + vector_res = vmaxvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src6, float, 16, 8) [] = {B1, C1, D1, E1, F1, G1, A1, H1}; + VLOAD (vsrc, buf_src6, q, float, f, 16, 8); + vector_res = vmaxvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src7, float, 16, 8) [] = {B1, C1, D1, E1, F1, G1, H1, A1}; + VLOAD (vsrc, buf_src7, q, float, f, 16, 8); + vector_res = vmaxvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); +} + +int +main (void) +{ + exec_vmaxv_f16 (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vminnmv_f16_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vminnmv_f16_1.c new file mode 100644 index 0000000..b7c5101 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vminnmv_f16_1.c @@ -0,0 +1,131 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_2a_fp16_neon_hw } */ +/* { dg-add-options arm_v8_2a_fp16_neon } */ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define FP16_C(a) ((__fp16) a) +#define A0 FP16_C (-567.8) +#define B0 FP16_C (__builtin_nanf ("")) +#define C0 FP16_C (34.8) +#define D0 FP16_C (-__builtin_nanf ("")) + +#define A1 FP16_C (-567.8) +#define B1 FP16_C (1025.8) +#define C1 FP16_C (-__builtin_nanf ("")) +#define D1 FP16_C (10) +#define E1 FP16_C (-0.0) +#define F1 FP16_C (__builtin_nanf ("")) +#define G1 FP16_C (0.0) +#define H1 FP16_C (10) + +/* Expected results for vminnmv. */ +uint16_t expect = 0xE070 /* A0. */; +uint16_t expect_alt = 0xE070 /* A1. */; + +void exec_vminnmv_f16 (void) +{ +#undef TEST_MSG +#define TEST_MSG "VMINNMV (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc, float, 16, 4); + VECT_VAR_DECL (buf_src, float, 16, 4) [] = {A0, B0, C0, D0}; + VLOAD (vsrc, buf_src, , float, f, 16, 4); + float16_t vector_res = vminnmv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + + VECT_VAR_DECL (buf_src1, float, 16, 4) [] = {B0, A0, C0, D0}; + VLOAD (vsrc, buf_src1, , float, f, 16, 4); + vector_res = vminnmv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + + VECT_VAR_DECL (buf_src2, float, 16, 4) [] = {B0, C0, A0, D0}; + VLOAD (vsrc, buf_src2, , float, f, 16, 4); + vector_res = vminnmv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + + VECT_VAR_DECL (buf_src3, float, 16, 4) [] = {B0, C0, D0, A0}; + VLOAD (vsrc, buf_src3, , float, f, 16, 4); + vector_res = vminnmv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + +#undef TEST_MSG +#define TEST_MSG "VMINNMVQ (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc, float, 16, 8); + VECT_VAR_DECL (buf_src, float, 16, 8) [] = {A1, B1, C1, D1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src, q, float, f, 16, 8); + vector_res = vminnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src1, float, 16, 8) [] = {B1, A1, C1, D1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src1, q, float, f, 16, 8); + vector_res = vminnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src2, float, 16, 8) [] = {B1, C1, A1, D1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src2, q, float, f, 16, 8); + vector_res = vminnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src3, float, 16, 8) [] = {B1, C1, D1, A1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src3, q, float, f, 16, 8); + vector_res = vminnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src4, float, 16, 8) [] = {B1, C1, D1, E1, A1, F1, G1, H1}; + VLOAD (vsrc, buf_src4, q, float, f, 16, 8); + vector_res = vminnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src5, float, 16, 8) [] = {B1, C1, D1, E1, F1, A1, G1, H1}; + VLOAD (vsrc, buf_src5, q, float, f, 16, 8); + vector_res = vminnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src6, float, 16, 8) [] = {B1, C1, D1, E1, F1, G1, A1, H1}; + VLOAD (vsrc, buf_src6, q, float, f, 16, 8); + vector_res = vminnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src7, float, 16, 8) [] = {B1, C1, D1, E1, F1, G1, H1, A1}; + VLOAD (vsrc, buf_src7, q, float, f, 16, 8); + vector_res = vminnmvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); +} + +int +main (void) +{ + exec_vminnmv_f16 (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vminv_f16_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vminv_f16_1.c new file mode 100644 index 0000000..c454a53 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vminv_f16_1.c @@ -0,0 +1,131 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_2a_fp16_neon_hw } */ +/* { dg-add-options arm_v8_2a_fp16_neon } */ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define FP16_C(a) ((__fp16) a) +#define A0 FP16_C (-567.8) +#define B0 FP16_C (123.4) +#define C0 FP16_C (34.8) +#define D0 FP16_C (0.0) + +#define A1 FP16_C (-567.8) +#define B1 FP16_C (1025.8) +#define C1 FP16_C (13.4) +#define D1 FP16_C (10) +#define E1 FP16_C (-0.0) +#define F1 FP16_C (567.8) +#define G1 FP16_C (0.0) +#define H1 FP16_C (10) + +/* Expected results for vminv. */ +uint16_t expect = 0xE070 /* A0. */; +uint16_t expect_alt = 0xE070 /* A1. */; + +void exec_vminv_f16 (void) +{ +#undef TEST_MSG +#define TEST_MSG "VMINV (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc, float, 16, 4); + VECT_VAR_DECL (buf_src, float, 16, 4) [] = {A0, B0, C0, D0}; + VLOAD (vsrc, buf_src, , float, f, 16, 4); + float16_t vector_res = vminv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + + VECT_VAR_DECL (buf_src1, float, 16, 4) [] = {B0, A0, C0, D0}; + VLOAD (vsrc, buf_src1, , float, f, 16, 4); + vector_res = vminv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + + VECT_VAR_DECL (buf_src2, float, 16, 4) [] = {B0, C0, A0, D0}; + VLOAD (vsrc, buf_src2, , float, f, 16, 4); + vector_res = vminv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + + VECT_VAR_DECL (buf_src3, float, 16, 4) [] = {B0, C0, D0, A0}; + VLOAD (vsrc, buf_src3, , float, f, 16, 4); + vector_res = vminv_f16 (VECT_VAR (vsrc, float, 16, 4)); + + if (* (uint16_t *) &vector_res != expect) + abort (); + +#undef TEST_MSG +#define TEST_MSG "VMINVQ (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc, float, 16, 8); + VECT_VAR_DECL (buf_src, float, 16, 8) [] = {A1, B1, C1, D1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src, q, float, f, 16, 8); + vector_res = vminvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src1, float, 16, 8) [] = {B1, A1, C1, D1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src1, q, float, f, 16, 8); + vector_res = vminvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src2, float, 16, 8) [] = {B1, C1, A1, D1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src2, q, float, f, 16, 8); + vector_res = vminvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src3, float, 16, 8) [] = {B1, C1, D1, A1, E1, F1, G1, H1}; + VLOAD (vsrc, buf_src3, q, float, f, 16, 8); + vector_res = vminvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src4, float, 16, 8) [] = {B1, C1, D1, E1, A1, F1, G1, H1}; + VLOAD (vsrc, buf_src4, q, float, f, 16, 8); + vector_res = vminvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src5, float, 16, 8) [] = {B1, C1, D1, E1, F1, A1, G1, H1}; + VLOAD (vsrc, buf_src5, q, float, f, 16, 8); + vector_res = vminvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src6, float, 16, 8) [] = {B1, C1, D1, E1, F1, G1, A1, H1}; + VLOAD (vsrc, buf_src6, q, float, f, 16, 8); + vector_res = vminvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); + + VECT_VAR_DECL (buf_src7, float, 16, 8) [] = {B1, C1, D1, E1, F1, G1, H1, A1}; + VLOAD (vsrc, buf_src7, q, float, f, 16, 8); + vector_res = vminvq_f16 (VECT_VAR (vsrc, float, 16, 8)); + + if (* (uint16_t *) &vector_res != expect_alt) + abort (); +} + +int +main (void) +{ + exec_vminv_f16 (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmul_lane_f16_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmul_lane_f16_1.c new file mode 100644 index 0000000..1719d56 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmul_lane_f16_1.c @@ -0,0 +1,454 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_2a_fp16_neon_hw } */ +/* { dg-add-options arm_v8_2a_fp16_neon } */ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define FP16_C(a) ((__fp16) a) +#define A FP16_C (13.4) +#define B FP16_C (-56.8) +#define C FP16_C (-34.8) +#define D FP16_C (12) +#define E FP16_C (63.1) +#define F FP16_C (19.1) +#define G FP16_C (-4.8) +#define H FP16_C (77) + +#define I FP16_C (0.7) +#define J FP16_C (-78) +#define K FP16_C (11.23) +#define L FP16_C (98) +#define M FP16_C (87.1) +#define N FP16_C (-8) +#define O FP16_C (-1.1) +#define P FP16_C (-9.7) + +/* Expected results for vmul_lane. */ +VECT_VAR_DECL (expected0_static, hfloat, 16, 4) [] + = { 0x629B /* A * E. */, + 0xEB00 /* B * E. */, + 0xE84A /* C * E. */, + 0x61EA /* D * E. */ }; + +VECT_VAR_DECL (expected1_static, hfloat, 16, 4) [] + = { 0x5BFF /* A * F. */, + 0xE43D /* B * F. */, + 0xE131 /* C * F. */, + 0x5B29 /* D * F. */ }; + +VECT_VAR_DECL (expected2_static, hfloat, 16, 4) [] + = { 0xD405 /* A * G. */, + 0x5C43 /* B * G. */, + 0x5939 /* C * G. */, + 0xD334 /* D * G. */ }; + +VECT_VAR_DECL (expected3_static, hfloat, 16, 4) [] + = { 0x6408 /* A * H. */, + 0xEC46 /* B * H. */, + 0xE93C /* C * H. */, + 0x6338 /* D * H. */ }; + +/* Expected results for vmulq_lane. */ +VECT_VAR_DECL (expected0_static, hfloat, 16, 8) [] + = { 0x629B /* A * E. */, + 0xEB00 /* B * E. */, + 0xE84A /* C * E. */, + 0x61EA /* D * E. */, + 0x5186 /* I * E. */, + 0xECCE /* J * E. */, + 0x6189 /* K * E. */, + 0x6E0A /* L * E. */ }; + +VECT_VAR_DECL (expected1_static, hfloat, 16, 8) [] + = { 0x5BFF /* A * F. */, + 0xE43D /* B * F. */, + 0xE131 /* C * F. */, + 0x5B29 /* D * F. */, + 0x4AAF /* I * F. */, + 0xE5D1 /* J * F. */, + 0x5AB3 /* K * F. */, + 0x674F /* L * F. */ }; + +VECT_VAR_DECL (expected2_static, hfloat, 16, 8) [] + = { 0xD405 /* A * G. */, + 0x5C43 /* B * G. */, + 0x5939 /* C * G. */, + 0xD334 /* D * G. */, + 0xC2B9 /* I * G. */, + 0x5DDA /* J * G. */, + 0xD2BD /* K * G. */, + 0xDF5A /* L * G. */ }; + +VECT_VAR_DECL (expected3_static, hfloat, 16, 8) [] + = { 0x6408 /* A * H. */, + 0xEC46 /* B * H. */, + 0xE93C /* C * H. */, + 0x6338 /* D * H. */, + 0x52BD /* I * H. */, + 0xEDDE /* J * H. */, + 0x62C1 /* K * H. */, + 0x6F5E /* L * H. */ }; + +/* Expected results for vmul_laneq. */ +VECT_VAR_DECL (expected_laneq0_static, hfloat, 16, 4) [] + = { 0x629B /* A * E. */, + 0xEB00 /* B * E. */, + 0xE84A /* C * E. */, + 0x61EA /* D * E. */ }; + +VECT_VAR_DECL (expected_laneq1_static, hfloat, 16, 4) [] + = { 0x5BFF /* A * F. */, + 0xE43D /* B * F. */, + 0xE131 /* C * F. */, + 0x5B29 /* D * F. */ }; + +VECT_VAR_DECL (expected_laneq2_static, hfloat, 16, 4) [] + = { 0xD405 /* A * G. */, + 0x5C43 /* B * G. */, + 0x5939 /* C * G. */, + 0xD334 /* D * G. */ }; + +VECT_VAR_DECL (expected_laneq3_static, hfloat, 16, 4) [] + = { 0x6408 /* A * H. */, + 0xEC46 /* B * H. */, + 0xE93C /* C * H. */, + 0x6338 /* D * H. */ }; + +VECT_VAR_DECL (expected_laneq4_static, hfloat, 16, 4) [] + = { 0x648F /* A * M. */, + 0xECD5 /* B * M. */, + 0xE9ED /* C * M. */, + 0x6416 /* D * M. */ }; + +VECT_VAR_DECL (expected_laneq5_static, hfloat, 16, 4) [] + = { 0xD6B3 /* A * N. */, + 0x5F1A /* B * N. */, + 0x5C5A /* C * N. */, + 0xD600 /* D * N. */ }; + +VECT_VAR_DECL (expected_laneq6_static, hfloat, 16, 4) [] + = { 0xCB5E /* A * O. */, + 0x53CF /* B * O. */, + 0x50C9 /* C * O. */, + 0xCA99 /* D * O. */ }; + +VECT_VAR_DECL (expected_laneq7_static, hfloat, 16, 4) [] + = { 0xD810 /* A * P. */, + 0x604F /* B * P. */, + 0x5D47 /* C * P. */, + 0xD747 /* D * P. */ }; + +/* Expected results for vmulq_laneq. */ +VECT_VAR_DECL (expected_laneq0_static, hfloat, 16, 8) [] + = { 0x629B /* A * E. */, + 0xEB00 /* B * E. */, + 0xE84A /* C * E. */, + 0x61EA /* D * E. */, + 0x5186 /* I * E. */, + 0xECCE /* J * E. */, + 0x6189 /* K * E. */, + 0x6E0A /* L * E. */ }; + +VECT_VAR_DECL (expected_laneq1_static, hfloat, 16, 8) [] + = { 0x5BFF /* A * F. */, + 0xE43D /* B * F. */, + 0xE131 /* C * F. */, + 0x5B29 /* D * F. */, + 0x4AAF /* I * F. */, + 0xE5D1 /* J * F. */, + 0x5AB3 /* K * F. */, + 0x674F /* L * F. */ }; + +VECT_VAR_DECL (expected_laneq2_static, hfloat, 16, 8) [] + = { 0xD405 /* A * G. */, + 0x5C43 /* B * G. */, + 0x5939 /* C * G. */, + 0xD334 /* D * G. */, + 0xC2B9 /* I * G. */, + 0x5DDA /* J * G. */, + 0xD2BD /* K * G. */, + 0xDF5A /* L * G. */ }; + +VECT_VAR_DECL (expected_laneq3_static, hfloat, 16, 8) [] + = { 0x6408 /* A * H. */, + 0xEC46 /* B * H. */, + 0xE93C /* C * H. */, + 0x6338 /* D * H. */, + 0x52BD /* I * H. */, + 0xEDDE /* J * H. */, + 0x62C1 /* K * H. */, + 0x6F5E /* L * H. */ }; + +VECT_VAR_DECL (expected_laneq4_static, hfloat, 16, 8) [] + = { 0x648F /* A * M. */, + 0xECD5 /* B * M. */, + 0xE9ED /* C * M. */, + 0x6416 /* D * M. */, + 0x53A0 /* I * M. */, + 0xEEA3 /* J * M. */, + 0x63A4 /* K * M. */, + 0x702B /* L * M. */ }; + +VECT_VAR_DECL (expected_laneq5_static, hfloat, 16, 8) [] + = { 0xD6B3 /* A * N. */, + 0x5F1A /* B * N. */, + 0x5C5A /* C * N. */, + 0xD600 /* D * N. */, + 0xC59A /* I * N. */, + 0x60E0 /* J * N. */, + 0xD59D /* K * N. */, + 0xE220 /* L * N. */ }; + +VECT_VAR_DECL (expected_laneq6_static, hfloat, 16, 8) [] + = { 0xCB5E /* A * O. */, + 0x53CF /* B * O. */, + 0x50C9 /* C * O. */, + 0xCA99 /* D * O. */, + 0xBA29 /* I * O. */, + 0x555C /* J * O. */, + 0xCA2C /* K * O. */, + 0xD6BC /* L * O. */ }; + +VECT_VAR_DECL (expected_laneq7_static, hfloat, 16, 8) [] + = { 0xD810 /* A * P. */, + 0x604F /* B * P. */, + 0x5D47 /* C * P. */, + 0xD747 /* D * P. */, + 0xC6CB /* I * P. */, + 0x61EA /* J * P. */, + 0xD6CF /* K * P. */, + 0xE36E /* L * P. */ }; + +void exec_vmul_lane_f16 (void) +{ +#undef TEST_MSG +#define TEST_MSG "VMUL_LANE (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_1, float, 16, 4); + DECL_VARIABLE(vsrc_2, float, 16, 4); + VECT_VAR_DECL (buf_src_1, float, 16, 4) [] = {A, B, C, D}; + VECT_VAR_DECL (buf_src_2, float, 16, 4) [] = {E, F, G, H}; + VLOAD (vsrc_1, buf_src_1, , float, f, 16, 4); + VLOAD (vsrc_2, buf_src_2, , float, f, 16, 4); + DECL_VARIABLE (vector_res, float, 16, 4) + = vmul_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 0); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected0_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmul_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 1); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected1_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmul_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 2); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected2_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmul_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 3); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected3_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VMULQ_LANE (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_1, float, 16, 8); + VECT_VAR_DECL (buf_src_1, float, 16, 8) [] = {A, B, C, D, I, J, K, L}; + VLOAD (vsrc_1, buf_src_1, q, float, f, 16, 8); + DECL_VARIABLE (vector_res, float, 16, 8) + = vmulq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 0); + + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected0_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 1); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected1_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 2); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected2_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 3); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected3_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VMUL_LANEQ (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_2, float, 16, 8); + VECT_VAR_DECL (buf_src_2, float, 16, 8) [] = {E, F, G, H, M, N, O, P}; + VLOAD (vsrc_2, buf_src_2, q, float, f, 16, 8); + VECT_VAR (vector_res, float, 16, 4) + = vmul_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 0); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq0_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmul_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 1); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq1_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmul_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 2); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq2_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmul_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 3); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq3_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmul_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 4); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq4_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmul_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 5); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq5_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmul_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 6); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq6_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmul_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 7); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq7_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VMULQ_LANEQ (FP16)" + clean_results (); + + VECT_VAR (vector_res, float, 16, 8) + = vmulq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 0); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq0_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 1); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq1_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 2); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq2_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 3); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq3_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 4); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq4_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 5); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq5_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 6); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq6_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 7); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq7_static, ""); +} + +int +main (void) +{ + exec_vmul_lane_f16 (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmulx_f16_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmulx_f16_1.c new file mode 100644 index 0000000..51bbead --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmulx_f16_1.c @@ -0,0 +1,84 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_2a_fp16_neon_hw } */ +/* { dg-add-options arm_v8_2a_fp16_neon } */ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define FP16_C(a) ((__fp16) a) +#define A FP16_C (13.4) +#define B FP16_C (__builtin_inff ()) +#define C FP16_C (-34.8) +#define D FP16_C (-__builtin_inff ()) +#define E FP16_C (63.1) +#define F FP16_C (0.0) +#define G FP16_C (-4.8) +#define H FP16_C (0.0) + +#define I FP16_C (0.7) +#define J FP16_C (-__builtin_inff ()) +#define K FP16_C (11.23) +#define L FP16_C (98) +#define M FP16_C (87.1) +#define N FP16_C (-0.0) +#define O FP16_C (-1.1) +#define P FP16_C (7) + +/* Expected results for vmulx. */ +VECT_VAR_DECL (expected_static, hfloat, 16, 4) [] + = { 0x629B /* A * E. */, 0x4000 /* FP16_C (2.0f). */, + 0x5939 /* C * G. */, 0xC000 /* FP16_C (-2.0f). */ }; + +VECT_VAR_DECL (expected_static, hfloat, 16, 8) [] + = { 0x629B /* A * E. */, 0x4000 /* FP16_C (2.0f). */, + 0x5939 /* C * G. */, 0xC000 /* FP16_C (-2.0f). */, + 0x53A0 /* I * M. */, 0x4000 /* FP16_C (2.0f). */, + 0xCA2C /* K * O. */, 0x615C /* L * P. */ }; + +void exec_vmulx_f16 (void) +{ +#undef TEST_MSG +#define TEST_MSG "VMULX (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_1, float, 16, 4); + DECL_VARIABLE(vsrc_2, float, 16, 4); + VECT_VAR_DECL (buf_src_1, float, 16, 4) [] = {A, B, C, D}; + VECT_VAR_DECL (buf_src_2, float, 16, 4) [] = {E, F, G, H}; + VLOAD (vsrc_1, buf_src_1, , float, f, 16, 4); + VLOAD (vsrc_2, buf_src_2, , float, f, 16, 4); + DECL_VARIABLE (vector_res, float, 16, 4) + = vmulx_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4)); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VMULXQ (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_1, float, 16, 8); + DECL_VARIABLE(vsrc_2, float, 16, 8); + VECT_VAR_DECL (buf_src_1, float, 16, 8) [] = {A, B, C, D, I, J, K, L}; + VECT_VAR_DECL (buf_src_2, float, 16, 8) [] = {E, F, G, H, M, N, O, P}; + VLOAD (vsrc_1, buf_src_1, q, float, f, 16, 8); + VLOAD (vsrc_2, buf_src_2, q, float, f, 16, 8); + DECL_VARIABLE (vector_res, float, 16, 8) + = vmulxq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8)); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_static, ""); +} + +int +main (void) +{ + exec_vmulx_f16 (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmulx_lane_f16_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmulx_lane_f16_1.c new file mode 100644 index 0000000..f90a36d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmulx_lane_f16_1.c @@ -0,0 +1,452 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_2a_fp16_neon_hw } */ +/* { dg-add-options arm_v8_2a_fp16_neon } */ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define FP16_C(a) ((__fp16) a) +#define A FP16_C (13.4) +#define B FP16_C (__builtin_inff ()) +#define C FP16_C (-34.8) +#define D FP16_C (-__builtin_inff ()) +#define E FP16_C (-0.0) +#define F FP16_C (19.1) +#define G FP16_C (-4.8) +#define H FP16_C (0.0) + +#define I FP16_C (0.7) +#define J FP16_C (-78) +#define K FP16_C (-__builtin_inff ()) +#define L FP16_C (98) +#define M FP16_C (87.1) +#define N FP16_C (-8) +#define O FP16_C (-1.1) +#define P FP16_C (-0.0) + +/* Expected results for vmulx_lane. */ +VECT_VAR_DECL (expected0_static, hfloat, 16, 4) [] + = { 0x8000 /* A * E. */, + 0xC000 /* FP16_C (-2.0f). */, + 0x0000 /* C * E. */, + 0x4000 /* FP16_C (2.0f). */ }; + +VECT_VAR_DECL (expected1_static, hfloat, 16, 4) [] + = { 0x5BFF /* A * F. */, + 0x7C00 /* B * F. */, + 0xE131 /* C * F. */, + 0xFC00 /* D * F. */ }; + +VECT_VAR_DECL (expected2_static, hfloat, 16, 4) [] + = { 0xD405 /* A * G. */, + 0xFC00 /* B * G. */, + 0x5939 /* C * G. */, + 0x7C00 /* D * G. */ }; + +VECT_VAR_DECL (expected3_static, hfloat, 16, 4) [] + = { 0x0000 /* A * H. */, + 0x4000 /* FP16_C (2.0f). */, + 0x8000 /* C * H. */, + 0xC000 /* FP16_C (-2.0f). */ }; + +/* Expected results for vmulxq_lane. */ +VECT_VAR_DECL (expected0_static, hfloat, 16, 8) [] + = { 0x8000 /* A * E. */, + 0xC000 /* FP16_C (-2.0f). */, + 0x0000 /* C * E. */, + 0x4000 /* FP16_C (2.0f). */, + 0x8000 /* I * E. */, + 0x0000 /* J * E. */, + 0x4000 /* FP16_C (2.0f). */, + 0x8000 /* L * E. */ }; + +VECT_VAR_DECL (expected1_static, hfloat, 16, 8) [] + = { 0x5BFF /* A * F. */, + 0x7C00 /* B * F. */, + 0xE131 /* C * F. */, + 0xFC00 /* D * F. */, + 0x4AAF /* I * F. */, + 0xE5D1 /* J * F. */, + 0xFC00 /* K * F. */, + 0x674F /* L * F. */ }; + +VECT_VAR_DECL (expected2_static, hfloat, 16, 8) [] + = { 0xD405 /* A * G. */, + 0xFC00 /* B * G. */, + 0x5939 /* C * G. */, + 0x7C00 /* D * G. */, + 0xC2B9 /* I * G. */, + 0x5DDA /* J * G. */, + 0x7C00 /* K * G. */, + 0xDF5A /* L * G. */ }; + +VECT_VAR_DECL (expected3_static, hfloat, 16, 8) [] + = { 0x0000 /* A * H. */, + 0x4000 /* FP16_C (2.0f). */, + 0x8000 /* C * H. */, + 0xC000 /* FP16_C (-2.0f). */, + 0x0000 /* I * H. */, + 0x8000 /* J * H. */, + 0xC000 /* FP16_C (-2.0f). */, + 0x0000 /* L * H. */}; + +/* Expected results for vmulx_laneq. */ +VECT_VAR_DECL (expected_laneq0_static, hfloat, 16, 4) [] + = { 0x8000 /* A * E. */, + 0xC000 /* FP16_C (-2.0f). */, + 0x0000 /* C * E. */, + 0x4000 /* FP16_C (2.0f). */ }; + +VECT_VAR_DECL (expected_laneq1_static, hfloat, 16, 4) [] + = { 0x5BFF /* A * F. */, + 0x7C00 /* B * F. */, + 0xE131 /* C * F. */, + 0xFC00 /* D * F. */ }; + +VECT_VAR_DECL (expected_laneq2_static, hfloat, 16, 4) [] + = { 0xD405 /* A * G. */, + 0xFC00 /* B * G. */, + 0x5939 /* C * G. */, + 0x7C00 /* D * G. */ }; + +VECT_VAR_DECL (expected_laneq3_static, hfloat, 16, 4) [] + = { 0x0000 /* A * H. */, + 0x4000 /* FP16_C (2.0f). */, + 0x8000 /* C * H. */, + 0xC000 /* FP16_C (-2.0f). */ }; + +VECT_VAR_DECL (expected_laneq4_static, hfloat, 16, 4) [] + = { 0x648F /* A * M. */, + 0x7C00 /* B * M. */, + 0xE9ED /* C * M. */, + 0xFC00 /* D * M. */ }; + +VECT_VAR_DECL (expected_laneq5_static, hfloat, 16, 4) [] + = { 0xD6B3 /* A * N. */, + 0xFC00 /* B * N. */, + 0x5C5A /* C * N. */, + 0x7C00 /* D * N. */ }; + +VECT_VAR_DECL (expected_laneq6_static, hfloat, 16, 4) [] + = { 0xCB5E /* A * O. */, + 0xFC00 /* B * O. */, + 0x50C9 /* C * O. */, + 0x7C00 /* D * O. */ }; + +VECT_VAR_DECL (expected_laneq7_static, hfloat, 16, 4) [] + = { 0x8000 /* A * P. */, + 0xC000 /* FP16_C (-2.0f). */, + 0x0000 /* C * P. */, + 0x4000 /* FP16_C (2.0f). */ }; + +VECT_VAR_DECL (expected_laneq0_static, hfloat, 16, 8) [] + = { 0x8000 /* A * E. */, + 0xC000 /* FP16_C (-2.0f). */, + 0x0000 /* C * E. */, + 0x4000 /* FP16_C (2.0f). */, + 0x8000 /* I * E. */, + 0x0000 /* J * E. */, + 0x4000 /* FP16_C (2.0f). */, + 0x8000 /* L * E. */ }; + +VECT_VAR_DECL (expected_laneq1_static, hfloat, 16, 8) [] + = { 0x5BFF /* A * F. */, + 0x7C00 /* B * F. */, + 0xE131 /* C * F. */, + 0xFC00 /* D * F. */, + 0x4AAF /* I * F. */, + 0xE5D1 /* J * F. */, + 0xFC00 /* K * F. */, + 0x674F /* L * F. */ }; + +VECT_VAR_DECL (expected_laneq2_static, hfloat, 16, 8) [] + = { 0xD405 /* A * G. */, + 0xFC00 /* B * G. */, + 0x5939 /* C * G. */, + 0x7C00 /* D * G. */, + 0xC2B9 /* I * G. */, + 0x5DDA /* J * G. */, + 0x7C00 /* K * G. */, + 0xDF5A /* L * G. */ }; + +VECT_VAR_DECL (expected_laneq3_static, hfloat, 16, 8) [] + = { 0x0000 /* A * H. */, + 0x4000 /* FP16_C (2.0f). */, + 0x8000 /* C * H. */, + 0xC000 /* FP16_C (-2.0f). */, + 0x0000 /* I * H. */, + 0x8000 /* J * H. */, + 0xC000 /* FP16_C (-2.0f). */, + 0x0000 /* L * H. */ }; + +VECT_VAR_DECL (expected_laneq4_static, hfloat, 16, 8) [] + = { 0x648F /* A * M. */, + 0x7C00 /* B * M. */, + 0xE9ED /* C * M. */, + 0xFC00 /* D * M. */, + 0x53A0 /* I * M. */, + 0xEEA3 /* J * M. */, + 0xFC00 /* K * M. */, + 0x702B /* L * M. */ }; + +VECT_VAR_DECL (expected_laneq5_static, hfloat, 16, 8) [] + = { 0xD6B3 /* A * N. */, + 0xFC00 /* B * N. */, + 0x5C5A /* C * N. */, + 0x7C00 /* D * N. */, + 0xC59A /* I * N. */, + 0x60E0 /* J * N. */, + 0x7C00 /* K * N. */, + 0xE220 /* L * N. */ }; + +VECT_VAR_DECL (expected_laneq6_static, hfloat, 16, 8) [] + = { 0xCB5E /* A * O. */, + 0xFC00 /* B * O. */, + 0x50C9 /* C * O. */, + 0x7C00 /* D * O. */, + 0xBA29 /* I * O. */, + 0x555C /* J * O. */, + 0x7C00 /* K * O. */, + 0xD6BC /* L * O. */ }; + +VECT_VAR_DECL (expected_laneq7_static, hfloat, 16, 8) [] + = { 0x8000 /* A * P. */, + 0xC000 /* FP16_C (-2.0f). */, + 0x0000 /* C * P. */, + 0x4000 /* FP16_C (2.0f). */, + 0x8000 /* I * P. */, + 0x0000 /* J * P. */, + 0x4000 /* FP16_C (2.0f). */, + 0x8000 /* L * P. */ }; + +void exec_vmulx_lane_f16 (void) +{ +#undef TEST_MSG +#define TEST_MSG "VMULX_LANE (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_1, float, 16, 4); + DECL_VARIABLE(vsrc_2, float, 16, 4); + VECT_VAR_DECL (buf_src_1, float, 16, 4) [] = {A, B, C, D}; + VECT_VAR_DECL (buf_src_2, float, 16, 4) [] = {E, F, G, H}; + VLOAD (vsrc_1, buf_src_1, , float, f, 16, 4); + VLOAD (vsrc_2, buf_src_2, , float, f, 16, 4); + DECL_VARIABLE (vector_res, float, 16, 4) + = vmulx_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 0); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected0_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmulx_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 1); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected1_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmulx_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 2); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected2_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmulx_lane_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4), 3); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected3_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VMULXQ_LANE (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_1, float, 16, 8); + VECT_VAR_DECL (buf_src_1, float, 16, 8) [] = {A, B, C, D, I, J, K, L}; + VLOAD (vsrc_1, buf_src_1, q, float, f, 16, 8); + DECL_VARIABLE (vector_res, float, 16, 8) + = vmulxq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 0); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected0_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulxq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 1); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected1_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulxq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 2); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected2_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulxq_lane_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 4), 3); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected3_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VMULX_LANEQ (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_2, float, 16, 8); + VECT_VAR_DECL (buf_src_2, float, 16, 8) [] = {E, F, G, H, M, N, O, P}; + VLOAD (vsrc_2, buf_src_2, q, float, f, 16, 8); + VECT_VAR (vector_res, float, 16, 4) + = vmulx_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 0); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq0_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmulx_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 1); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq1_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmulx_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 2); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq2_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmulx_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 3); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq3_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmulx_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 4); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq4_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmulx_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 5); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq5_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmulx_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 6); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq6_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmulx_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 8), 7); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_laneq7_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VMULXQ_LANEQ (FP16)" + clean_results (); + + VECT_VAR (vector_res, float, 16, 8) + = vmulxq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 0); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq0_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulxq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 1); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq1_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulxq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 2); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq2_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulxq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 3); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq3_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulxq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 4); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq4_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulxq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 5); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq5_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulxq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 6); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq6_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulxq_laneq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8), 7); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_laneq7_static, ""); +} + +int +main (void) +{ + exec_vmulx_lane_f16 (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmulx_n_f16_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmulx_n_f16_1.c new file mode 100644 index 0000000..140647b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmulx_n_f16_1.c @@ -0,0 +1,177 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_2a_fp16_neon_hw } */ +/* { dg-add-options arm_v8_2a_fp16_neon } */ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define FP16_C(a) ((__fp16) a) +#define A FP16_C (13.4) +#define B FP16_C (__builtin_inff ()) +#define C FP16_C (-34.8) +#define D FP16_C (-__builtin_inff ()) +#define E FP16_C (-0.0) +#define F FP16_C (19.1) +#define G FP16_C (-4.8) +#define H FP16_C (0.0) + +float16_t elemE = E; +float16_t elemF = F; +float16_t elemG = G; +float16_t elemH = H; + +#define I FP16_C (0.7) +#define J FP16_C (-78) +#define K FP16_C (11.23) +#define L FP16_C (98) +#define M FP16_C (87.1) +#define N FP16_C (-8) +#define O FP16_C (-1.1) +#define P FP16_C (-9.7) + +/* Expected results for vmulx_n. */ +VECT_VAR_DECL (expected0_static, hfloat, 16, 4) [] + = { 0x8000 /* A * E. */, + 0xC000 /* FP16_C (-2.0f). */, + 0x0000 /* C * E. */, + 0x4000 /* FP16_C (2.0f). */ }; + +VECT_VAR_DECL (expected1_static, hfloat, 16, 4) [] + = { 0x5BFF /* A * F. */, + 0x7C00 /* B * F. */, + 0xE131 /* C * F. */, + 0xFC00 /* D * F. */ }; + +VECT_VAR_DECL (expected2_static, hfloat, 16, 4) [] + = { 0xD405 /* A * G. */, + 0xFC00 /* B * G. */, + 0x5939 /* C * G. */, + 0x7C00 /* D * G. */ }; + +VECT_VAR_DECL (expected3_static, hfloat, 16, 4) [] + = { 0x0000 /* A * H. */, + 0x4000 /* FP16_C (2.0f). */, + 0x8000 /* C * H. */, + 0xC000 /* FP16_C (-2.0f). */ }; + +VECT_VAR_DECL (expected0_static, hfloat, 16, 8) [] + = { 0x8000 /* A * E. */, + 0xC000 /* FP16_C (-2.0f). */, + 0x0000 /* C * E. */, + 0x4000 /* FP16_C (2.0f). */, + 0x8000 /* I * E. */, + 0x0000 /* J * E. */, + 0x8000 /* K * E. */, + 0x8000 /* L * E. */ }; + +VECT_VAR_DECL (expected1_static, hfloat, 16, 8) [] + = { 0x5BFF /* A * F. */, + 0x7C00 /* B * F. */, + 0xE131 /* C * F. */, + 0xFC00 /* D * F. */, + 0x4AAF /* I * F. */, + 0xE5D1 /* J * F. */, + 0x5AB3 /* K * F. */, + 0x674F /* L * F. */ }; + +VECT_VAR_DECL (expected2_static, hfloat, 16, 8) [] + = { 0xD405 /* A * G. */, + 0xFC00 /* B * G. */, + 0x5939 /* C * G. */, + 0x7C00 /* D * G. */, + 0xC2B9 /* I * G. */, + 0x5DDA /* J * G. */, + 0xD2BD /* K * G. */, + 0xDF5A /* L * G. */ }; + +VECT_VAR_DECL (expected3_static, hfloat, 16, 8) [] + = { 0x0000 /* A * H. */, + 0x4000 /* FP16_C (2.0f). */, + 0x8000 /* C * H. */, + 0xC000 /* FP16_C (-2.0f). */, + 0x0000 /* I * H. */, + 0x8000 /* J * H. */, + 0x0000 /* K * H. */, + 0x0000 /* L * H. */ }; + +void exec_vmulx_n_f16 (void) +{ +#undef TEST_MSG +#define TEST_MSG "VMULX_N (FP16)" + clean_results (); + + DECL_VARIABLE (vsrc_1, float, 16, 4); + VECT_VAR_DECL (buf_src_1, float, 16, 4) [] = {A, B, C, D}; + VLOAD (vsrc_1, buf_src_1, , float, f, 16, 4); + DECL_VARIABLE (vector_res, float, 16, 4) + = vmulx_n_f16 (VECT_VAR (vsrc_1, float, 16, 4), elemE); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected0_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmulx_n_f16 (VECT_VAR (vsrc_1, float, 16, 4), elemF); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected1_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmulx_n_f16 (VECT_VAR (vsrc_1, float, 16, 4), elemG); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected2_static, ""); + + VECT_VAR (vector_res, float, 16, 4) + = vmulx_n_f16 (VECT_VAR (vsrc_1, float, 16, 4), elemH); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected3_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VMULXQ_N (FP16)" + clean_results (); + + DECL_VARIABLE (vsrc_1, float, 16, 8); + VECT_VAR_DECL (buf_src_1, float, 16, 8) [] = {A, B, C, D, I, J, K, L}; + VLOAD (vsrc_1, buf_src_1, q, float, f, 16, 8); + DECL_VARIABLE (vector_res, float, 16, 8) + = vmulxq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), elemE); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected0_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulxq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), elemF); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected1_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulxq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), elemG); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected2_static, ""); + + VECT_VAR (vector_res, float, 16, 8) + = vmulxq_n_f16 (VECT_VAR (vsrc_1, float, 16, 8), elemH); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected3_static, ""); +} + +int +main (void) +{ + exec_vmulx_n_f16 (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpminmaxnm_f16_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpminmaxnm_f16_1.c new file mode 100644 index 0000000..c8df677 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpminmaxnm_f16_1.c @@ -0,0 +1,114 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_2a_fp16_neon_hw } */ +/* { dg-add-options arm_v8_2a_fp16_neon } */ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define FP16_C(a) ((__fp16) a) +#define A FP16_C (123.4) +#define B FP16_C (__builtin_nanf ("")) /* NaN */ +#define C FP16_C (-34.8) +#define D FP16_C (1024) +#define E FP16_C (663.1) +#define F FP16_C (169.1) +#define G FP16_C (-4.8) +#define H FP16_C (-__builtin_nanf ("")) /* NaN */ + +#define I FP16_C (0.7) +#define J FP16_C (-78) +#define K FP16_C (101.23) +#define L FP16_C (-1098) +#define M FP16_C (870.1) +#define N FP16_C (-8781) +#define O FP16_C (__builtin_inff ()) /* +Inf */ +#define P FP16_C (-__builtin_inff ()) /* -Inf */ + + +/* Expected results for vpminnm. */ +VECT_VAR_DECL (expected_min_static, hfloat, 16, 4) [] + = { 0x57B6 /* A. */, 0xD05A /* C. */, 0x5949 /* F. */, 0xC4CD /* G. */ }; + +VECT_VAR_DECL (expected_min_static, hfloat, 16, 8) [] + = { 0x57B6 /* A. */, 0xD05A /* C. */, 0xD4E0 /* J. */, 0xE44A /* L. */, + 0x5949 /* F. */, 0xC4CD /* G. */, 0xF04A /* N. */, 0xFC00 /* P. */ }; + +/* expected_max results for vpmaxnm. */ +VECT_VAR_DECL (expected_max_static, hfloat, 16, 4) [] + = { 0x57B6 /* A. */, 0x6400 /* D. */, 0x612E /* E. */, 0xC4CD /* G. */ }; + +VECT_VAR_DECL (expected_max_static, hfloat, 16, 8) [] + = { 0x57B6 /* A. */, 0x6400 /* D. */, 0x399A /* I. */, 0x5654 /* K. */, + 0x612E /* E. */, 0xC4CD /* G. */, 0x62CC /* M. */, 0x7C00 /* O. */ }; + +void exec_vpminmaxnm_f16 (void) +{ +#undef TEST_MSG +#define TEST_MSG "VPMINNM (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_1, float, 16, 4); + DECL_VARIABLE(vsrc_2, float, 16, 4); + VECT_VAR_DECL (buf_src_1, float, 16, 4) [] = {A, B, C, D}; + VECT_VAR_DECL (buf_src_2, float, 16, 4) [] = {E, F, G, H}; + VLOAD (vsrc_1, buf_src_1, , float, f, 16, 4); + VLOAD (vsrc_2, buf_src_2, , float, f, 16, 4); + DECL_VARIABLE (vector_res, float, 16, 4) + = vpminnm_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4)); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_min_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VPMINNMQ (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc_1, float, 16, 8); + DECL_VARIABLE(vsrc_2, float, 16, 8); + VECT_VAR_DECL (buf_src_1, float, 16, 8) [] = {A, B, C, D, I, J, K, L}; + VECT_VAR_DECL (buf_src_2, float, 16, 8) [] = {E, F, G, H, M, N, O, P}; + VLOAD (vsrc_1, buf_src_1, q, float, f, 16, 8); + VLOAD (vsrc_2, buf_src_2, q, float, f, 16, 8); + DECL_VARIABLE (vector_res, float, 16, 8) + = vpminnmq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8)); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_min_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VPMAXNM (FP16)" + clean_results (); + + VECT_VAR (vector_res, float, 16, 4) + = vpmaxnm_f16 (VECT_VAR (vsrc_1, float, 16, 4), + VECT_VAR (vsrc_2, float, 16, 4)); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_max_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VPMAXNMQ (FP16)" + clean_results (); + + VECT_VAR (vector_res, float, 16, 8) + = vpmaxnmq_f16 (VECT_VAR (vsrc_1, float, 16, 8), + VECT_VAR (vsrc_2, float, 16, 8)); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_max_static, ""); +} + +int +main (void) +{ + exec_vpminmaxnm_f16 (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrndi_f16_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrndi_f16_1.c new file mode 100644 index 0000000..7a4620b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrndi_f16_1.c @@ -0,0 +1,71 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_2a_fp16_neon_hw } */ +/* { dg-add-options arm_v8_2a_fp16_neon } */ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define FP16_C(a) ((__fp16) a) +#define A FP16_C (123.4) +#define RNDI_A 0x57B0 /* FP16_C (123). */ +#define B FP16_C (-567.5) +#define RNDI_B 0xE070 /* FP16_C (-568). */ +#define C FP16_C (-34.8) +#define RNDI_C 0xD060 /* FP16_C (-35). */ +#define D FP16_C (1024) +#define RNDI_D 0x6400 /* FP16_C (1024). */ +#define E FP16_C (663.1) +#define RNDI_E 0x612E /* FP16_C (663). */ +#define F FP16_C (169.1) +#define RNDI_F 0x5948 /* FP16_C (169). */ +#define G FP16_C (-4.8) +#define RNDI_G 0xC500 /* FP16_C (-5). */ +#define H FP16_C (77.5) +#define RNDI_H 0x54E0 /* FP16_C (78). */ + +/* Expected results for vrndi. */ +VECT_VAR_DECL (expected_static, hfloat, 16, 4) [] + = { RNDI_A, RNDI_B, RNDI_C, RNDI_D }; + +VECT_VAR_DECL (expected_static, hfloat, 16, 8) [] + = { RNDI_A, RNDI_B, RNDI_C, RNDI_D, RNDI_E, RNDI_F, RNDI_G, RNDI_H }; + +void exec_vrndi_f16 (void) +{ +#undef TEST_MSG +#define TEST_MSG "VRNDI (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc, float, 16, 4); + VECT_VAR_DECL (buf_src, float, 16, 4) [] = {A, B, C, D}; + VLOAD (vsrc, buf_src, , float, f, 16, 4); + DECL_VARIABLE (vector_res, float, 16, 4) + = vrndi_f16 (VECT_VAR (vsrc, float, 16, 4)); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VRNDIQ (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc, float, 16, 8); + VECT_VAR_DECL (buf_src, float, 16, 8) [] = {A, B, C, D, E, F, G, H}; + VLOAD (vsrc, buf_src, q, float, f, 16, 8); + DECL_VARIABLE (vector_res, float, 16, 8) + = vrndiq_f16 (VECT_VAR (vsrc, float, 16, 8)); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_static, ""); +} + +int +main (void) +{ + exec_vrndi_f16 (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vsqrt_f16_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vsqrt_f16_1.c new file mode 100644 index 0000000..82249a7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vsqrt_f16_1.c @@ -0,0 +1,72 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_2a_fp16_neon_hw } */ +/* { dg-add-options arm_v8_2a_fp16_neon } */ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define FP16_C(a) ((__fp16) a) +#define A FP16_C (123.4) +#define B FP16_C (567.8) +#define C FP16_C (34.8) +#define D FP16_C (1024) +#define E FP16_C (663.1) +#define F FP16_C (144.0) +#define G FP16_C (4.8) +#define H FP16_C (77) + +#define SQRT_A 0x498E /* FP16_C (__builtin_sqrtf (123.4)). */ +#define SQRT_B 0x4DF5 /* FP16_C (__builtin_sqrtf (567.8)). */ +#define SQRT_C 0x45E6 /* FP16_C (__builtin_sqrtf (34.8)). */ +#define SQRT_D 0x5000 /* FP16_C (__builtin_sqrtf (1024)). */ +#define SQRT_E 0x4E70 /* FP16_C (__builtin_sqrtf (663.1)). */ +#define SQRT_F 0x4A00 /* FP16_C (__builtin_sqrtf (144.0)). */ +#define SQRT_G 0x4062 /* FP16_C (__builtin_sqrtf (4.8)). */ +#define SQRT_H 0x4863 /* FP16_C (__builtin_sqrtf (77)). */ + +/* Expected results for vsqrt. */ +VECT_VAR_DECL (expected_static, hfloat, 16, 4) [] + = { SQRT_A, SQRT_B, SQRT_C, SQRT_D }; + +VECT_VAR_DECL (expected_static, hfloat, 16, 8) [] + = { SQRT_A, SQRT_B, SQRT_C, SQRT_D, SQRT_E, SQRT_F, SQRT_G, SQRT_H }; + +void exec_vsqrt_f16 (void) +{ +#undef TEST_MSG +#define TEST_MSG "VSQRT (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc, float, 16, 4); + VECT_VAR_DECL (buf_src, float, 16, 4) [] = {A, B, C, D}; + VLOAD (vsrc, buf_src, , float, f, 16, 4); + DECL_VARIABLE (vector_res, float, 16, 4) + = vsqrt_f16 (VECT_VAR (vsrc, float, 16, 4)); + vst1_f16 (VECT_VAR (result, float, 16, 4), + VECT_VAR (vector_res, float, 16, 4)); + + CHECK_FP (TEST_MSG, float, 16, 4, PRIx16, expected_static, ""); + +#undef TEST_MSG +#define TEST_MSG "VSQRTQ (FP16)" + clean_results (); + + DECL_VARIABLE(vsrc, float, 16, 8); + VECT_VAR_DECL (buf_src, float, 16, 8) [] = {A, B, C, D, E, F, G, H}; + VLOAD (vsrc, buf_src, q, float, f, 16, 8); + DECL_VARIABLE (vector_res, float, 16, 8) + = vsqrtq_f16 (VECT_VAR (vsrc, float, 16, 8)); + vst1q_f16 (VECT_VAR (result, float, 16, 8), + VECT_VAR (vector_res, float, 16, 8)); + + CHECK_FP (TEST_MSG, float, 16, 8, PRIx16, expected_static, ""); +} + +int +main (void) +{ + exec_vsqrt_f16 (); + return 0; +} -- 2.5.0