From patchwork Tue Nov 29 09:50:42 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 700371 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3tSf2G2q2vz9vDp for ; Tue, 29 Nov 2016 20:51:16 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="JP4aIamz"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:mime-version; q=dns; s=default; b=kZSfjSVmUFMbo8BI Dp71ox26Ch4dg69t6oSq9izB3GYemrDUWgVIuMzFl1hfkF3dUh4wIBZWVV2H59tA oQM9OVBCIL+iL6+1B/I/th9GrJ/U/B1shoTghIQf9Xk0dLfESocI11eapLVqvxwx 8f/+XVFHWbJkdVA8edzszwvEci8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:mime-version; s=default; bh=tUw/Q+rw8+0i3aGrvVNq1C s7d+w=; b=JP4aIamzu1bFf4nsIq8gGXSBWoi+TZKnqGd2rRRnOUrVZ15gA1on4o 4J7P54xKdnL9HEcb/9WiE996hvvsSITIiWCiOPRsJuKIYboAMgnuO3NmOOr3+5FP 7tS6XBXej1PlH4SmGyjLbCKuLEhX8xG+s//p6n6laR25aq4UoDaCg= Received: (qmail 40416 invoked by alias); 29 Nov 2016 09:50:58 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 40329 invoked by uid 89); 29 Nov 2016 09:50:57 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=AWL, BAYES_00, KAM_ASCII_DIVIDERS, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=no version=3.3.2 spammy=prix, 1989, (unknown), 3917 X-HELO: EUR01-VE1-obe.outbound.protection.outlook.com Received: from mail-ve1eur01on0047.outbound.protection.outlook.com (HELO EUR01-VE1-obe.outbound.protection.outlook.com) (104.47.1.47) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 29 Nov 2016 09:50:47 +0000 Received: from VI1PR0801MB2031.eurprd08.prod.outlook.com (10.173.74.140) by VI1PR0801MB2093.eurprd08.prod.outlook.com (10.173.75.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.747.13; Tue, 29 Nov 2016 09:50:42 +0000 Received: from VI1PR0801MB2031.eurprd08.prod.outlook.com ([10.173.74.140]) by VI1PR0801MB2031.eurprd08.prod.outlook.com ([10.173.74.140]) with mapi id 15.01.0747.015; Tue, 29 Nov 2016 09:50:42 +0000 From: Tamar Christina To: Christophe Lyon CC: GCC Patches , "christophe.lyon@st.com" , Marcus Shawcroft , Richard Earnshaw , James Greenhalgh , Kyrylo Tkachov , nd Subject: Re: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing Poly64_t intrinsics to GCC Date: Tue, 29 Nov 2016 09:50:42 +0000 Message-ID: References: , In-Reply-To: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Tamar.Christina@arm.com; x-ms-exchange-messagesentrepresentingtype: 1 x-ms-office365-filtering-correlation-id: b4aec05e-728c-441e-8c07-08d4183d2f11 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001); SRVR:VI1PR0801MB2093; x-microsoft-exchange-diagnostics: 1; VI1PR0801MB2093; 7:sGzBnOdj+iEm8aYCnw6TiJiWRv4IyQe8va4V584xbtdj5443kYdmU7wXeZ3P95DcCk8Cu6WF1VdMilEANiq/5WYO332SoCogUcnsQrCV3y4qvcrGb8Uf9FK88uta+UomyEYvMphlRxqEGd+ZGjVepZHj1OevXeuDRIGSDvqSZl0dROJ5bqwyf5ch15s9iaxM2MErq/wIkCOBcFinCQtXsrIACQO7iTdXBGQrAiKepkDiwKVUrCea/Djt7makDvDGPd545aY3vClJ14scaPN5OakOHAK6jYz4H6Uw2G1Fm1fkjAeOgED2aVsdH3FnYgWrWckvcxrTqrvB7k/UgzGIG9Ydp8CjJ5P1LRo6SGXZw2eSSOODFHrptDykv/U/nCmeU+wQvK3wTEy0yEUJhjw+MC1qBovcOIyDQklvw3yXGfYdYxmrJOu7mTabtFphcSilayHVF6Q5238YVPp7Z5TluA== nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(102415395)(6060326)(6040361)(6045199)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6055026)(6041248)(6061324)(20161123564025)(20161123562025)(20161123560025)(20161123555025)(6072148); SRVR:VI1PR0801MB2093; BCL:0; PCL:0; RULEID:; SRVR:VI1PR0801MB2093; x-forefront-prvs: 01415BB535 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(7916002)(377454003)(199003)(53754006)(51914003)(189002)(77096006)(76576001)(39450400002)(102836003)(33656002)(50986999)(92566002)(39410400001)(189998001)(6116002)(39400400001)(97736004)(99936001)(81166006)(66066001)(3846002)(81156014)(8936002)(101416001)(93886004)(39380400001)(6916009)(2906002)(2950100002)(110136003)(68736007)(7696004)(76176999)(54356999)(106356001)(74316002)(7736002)(9686002)(6506003)(106116001)(305945005)(105586002)(7846002)(229853002)(5660300001)(38730400001)(4326007)(3280700002)(2900100001)(122556002)(3660700001)(86362001)(8676002); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR0801MB2093; H:VI1PR0801MB2031.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Nov 2016 09:50:42.4620 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0801MB2093 X-IsSubscribed: yes Hi All, The new patch contains the proper types for the intrinsics that should be returning uint64x1 and has the rest of the comments by Christophe in them. Kind Regards, Tamar diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h index 462141586b3db7c5256c74b08fa0449210634226..beaf6ac31d5c5affe3702a505ad0df8679229e32 100644 --- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h @@ -32,6 +32,13 @@ extern size_t strlen(const char *); VECT_VAR(expected, int, 16, 4) -> expected_int16x4 VECT_VAR_DECL(expected, int, 16, 4) -> int16x4_t expected_int16x4 */ +/* Some instructions don't exist on ARM. + Use this macro to guard against them. */ +#ifdef __aarch64__ +#define AARCH64_ONLY(X) X +#else +#define AARCH64_ONLY(X) +#endif #define xSTR(X) #X #define STR(X) xSTR(X) @@ -92,6 +99,13 @@ extern size_t strlen(const char *); fprintf(stderr, "CHECKED %s %s\n", STR(VECT_TYPE(T, W, N)), MSG); \ } +#if defined (__ARM_FEATURE_CRYPTO) +#define CHECK_CRYPTO(MSG,T,W,N,FMT,EXPECTED,COMMENT) \ + CHECK(MSG,T,W,N,FMT,EXPECTED,COMMENT) +#else +#define CHECK_CRYPTO(MSG,T,W,N,FMT,EXPECTED,COMMENT) +#endif + /* Floating-point variant. */ #define CHECK_FP(MSG,T,W,N,FMT,EXPECTED,COMMENT) \ { \ @@ -184,6 +198,9 @@ extern ARRAY(expected, uint, 32, 2); extern ARRAY(expected, uint, 64, 1); extern ARRAY(expected, poly, 8, 8); extern ARRAY(expected, poly, 16, 4); +#if defined (__ARM_FEATURE_CRYPTO) +extern ARRAY(expected, poly, 64, 1); +#endif extern ARRAY(expected, hfloat, 16, 4); extern ARRAY(expected, hfloat, 32, 2); extern ARRAY(expected, hfloat, 64, 1); @@ -197,6 +214,9 @@ extern ARRAY(expected, uint, 32, 4); extern ARRAY(expected, uint, 64, 2); extern ARRAY(expected, poly, 8, 16); extern ARRAY(expected, poly, 16, 8); +#if defined (__ARM_FEATURE_CRYPTO) +extern ARRAY(expected, poly, 64, 2); +#endif extern ARRAY(expected, hfloat, 16, 8); extern ARRAY(expected, hfloat, 32, 4); extern ARRAY(expected, hfloat, 64, 2); @@ -213,6 +233,7 @@ extern ARRAY(expected, hfloat, 64, 2); CHECK(test_name, uint, 64, 1, PRIx64, EXPECTED, comment); \ CHECK(test_name, poly, 8, 8, PRIx8, EXPECTED, comment); \ CHECK(test_name, poly, 16, 4, PRIx16, EXPECTED, comment); \ + CHECK_CRYPTO(test_name, poly, 64, 1, PRIx64, EXPECTED, comment); \ CHECK_FP(test_name, float, 32, 2, PRIx32, EXPECTED, comment); \ \ CHECK(test_name, int, 8, 16, PRIx8, EXPECTED, comment); \ @@ -225,6 +246,7 @@ extern ARRAY(expected, hfloat, 64, 2); CHECK(test_name, uint, 64, 2, PRIx64, EXPECTED, comment); \ CHECK(test_name, poly, 8, 16, PRIx8, EXPECTED, comment); \ CHECK(test_name, poly, 16, 8, PRIx16, EXPECTED, comment); \ + CHECK_CRYPTO(test_name, poly, 64, 2, PRIx64, EXPECTED, comment); \ CHECK_FP(test_name, float, 32, 4, PRIx32, EXPECTED, comment); \ } \ @@ -398,6 +420,9 @@ static void clean_results (void) CLEAN(result, uint, 64, 1); CLEAN(result, poly, 8, 8); CLEAN(result, poly, 16, 4); +#if defined (__ARM_FEATURE_CRYPTO) + CLEAN(result, poly, 64, 1); +#endif #if defined (__ARM_FP16_FORMAT_IEEE) || defined (__ARM_FP16_FORMAT_ALTERNATIVE) CLEAN(result, float, 16, 4); #endif @@ -413,6 +438,9 @@ static void clean_results (void) CLEAN(result, uint, 64, 2); CLEAN(result, poly, 8, 16); CLEAN(result, poly, 16, 8); +#if defined (__ARM_FEATURE_CRYPTO) + CLEAN(result, poly, 64, 2); +#endif #if defined (__ARM_FP16_FORMAT_IEEE) || defined (__ARM_FP16_FORMAT_ALTERNATIVE) CLEAN(result, float, 16, 8); #endif @@ -438,6 +466,13 @@ static void clean_results (void) #define DECL_VARIABLE(VAR, T1, W, N) \ VECT_TYPE(T1, W, N) VECT_VAR(VAR, T1, W, N) +#if defined (__ARM_FEATURE_CRYPTO) +#define DECL_VARIABLE_CRYPTO(VAR, T1, W, N) \ + DECL_VARIABLE(VAR, T1, W, N) +#else +#define DECL_VARIABLE_CRYPTO(VAR, T1, W, N) +#endif + /* Declare only 64 bits signed variants. */ #define DECL_VARIABLE_64BITS_SIGNED_VARIANTS(VAR) \ DECL_VARIABLE(VAR, int, 8, 8); \ @@ -473,6 +508,7 @@ static void clean_results (void) DECL_VARIABLE_64BITS_UNSIGNED_VARIANTS(VAR); \ DECL_VARIABLE(VAR, poly, 8, 8); \ DECL_VARIABLE(VAR, poly, 16, 4); \ + DECL_VARIABLE_CRYPTO(VAR, poly, 64, 1); \ DECL_VARIABLE(VAR, float, 16, 4); \ DECL_VARIABLE(VAR, float, 32, 2) #else @@ -481,6 +517,7 @@ static void clean_results (void) DECL_VARIABLE_64BITS_UNSIGNED_VARIANTS(VAR); \ DECL_VARIABLE(VAR, poly, 8, 8); \ DECL_VARIABLE(VAR, poly, 16, 4); \ + DECL_VARIABLE_CRYPTO(VAR, poly, 64, 1); \ DECL_VARIABLE(VAR, float, 32, 2) #endif @@ -491,6 +528,7 @@ static void clean_results (void) DECL_VARIABLE_128BITS_UNSIGNED_VARIANTS(VAR); \ DECL_VARIABLE(VAR, poly, 8, 16); \ DECL_VARIABLE(VAR, poly, 16, 8); \ + DECL_VARIABLE_CRYPTO(VAR, poly, 64, 2); \ DECL_VARIABLE(VAR, float, 16, 8); \ DECL_VARIABLE(VAR, float, 32, 4) #else @@ -499,6 +537,7 @@ static void clean_results (void) DECL_VARIABLE_128BITS_UNSIGNED_VARIANTS(VAR); \ DECL_VARIABLE(VAR, poly, 8, 16); \ DECL_VARIABLE(VAR, poly, 16, 8); \ + DECL_VARIABLE_CRYPTO(VAR, poly, 64, 2); \ DECL_VARIABLE(VAR, float, 32, 4) #endif /* Declare all variants. */ @@ -531,6 +570,13 @@ static void clean_results (void) /* Helpers to call macros with 1 constant and 5 variable arguments. */ +#if defined (__ARM_FEATURE_CRYPTO) +#define MACRO_CRYPTO(MACRO, VAR1, VAR2, T1, T2, T3, W, N) \ + MACRO(VAR1, VAR2, T1, T2, T3, W, N) +#else +#define MACRO_CRYPTO(MACRO, VAR1, VAR2, T1, T2, T3, W, N) +#endif + #define TEST_MACRO_64BITS_SIGNED_VARIANTS_1_5(MACRO, VAR) \ MACRO(VAR, , int, s, 8, 8); \ MACRO(VAR, , int, s, 16, 4); \ @@ -601,13 +647,15 @@ static void clean_results (void) TEST_MACRO_64BITS_SIGNED_VARIANTS_2_5(MACRO, VAR1, VAR2); \ TEST_MACRO_64BITS_UNSIGNED_VARIANTS_2_5(MACRO, VAR1, VAR2); \ MACRO(VAR1, VAR2, , poly, p, 8, 8); \ - MACRO(VAR1, VAR2, , poly, p, 16, 4) + MACRO(VAR1, VAR2, , poly, p, 16, 4); \ + MACRO_CRYPTO(MACRO, VAR1, VAR2, , poly, p, 64, 1) #define TEST_MACRO_128BITS_VARIANTS_2_5(MACRO, VAR1, VAR2) \ TEST_MACRO_128BITS_SIGNED_VARIANTS_2_5(MACRO, VAR1, VAR2); \ TEST_MACRO_128BITS_UNSIGNED_VARIANTS_2_5(MACRO, VAR1, VAR2); \ MACRO(VAR1, VAR2, q, poly, p, 8, 16); \ - MACRO(VAR1, VAR2, q, poly, p, 16, 8) + MACRO(VAR1, VAR2, q, poly, p, 16, 8); \ + MACRO_CRYPTO(MACRO, VAR1, VAR2, q, poly, p, 64, 2) #define TEST_MACRO_ALL_VARIANTS_2_5(MACRO, VAR1, VAR2) \ TEST_MACRO_64BITS_VARIANTS_2_5(MACRO, VAR1, VAR2); \ diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c index 519cffb0125079022e7ba876c1ca657d9e37cac2..8907b38cde90b44a8f1501f72b2c4e812cba5707 100644 --- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c @@ -1,8 +1,9 @@ /* This file contains tests for all the *p64 intrinsics, except for vreinterpret which have their own testcase. */ -/* { dg-require-effective-target arm_crypto_ok } */ +/* { dg-require-effective-target arm_crypto_ok { target { arm*-*-* } } } */ /* { dg-add-options arm_crypto } */ +/* { dg-additional-options "-march=armv8-a+crypto" { target { aarch64*-*-* } } }*/ #include #include "arm-neon-ref.h" @@ -38,6 +39,17 @@ VECT_VAR_DECL(vdup_n_expected2,poly,64,1) [] = { 0xfffffffffffffff2 }; VECT_VAR_DECL(vdup_n_expected2,poly,64,2) [] = { 0xfffffffffffffff2, 0xfffffffffffffff2 }; +/* Expected results: vmov_n. */ +VECT_VAR_DECL(vmov_n_expected0,poly,64,1) [] = { 0xfffffffffffffff0 }; +VECT_VAR_DECL(vmov_n_expected0,poly,64,2) [] = { 0xfffffffffffffff0, + 0xfffffffffffffff0 }; +VECT_VAR_DECL(vmov_n_expected1,poly,64,1) [] = { 0xfffffffffffffff1 }; +VECT_VAR_DECL(vmov_n_expected1,poly,64,2) [] = { 0xfffffffffffffff1, + 0xfffffffffffffff1 }; +VECT_VAR_DECL(vmov_n_expected2,poly,64,1) [] = { 0xfffffffffffffff2 }; +VECT_VAR_DECL(vmov_n_expected2,poly,64,2) [] = { 0xfffffffffffffff2, + 0xfffffffffffffff2 }; + /* Expected results: vext. */ VECT_VAR_DECL(vext_expected,poly,64,1) [] = { 0xfffffffffffffff0 }; VECT_VAR_DECL(vext_expected,poly,64,2) [] = { 0xfffffffffffffff1, 0x88 }; @@ -45,6 +57,9 @@ VECT_VAR_DECL(vext_expected,poly,64,2) [] = { 0xfffffffffffffff1, 0x88 }; /* Expected results: vget_low. */ VECT_VAR_DECL(vget_low_expected,poly,64,1) [] = { 0xfffffffffffffff0 }; +/* Expected results: vget_high. */ +VECT_VAR_DECL(vget_high_expected,poly,64,1) [] = { 0xfffffffffffffff1 }; + /* Expected results: vld1. */ VECT_VAR_DECL(vld1_expected,poly,64,1) [] = { 0xfffffffffffffff0 }; VECT_VAR_DECL(vld1_expected,poly,64,2) [] = { 0xfffffffffffffff0, @@ -109,6 +124,39 @@ VECT_VAR_DECL(vst1_lane_expected,poly,64,1) [] = { 0xfffffffffffffff0 }; VECT_VAR_DECL(vst1_lane_expected,poly,64,2) [] = { 0xfffffffffffffff0, 0x3333333333333333 }; +/* Expected results: vldX_lane. */ +VECT_VAR_DECL(expected_vld_st2_0,poly,64,1) [] = { 0xfffffffffffffff0 }; +VECT_VAR_DECL(expected_vld_st2_0,poly,64,2) [] = { 0xfffffffffffffff0, + 0xfffffffffffffff1 }; +VECT_VAR_DECL(expected_vld_st2_1,poly,64,1) [] = { 0xfffffffffffffff1 }; +VECT_VAR_DECL(expected_vld_st2_1,poly,64,2) [] = { 0xaaaaaaaaaaaaaaaa, + 0xaaaaaaaaaaaaaaaa }; +VECT_VAR_DECL(expected_vld_st3_0,poly,64,1) [] = { 0xfffffffffffffff0 }; +VECT_VAR_DECL(expected_vld_st3_0,poly,64,2) [] = { 0xfffffffffffffff0, + 0xfffffffffffffff1 }; +VECT_VAR_DECL(expected_vld_st3_1,poly,64,1) [] = { 0xfffffffffffffff1 }; +VECT_VAR_DECL(expected_vld_st3_1,poly,64,2) [] = { 0xfffffffffffffff2, + 0xaaaaaaaaaaaaaaaa }; +VECT_VAR_DECL(expected_vld_st3_2,poly,64,1) [] = { 0xfffffffffffffff2 }; +VECT_VAR_DECL(expected_vld_st3_2,poly,64,2) [] = { 0xaaaaaaaaaaaaaaaa, + 0xaaaaaaaaaaaaaaaa }; +VECT_VAR_DECL(expected_vld_st4_0,poly,64,1) [] = { 0xfffffffffffffff0 }; +VECT_VAR_DECL(expected_vld_st4_0,poly,64,2) [] = { 0xfffffffffffffff0, + 0xfffffffffffffff1 }; +VECT_VAR_DECL(expected_vld_st4_1,poly,64,1) [] = { 0xfffffffffffffff1 }; +VECT_VAR_DECL(expected_vld_st4_1,poly,64,2) [] = { 0xfffffffffffffff2, + 0xfffffffffffffff3 }; +VECT_VAR_DECL(expected_vld_st4_2,poly,64,1) [] = { 0xfffffffffffffff2 }; +VECT_VAR_DECL(expected_vld_st4_2,poly,64,2) [] = { 0xaaaaaaaaaaaaaaaa, + 0xaaaaaaaaaaaaaaaa }; +VECT_VAR_DECL(expected_vld_st4_3,poly,64,1) [] = { 0xfffffffffffffff3 }; +VECT_VAR_DECL(expected_vld_st4_3,poly,64,2) [] = { 0xaaaaaaaaaaaaaaaa, + 0xaaaaaaaaaaaaaaaa }; + +/* Expected results: vget_lane. */ +VECT_VAR_DECL(vget_lane_expected,poly,64,1) = 0xfffffffffffffff0; +VECT_VAR_DECL(vget_lane_expected,poly,64,2) = 0xfffffffffffffff0; + int main (void) { int i; @@ -341,6 +389,26 @@ int main (void) CHECK(TEST_MSG, poly, 64, 1, PRIx64, vget_low_expected, ""); + /* vget_high_p64 tests. */ +#undef TEST_MSG +#define TEST_MSG "VGET_HIGH" + +#define TEST_VGET_HIGH(T1, T2, W, N, N2) \ + VECT_VAR(vget_high_vector64, T1, W, N) = \ + vget_high_##T2##W(VECT_VAR(vget_high_vector128, T1, W, N2)); \ + vst1_##T2##W(VECT_VAR(result, T1, W, N), VECT_VAR(vget_high_vector64, T1, W, N)) + + DECL_VARIABLE(vget_high_vector64, poly, 64, 1); + DECL_VARIABLE(vget_high_vector128, poly, 64, 2); + + CLEAN(result, poly, 64, 1); + + VLOAD(vget_high_vector128, buffer, q, poly, p, 64, 2); + + TEST_VGET_HIGH(poly, p, 64, 1, 2); + + CHECK(TEST_MSG, poly, 64, 1, PRIx64, vget_high_expected, ""); + /* vld1_p64 tests. */ #undef TEST_MSG #define TEST_MSG "VLD1/VLD1Q" @@ -645,7 +713,7 @@ int main (void) VECT_VAR(vst1_lane_vector, T1, W, N) = \ vld1##Q##_##T2##W(VECT_VAR(buffer, T1, W, N)); \ vst1##Q##_lane_##T2##W(VECT_VAR(result, T1, W, N), \ - VECT_VAR(vst1_lane_vector, T1, W, N), L) + VECT_VAR(vst1_lane_vector, T1, W, N), L); DECL_VARIABLE(vst1_lane_vector, poly, 64, 1); DECL_VARIABLE(vst1_lane_vector, poly, 64, 2); @@ -659,5 +727,298 @@ int main (void) CHECK(TEST_MSG, poly, 64, 1, PRIx64, vst1_lane_expected, ""); CHECK(TEST_MSG, poly, 64, 2, PRIx64, vst1_lane_expected, ""); +#ifdef __aarch64__ + + /* vmov_n_p64 tests. */ +#undef TEST_MSG +#define TEST_MSG "VMOV/VMOVQ" + +#define TEST_VMOV(Q, T1, T2, W, N) \ + VECT_VAR(vmov_n_vector, T1, W, N) = \ + vmov##Q##_n_##T2##W(VECT_VAR(buffer_dup, T1, W, N)[i]); \ + vst1##Q##_##T2##W(VECT_VAR(result, T1, W, N), VECT_VAR(vmov_n_vector, T1, W, N)) + + DECL_VARIABLE(vmov_n_vector, poly, 64, 1); + DECL_VARIABLE(vmov_n_vector, poly, 64, 2); + + /* Try to read different places from the input buffer. */ + for (i=0; i< 3; i++) { + CLEAN(result, poly, 64, 1); + CLEAN(result, poly, 64, 2); + + TEST_VMOV(, poly, p, 64, 1); + TEST_VMOV(q, poly, p, 64, 2); + + switch (i) { + case 0: + CHECK(TEST_MSG, poly, 64, 1, PRIx64, vmov_n_expected0, ""); + CHECK(TEST_MSG, poly, 64, 2, PRIx64, vmov_n_expected0, ""); + break; + case 1: + CHECK(TEST_MSG, poly, 64, 1, PRIx64, vmov_n_expected1, ""); + CHECK(TEST_MSG, poly, 64, 2, PRIx64, vmov_n_expected1, ""); + break; + case 2: + CHECK(TEST_MSG, poly, 64, 1, PRIx64, vmov_n_expected2, ""); + CHECK(TEST_MSG, poly, 64, 2, PRIx64, vmov_n_expected2, ""); + break; + default: + abort(); + } + } + + /* vget_lane_p64 tests. */ +#undef TEST_MSG +#define TEST_MSG "VGET_LANE/VGETQ_LANE" + +#define TEST_VGET_LANE(Q, T1, T2, W, N, L) \ + VECT_VAR(vget_lane_vector, T1, W, N) = vget##Q##_lane_##T2##W(VECT_VAR(vector, T1, W, N), L); \ + if (VECT_VAR(vget_lane_vector, T1, W, N) != VECT_VAR(vget_lane_expected, T1, W, N)) { \ + fprintf(stderr, \ + "ERROR in %s (%s line %d in result '%s') at type %s " \ + "got 0x%" PRIx##W " != 0x%" PRIx##W "\n", \ + TEST_MSG, __FILE__, __LINE__, \ + STR(VECT_VAR(vget_lane_expected, T1, W, N)), \ + STR(VECT_NAME(T1, W, N)), \ + VECT_VAR(vget_lane_vector, T1, W, N), \ + VECT_VAR(vget_lane_expected, T1, W, N)); \ + abort (); \ + } + + /* Initialize input values. */ + DECL_VARIABLE(vector, poly, 64, 1); + DECL_VARIABLE(vector, poly, 64, 2); + + VLOAD(vector, buffer, , poly, p, 64, 1); + VLOAD(vector, buffer, q, poly, p, 64, 2); + + VECT_VAR_DECL(vget_lane_vector, poly, 64, 1); + VECT_VAR_DECL(vget_lane_vector, poly, 64, 2); + + TEST_VGET_LANE( , poly, p, 64, 1, 0); + TEST_VGET_LANE(q, poly, p, 64, 2, 0); + + /* vldx_lane_p64 tests. */ +#undef TEST_MSG +#define TEST_MSG "VLDX_LANE/VLDXQ_LANE" + +VECT_VAR_DECL_INIT(buffer_vld2_lane, poly, 64, 2); +VECT_VAR_DECL_INIT(buffer_vld3_lane, poly, 64, 3); +VECT_VAR_DECL_INIT(buffer_vld4_lane, poly, 64, 4); + + /* In this case, input variables are arrays of vectors. */ +#define DECL_VLD_STX_LANE(T1, W, N, X) \ + VECT_ARRAY_TYPE(T1, W, N, X) VECT_ARRAY_VAR(vector, T1, W, N, X); \ + VECT_ARRAY_TYPE(T1, W, N, X) VECT_ARRAY_VAR(vector_src, T1, W, N, X); \ + VECT_VAR_DECL(result_bis_##X, T1, W, N)[X * N] + + /* We need to use a temporary result buffer (result_bis), because + the one used for other tests is not large enough. A subset of the + result data is moved from result_bis to result, and it is this + subset which is used to check the actual behavior. The next + macro enables to move another chunk of data from result_bis to + result. */ + /* We also use another extra input buffer (buffer_src), which we + fill with 0xAA, and which it used to load a vector from which we + read a given lane. */ + +#define TEST_VLDX_LANE(Q, T1, T2, W, N, X, L) \ + memset (VECT_VAR(buffer_src, T1, W, N), 0xAA, \ + sizeof(VECT_VAR(buffer_src, T1, W, N))); \ + \ + VECT_ARRAY_VAR(vector_src, T1, W, N, X) = \ + vld##X##Q##_##T2##W(VECT_VAR(buffer_src, T1, W, N)); \ + \ + VECT_ARRAY_VAR(vector, T1, W, N, X) = \ + /* Use dedicated init buffer, of size. X */ \ + vld##X##Q##_lane_##T2##W(VECT_VAR(buffer_vld##X##_lane, T1, W, X), \ + VECT_ARRAY_VAR(vector_src, T1, W, N, X), \ + L); \ + vst##X##Q##_##T2##W(VECT_VAR(result_bis_##X, T1, W, N), \ + VECT_ARRAY_VAR(vector, T1, W, N, X)); \ + memcpy(VECT_VAR(result, T1, W, N), VECT_VAR(result_bis_##X, T1, W, N), \ + sizeof(VECT_VAR(result, T1, W, N))) + + /* Overwrite "result" with the contents of "result_bis"[Y]. */ +#undef TEST_EXTRA_CHUNK +#define TEST_EXTRA_CHUNK(T1, W, N, X, Y) \ + memcpy(VECT_VAR(result, T1, W, N), \ + &(VECT_VAR(result_bis_##X, T1, W, N)[Y*N]), \ + sizeof(VECT_VAR(result, T1, W, N))); + + /* Add some padding to try to catch out of bound accesses. */ +#define ARRAY1(V, T, W, N) VECT_VAR_DECL(V,T,W,N)[1]={42} +#define DUMMY_ARRAY(V, T, W, N, L) \ + VECT_VAR_DECL(V,T,W,N)[N*L]={0}; \ + ARRAY1(V##_pad,T,W,N) + +#define DECL_ALL_VLD_STX_LANE(X) \ + DECL_VLD_STX_LANE(poly, 64, 1, X); \ + DECL_VLD_STX_LANE(poly, 64, 2, X); + +#define TEST_ALL_VLDX_LANE(X) \ + TEST_VLDX_LANE(, poly, p, 64, 1, X, 0); \ + TEST_VLDX_LANE(q, poly, p, 64, 2, X, 0); + +#define TEST_ALL_EXTRA_CHUNKS(X,Y) \ + TEST_EXTRA_CHUNK(poly, 64, 1, X, Y) \ + TEST_EXTRA_CHUNK(poly, 64, 2, X, Y) + +#define CHECK_RESULTS_VLD_STX_LANE(test_name,EXPECTED,comment) \ + CHECK(test_name, poly, 64, 1, PRIx64, EXPECTED, comment); \ + CHECK(test_name, poly, 64, 2, PRIx64, EXPECTED, comment); + + /* Declare the temporary buffers / variables. */ + DECL_ALL_VLD_STX_LANE(2); + DECL_ALL_VLD_STX_LANE(3); + DECL_ALL_VLD_STX_LANE(4); + + DUMMY_ARRAY(buffer_src, poly, 64, 1, 4); + DUMMY_ARRAY(buffer_src, poly, 64, 2, 4); + + /* Check vld2_lane/vld2q_lane. */ + clean_results (); +#undef TEST_MSG +#define TEST_MSG "VLD2_LANE/VLD2Q_LANE" + TEST_ALL_VLDX_LANE(2); + CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st2_0, " chunk 0"); + + TEST_ALL_EXTRA_CHUNKS(2, 1); + CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st2_1, " chunk 1"); + + /* Check vld3_lane/vld3q_lane. */ + clean_results (); +#undef TEST_MSG +#define TEST_MSG "VLD3_LANE/VLD3Q_LANE" + TEST_ALL_VLDX_LANE(3); + CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st3_0, " chunk 0"); + + TEST_ALL_EXTRA_CHUNKS(3, 1); + CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st3_1, " chunk 1"); + + TEST_ALL_EXTRA_CHUNKS(3, 2); + CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st3_2, " chunk 2"); + + /* Check vld4_lane/vld4q_lane. */ + clean_results (); +#undef TEST_MSG +#define TEST_MSG "VLD4_LANE/VLD4Q_LANE" + TEST_ALL_VLDX_LANE(4); + CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st4_0, " chunk 0"); + + TEST_ALL_EXTRA_CHUNKS(4, 1); + CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st4_1, " chunk 1"); + TEST_ALL_EXTRA_CHUNKS(4, 2); + + CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st4_2, " chunk 2"); + + TEST_ALL_EXTRA_CHUNKS(4, 3); + CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st4_3, " chunk 3"); + + /* In this case, input variables are arrays of vectors. */ +#define DECL_VSTX_LANE(T1, W, N, X) \ + VECT_ARRAY_TYPE(T1, W, N, X) VECT_ARRAY_VAR(vector, T1, W, N, X); \ + VECT_ARRAY_TYPE(T1, W, N, X) VECT_ARRAY_VAR(vector_src, T1, W, N, X); \ + VECT_VAR_DECL(result_bis_##X, T1, W, N)[X * N] + + /* We need to use a temporary result buffer (result_bis), because + the one used for other tests is not large enough. A subset of the + result data is moved from result_bis to result, and it is this + subset which is used to check the actual behavior. The next + macro enables to move another chunk of data from result_bis to + result. */ + /* We also use another extra input buffer (buffer_src), which we + fill with 0xAA, and which it used to load a vector from which we + read a given lane. */ +#define TEST_VSTX_LANE(Q, T1, T2, W, N, X, L) \ + memset (VECT_VAR(buffer_src, T1, W, N), 0xAA, \ + sizeof(VECT_VAR(buffer_src, T1, W, N))); \ + memset (VECT_VAR(result_bis_##X, T1, W, N), 0, \ + sizeof(VECT_VAR(result_bis_##X, T1, W, N))); \ + \ + VECT_ARRAY_VAR(vector_src, T1, W, N, X) = \ + vld##X##Q##_##T2##W(VECT_VAR(buffer_src, T1, W, N)); \ + \ + VECT_ARRAY_VAR(vector, T1, W, N, X) = \ + /* Use dedicated init buffer, of size X. */ \ + vld##X##Q##_lane_##T2##W(VECT_VAR(buffer_vld##X##_lane, T1, W, X), \ + VECT_ARRAY_VAR(vector_src, T1, W, N, X), \ + L); \ + vst##X##Q##_lane_##T2##W(VECT_VAR(result_bis_##X, T1, W, N), \ + VECT_ARRAY_VAR(vector, T1, W, N, X), \ + L); \ + memcpy(VECT_VAR(result, T1, W, N), VECT_VAR(result_bis_##X, T1, W, N), \ + sizeof(VECT_VAR(result, T1, W, N))); + +#define TEST_ALL_VSTX_LANE(X) \ + TEST_VSTX_LANE(, poly, p, 64, 1, X, 0); \ + TEST_VSTX_LANE(q, poly, p, 64, 2, X, 0); + + /* Check vst2_lane/vst2q_lane. */ + clean_results (); +#undef TEST_MSG +#define TEST_MSG "VST2_LANE/VST2Q_LANE" + TEST_ALL_VSTX_LANE(2); + +#define CMT " (chunk 0)" + CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st2_0, CMT); + + TEST_ALL_EXTRA_CHUNKS(2, 1); +#undef CMT +#define CMT " chunk 1" + CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st2_1, CMT); + + /* Check vst3_lane/vst3q_lane. */ + clean_results (); +#undef TEST_MSG +#define TEST_MSG "VST3_LANE/VST3Q_LANE" + TEST_ALL_VSTX_LANE(3); + +#undef CMT +#define CMT " (chunk 0)" + CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st3_0, CMT); + + TEST_ALL_EXTRA_CHUNKS(3, 1); + +#undef CMT +#define CMT " (chunk 1)" + CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st3_1, CMT); + + TEST_ALL_EXTRA_CHUNKS(3, 2); + +#undef CMT +#define CMT " (chunk 2)" + CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st3_2, CMT); + + /* Check vst4_lane/vst4q_lane. */ + clean_results (); +#undef TEST_MSG +#define TEST_MSG "VST4_LANE/VST4Q_LANE" + TEST_ALL_VSTX_LANE(4); + +#undef CMT +#define CMT " (chunk 0)" + CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st4_0, CMT); + + TEST_ALL_EXTRA_CHUNKS(4, 1); + +#undef CMT +#define CMT " (chunk 1)" + CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st4_1, CMT); + + TEST_ALL_EXTRA_CHUNKS(4, 2); + +#undef CMT +#define CMT " (chunk 2)" + CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st4_2, CMT); + + TEST_ALL_EXTRA_CHUNKS(4, 3); + +#undef CMT +#define CMT " (chunk 3)" + CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st4_3, CMT); + +#endif /* __aarch64__. */ + return 0; } diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c index 808641524c47b2c245ee2f10e74a784a7bccefc9..f192d4dda514287c8417e7fc922bc580b209b163 100644 --- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c @@ -1,7 +1,8 @@ /* This file contains tests for the vreinterpret *p128 intrinsics. */ -/* { dg-require-effective-target arm_crypto_ok } */ +/* { dg-require-effective-target arm_crypto_ok { target { arm*-*-* } } } */ /* { dg-add-options arm_crypto } */ +/* { dg-additional-options "-march=armv8-a+crypto" { target { aarch64*-*-* } } }*/ #include #include "arm-neon-ref.h" @@ -78,9 +79,7 @@ VECT_VAR_DECL(vreint_expected_q_f16_p128,hfloat,16,8) [] = { 0xfff0, 0xffff, int main (void) { DECL_VARIABLE_128BITS_VARIANTS(vreint_vector); - DECL_VARIABLE(vreint_vector, poly, 64, 2); DECL_VARIABLE_128BITS_VARIANTS(vreint_vector_res); - DECL_VARIABLE(vreint_vector_res, poly, 64, 2); clean_results (); diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c index 1d8cf9aa69f0b5b0717e98de613e3c350d6395d4..c915fd2fea6b4d8770c9a4aab88caad391105d89 100644 --- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c @@ -1,7 +1,8 @@ /* This file contains tests for the vreinterpret *p64 intrinsics. */ -/* { dg-require-effective-target arm_crypto_ok } */ +/* { dg-require-effective-target arm_crypto_ok { target { arm*-*-* } } } */ /* { dg-add-options arm_crypto } */ +/* { dg-additional-options "-march=armv8-a+crypto" { target { aarch64*-*-* } } }*/ #include #include "arm-neon-ref.h" @@ -121,11 +122,7 @@ int main (void) CHECK_FP(TEST_MSG, T1, W, N, PRIx##W, EXPECTED, ""); DECL_VARIABLE_ALL_VARIANTS(vreint_vector); - DECL_VARIABLE(vreint_vector, poly, 64, 1); - DECL_VARIABLE(vreint_vector, poly, 64, 2); DECL_VARIABLE_ALL_VARIANTS(vreint_vector_res); - DECL_VARIABLE(vreint_vector_res, poly, 64, 1); - DECL_VARIABLE(vreint_vector_res, poly, 64, 2); clean_results ();