From patchwork Sat Nov 16 11:31:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1196095 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-513784-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="cDhjtp19"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47FY2l1Ksfz9sPK for ; Sat, 16 Nov 2019 22:31:42 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; q=dns; s= default; b=WS+EJzIT0sfKMKUPzrTjvSYVmC/dZf+CHpqT9yarczcWwwtlFLKtL mFDs6zAx5aHAiRiPWJoVS8BtLzbWJpyJFhdWxHmdr8vA7eDdm7ko6IxCOyQcVikO V/sP5gh2LWOTUGflBpu0ryRo60Bf/LlBlgsTPcy1RjtF2oT1hb/3mA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; s= default; bh=/bkBMT5eVTgXqFinpz+IKB1ngD0=; b=cDhjtp19iILZTPdwjvqO Bhe7wxdSS6YoyGVTzETCi2YGTTH2sg7GRYkxNsV6F7Uz/CyhOZKDQMrEuOO/CWOb Nd2gTjIUwsyF+jHoRM9qXZEM7wbJc53sYpk+/ih3Nt9fofkH5MM4le19+eVPyBN+ rsQY214Qyhvtmukxj4iBUNA= Received: (qmail 15981 invoked by alias); 16 Nov 2019 11:31:34 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 15967 invoked by uid 89); 16 Nov 2019 11:31:34 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-9.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, SPF_PASS autolearn=ham version=3.3.1 spammy=4s X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.110.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 16 Nov 2019 11:31:31 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D532A30E for ; Sat, 16 Nov 2019 03:31:29 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6054C3F534 for ; Sat, 16 Nov 2019 03:31:29 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [committed][AArch64] Add scatter stores for partial SVE modes Date: Sat, 16 Nov 2019 11:31:28 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-IsSubscribed: yes This patch adds support for scatter stores of partial vectors, where the vector base or offset elements can be wider than the elements being stored. Tested on aarch64-linux-gnu and applied as r278347. Richard 2019-11-16 Richard Sandiford gcc/ * config/aarch64/aarch64-sve.md (scatter_store): Extend to... (scatter_store): ...this. (mask_scatter_store): Extend to... (mask_scatter_store): ...this. (mask_scatter_store): Extend to... (mask_scatter_store): ...this. (*mask_scatter_store_xtw_unpacked): New pattern. (*mask_scatter_store_sxtw): Extend to... (*mask_scatter_store_sxtw): ...this. (*mask_scatter_store_uxtw): Extend to... (*mask_scatter_store_uxtw): ...this. gcc/testsuite/ * gcc.target/aarch64/sve/scatter_store_1.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit and 16-bit elements. * gcc.target/aarch64/sve/scatter_store_2.c: Update accordingly. * gcc.target/aarch64/sve/scatter_store_3.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit and 16-bit elements. * gcc.target/aarch64/sve/scatter_store_4.c: Update accordingly. * gcc.target/aarch64/sve/scatter_store_5.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit, 16-bit and 32-bit elements. * gcc.target/aarch64/sve/scatter_store_8.c: New test. * gcc.target/aarch64/sve/scatter_store_9.c: Likewise. Index: gcc/config/aarch64/aarch64-sve.md =================================================================== --- gcc/config/aarch64/aarch64-sve.md 2019-11-16 11:26:06.895163107 +0000 +++ gcc/config/aarch64/aarch64-sve.md 2019-11-16 11:28:29.386158694 +0000 @@ -2135,15 +2135,15 @@ (define_insn "@aarch64_stnt1" ;; ------------------------------------------------------------------------- ;; Unpredicated scatter stores. -(define_expand "scatter_store" +(define_expand "scatter_store" [(set (mem:BLK (scratch)) (unspec:BLK [(match_dup 5) (match_operand:DI 0 "aarch64_sve_gather_offset_") - (match_operand: 1 "register_operand") + (match_operand: 1 "register_operand") (match_operand:DI 2 "const_int_operand") (match_operand:DI 3 "aarch64_gather_scale_operand_") - (match_operand:SVE_FULL_SD 4 "register_operand")] + (match_operand:SVE_24 4 "register_operand")] UNSPEC_ST1_SCATTER))] "TARGET_SVE" { @@ -2153,48 +2153,74 @@ (define_expand "scatter_store" +(define_insn "mask_scatter_store" [(set (mem:BLK (scratch)) (unspec:BLK [(match_operand:VNx4BI 5 "register_operand" "Upl, Upl, Upl, Upl, Upl, Upl") - (match_operand:DI 0 "aarch64_sve_gather_offset_w" "Z, vgw, rk, rk, rk, rk") + (match_operand:DI 0 "aarch64_sve_gather_offset_" "Z, vgw, rk, rk, rk, rk") (match_operand:VNx4SI 1 "register_operand" "w, w, w, w, w, w") (match_operand:DI 2 "const_int_operand" "Ui1, Ui1, Z, Ui1, Z, Ui1") - (match_operand:DI 3 "aarch64_gather_scale_operand_w" "Ui1, Ui1, Ui1, Ui1, i, i") - (match_operand:SVE_FULL_S 4 "register_operand" "w, w, w, w, w, w")] + (match_operand:DI 3 "aarch64_gather_scale_operand_" "Ui1, Ui1, Ui1, Ui1, i, i") + (match_operand:SVE_4 4 "register_operand" "w, w, w, w, w, w")] UNSPEC_ST1_SCATTER))] "TARGET_SVE" "@ - st1w\t%4.s, %5, [%1.s] - st1w\t%4.s, %5, [%1.s, #%0] - st1w\t%4.s, %5, [%0, %1.s, sxtw] - st1w\t%4.s, %5, [%0, %1.s, uxtw] - st1w\t%4.s, %5, [%0, %1.s, sxtw %p3] - st1w\t%4.s, %5, [%0, %1.s, uxtw %p3]" + st1\t%4.s, %5, [%1.s] + st1\t%4.s, %5, [%1.s, #%0] + st1\t%4.s, %5, [%0, %1.s, sxtw] + st1\t%4.s, %5, [%0, %1.s, uxtw] + st1\t%4.s, %5, [%0, %1.s, sxtw %p3] + st1\t%4.s, %5, [%0, %1.s, uxtw %p3]" ) ;; Predicated scatter stores for 64-bit elements. The value of operand 2 ;; doesn't matter in this case. -(define_insn "mask_scatter_store" +(define_insn "mask_scatter_store" [(set (mem:BLK (scratch)) (unspec:BLK [(match_operand:VNx2BI 5 "register_operand" "Upl, Upl, Upl, Upl") - (match_operand:DI 0 "aarch64_sve_gather_offset_d" "Z, vgd, rk, rk") + (match_operand:DI 0 "aarch64_sve_gather_offset_" "Z, vgd, rk, rk") (match_operand:VNx2DI 1 "register_operand" "w, w, w, w") (match_operand:DI 2 "const_int_operand") - (match_operand:DI 3 "aarch64_gather_scale_operand_d" "Ui1, Ui1, Ui1, i") - (match_operand:SVE_FULL_D 4 "register_operand" "w, w, w, w")] + (match_operand:DI 3 "aarch64_gather_scale_operand_" "Ui1, Ui1, Ui1, i") + (match_operand:SVE_2 4 "register_operand" "w, w, w, w")] UNSPEC_ST1_SCATTER))] "TARGET_SVE" "@ - st1d\t%4.d, %5, [%1.d] - st1d\t%4.d, %5, [%1.d, #%0] - st1d\t%4.d, %5, [%0, %1.d] - st1d\t%4.d, %5, [%0, %1.d, lsl %p3]" + st1\t%4.d, %5, [%1.d] + st1\t%4.d, %5, [%1.d, #%0] + st1\t%4.d, %5, [%0, %1.d] + st1\t%4.d, %5, [%0, %1.d, lsl %p3]" ) -;; Likewise, but with the offset being sign-extended from 32 bits. -(define_insn_and_rewrite "*mask_scatter_store_sxtw" +;; Likewise, but with the offset being extended from 32 bits. +(define_insn_and_rewrite "*mask_scatter_store_xtw_unpacked" + [(set (mem:BLK (scratch)) + (unspec:BLK + [(match_operand:VNx2BI 5 "register_operand" "Upl, Upl") + (match_operand:DI 0 "register_operand" "rk, rk") + (unspec:VNx2DI + [(match_operand 6) + (ANY_EXTEND:VNx2DI + (match_operand:VNx2SI 1 "register_operand" "w, w"))] + UNSPEC_PRED_X) + (match_operand:DI 2 "const_int_operand") + (match_operand:DI 3 "aarch64_gather_scale_operand_" "Ui1, i") + (match_operand:SVE_2 4 "register_operand" "w, w")] + UNSPEC_ST1_SCATTER))] + "TARGET_SVE" + "@ + st1\t%4.d, %5, [%0, %1.d, xtw] + st1\t%4.d, %5, [%0, %1.d, xtw %p3]" + "&& !CONSTANT_P (operands[6])" + { + operands[6] = CONSTM1_RTX (mode); + } +) + +;; Likewise, but with the offset being truncated to 32 bits and then +;; sign-extended. +(define_insn_and_rewrite "*mask_scatter_store_sxtw" [(set (mem:BLK (scratch)) (unspec:BLK [(match_operand:VNx2BI 5 "register_operand" "Upl, Upl") @@ -2206,21 +2232,22 @@ (define_insn_and_rewrite "*mask_scatter_ (match_operand:VNx2DI 1 "register_operand" "w, w")))] UNSPEC_PRED_X) (match_operand:DI 2 "const_int_operand") - (match_operand:DI 3 "aarch64_gather_scale_operand_d" "Ui1, i") - (match_operand:SVE_FULL_D 4 "register_operand" "w, w")] + (match_operand:DI 3 "aarch64_gather_scale_operand_" "Ui1, i") + (match_operand:SVE_2 4 "register_operand" "w, w")] UNSPEC_ST1_SCATTER))] "TARGET_SVE" "@ - st1d\t%4.d, %5, [%0, %1.d, sxtw] - st1d\t%4.d, %5, [%0, %1.d, sxtw %p3]" - "&& !rtx_equal_p (operands[5], operands[6])" + st1\t%4.d, %5, [%0, %1.d, sxtw] + st1\t%4.d, %5, [%0, %1.d, sxtw %p3]" + "&& !CONSTANT_P (operands[6])" { - operands[6] = copy_rtx (operands[5]); + operands[6] = CONSTM1_RTX (mode); } ) -;; Likewise, but with the offset being zero-extended from 32 bits. -(define_insn "*mask_scatter_store_uxtw" +;; Likewise, but with the offset being truncated to 32 bits and then +;; zero-extended. +(define_insn "*mask_scatter_store_uxtw" [(set (mem:BLK (scratch)) (unspec:BLK [(match_operand:VNx2BI 5 "register_operand" "Upl, Upl") @@ -2229,13 +2256,13 @@ (define_insn "*mask_scatter_store< (match_operand:VNx2DI 1 "register_operand" "w, w") (match_operand:VNx2DI 6 "aarch64_sve_uxtw_immediate")) (match_operand:DI 2 "const_int_operand") - (match_operand:DI 3 "aarch64_gather_scale_operand_d" "Ui1, i") - (match_operand:SVE_FULL_D 4 "register_operand" "w, w")] + (match_operand:DI 3 "aarch64_gather_scale_operand_" "Ui1, i") + (match_operand:SVE_2 4 "register_operand" "w, w")] UNSPEC_ST1_SCATTER))] "TARGET_SVE" "@ - st1d\t%4.d, %5, [%0, %1.d, uxtw] - st1d\t%4.d, %5, [%0, %1.d, uxtw %p3]" + st1\t%4.d, %5, [%0, %1.d, uxtw] + st1\t%4.d, %5, [%0, %1.d, uxtw %p3]" ) ;; ------------------------------------------------------------------------- Index: gcc/testsuite/gcc.target/aarch64/sve/scatter_store_1.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/scatter_store_1.c 2019-03-08 18:14:29.792994691 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/scatter_store_1.c 2019-11-16 11:28:29.386158694 +0000 @@ -13,11 +13,15 @@ #define TEST_LOOP(DATA_TYPE, BITS) \ f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \ INDEX##BITS *indices, int n) \ { \ - for (int i = 9; i < n; ++i) \ + for (int i = 0; i < n; ++i) \ dest[indices[i]] = src[i] + 1; \ } #define TEST_ALL(T) \ + T (int8_t, 32) \ + T (uint8_t, 32) \ + T (int16_t, 32) \ + T (uint16_t, 32) \ T (int32_t, 32) \ T (uint32_t, 32) \ T (float, 32) \ @@ -27,5 +31,7 @@ #define TEST_ALL(T) \ TEST_ALL (TEST_LOOP) +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, sxtw\]\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, sxtw 1\]\n} 2 } } */ /* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, sxtw 2\]\n} 3 } } */ /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x[0-9]+, z[0-9]+.d, lsl 3\]\n} 3 } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/scatter_store_2.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/scatter_store_2.c 2019-03-08 18:14:29.764994797 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/scatter_store_2.c 2019-11-16 11:28:29.386158694 +0000 @@ -6,5 +6,7 @@ #define INDEX64 uint64_t #include "scatter_store_1.c" +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, uxtw\]\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, uxtw 1\]\n} 2 } } */ /* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, uxtw 2\]\n} 3 } } */ /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x[0-9]+, z[0-9]+.d, lsl 3\]\n} 3 } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/scatter_store_3.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/scatter_store_3.c 2019-03-08 18:14:29.768994780 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/scatter_store_3.c 2019-11-16 11:28:29.386158694 +0000 @@ -8,17 +8,20 @@ #define INDEX32 int32_t #define INDEX64 int64_t #endif -/* Invoked 18 times for each data size. */ #define TEST_LOOP(DATA_TYPE, BITS) \ void __attribute__ ((noinline, noclone)) \ f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \ INDEX##BITS *indices, int n) \ { \ - for (int i = 9; i < n; ++i) \ + for (int i = 0; i < n; ++i) \ *(DATA_TYPE *) ((char *) dest + indices[i]) = src[i] + 1; \ } #define TEST_ALL(T) \ + T (int8_t, 32) \ + T (uint8_t, 32) \ + T (int16_t, 32) \ + T (uint16_t, 32) \ T (int32_t, 32) \ T (uint32_t, 32) \ T (float, 32) \ @@ -28,5 +31,7 @@ #define TEST_ALL(T) \ TEST_ALL (TEST_LOOP) +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, sxtw\]\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, sxtw\]\n} 2 } } */ /* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, sxtw\]\n} 3 } } */ /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x[0-9]+, z[0-9]+.d\]\n} 3 } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/scatter_store_4.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/scatter_store_4.c 2019-03-08 18:14:29.772994767 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/scatter_store_4.c 2019-11-16 11:28:29.386158694 +0000 @@ -6,5 +6,7 @@ #define INDEX64 uint64_t #include "scatter_store_3.c" +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, uxtw\]\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, uxtw\]\n} 2 } } */ /* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, uxtw\]\n} 3 } } */ /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x[0-9]+, z[0-9]+.d\]\n} 3 } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/scatter_store_5.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/scatter_store_5.c 2019-03-08 18:14:29.776994751 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/scatter_store_5.c 2019-11-16 11:28:29.386158694 +0000 @@ -3,21 +3,29 @@ #include -/* Invoked 18 times for each data size. */ #define TEST_LOOP(DATA_TYPE) \ void __attribute__ ((noinline, noclone)) \ f_##DATA_TYPE (DATA_TYPE *restrict *dest, DATA_TYPE *restrict src, \ int n) \ { \ - for (int i = 9; i < n; ++i) \ + for (int i = 0; i < n; ++i) \ *dest[i] = src[i] + 1; \ } #define TEST_ALL(T) \ + T (int8_t) \ + T (uint8_t) \ + T (int16_t) \ + T (uint16_t) \ + T (int32_t) \ + T (uint32_t) \ T (int64_t) \ T (uint64_t) \ T (double) TEST_ALL (TEST_LOOP) +/* We assume this isn't profitable for bytes. */ +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.d, p[0-7], \[z[0-9]+.d\]\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.d, p[0-7], \[z[0-9]+.d\]\n} 2 } } */ /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[z[0-9]+.d\]\n} 3 } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/scatter_store_8.c =================================================================== --- /dev/null 2019-09-17 11:41:18.176664108 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/scatter_store_8.c 2019-11-16 11:28:29.386158694 +0000 @@ -0,0 +1,46 @@ +/* { dg-do assemble { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O2 -ftree-vectorize -fwrapv --save-temps" } */ + +#include + +#ifndef INDEX32 +#define INDEX16 int16_t +#define INDEX32 int32_t +#endif + +#define TEST_LOOP(DATA_TYPE, BITS) \ + void __attribute__ ((noinline, noclone)) \ + f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \ + INDEX##BITS *indices, INDEX##BITS mask, int n) \ + { \ + for (int i = 0; i < n; ++i) \ + dest[(INDEX##BITS) (indices[i] + mask)] = src[i]; \ + } + +#define TEST_ALL(T) \ + T (int8_t, 16) \ + T (uint8_t, 16) \ + T (int16_t, 16) \ + T (uint16_t, 16) \ + T (_Float16, 16) \ + T (int32_t, 16) \ + T (uint32_t, 16) \ + T (float, 16) \ + T (int64_t, 32) \ + T (uint64_t, 32) \ + T (double, 32) + +TEST_ALL (TEST_LOOP) + +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, sxtw\]\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, sxtw 1\]\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, sxtw 2\]\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x[0-9]+, z[0-9]+.d, sxtw 3\]\n} 3 } } */ + +/* { dg-final { scan-assembler-times {\tsxt.\tz} 8 } } */ +/* { dg-final { scan-assembler-times {\tsxth\tz[0-9]+\.s,} 8 } } */ + +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.s,} 2 } } */ +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.s,} 3 } } */ +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s,} 3 } } */ +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 3 } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/scatter_store_9.c =================================================================== --- /dev/null 2019-09-17 11:41:18.176664108 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/scatter_store_9.c 2019-11-16 11:28:29.386158694 +0000 @@ -0,0 +1,20 @@ +/* { dg-do assemble { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O2 -ftree-vectorize -fwrapv --save-temps" } */ + +#define INDEX16 uint16_t +#define INDEX32 uint32_t + +#include "scatter_store_8.c" + +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, uxtw\]\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, uxtw 1\]\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x[0-9]+, z[0-9]+.s, uxtw 2\]\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x[0-9]+, z[0-9]+.d, uxtw 3\]\n} 3 } } */ + +/* { dg-final { scan-assembler-times {\tuxt.\tz} 8 } } */ +/* { dg-final { scan-assembler-times {\tuxth\tz[0-9]+\.s,} 8 } } */ + +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.s,} 2 } } */ +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.s,} 3 } } */ +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s,} 3 } } */ +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 3 } } */