From patchwork Wed Aug 14 12:27:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1147009 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-506933-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="C4ZLvGvC"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 467pkX5SSrz9sN1 for ; Wed, 14 Aug 2019 22:27:30 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; q=dns; s=default; b=L8L6awJbrh2waL+pidL+KAiD1Tzlfs0E4y6zsN9P/j+cr59rU0 ZXdsqDJxI/+ZQvO8KY02AnLKeg4n6p4niVymrA+X/rIVzHfZ5QEE8PGj5PTENTIf lCxNwp6oqA6JWGpUVZXdu2PB7A414qIMxhWhl/h4jercolySKvbl34o4o= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; s= default; bh=MIdBwDMw9p7qnShanT3cyZXIaQk=; b=C4ZLvGvCzV9W0ZSpeWto MjtU5mH8ddJDourHI44xFls2QxJpFl+dScHYwFjLS10584QKN4V40T5Wcixo3lD1 D3VPpqD9fW00+AgpD7cMjdEz2UAl13zUHYOYDNedshisLeROAW3UIwfqdtmjQzMi Wn8yX8LUoe5w7FV3mDpeeXY= Received: (qmail 12975 invoked by alias); 14 Aug 2019 12:27:21 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 12967 invoked by uid 89); 14 Aug 2019 12:27:20 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-8.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS autolearn=ham version=3.3.1 spammy= X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.110.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 14 Aug 2019 12:27:10 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B7B2828; Wed, 14 Aug 2019 05:27:08 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.99.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0AB403F706; Wed, 14 Aug 2019 05:27:07 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, Prathamesh Kulkarni , richard.sandiford@arm.com Cc: Prathamesh Kulkarni Subject: Add IFN_COND functions for shifting Date: Wed, 14 Aug 2019 13:27:06 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-IsSubscribed: yes This patch adds support for IFN_COND shifts left and shifts right. This is mostly mechanical, but since we try to handle conditional operations in the same way as unconditional operations in match.pd, we need to support IFN_COND shifts by scalars as well as vectors. E.g.: IFN_COND_SHL (cond, a, { 1, 1, ... }, fallback) and: IFN_COND_SHL (cond, a, 1, fallback) are the same operation, with: (for shiftrotate (lrotate rrotate lshift rshift) ... /* Prefer vector1 << scalar to vector1 << vector2 if vector2 is uniform. */ (for vec (VECTOR_CST CONSTRUCTOR) (simplify (shiftrotate @0 vec@1) (with { tree tem = uniform_vector_p (@1); } (if (tem) (shiftrotate @0 { tem; })))))) preferring the latter. The patch copes with this by extending create_convert_operand_from to handle scalar-to-vector conversions. Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf and x86_64-linux-gnu. OK for the generic bits? Richard 2019-08-14 Richard Sandiford Prathamesh Kulkarni gcc/ * internal-fn.def (IFN_COND_SHL, IFN_COND_SHR): New internal functions. * internal-fn.c (FOR_EACH_CODE_MAPPING): Handle shifts. * match.pd (UNCOND_BINARY, COND_BINARY): Likewise. * optabs.def (cond_ashl_optab, cond_ashr_optab, cond_lshr_optab): New optabs. * optabs.h (create_convert_operand_from): Expand comment. * optabs.c (maybe_legitimize_operand): Allow implicit broadcasts when mapping scalar rtxes to vector operands. * config/aarch64/iterators.md (SVE_INT_BINARY): Add ashift, ashiftrt and lshiftrt. (sve_int_op, sve_int_op_rev, sve_pred_int_rhs2_operand): Handle them. * config/aarch64/aarch64-sve.md (*cond__2_const) (*cond__any_const): New patterns. gcc/testsuite/ * gcc.target/aarch64/sve/cond_shift_1.c: New test. * gcc.target/aarch64/sve/cond_shift_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_2.c: Likewise. * gcc.target/aarch64/sve/cond_shift_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_3.c: Likewise. * gcc.target/aarch64/sve/cond_shift_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_4.c: Likewise. * gcc.target/aarch64/sve/cond_shift_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_5.c: Likewise. * gcc.target/aarch64/sve/cond_shift_5_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_6.c: Likewise. * gcc.target/aarch64/sve/cond_shift_6_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_7.c: Likewise. * gcc.target/aarch64/sve/cond_shift_7_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_8.c: Likewise. * gcc.target/aarch64/sve/cond_shift_8_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_9.c: Likewise. * gcc.target/aarch64/sve/cond_shift_9_run.c: Likewise. Index: gcc/internal-fn.def =================================================================== --- gcc/internal-fn.def 2019-06-18 09:35:54.921869466 +0100 +++ gcc/internal-fn.def 2019-08-14 13:22:08.625843346 +0100 @@ -167,6 +167,10 @@ DEF_INTERNAL_OPTAB_FN (COND_IOR, ECF_CON cond_ior, cond_binary) DEF_INTERNAL_OPTAB_FN (COND_XOR, ECF_CONST | ECF_NOTHROW, cond_xor, cond_binary) +DEF_INTERNAL_OPTAB_FN (COND_SHL, ECF_CONST | ECF_NOTHROW, + cond_ashl, cond_binary) +DEF_INTERNAL_SIGNED_OPTAB_FN (COND_SHR, ECF_CONST | ECF_NOTHROW, first, + cond_ashr, cond_lshr, cond_binary) DEF_INTERNAL_OPTAB_FN (COND_FMA, ECF_CONST, cond_fma, cond_ternary) DEF_INTERNAL_OPTAB_FN (COND_FMS, ECF_CONST, cond_fms, cond_ternary) Index: gcc/internal-fn.c =================================================================== --- gcc/internal-fn.c 2019-07-10 19:41:21.623936245 +0100 +++ gcc/internal-fn.c 2019-08-14 13:22:08.625843346 +0100 @@ -3286,7 +3286,9 @@ #define FOR_EACH_CODE_MAPPING(T) \ T (MAX_EXPR, IFN_COND_MAX) \ T (BIT_AND_EXPR, IFN_COND_AND) \ T (BIT_IOR_EXPR, IFN_COND_IOR) \ - T (BIT_XOR_EXPR, IFN_COND_XOR) + T (BIT_XOR_EXPR, IFN_COND_XOR) \ + T (LSHIFT_EXPR, IFN_COND_SHL) \ + T (RSHIFT_EXPR, IFN_COND_SHR) /* Return a function that only performs CODE when a certain condition is met and that uses a given fallback value otherwise. For example, if CODE is Index: gcc/match.pd =================================================================== --- gcc/match.pd 2019-07-29 09:39:48.690173827 +0100 +++ gcc/match.pd 2019-08-14 13:22:08.625843346 +0100 @@ -83,12 +83,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) plus minus mult trunc_div trunc_mod rdiv min max - bit_and bit_ior bit_xor) + bit_and bit_ior bit_xor + lshift rshift) (define_operator_list COND_BINARY IFN_COND_ADD IFN_COND_SUB IFN_COND_MUL IFN_COND_DIV IFN_COND_MOD IFN_COND_RDIV IFN_COND_MIN IFN_COND_MAX - IFN_COND_AND IFN_COND_IOR IFN_COND_XOR) + IFN_COND_AND IFN_COND_IOR IFN_COND_XOR + IFN_COND_SHL IFN_COND_SHR) /* Same for ternary operations. */ (define_operator_list UNCOND_TERNARY Index: gcc/optabs.def =================================================================== --- gcc/optabs.def 2019-07-03 20:50:42.826346402 +0100 +++ gcc/optabs.def 2019-08-14 13:22:08.625843346 +0100 @@ -230,6 +230,9 @@ OPTAB_D (cond_umod_optab, "cond_umod$a") OPTAB_D (cond_and_optab, "cond_and$a") OPTAB_D (cond_ior_optab, "cond_ior$a") OPTAB_D (cond_xor_optab, "cond_xor$a") +OPTAB_D (cond_ashl_optab, "cond_ashl$a") +OPTAB_D (cond_ashr_optab, "cond_ashr$a") +OPTAB_D (cond_lshr_optab, "cond_lshr$a") OPTAB_D (cond_smin_optab, "cond_smin$a") OPTAB_D (cond_smax_optab, "cond_smax$a") OPTAB_D (cond_umin_optab, "cond_umin$a") Index: gcc/optabs.h =================================================================== --- gcc/optabs.h 2019-07-10 19:41:21.607936374 +0100 +++ gcc/optabs.h 2019-08-14 13:22:08.625843346 +0100 @@ -129,7 +129,11 @@ create_convert_operand_to (class expand_ /* Make OP describe an input operand that should have the same value as VALUE, after any mode conversion that the backend might request. If VALUE is a CONST_INT, it should be treated as having mode MODE. - UNSIGNED_P says whether VALUE is unsigned. */ + UNSIGNED_P says whether VALUE is unsigned. + + The conversion of VALUE can include a combination of numerical + conversion (as for convert_modes) and duplicating a scalar to fill + a vector (if VALUE is a scalar but the operand is a vector). */ static inline void create_convert_operand_from (class expand_operand *op, rtx value, Index: gcc/optabs.c =================================================================== --- gcc/optabs.c 2019-07-29 09:39:47.782181127 +0100 +++ gcc/optabs.c 2019-08-14 13:22:08.625843346 +0100 @@ -7212,7 +7212,7 @@ maybe_legitimize_operand_same_code (enum maybe_legitimize_operand (enum insn_code icode, unsigned int opno, class expand_operand *op) { - machine_mode mode, imode; + machine_mode mode, imode, tmode; mode = op->mode; switch (op->type) @@ -7259,9 +7259,17 @@ maybe_legitimize_operand (enum insn_code gcc_assert (mode != VOIDmode); imode = insn_data[(int) icode].operand[opno].mode; + tmode = (VECTOR_MODE_P (imode) && !VECTOR_MODE_P (mode) + ? GET_MODE_INNER (imode) : imode); + if (tmode != VOIDmode && tmode != mode) + { + op->value = convert_modes (tmode, mode, op->value, op->unsigned_p); + mode = tmode; + } if (imode != VOIDmode && imode != mode) { - op->value = convert_modes (imode, mode, op->value, op->unsigned_p); + gcc_assert (VECTOR_MODE_P (imode) && !VECTOR_MODE_P (mode)); + op->value = expand_vector_broadcast (imode, op->value); mode = imode; } goto input; Index: gcc/config/aarch64/iterators.md =================================================================== --- gcc/config/aarch64/iterators.md 2019-08-14 12:00:23.769690070 +0100 +++ gcc/config/aarch64/iterators.md 2019-08-14 13:22:08.621843377 +0100 @@ -1280,6 +1280,7 @@ (define_code_iterator SVE_INT_UNARY [abs ;; SVE integer binary operations. (define_code_iterator SVE_INT_BINARY [plus minus mult smax umax smin umin + ashift ashiftrt lshiftrt and ior xor]) ;; SVE integer binary division operations. @@ -1475,6 +1476,9 @@ (define_code_attr sve_int_op [(plus "add (smax "smax") (umin "umin") (umax "umax") + (ashift "lsl") + (ashiftrt "asr") + (lshiftrt "lsr") (and "and") (ior "orr") (xor "eor") @@ -1484,17 +1488,20 @@ (define_code_attr sve_int_op [(plus "add (popcount "cnt")]) (define_code_attr sve_int_op_rev [(plus "add") - (minus "subr") - (mult "mul") - (div "sdivr") - (udiv "udivr") - (smin "smin") - (smax "smax") - (umin "umin") - (umax "umax") - (and "and") - (ior "orr") - (xor "eor")]) + (minus "subr") + (mult "mul") + (div "sdivr") + (udiv "udivr") + (smin "smin") + (smax "smax") + (umin "umin") + (umax "umax") + (ashift "lslr") + (ashiftrt "asrr") + (lshiftrt "lsrr") + (and "and") + (ior "orr") + (xor "eor")]) ;; The floating-point SVE instruction that implements an rtx code. (define_code_attr sve_fp_op [(plus "fadd") @@ -1535,6 +1542,9 @@ (define_code_attr sve_pred_int_rhs2_oper (umax "register_operand") (smin "register_operand") (umin "register_operand") + (ashift "aarch64_sve_lshift_operand") + (ashiftrt "aarch64_sve_rshift_operand") + (lshiftrt "aarch64_sve_rshift_operand") (and "aarch64_sve_pred_and_operand") (ior "register_operand") (xor "register_operand")]) Index: gcc/config/aarch64/aarch64-sve.md =================================================================== --- gcc/config/aarch64/aarch64-sve.md 2019-08-14 12:04:06.304063627 +0100 +++ gcc/config/aarch64/aarch64-sve.md 2019-08-14 13:22:08.621843377 +0100 @@ -1772,7 +1772,10 @@ (define_insn "*one_cmpl3" ;; Includes: ;; - ADD (merging form only) ;; - AND (merging form only) +;; - ASR (merging form only) ;; - EOR (merging form only) +;; - LSL (merging form only) +;; - LSR (merging form only) ;; - MUL ;; - ORR (merging form only) ;; - SMAX @@ -2405,6 +2408,49 @@ (define_insn "*post_ra_v3" "\t%0., %1., #%2" ) +;; Predicated integer shift, merging with the first input. +(define_insn "*cond__2_const" + [(set (match_operand:SVE_I 0 "register_operand" "=w, ?&w") + (unspec:SVE_I + [(match_operand: 1 "register_operand" "Upl, Upl") + (ASHIFT:SVE_I + (match_operand:SVE_I 2 "register_operand" "0, w") + (match_operand:SVE_I 3 "aarch64_simd_shift_imm")) + (match_dup 2)] + UNSPEC_SEL))] + "TARGET_SVE" + "@ + \t%0., %1/m, %0., #%3 + movprfx\t%0, %2\;\t%0., %1/m, %0., #%3" + [(set_attr "movprfx" "*,yes")] +) + +;; Predicated integer shift, merging with an independent value. +(define_insn_and_rewrite "*cond__any_const" + [(set (match_operand:SVE_I 0 "register_operand" "=w, &w, ?&w") + (unspec:SVE_I + [(match_operand: 1 "register_operand" "Upl, Upl, Upl") + (ASHIFT:SVE_I + (match_operand:SVE_I 2 "register_operand" "w, w, w") + (match_operand:SVE_I 3 "aarch64_simd_shift_imm")) + (match_operand:SVE_I 4 "aarch64_simd_reg_or_zero" "Dz, 0, w")] + UNSPEC_SEL))] + "TARGET_SVE && !rtx_equal_p (operands[2], operands[4])" + "@ + movprfx\t%0., %1/z, %2.\;\t%0., %1/m, %0., #%3 + movprfx\t%0., %1/m, %2.\;\t%0., %1/m, %0., #%3 + #" + "&& reload_completed + && register_operand (operands[4], mode) + && !rtx_equal_p (operands[0], operands[4])" + { + emit_insn (gen_vcond_mask_ (operands[0], operands[2], + operands[4], operands[1])); + operands[4] = operands[2] = operands[0]; + } + [(set_attr "movprfx" "yes")] +) + ;; ------------------------------------------------------------------------- ;; ---- [FP] General binary arithmetic corresponding to rtx codes ;; ------------------------------------------------------------------------- Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_1.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_1.c 2019-08-14 13:22:08.625843346 +0100 @@ -0,0 +1,48 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include + +#define DEF_LOOP(TYPE, NAME, OP) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict r, TYPE *__restrict a, \ + TYPE *__restrict b, int n) \ + { \ + for (int i = 0; i < n; ++i) \ + r[i] = a[i] > 20 ? b[i] OP 3 : b[i]; \ + } + +#define TEST_TYPE(T, TYPE) \ + T (TYPE, shl, <<) \ + T (TYPE, shr, >>) + +#define TEST_ALL(T) \ + TEST_TYPE (T, int8_t) \ + TEST_TYPE (T, uint8_t) \ + TEST_TYPE (T, int16_t) \ + TEST_TYPE (T, uint16_t) \ + TEST_TYPE (T, int32_t) \ + TEST_TYPE (T, uint32_t) \ + TEST_TYPE (T, int64_t) \ + TEST_TYPE (T, uint64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.b, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.h, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ + +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.b, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.b, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz[^,]*z} } } */ +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_1_run.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_1_run.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,27 @@ +/* { dg-do run { target { aarch64_sve_hw } } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_shift_1.c" + +#define N 99 + +#define TEST_LOOP(TYPE, NAME, OP) \ + { \ + TYPE r[N], a[N], b[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + a[i] = (i & 1 ? i : 3 * i); \ + b[i] = (i >> 4) << (i & 15); \ + asm volatile ("" ::: "memory"); \ + } \ + test_##TYPE##_##NAME (r, a, b, N); \ + for (int i = 0; i < N; ++i) \ + if (r[i] != (TYPE) (a[i] > 20 ? b[i] OP 3 : b[i])) \ + __builtin_abort (); \ + } + +int main () +{ + TEST_ALL (TEST_LOOP) + return 0; +} Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_2.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_2.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,52 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include + +#define DEF_LOOP(TYPE, NAME, OP) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict r, TYPE *__restrict a, \ + TYPE *__restrict b, int n) \ + { \ + for (int i = 0; i < n; ++i) \ + r[i] = a[i] > 20 ? b[i] OP 3 : a[i]; \ + } + +#define TEST_TYPE(T, TYPE) \ + T (TYPE, shl, <<) \ + T (TYPE, shr, >>) + +#define TEST_ALL(T) \ + TEST_TYPE (T, int8_t) \ + TEST_TYPE (T, uint8_t) \ + TEST_TYPE (T, int16_t) \ + TEST_TYPE (T, uint16_t) \ + TEST_TYPE (T, int32_t) \ + TEST_TYPE (T, uint32_t) \ + TEST_TYPE (T, int64_t) \ + TEST_TYPE (T, uint64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.b, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.h, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ + +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.b, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.b, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b\n} 4 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h\n} 4 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 4 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 4 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz[^,]*z} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_2_run.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_2_run.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,27 @@ +/* { dg-do run { target { aarch64_sve_hw } } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_shift_2.c" + +#define N 99 + +#define TEST_LOOP(TYPE, NAME, OP) \ + { \ + TYPE r[N], a[N], b[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + a[i] = (i & 1 ? i : 3 * i); \ + b[i] = (i >> 4) << (i & 15); \ + asm volatile ("" ::: "memory"); \ + } \ + test_##TYPE##_##NAME (r, a, b, N); \ + for (int i = 0; i < N; ++i) \ + if (r[i] != (TYPE) (a[i] > 20 ? b[i] OP 3 : a[i])) \ + __builtin_abort (); \ + } + +int main () +{ + TEST_ALL (TEST_LOOP) + return 0; +} Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_3.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_3.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,48 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include + +#define DEF_LOOP(TYPE, NAME, OP) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict r, TYPE *__restrict a, \ + TYPE *__restrict b, int n) \ + { \ + for (int i = 0; i < n; ++i) \ + r[i] = a[i] > 20 ? b[i] OP 3 : 72; \ + } + +#define TEST_TYPE(T, TYPE) \ + T (TYPE, shl, <<) \ + T (TYPE, shr, >>) + +#define TEST_ALL(T) \ + TEST_TYPE (T, int8_t) \ + TEST_TYPE (T, uint8_t) \ + TEST_TYPE (T, int16_t) \ + TEST_TYPE (T, uint16_t) \ + TEST_TYPE (T, int32_t) \ + TEST_TYPE (T, uint32_t) \ + TEST_TYPE (T, int64_t) \ + TEST_TYPE (T, uint64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.b, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.h, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ + +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.b, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.b, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz[^,]*z} } } */ +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-times {\tsel\t} 16 } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_3_run.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_3_run.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,27 @@ +/* { dg-do run { target { aarch64_sve_hw } } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_shift_3.c" + +#define N 99 + +#define TEST_LOOP(TYPE, NAME, OP) \ + { \ + TYPE r[N], a[N], b[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + a[i] = (i & 1 ? i : 3 * i); \ + b[i] = (i >> 4) << (i & 15); \ + asm volatile ("" ::: "memory"); \ + } \ + test_##TYPE##_##NAME (r, a, b, N); \ + for (int i = 0; i < N; ++i) \ + if (r[i] != (TYPE) (a[i] > 20 ? b[i] OP 3 : 72)) \ + __builtin_abort (); \ + } + +int main () +{ + TEST_ALL (TEST_LOOP) + return 0; +} Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_4.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_4.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,52 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include + +#define DEF_LOOP(TYPE, NAME, OP) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict r, TYPE *__restrict a, \ + TYPE *__restrict b, int n) \ + { \ + for (int i = 0; i < n; ++i) \ + r[i] = a[i] > 20 ? b[i] OP 3 : 0; \ + } + +#define TEST_TYPE(T, TYPE) \ + T (TYPE, shl, <<) \ + T (TYPE, shr, >>) + +#define TEST_ALL(T) \ + TEST_TYPE (T, int8_t) \ + TEST_TYPE (T, uint8_t) \ + TEST_TYPE (T, int16_t) \ + TEST_TYPE (T, uint16_t) \ + TEST_TYPE (T, int32_t) \ + TEST_TYPE (T, uint32_t) \ + TEST_TYPE (T, int64_t) \ + TEST_TYPE (T, uint64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.b, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.h, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ + +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.b, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.b, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.b, p[0-7]/z, z[0-9]+\.b\n} 4 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.h, p[0-7]/z, z[0-9]+\.h\n} 4 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 4 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 4 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz[^,]*z} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_4_run.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_4_run.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,27 @@ +/* { dg-do run { target { aarch64_sve_hw } } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_shift_4.c" + +#define N 99 + +#define TEST_LOOP(TYPE, NAME, OP) \ + { \ + TYPE r[N], a[N], b[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + a[i] = (i & 1 ? i : 3 * i); \ + b[i] = (i >> 4) << (i & 15); \ + asm volatile ("" ::: "memory"); \ + } \ + test_##TYPE##_##NAME (r, a, b, N); \ + for (int i = 0; i < N; ++i) \ + if (r[i] != (TYPE) (a[i] > 20 ? b[i] OP 3 : 0)) \ + __builtin_abort (); \ + } + +int main () +{ + TEST_ALL (TEST_LOOP) + return 0; +} Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_5.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_5.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,38 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include + +#define DEF_LOOP(TYPE, NAME, OP) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict r, TYPE *__restrict a, \ + TYPE *__restrict b, TYPE *__restrict c, int n) \ + { \ + for (int i = 0; i < n; ++i) \ + r[i] = a[i] > 20 ? b[i] OP c[i] : b[i]; \ + } + +#define TEST_TYPE(T, TYPE) \ + T (TYPE, shl, <<) \ + T (TYPE, shr, >>) + +#define TEST_ALL(T) \ + TEST_TYPE (T, int32_t) \ + TEST_TYPE (T, uint32_t) \ + TEST_TYPE (T, int64_t) \ + TEST_TYPE (T, uint64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ + +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz[^,]*z} } } */ +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_5_run.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_5_run.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,28 @@ +/* { dg-do run { target { aarch64_sve_hw } } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_shift_5.c" + +#define N 99 + +#define TEST_LOOP(TYPE, NAME, OP) \ + { \ + TYPE r[N], a[N], b[N], c[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + a[i] = (i & 1 ? i : 3 * i); \ + b[i] = (i >> 4) << (i & 15); \ + c[i] = ~i & 7; \ + asm volatile ("" ::: "memory"); \ + } \ + test_##TYPE##_##NAME (r, a, b, c, N); \ + for (int i = 0; i < N; ++i) \ + if (r[i] != (TYPE) (a[i] > 20 ? b[i] OP c[i] : b[i])) \ + __builtin_abort (); \ + } + +int main () +{ + TEST_ALL (TEST_LOOP) + return 0; +} Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_6.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_6.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,33 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include + +#define DEF_LOOP(TYPE, NAME, OP) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict r, TYPE *__restrict a, \ + TYPE *__restrict b, TYPE *__restrict c, int n) \ + { \ + for (int i = 0; i < n; ++i) \ + r[i] = a[i] > 20 ? b[i] OP c[i] : c[i]; \ + } + +#define TEST_TYPE(T, TYPE) \ + T (TYPE, shl, <<) \ + T (TYPE, shr, >>) + +#define TEST_ALL(T) \ + TEST_TYPE (T, int32_t) \ + TEST_TYPE (T, uint32_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tlslr\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ + +/* { dg-final { scan-assembler-times {\tasrr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tlsrr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz[^,]*z} } } */ +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_6_run.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_6_run.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,28 @@ +/* { dg-do run { target { aarch64_sve_hw } } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_shift_6.c" + +#define N 99 + +#define TEST_LOOP(TYPE, NAME, OP) \ + { \ + TYPE r[N], a[N], b[N], c[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + a[i] = (i & 1 ? i : 3 * i); \ + b[i] = (i >> 4) << (i & 15); \ + c[i] = ~i & 7; \ + asm volatile ("" ::: "memory"); \ + } \ + test_##TYPE##_##NAME (r, a, b, c, N); \ + for (int i = 0; i < N; ++i) \ + if (r[i] != (TYPE) (a[i] > 20 ? b[i] OP c[i] : c[i])) \ + __builtin_abort (); \ + } + +int main () +{ + TEST_ALL (TEST_LOOP) + return 0; +} Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_7.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_7.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,40 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include + +#define DEF_LOOP(TYPE, NAME, OP) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict r, TYPE *__restrict a, \ + TYPE *__restrict b, TYPE *__restrict c, int n) \ + { \ + for (int i = 0; i < n; ++i) \ + r[i] = a[i] > 20 ? b[i] OP c[i] : a[i]; \ + } + +#define TEST_TYPE(T, TYPE) \ + T (TYPE, shl, <<) \ + T (TYPE, shr, >>) + +#define TEST_ALL(T) \ + TEST_TYPE (T, int32_t) \ + TEST_TYPE (T, uint32_t) \ + TEST_TYPE (T, int64_t) \ + TEST_TYPE (T, uint64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ + +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 4 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 4 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz[^,]*z} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_7_run.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_7_run.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,28 @@ +/* { dg-do run { target { aarch64_sve_hw } } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_shift_7.c" + +#define N 99 + +#define TEST_LOOP(TYPE, NAME, OP) \ + { \ + TYPE r[N], a[N], b[N], c[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + a[i] = (i & 1 ? i : 3 * i); \ + b[i] = (i >> 4) << (i & 15); \ + c[i] = ~i & 7; \ + asm volatile ("" ::: "memory"); \ + } \ + test_##TYPE##_##NAME (r, a, b, c, N); \ + for (int i = 0; i < N; ++i) \ + if (r[i] != (TYPE) (a[i] > 20 ? b[i] OP c[i] : a[i])) \ + __builtin_abort (); \ + } + +int main () +{ + TEST_ALL (TEST_LOOP) + return 0; +} Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_8.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_8.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,38 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include + +#define DEF_LOOP(TYPE, NAME, OP) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict r, TYPE *__restrict a, \ + TYPE *__restrict b, TYPE *__restrict c, int n) \ + { \ + for (int i = 0; i < n; ++i) \ + r[i] = a[i] > 20 ? b[i] OP c[i] : 91; \ + } + +#define TEST_TYPE(T, TYPE) \ + T (TYPE, shl, <<) \ + T (TYPE, shr, >>) + +#define TEST_ALL(T) \ + TEST_TYPE (T, int32_t) \ + TEST_TYPE (T, uint32_t) \ + TEST_TYPE (T, int64_t) \ + TEST_TYPE (T, uint64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ + +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz[^,]*z} } } */ +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-times {\tsel\t} 8 } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_8_run.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_8_run.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,28 @@ +/* { dg-do run { target { aarch64_sve_hw } } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_shift_8.c" + +#define N 99 + +#define TEST_LOOP(TYPE, NAME, OP) \ + { \ + TYPE r[N], a[N], b[N], c[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + a[i] = (i & 1 ? i : 3 * i); \ + b[i] = (i >> 4) << (i & 15); \ + c[i] = ~i & 7; \ + asm volatile ("" ::: "memory"); \ + } \ + test_##TYPE##_##NAME (r, a, b, c, N); \ + for (int i = 0; i < N; ++i) \ + if (r[i] != (TYPE) (a[i] > 20 ? b[i] OP c[i] : 91)) \ + __builtin_abort (); \ + } + +int main () +{ + TEST_ALL (TEST_LOOP) + return 0; +} Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_9.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_9.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,40 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include + +#define DEF_LOOP(TYPE, NAME, OP) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict r, TYPE *__restrict a, \ + TYPE *__restrict b, TYPE *__restrict c, int n) \ + { \ + for (int i = 0; i < n; ++i) \ + r[i] = a[i] > 20 ? b[i] OP c[i] : 0; \ + } + +#define TEST_TYPE(T, TYPE) \ + T (TYPE, shl, <<) \ + T (TYPE, shr, >>) + +#define TEST_ALL(T) \ + TEST_TYPE (T, int32_t) \ + TEST_TYPE (T, uint32_t) \ + TEST_TYPE (T, int64_t) \ + TEST_TYPE (T, uint64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tlslr?\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tlslr?\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ + +/* { dg-final { scan-assembler-times {\tasrr?\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tasrr?\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tlsrr?\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tlsrr?\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 4 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 4 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz[^,]*z} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/cond_shift_9_run.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/cond_shift_9_run.c 2019-08-14 13:22:08.629843319 +0100 @@ -0,0 +1,28 @@ +/* { dg-do run { target { aarch64_sve_hw } } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_shift_9.c" + +#define N 99 + +#define TEST_LOOP(TYPE, NAME, OP) \ + { \ + TYPE r[N], a[N], b[N], c[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + a[i] = (i & 1 ? i : 3 * i); \ + b[i] = (i >> 4) << (i & 15); \ + c[i] = ~i & 7; \ + asm volatile ("" ::: "memory"); \ + } \ + test_##TYPE##_##NAME (r, a, b, c, N); \ + for (int i = 0; i < N; ++i) \ + if (r[i] != (TYPE) (a[i] > 20 ? b[i] OP c[i] : 0)) \ + __builtin_abort (); \ + } + +int main () +{ + TEST_ALL (TEST_LOOP) + return 0; +}