From patchwork Wed Aug 14 09:11:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1146859 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-506893-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="kthrjlqh"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 467kNG2DJkz9s7T for ; Wed, 14 Aug 2019 19:11:26 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; q=dns; s= default; b=oI6n1VGXgmcE3DUhe7gtdBaahYqcH1rFikZPzq5yjL0F82qOdvkdA qXlaK9Qm84n11oMuwryAQYQ1YwWIzm3+QvyNe3U5F0aI290LUrdCHeQsnS7aZcO+ +LGTjejecx96/DcrWBl/1yitWPzt+eXfp4UKF4CtWPl1+qgUB2qtQA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; s= default; bh=mQD84qNvrx5m/+uVqEFOFULogCY=; b=kthrjlqhG3MCxslym4yv syJMflhDg2ywERv89/vbCpcyWAHIxIM16syCGvxpiX+qSNyDWrqzDZs/kEKB8b5A NkRUTqjZY/cRVFFaZbEmAsCqkbp/2PlsgMMknKCpe1z68+PXD4o4eOpzift24jyL ht55wGf1zK3ahJ2ZKoJRuDA= Received: (qmail 40617 invoked by alias); 14 Aug 2019 09:11:18 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 40609 invoked by uid 89); 14 Aug 2019 09:11:17 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-8.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.110.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 14 Aug 2019 09:11:13 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4EDD7344 for ; Wed, 14 Aug 2019 02:11:12 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.99.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CE92D3F694 for ; Wed, 14 Aug 2019 02:11:11 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [committed][AArch64] Add support for SVE [SU]{MAX,MIN} immediate Date: Wed, 14 Aug 2019 10:11:10 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-IsSubscribed: yes This patch adds support for the immediate forms of SVE SMAX, SMIN, UMAX and UMIN. SMAX and SMIN take the same range as MUL, so the patch basically just moves and generalises the existing MUL patterns. Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf. Applied as r274439. Richard 2019-08-14 Richard Sandiford gcc/ * config/aarch64/constraints.md (vsb): New constraint. (vsm): Generalize description. * config/aarch64/iterators.md (SVE_INT_BINARY_IMM): New code iterator. (sve_imm_con): Handle smax, smin, umax and umin. (sve_imm_prefix): New code attribute. * config/aarch64/predicates.md (aarch64_sve_vsb_immediate) (aarch64_sve_vsb_operand): New predicates. (aarch64_sve_mul_immediate): Rename to... (aarch64_sve_vsm_immediate): ...this. (aarch64_sve_mul_operand): Rename to... (aarch64_sve_vsm_operand): ...this. * config/aarch64/aarch64-sve.md (mul3): Generalize to... (3): ...this. (*mul3, *post_ra_mul3): Generalize to... (*3) (*post_ra_3): ...these and add movprfx support for the immediate alternatives. (3, *3): Delete in favor of the above. (*3): Fix incorrect predicate for operand 3. gcc/testsuite/ * gcc.target/aarch64/sve/smax_1.c: New test. * gcc.target/aarch64/sve/smin_1.c: Likewise. * gcc.target/aarch64/sve/umax_1.c: Likewise. * gcc.target/aarch64/sve/umin_1.c: Likewise. Index: gcc/config/aarch64/constraints.md =================================================================== --- gcc/config/aarch64/constraints.md 2019-08-13 11:39:54.753376024 +0100 +++ gcc/config/aarch64/constraints.md 2019-08-14 10:08:03.446774020 +0100 @@ -388,6 +388,12 @@ (define_constraint "vsa" arithmetic instructions." (match_operand 0 "aarch64_sve_arith_immediate")) +(define_constraint "vsb" + "@internal + A constraint that matches an immediate operand valid for SVE UMAX + and UMIN operations." + (match_operand 0 "aarch64_sve_vsb_immediate")) + (define_constraint "vsc" "@internal A constraint that matches a signed immediate operand valid for SVE @@ -420,9 +426,9 @@ (define_constraint "vsl" (define_constraint "vsm" "@internal - A constraint that matches an immediate operand valid for SVE MUL - operations." - (match_operand 0 "aarch64_sve_mul_immediate")) + A constraint that matches an immediate operand valid for SVE MUL, + SMAX and SMIN operations." + (match_operand 0 "aarch64_sve_vsm_immediate")) (define_constraint "vsA" "@internal Index: gcc/config/aarch64/iterators.md =================================================================== --- gcc/config/aarch64/iterators.md 2019-08-14 10:02:44.165119259 +0100 +++ gcc/config/aarch64/iterators.md 2019-08-14 10:08:03.446774020 +0100 @@ -1285,6 +1285,9 @@ (define_code_iterator SVE_INT_BINARY [pl ;; SVE integer binary division operations. (define_code_iterator SVE_INT_BINARY_SD [div udiv]) +;; SVE integer binary operations that have an immediate form. +(define_code_iterator SVE_INT_BINARY_IMM [mult smax smin umax umin]) + ;; SVE floating-point operations with an unpredicated all-register form. (define_code_iterator SVE_UNPRED_FP_BINARY [plus minus mult]) @@ -1499,7 +1502,12 @@ (define_code_attr sve_fp_op [(plus "fadd (mult "fmul")]) ;; The SVE immediate constraint to use for an rtl code. -(define_code_attr sve_imm_con [(eq "vsc") +(define_code_attr sve_imm_con [(mult "vsm") + (smax "vsm") + (smin "vsm") + (umax "vsb") + (umin "vsb") + (eq "vsc") (ne "vsc") (lt "vsc") (ge "vsc") @@ -1510,6 +1518,13 @@ (define_code_attr sve_imm_con [(eq "vsc" (geu "vsd") (gtu "vsd")]) +;; The prefix letter to use when printing an immediate operand. +(define_code_attr sve_imm_prefix [(mult "") + (smax "") + (smin "") + (umax "D") + (umin "D")]) + ;; ------------------------------------------------------------------- ;; Int Iterators. ;; ------------------------------------------------------------------- Index: gcc/config/aarch64/predicates.md =================================================================== --- gcc/config/aarch64/predicates.md 2019-08-14 10:06:06.331634340 +0100 +++ gcc/config/aarch64/predicates.md 2019-08-14 10:08:03.446774020 +0100 @@ -615,7 +615,15 @@ (define_predicate "aarch64_sve_logical_i (and (match_code "const,const_vector") (match_test "aarch64_sve_bitmask_immediate_p (op)"))) -(define_predicate "aarch64_sve_mul_immediate" +;; Used for SVE UMAX and UMIN. +(define_predicate "aarch64_sve_vsb_immediate" + (and (match_code "const_vector") + (match_test "GET_MODE_INNER (GET_MODE (op)) == QImode + ? aarch64_const_vec_all_same_in_range_p (op, -128, 127) + : aarch64_const_vec_all_same_in_range_p (op, 0, 255)"))) + +;; Used for SVE MUL, SMAX and SMIN. +(define_predicate "aarch64_sve_vsm_immediate" (and (match_code "const,const_vector") (match_test "aarch64_const_vec_all_same_in_range_p (op, -128, 127)"))) @@ -668,9 +676,13 @@ (define_predicate "aarch64_sve_rshift_op (ior (match_operand 0 "register_operand") (match_operand 0 "aarch64_simd_rshift_imm"))) -(define_predicate "aarch64_sve_mul_operand" +(define_predicate "aarch64_sve_vsb_operand" + (ior (match_operand 0 "register_operand") + (match_operand 0 "aarch64_sve_vsb_immediate"))) + +(define_predicate "aarch64_sve_vsm_operand" (ior (match_operand 0 "register_operand") - (match_operand 0 "aarch64_sve_mul_immediate"))) + (match_operand 0 "aarch64_sve_vsm_immediate"))) (define_predicate "aarch64_sve_cmp_vsc_operand" (ior (match_operand 0 "register_operand") Index: gcc/config/aarch64/aarch64-sve.md =================================================================== --- gcc/config/aarch64/aarch64-sve.md 2019-08-14 10:06:06.327634370 +0100 +++ gcc/config/aarch64/aarch64-sve.md 2019-08-14 10:08:03.446774020 +0100 @@ -64,13 +64,11 @@ ;; ---- [INT] Subtraction ;; ---- [INT] Take address ;; ---- [INT] Absolute difference -;; ---- [INT] Multiplication ;; ---- [INT] Highpart multiplication ;; ---- [INT] Division ;; ---- [INT] Binary logical operations ;; ---- [INT] Binary logical operations (inverted second input) ;; ---- [INT] Shifts -;; ---- [INT] Maximum and minimum ;; ---- [FP] General binary arithmetic corresponding to rtx codes ;; ---- [FP] General binary arithmetic corresponding to unspecs ;; ---- [FP] Addition @@ -1622,19 +1620,77 @@ (define_insn "*one_cmpl3" ;; ------------------------------------------------------------------------- ;; ---- [INT] General binary arithmetic corresponding to rtx codes ;; ------------------------------------------------------------------------- -;; Includes merging patterns for: -;; - ADD -;; - AND -;; - EOR +;; Includes: +;; - ADD (merging form only) +;; - AND (merging form only) +;; - EOR (merging form only) ;; - MUL -;; - ORR +;; - ORR (merging form only) ;; - SMAX ;; - SMIN -;; - SUB +;; - SUB (merging form only) ;; - UMAX ;; - UMIN ;; ------------------------------------------------------------------------- +;; Unpredicated integer binary operations that have an immediate form. +(define_expand "3" + [(set (match_operand:SVE_I 0 "register_operand") + (unspec:SVE_I + [(match_dup 3) + (SVE_INT_BINARY_IMM:SVE_I + (match_operand:SVE_I 1 "register_operand") + (match_operand:SVE_I 2 "aarch64_sve__operand"))] + UNSPEC_PRED_X))] + "TARGET_SVE" + { + operands[3] = aarch64_ptrue_reg (mode); + } +) + +;; Integer binary operations that have an immediate form, predicated +;; with a PTRUE. We don't actually need the predicate for the first +;; and third alternatives, but using Upa or X isn't likely to gain much +;; and would make the instruction seem less uniform to the register +;; allocator. +(define_insn_and_split "*3" + [(set (match_operand:SVE_I 0 "register_operand" "=w, w, ?&w, ?&w") + (unspec:SVE_I + [(match_operand: 1 "register_operand" "Upl, Upl, Upl, Upl") + (SVE_INT_BINARY_IMM:SVE_I + (match_operand:SVE_I 2 "register_operand" "%0, 0, w, w") + (match_operand:SVE_I 3 "aarch64_sve__operand" ", w, , w"))] + UNSPEC_PRED_X))] + "TARGET_SVE" + "@ + # + \t%0., %1/m, %0., %3. + # + movprfx\t%0, %2\;\t%0., %1/m, %0., %3." + ; Split the unpredicated form after reload, so that we don't have + ; the unnecessary PTRUE. + "&& reload_completed + && !register_operand (operands[3], mode)" + [(set (match_dup 0) (SVE_INT_BINARY_IMM:SVE_I (match_dup 2) (match_dup 3)))] + "" + [(set_attr "movprfx" "*,*,yes,yes")] +) + +;; Unpredicated binary operations with a constant (post-RA only). +;; These are generated by splitting a predicated instruction whose +;; predicate is unused. +(define_insn "*post_ra_3" + [(set (match_operand:SVE_I 0 "register_operand" "=w, ?&w") + (SVE_INT_BINARY_IMM:SVE_I + (match_operand:SVE_I 1 "register_operand" "0, w") + (match_operand:SVE_I 2 "aarch64_sve__immediate")))] + "TARGET_SVE && reload_completed" + "@ + \t%0., %0., #%2 + movprfx\t%0, %1\;\t%0., %0., #%2" + [(set_attr "movprfx" "*,yes")] +) + ;; Predicated integer operations with merging. (define_expand "cond_" [(set (match_operand:SVE_I 0 "register_operand") @@ -1866,68 +1922,6 @@ (define_insn "aarch64_abd_3" ) ;; ------------------------------------------------------------------------- -;; ---- [INT] Multiplication -;; ------------------------------------------------------------------------- -;; Includes: -;; - MUL -;; ------------------------------------------------------------------------- - -;; Unpredicated multiplication. -(define_expand "mul3" - [(set (match_operand:SVE_I 0 "register_operand") - (unspec:SVE_I - [(match_dup 3) - (mult:SVE_I - (match_operand:SVE_I 1 "register_operand") - (match_operand:SVE_I 2 "aarch64_sve_mul_operand"))] - UNSPEC_PRED_X))] - "TARGET_SVE" - { - operands[3] = aarch64_ptrue_reg (mode); - } -) - -;; Multiplication predicated with a PTRUE. We don't actually need the -;; predicate for the first alternative, but using Upa or X isn't likely -;; to gain much and would make the instruction seem less uniform to the -;; register allocator. -(define_insn_and_split "*mul3" - [(set (match_operand:SVE_I 0 "register_operand" "=w, w, ?&w") - (unspec:SVE_I - [(match_operand: 1 "register_operand" "Upl, Upl, Upl") - (mult:SVE_I - (match_operand:SVE_I 2 "register_operand" "%0, 0, w") - (match_operand:SVE_I 3 "aarch64_sve_mul_operand" "vsm, w, w"))] - UNSPEC_PRED_X))] - "TARGET_SVE" - "@ - # - mul\t%0., %1/m, %0., %3. - movprfx\t%0, %2\;mul\t%0., %1/m, %0., %3." - ; Split the unpredicated form after reload, so that we don't have - ; the unnecessary PTRUE. - "&& reload_completed - && !register_operand (operands[3], mode)" - [(set (match_dup 0) (mult:SVE_I (match_dup 2) (match_dup 3)))] - "" - [(set_attr "movprfx" "*,*,yes")] -) - -;; Unpredicated multiplications by a constant (post-RA only). -;; These are generated by splitting a predicated instruction whose -;; predicate is unused. -(define_insn "*post_ra_mul3" - [(set (match_operand:SVE_I 0 "register_operand" "=w") - (mult:SVE_I - (match_operand:SVE_I 1 "register_operand" "0") - (match_operand:SVE_I 2 "aarch64_sve_mul_immediate")))] - "TARGET_SVE && reload_completed" - "mul\t%0., %0., #%2" -) - -;; Merging forms are handled through SVE_INT_BINARY. - -;; ------------------------------------------------------------------------- ;; ---- [INT] Highpart multiplication ;; ------------------------------------------------------------------------- ;; Includes: @@ -1998,7 +1992,7 @@ (define_insn "*3" [(match_operand: 1 "register_operand" "Upl, Upl, Upl") (SVE_INT_BINARY_SD:SVE_SDI (match_operand:SVE_SDI 2 "register_operand" "0, w, w") - (match_operand:SVE_SDI 3 "aarch64_sve_mul_operand" "w, 0, w"))] + (match_operand:SVE_SDI 3 "register_operand" "w, 0, w"))] UNSPEC_PRED_X))] "TARGET_SVE" "@ @@ -2219,47 +2213,6 @@ (define_insn "*post_ra_v3" ) ;; ------------------------------------------------------------------------- -;; ---- [INT] Maximum and minimum -;; ------------------------------------------------------------------------- -;; Includes: -;; - SMAX -;; - SMIN -;; - UMAX -;; - UMIN -;; ------------------------------------------------------------------------- - -;; Unpredicated integer MAX/MIN. -(define_expand "3" - [(set (match_operand:SVE_I 0 "register_operand") - (unspec:SVE_I - [(match_dup 3) - (MAXMIN:SVE_I (match_operand:SVE_I 1 "register_operand") - (match_operand:SVE_I 2 "register_operand"))] - UNSPEC_PRED_X))] - "TARGET_SVE" - { - operands[3] = aarch64_ptrue_reg (mode); - } -) - -;; Integer MAX/MIN predicated with a PTRUE. -(define_insn "*3" - [(set (match_operand:SVE_I 0 "register_operand" "=w, ?&w") - (unspec:SVE_I - [(match_operand: 1 "register_operand" "Upl, Upl") - (MAXMIN:SVE_I (match_operand:SVE_I 2 "register_operand" "%0, w") - (match_operand:SVE_I 3 "register_operand" "w, w"))] - UNSPEC_PRED_X))] - "TARGET_SVE" - "@ - \t%0., %1/m, %0., %3. - movprfx\t%0, %2\;\t%0., %1/m, %0., %3." - [(set_attr "movprfx" "*,yes")] -) - -;; Merging forms are handled through SVE_INT_BINARY. - -;; ------------------------------------------------------------------------- ;; ---- [FP] General binary arithmetic corresponding to rtx codes ;; ------------------------------------------------------------------------- ;; Includes post-RA forms of: Index: gcc/testsuite/gcc.target/aarch64/sve/smax_1.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/smax_1.c 2019-08-14 10:08:03.446774020 +0100 @@ -0,0 +1,71 @@ +/* { dg-do assemble { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O3 --save-temps" } */ + +#include + +#define DO_REGREG_OPS(TYPE) \ +void varith_##TYPE##_reg (TYPE *dst, TYPE *src, int count) \ +{ \ + for (int i = 0; i < count; ++i) \ + dst[i] = dst[i] > src[i] ? dst[i] : src[i]; \ +} + +#define DO_IMMEDIATE_OPS(VALUE, TYPE, NAME) \ +void varithimm_##NAME##_##TYPE (TYPE *dst, int count) \ +{ \ + for (int i = 0; i < count; ++i) \ + dst[i] = dst[i] > (TYPE) VALUE ? dst[i] : (TYPE) VALUE; \ +} + +#define DO_ARITH_OPS(TYPE) \ + DO_REGREG_OPS (TYPE); \ + DO_IMMEDIATE_OPS (0, TYPE, 0); \ + DO_IMMEDIATE_OPS (86, TYPE, 86); \ + DO_IMMEDIATE_OPS (109, TYPE, 109); \ + DO_IMMEDIATE_OPS (141, TYPE, 141); \ + DO_IMMEDIATE_OPS (-1, TYPE, minus1); \ + DO_IMMEDIATE_OPS (-110, TYPE, minus110); \ + DO_IMMEDIATE_OPS (-141, TYPE, minus141); + +DO_ARITH_OPS (int8_t) +DO_ARITH_OPS (int16_t) +DO_ARITH_OPS (int32_t) +DO_ARITH_OPS (int64_t) + +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, z[0-9]+\.b, #0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, z[0-9]+\.b, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, z[0-9]+\.b, #109\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, z[0-9]+\.b, #115\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmax\tz[0-9]+\.b, z[0-9]+\.b, #141\n} } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, z[0-9]+\.b, #-1\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, z[0-9]+\.b, #-110\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, z[0-9]+\.b, #-115\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmax\tz[0-9]+\.b, z[0-9]+\.b, #-141\n} } } */ + +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, z[0-9]+\.h, #0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, z[0-9]+\.h, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, z[0-9]+\.h, #109\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmax\tz[0-9]+\.h, z[0-9]+\.h, #141\n} } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, z[0-9]+\.h, #-1\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, z[0-9]+\.h, #-110\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmax\tz[0-9]+\.h, z[0-9]+\.h, #-141\n} } } */ + +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, z[0-9]+\.s, #0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, z[0-9]+\.s, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, z[0-9]+\.s, #109\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmax\tz[0-9]+\.s, z[0-9]+\.s, #141\n} } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, z[0-9]+\.s, #-1\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, z[0-9]+\.s, #-110\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmax\tz[0-9]+\.s, z[0-9]+\.s, #-141\n} } } */ + +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.d, z[0-9]+\.d, #0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.d, z[0-9]+\.d, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.d, z[0-9]+\.d, #109\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmax\tz[0-9]+\.d, z[0-9]+\.d, #141\n} } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.d, z[0-9]+\.d, #-1\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.d, z[0-9]+\.d, #-110\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmax\tz[0-9]+\.d, z[0-9]+\.d, #-141\n} } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/smin_1.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/smin_1.c 2019-08-14 10:08:03.446774020 +0100 @@ -0,0 +1,71 @@ +/* { dg-do assemble { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O3 --save-temps" } */ + +#include + +#define DO_REGREG_OPS(TYPE) \ +void varith_##TYPE##_reg (TYPE *dst, TYPE *src, int count) \ +{ \ + for (int i = 0; i < count; ++i) \ + dst[i] = dst[i] < src[i] ? dst[i] : src[i]; \ +} + +#define DO_IMMEDIATE_OPS(VALUE, TYPE, NAME) \ +void varithimm_##NAME##_##TYPE (TYPE *dst, int count) \ +{ \ + for (int i = 0; i < count; ++i) \ + dst[i] = dst[i] < (TYPE) VALUE ? dst[i] : (TYPE) VALUE; \ +} + +#define DO_ARITH_OPS(TYPE) \ + DO_REGREG_OPS (TYPE); \ + DO_IMMEDIATE_OPS (0, TYPE, 0); \ + DO_IMMEDIATE_OPS (86, TYPE, 86); \ + DO_IMMEDIATE_OPS (109, TYPE, 109); \ + DO_IMMEDIATE_OPS (141, TYPE, 141); \ + DO_IMMEDIATE_OPS (-1, TYPE, minus1); \ + DO_IMMEDIATE_OPS (-110, TYPE, minus110); \ + DO_IMMEDIATE_OPS (-141, TYPE, minus141); + +DO_ARITH_OPS (int8_t) +DO_ARITH_OPS (int16_t) +DO_ARITH_OPS (int32_t) +DO_ARITH_OPS (int64_t) + +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.b, z[0-9]+\.b, #0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.b, z[0-9]+\.b, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.b, z[0-9]+\.b, #109\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.b, z[0-9]+\.b, #115\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmin\tz[0-9]+\.b, z[0-9]+\.b, #141\n} } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.b, z[0-9]+\.b, #-1\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.b, z[0-9]+\.b, #-110\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.b, z[0-9]+\.b, #-115\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmin\tz[0-9]+\.b, z[0-9]+\.b, #-141\n} } } */ + +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.h, z[0-9]+\.h, #0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.h, z[0-9]+\.h, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.h, z[0-9]+\.h, #109\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmin\tz[0-9]+\.h, z[0-9]+\.h, #141\n} } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.h, z[0-9]+\.h, #-1\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.h, z[0-9]+\.h, #-110\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmin\tz[0-9]+\.h, z[0-9]+\.h, #-141\n} } } */ + +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.s, z[0-9]+\.s, #0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.s, z[0-9]+\.s, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.s, z[0-9]+\.s, #109\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmin\tz[0-9]+\.s, z[0-9]+\.s, #141\n} } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.s, z[0-9]+\.s, #-1\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.s, z[0-9]+\.s, #-110\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmin\tz[0-9]+\.s, z[0-9]+\.s, #-141\n} } } */ + +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.d, z[0-9]+\.d, #0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.d, z[0-9]+\.d, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.d, z[0-9]+\.d, #109\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmin\tz[0-9]+\.d, z[0-9]+\.d, #141\n} } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.d, z[0-9]+\.d, #-1\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.d, z[0-9]+\.d, #-110\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tsmin\tz[0-9]+\.d, z[0-9]+\.d, #-141\n} } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/umax_1.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/umax_1.c 2019-08-14 10:08:03.446774020 +0100 @@ -0,0 +1,65 @@ +/* { dg-do assemble { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O3 --save-temps" } */ + +#include + +#define DO_REGREG_OPS(TYPE) \ +void varith_##TYPE##_reg (TYPE *dst, TYPE *src, int count) \ +{ \ + for (int i = 0; i < count; ++i) \ + dst[i] = dst[i] > src[i] ? dst[i] : src[i]; \ +} + +#define DO_IMMEDIATE_OPS(VALUE, TYPE) \ +void varithimm_##VALUE##_##TYPE (TYPE *dst, int count) \ +{ \ + for (int i = 0; i < count; ++i) \ + dst[i] = dst[i] > (TYPE) VALUE ? dst[i] : (TYPE) VALUE; \ +} + +#define DO_ARITH_OPS(TYPE) \ + DO_REGREG_OPS (TYPE); \ + DO_IMMEDIATE_OPS (2, TYPE); \ + DO_IMMEDIATE_OPS (86, TYPE); \ + DO_IMMEDIATE_OPS (109, TYPE); \ + DO_IMMEDIATE_OPS (141, TYPE); \ + DO_IMMEDIATE_OPS (229, TYPE); \ + DO_IMMEDIATE_OPS (255, TYPE); \ + DO_IMMEDIATE_OPS (256, TYPE); + +DO_ARITH_OPS (uint8_t) +DO_ARITH_OPS (uint16_t) +DO_ARITH_OPS (uint32_t) +DO_ARITH_OPS (uint64_t) + +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.b, z[0-9]+\.b, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.b, z[0-9]+\.b, #109\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.b, z[0-9]+\.b, #141\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.b, z[0-9]+\.b, #229\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tumax\tz[0-9]+\.b, z[0-9]+\.b, #255\n} } } */ +/* { dg-final { scan-assembler-not {\tumax\tz[0-9]+\.b, z[0-9]+\.b, #256\n} } } */ + +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.h, z[0-9]+\.h, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.h, z[0-9]+\.h, #109\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.h, z[0-9]+\.h, #141\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.h, z[0-9]+\.h, #229\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.h, z[0-9]+\.h, #255\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tumax\tz[0-9]+\.h, z[0-9]+\.h, #256\n} } } */ + +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.s, z[0-9]+\.s, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.s, z[0-9]+\.s, #109\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.s, z[0-9]+\.s, #141\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.s, z[0-9]+\.s, #229\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.s, z[0-9]+\.s, #255\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tumax\tz[0-9]+\.s, z[0-9]+\.s, #256\n} } } */ + +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.d, z[0-9]+\.d, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.d, z[0-9]+\.d, #109\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.d, z[0-9]+\.d, #141\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.d, z[0-9]+\.d, #229\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.d, z[0-9]+\.d, #255\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tumax\tz[0-9]+\.d, z[0-9]+\.d, #256\n} } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/umin_1.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/umin_1.c 2019-08-14 10:08:03.446774020 +0100 @@ -0,0 +1,65 @@ +/* { dg-do assemble { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O3 --save-temps" } */ + +#include + +#define DO_REGREG_OPS(TYPE) \ +void varith_##TYPE##_reg (TYPE *dst, TYPE *src, int count) \ +{ \ + for (int i = 0; i < count; ++i) \ + dst[i] = dst[i] < src[i] ? dst[i] : src[i]; \ +} + +#define DO_IMMEDIATE_OPS(VALUE, TYPE) \ +void varithimm_##VALUE##_##TYPE (TYPE *dst, int count) \ +{ \ + for (int i = 0; i < count; ++i) \ + dst[i] = dst[i] < (TYPE) VALUE ? dst[i] : (TYPE) VALUE; \ +} + +#define DO_ARITH_OPS(TYPE) \ + DO_REGREG_OPS (TYPE); \ + DO_IMMEDIATE_OPS (2, TYPE); \ + DO_IMMEDIATE_OPS (86, TYPE); \ + DO_IMMEDIATE_OPS (109, TYPE); \ + DO_IMMEDIATE_OPS (141, TYPE); \ + DO_IMMEDIATE_OPS (229, TYPE); \ + DO_IMMEDIATE_OPS (255, TYPE); \ + DO_IMMEDIATE_OPS (256, TYPE); + +DO_ARITH_OPS (uint8_t) +DO_ARITH_OPS (uint16_t) +DO_ARITH_OPS (uint32_t) +DO_ARITH_OPS (uint64_t) + +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.b, z[0-9]+\.b, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.b, z[0-9]+\.b, #109\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.b, z[0-9]+\.b, #141\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.b, z[0-9]+\.b, #229\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tumin\tz[0-9]+\.b, z[0-9]+\.b, #255\n} } } */ +/* { dg-final { scan-assembler-not {\tumin\tz[0-9]+\.b, z[0-9]+\.b, #256\n} } } */ + +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.h, z[0-9]+\.h, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.h, z[0-9]+\.h, #109\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.h, z[0-9]+\.h, #141\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.h, z[0-9]+\.h, #229\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.h, z[0-9]+\.h, #255\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tumin\tz[0-9]+\.h, z[0-9]+\.h, #256\n} } } */ + +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.s, z[0-9]+\.s, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.s, z[0-9]+\.s, #109\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.s, z[0-9]+\.s, #141\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.s, z[0-9]+\.s, #229\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.s, z[0-9]+\.s, #255\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tumin\tz[0-9]+\.s, z[0-9]+\.s, #256\n} } } */ + +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.d, z[0-9]+\.d, #86\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.d, z[0-9]+\.d, #109\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.d, z[0-9]+\.d, #141\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.d, z[0-9]+\.d, #229\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.d, z[0-9]+\.d, #255\n} 1 } } */ +/* { dg-final { scan-assembler-not {\tumin\tz[0-9]+\.d, z[0-9]+\.d, #256\n} } } */