From patchwork Mon Jun 16 14:26:24 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrylo Tkachov X-Patchwork-Id: 360148 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 702E314007B for ; Tue, 17 Jun 2014 00:26:48 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=OuGPqoTP/9nTdqkfdziY9AbmhBkPju2QJUidKeEMaJw Ho0NWtsHaFEDcGxZMzzmfpW93fdjaM9fdhU266hfq8Be0lrNrSVAuqQge3sTxg7E ErrYNA59/4v95ApLirmkxFsETvTVdLdlfMrO9IhLVT9PRoIZW9bLnSS2MbhHo5o0 = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=IOIynPxgqCpdXcFfd8y1XwxrAJw=; b=ejXXeXQK2IaqHIJUi bZKlxjwmrSBRWVHy84KBEKGGZ3fvflV/lj8zYEjjajXL96C9NSBJgGl4HxxPgK7l IJ8lvJfCxKDD2kQhDPcVXR9D5gXedciCBf7XXxHd5S9tj5pzNsUr1c/NfOHLC3lI /F4feb3F8hGKyP/ntgsMAXkBKY= Received: (qmail 27674 invoked by alias); 16 Jun 2014 14:26:38 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 27647 invoked by uid 89); 16 Jun 2014 14:26:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: service87.mimecast.com Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 16 Jun 2014 14:26:30 +0000 Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Mon, 16 Jun 2014 15:26:26 +0100 Received: from [10.1.208.24] ([10.1.255.212]) by cam-owa2.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 16 Jun 2014 15:26:18 +0100 Message-ID: <539EFE90.7030809@arm.com> Date: Mon, 16 Jun 2014 15:26:24 +0100 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: GCC Patches CC: Marcus Shawcroft , Richard Earnshaw Subject: [PATCH][AArch64] Fix some saturating math NEON intrinsics types X-MC-Unique: 114061615262600801 X-IsSubscribed: yes Hi all, I noticed that a few saturating math intrinsics in arm_neon.h for aarch64 have the wrong types, i.e. not what's mandated by the ACLE spec. This patch fixes that by adjusting the types of the builtin functions that those intrinsics map to (and in the process cleaning up the VCON iterator) and adding tests for the affected intrinsics. I realise it's quite big, but the changes are mostly uniform. Bootstrapped and tested aarch64-none-linux-gnu. Ok for trunk? Thanks, Kyrill 2014-06-16 Kyrylo Tkachov * config/aarch64/iterators.md (VCOND): Handle SI and HI modes. Update comments. (VCONQ): Make comment more helpful. (VCON): Delete. * config/aarch64/aarch64-simd.md (aarch64_sqdmulh_lane): Use VCOND for operands 2. Update lane checking and flipping logic. (aarch64_sqrdmulh_lane): Likewise. (aarch64_sqdmulh_lane_internal): Likewise. (aarch64_sqdmull2): Remove VCON, use VQ_HSI mode iterator. (aarch64_sqdmll_lane_internal, VD_HSI): Change mode attribute of operand 3 to VCOND. (aarch64_sqdmll_lane_internal, SD_HSI): Likewise. (aarch64_sqdmll2_lane_internal): Likewise. (aarch64_sqdmull_lane_internal, VD_HSI): Likewise. (aarch64_sqdmull_lane_internal, SD_HSI): Likewise. (aarch64_sqdmull2_lane_internal): Likewise. (aarch64_sqdmll_laneq_internal, VD_HSI: New define_insn. (aarch64_sqdmll_laneq_internal, SD_HSI): Likewise. (aarch64_sqdmll2_laneq_internal): Likewise. (aarch64_sqdmull_laneq_internal, VD_HSI): Likewise. (aarch64_sqdmull_laneq_internal, SD_HSI): Likewise. (aarch64_sqdmull2_laneq_internal): Likewise. (aarch64_sqdmlal_lane): Change mode attribute of penultimate operand to VCOND. Update lane flipping and bounds checking logic. (aarch64_sqdmlal2_lane): Likewise. (aarch64_sqdmlsl_lane): Likewise. (aarch64_sqdmull_lane): Likewise. (aarch64_sqdmull2_lane): Likewise. (aarch64_sqdmlal_laneq): Replace VCON usage with VCONQ. Emit aarch64_sqdmlal_laneq_internal insn. (aarch64_sqdmlal2_laneq): Emit aarch64_sqdmlal2_laneq_internal insn. Replace VCON with VCONQ. (aarch64_sqdmlsl2_lane): Replace VCON with VCONQ. (aarch64_sqdmlsl2_laneq): Likewise. (aarch64_sqdmull_laneq): Emit aarch64_sqdmull_laneq_internal insn. Replace VCON with VCONQ. (aarch64_sqdmull2_laneq): Emit aarch64_sqdmull2_laneq_internal insn. (aarch64_sqdmlsl_laneq): Replace VCON usage with VCONQ. * config/aarch64/arm_neon.h (vqdmlal_high_lane_s16): Change type of 3rd argument to int16x4_t. (vqdmlalh_lane_s16): Likewise. (vqdmlslh_lane_s16): Likewise. (vqdmull_high_lane_s16): Likewise. (vqdmullh_lane_s16): Change type of 2nd argument to int16x4_t. (vqdmlal_lane_s16): Don't create temporary int16x8_t value. (vqdmlsl_lane_s16): Likewise. (vqdmull_lane_s16): Don't create temporary int16x8_t value. (vqdmlal_high_lane_s32): Change type 3rd argument to int32x2_t. (vqdmlals_lane_s32): Likewise. (vqdmlsls_lane_s32): Likewise. (vqdmull_high_lane_s32): Change type 2nd argument to int32x2_t. (vqdmulls_lane_s32): Likewise. (vqdmlal_lane_s32): Don't create temporary int32x4_t value. (vqdmlsl_lane_s32): Likewise. (vqdmull_lane_s32): Don't create temporary int32x4_t value. (vqdmulhh_lane_s16): Change type of second argument to int16x4_t. (vqrdmulhh_lane_s16): Likewise. (vqdmlsl_high_lane_s16): Likewise. (vqdmulhs_lane_s32): Change type of second argument to int32x2_t. (vqdmlsl_high_lane_s32): Likewise. (vqrdmulhs_lane_s32): Likewise. 2014-06-16 Kyrylo Tkachov * gcc.target/aarch64/simd/vqdmulhh_lane_s16.c: New test. * gcc.target/aarch64/simd/vqdmulhs_lane_s32.c: Likewise. * gcc.target/aarch64/simd/vqrdmulhh_lane_s16.c: Likewise. * gcc.target/aarch64/simd/vqrdmulhs_lane_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmlal_high_lane_s16.c: New test. * gcc.target/aarch64/simd/vqdmlal_high_lane_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmlal_high_laneq_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmlal_high_laneq_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmlal_lane_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmlal_lane_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmlal_laneq_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmlal_laneq_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmlalh_lane_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmlals_lane_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmlsl_high_lane_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmlsl_high_lane_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmlsl_high_laneq_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmlsl_high_laneq_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmlsl_lane_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmlsl_lane_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmlsl_laneq_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmlslh_lane_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmlsls_lane_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmulh_laneq_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmulh_laneq_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmulhq_laneq_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmulhq_laneq_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmull_high_lane_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmull_high_lane_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmull_high_laneq_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmull_high_laneq_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmull_lane_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmull_lane_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmull_laneq_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmull_laneq_s32.c: Likewise. * gcc.target/aarch64/simd/vqdmullh_lane_s16.c: Likewise. * gcc.target/aarch64/simd/vqdmulls_lane_s32.c: Likewise. * gcc.target/aarch64/simd/vqrdmulh_laneq_s16.c: Likewise. * gcc.target/aarch64/simd/vqrdmulh_laneq_s32.c: Likewise. * gcc.target/aarch64/simd/vqrdmulhq_laneq_s16.c: Likewise. * gcc.target/aarch64/simd/vqrdmulhq_laneq_s32.c: Likewise. * gcc.target/aarch64/vector_intrinsics.c: Simplify arm_neon.h include. (test_vqdmlal_high_lane_s16): Fix parameter type. (test_vqdmlal_high_lane_s32): Likewise. (test_vqdmull_high_lane_s16): Likewise. (test_vqdmull_high_lane_s32): Likewise. (test_vqdmlsl_high_lane_s32): Likewise. (test_vqdmlsl_high_lane_s16): Likewise. * gcc.target/aarch64/scalar_intrinsics.c (test_vqdmlalh_lane_s16): Fix argument type. (test_vqdmlals_lane_s32): Likewise. (test_vqdmlslh_lane_s16): Likewise. (test_vqdmlsls_lane_s32): Likewise. (test_vqdmulhh_lane_s16): Likewise. (test_vqdmulhs_lane_s32): Likewise. (test_vqdmullh_lane_s16): Likewise. (test_vqdmulls_lane_s32): Likewise. (test_vqrdmulhh_lane_s16): Likewise. (test_vqrdmulhs_lane_s32): Likewise. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 42bfd3e..7fd7094 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -2752,12 +2752,12 @@ (define_expand "aarch64_sqdmulh_lane" [(match_operand:SD_HSI 0 "register_operand" "") (match_operand:SD_HSI 1 "register_operand" "") - (match_operand: 2 "register_operand" "") + (match_operand: 2 "register_operand" "") (match_operand:SI 3 "immediate_operand" "")] "TARGET_SIMD" { - aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode)); - operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); + aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode)); + operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); emit_insn (gen_aarch64_sqdmulh_lane_internal (operands[0], operands[1], operands[2], @@ -2769,12 +2769,12 @@ (define_expand "aarch64_sqrdmulh_lane" [(match_operand:SD_HSI 0 "register_operand" "") (match_operand:SD_HSI 1 "register_operand" "") - (match_operand: 2 "register_operand" "") + (match_operand: 2 "register_operand" "") (match_operand:SI 3 "immediate_operand" "")] "TARGET_SIMD" { - aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode)); - operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); + aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode)); + operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); emit_insn (gen_aarch64_sqrdmulh_lane_internal (operands[0], operands[1], operands[2], @@ -2788,12 +2788,12 @@ (unspec:SD_HSI [(match_operand:SD_HSI 1 "register_operand" "w") (vec_select: - (match_operand: 2 "register_operand" "") + (match_operand: 2 "register_operand" "") (parallel [(match_operand:SI 3 "immediate_operand" "i")]))] VQDMULH))] "TARGET_SIMD" "* - operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); + operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); return \"sqdmulh\\t%0, %1, %2.[%3]\";" [(set_attr "type" "neon_sat_mul__scalar")] ) @@ -2829,7 +2829,31 @@ (sign_extend: (vec_duplicate:VD_HSI (vec_select: - (match_operand: 3 "register_operand" "") + (match_operand: 3 "register_operand" "") + (parallel [(match_operand:SI 4 "immediate_operand" "i")]))) + )) + (const_int 1))))] + "TARGET_SIMD" + { + operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); + return + "sqdmll\\t%0, %2, %3.[%4]"; + } + [(set_attr "type" "neon_sat_mla__scalar_long")] +) + +(define_insn "aarch64_sqdmll_laneq_internal" + [(set (match_operand: 0 "register_operand" "=w") + (SBINQOPS: + (match_operand: 1 "register_operand" "0") + (ss_ashift: + (mult: + (sign_extend: + (match_operand:VD_HSI 2 "register_operand" "w")) + (sign_extend: + (vec_duplicate:VD_HSI + (vec_select: + (match_operand: 3 "register_operand" "") (parallel [(match_operand:SI 4 "immediate_operand" "i")]))) )) (const_int 1))))] @@ -2852,7 +2876,30 @@ (match_operand:SD_HSI 2 "register_operand" "w")) (sign_extend: (vec_select: - (match_operand: 3 "register_operand" "") + (match_operand: 3 "register_operand" "") + (parallel [(match_operand:SI 4 "immediate_operand" "i")]))) + ) + (const_int 1))))] + "TARGET_SIMD" + { + operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); + return + "sqdmll\\t%0, %2, %3.[%4]"; + } + [(set_attr "type" "neon_sat_mla__scalar_long")] +) + +(define_insn "aarch64_sqdmll_laneq_internal" + [(set (match_operand: 0 "register_operand" "=w") + (SBINQOPS: + (match_operand: 1 "register_operand" "0") + (ss_ashift: + (mult: + (sign_extend: + (match_operand:SD_HSI 2 "register_operand" "w")) + (sign_extend: + (vec_select: + (match_operand: 3 "register_operand" "") (parallel [(match_operand:SI 4 "immediate_operand" "i")]))) ) (const_int 1))))] @@ -2869,12 +2916,12 @@ [(match_operand: 0 "register_operand" "=w") (match_operand: 1 "register_operand" "0") (match_operand:VSD_HSI 2 "register_operand" "w") - (match_operand: 3 "register_operand" "") + (match_operand: 3 "register_operand" "") (match_operand:SI 4 "immediate_operand" "i")] "TARGET_SIMD" { - aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode) / 2); - operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); + aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode)); + operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); emit_insn (gen_aarch64_sqdmlal_lane_internal (operands[0], operands[1], operands[2], operands[3], operands[4])); @@ -2885,13 +2932,13 @@ [(match_operand: 0 "register_operand" "=w") (match_operand: 1 "register_operand" "0") (match_operand:VSD_HSI 2 "register_operand" "w") - (match_operand: 3 "register_operand" "") + (match_operand: 3 "register_operand" "") (match_operand:SI 4 "immediate_operand" "i")] "TARGET_SIMD" { - aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode)); - operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); - emit_insn (gen_aarch64_sqdmlal_lane_internal (operands[0], operands[1], + aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode)); + operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); + emit_insn (gen_aarch64_sqdmlal_laneq_internal (operands[0], operands[1], operands[2], operands[3], operands[4])); DONE; @@ -2901,12 +2948,12 @@ [(match_operand: 0 "register_operand" "=w") (match_operand: 1 "register_operand" "0") (match_operand:VSD_HSI 2 "register_operand" "w") - (match_operand: 3 "register_operand" "") + (match_operand: 3 "register_operand" "") (match_operand:SI 4 "immediate_operand" "i")] "TARGET_SIMD" { - aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode) / 2); - operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); + aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode)); + operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); emit_insn (gen_aarch64_sqdmlsl_lane_internal (operands[0], operands[1], operands[2], operands[3], operands[4])); @@ -2917,13 +2964,13 @@ [(match_operand: 0 "register_operand" "=w") (match_operand: 1 "register_operand" "0") (match_operand:VSD_HSI 2 "register_operand" "w") - (match_operand: 3 "register_operand" "") + (match_operand: 3 "register_operand" "") (match_operand:SI 4 "immediate_operand" "i")] "TARGET_SIMD" { - aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode)); - operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); - emit_insn (gen_aarch64_sqdmlsl_lane_internal (operands[0], operands[1], + aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode)); + operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); + emit_insn (gen_aarch64_sqdmlsl_laneq_internal (operands[0], operands[1], operands[2], operands[3], operands[4])); DONE; @@ -3011,7 +3058,33 @@ (sign_extend: (vec_duplicate: (vec_select: - (match_operand: 3 "register_operand" "") + (match_operand: 3 "register_operand" "") + (parallel [(match_operand:SI 4 "immediate_operand" "i")]) + )))) + (const_int 1))))] + "TARGET_SIMD" + { + operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); + return + "sqdmll2\\t%0, %2, %3.[%4]"; + } + [(set_attr "type" "neon_sat_mla__scalar_long")] +) + +(define_insn "aarch64_sqdmll2_laneq_internal" + [(set (match_operand: 0 "register_operand" "=w") + (SBINQOPS: + (match_operand: 1 "register_operand" "0") + (ss_ashift: + (mult: + (sign_extend: + (vec_select: + (match_operand:VQ_HSI 2 "register_operand" "w") + (match_operand:VQ_HSI 5 "vect_par_cnst_hi_half" ""))) + (sign_extend: + (vec_duplicate: + (vec_select: + (match_operand: 3 "register_operand" "") (parallel [(match_operand:SI 4 "immediate_operand" "i")]) )))) (const_int 1))))] @@ -3028,13 +3101,13 @@ [(match_operand: 0 "register_operand" "=w") (match_operand: 1 "register_operand" "w") (match_operand:VQ_HSI 2 "register_operand" "w") - (match_operand: 3 "register_operand" "") + (match_operand: 3 "register_operand" "") (match_operand:SI 4 "immediate_operand" "i")] "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, true); - aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode) / 2); - operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); + aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode)); + operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); emit_insn (gen_aarch64_sqdmlal2_lane_internal (operands[0], operands[1], operands[2], operands[3], operands[4], p)); @@ -3045,14 +3118,14 @@ [(match_operand: 0 "register_operand" "=w") (match_operand: 1 "register_operand" "w") (match_operand:VQ_HSI 2 "register_operand" "w") - (match_operand: 3 "register_operand" "") + (match_operand: 3 "register_operand" "") (match_operand:SI 4 "immediate_operand" "i")] "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, true); - aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode)); - operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); - emit_insn (gen_aarch64_sqdmlal2_lane_internal (operands[0], operands[1], + aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode)); + operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); + emit_insn (gen_aarch64_sqdmlal2_laneq_internal (operands[0], operands[1], operands[2], operands[3], operands[4], p)); DONE; @@ -3062,13 +3135,13 @@ [(match_operand: 0 "register_operand" "=w") (match_operand: 1 "register_operand" "w") (match_operand:VQ_HSI 2 "register_operand" "w") - (match_operand: 3 "register_operand" "") + (match_operand: 3 "register_operand" "") (match_operand:SI 4 "immediate_operand" "i")] "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, true); - aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode) / 2); - operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); + aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode)); + operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); emit_insn (gen_aarch64_sqdmlsl2_lane_internal (operands[0], operands[1], operands[2], operands[3], operands[4], p)); @@ -3079,14 +3152,14 @@ [(match_operand: 0 "register_operand" "=w") (match_operand: 1 "register_operand" "w") (match_operand:VQ_HSI 2 "register_operand" "w") - (match_operand: 3 "register_operand" "") + (match_operand: 3 "register_operand" "") (match_operand:SI 4 "immediate_operand" "i")] "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, true); - aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode)); - operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); - emit_insn (gen_aarch64_sqdmlsl2_lane_internal (operands[0], operands[1], + aarch64_simd_lane_bounds (operands[4], 0, GET_MODE_NUNITS (mode)); + operands[4] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[4]))); + emit_insn (gen_aarch64_sqdmlsl2_laneq_internal (operands[0], operands[1], operands[2], operands[3], operands[4], p)); DONE; @@ -3166,7 +3239,28 @@ (sign_extend: (vec_duplicate:VD_HSI (vec_select: - (match_operand: 2 "register_operand" "") + (match_operand: 2 "register_operand" "") + (parallel [(match_operand:SI 3 "immediate_operand" "i")]))) + )) + (const_int 1)))] + "TARGET_SIMD" + { + operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); + return "sqdmull\\t%0, %1, %2.[%3]"; + } + [(set_attr "type" "neon_sat_mul__scalar_long")] +) + +(define_insn "aarch64_sqdmull_laneq_internal" + [(set (match_operand: 0 "register_operand" "=w") + (ss_ashift: + (mult: + (sign_extend: + (match_operand:VD_HSI 1 "register_operand" "w")) + (sign_extend: + (vec_duplicate:VD_HSI + (vec_select: + (match_operand: 2 "register_operand" "") (parallel [(match_operand:SI 3 "immediate_operand" "i")]))) )) (const_int 1)))] @@ -3186,7 +3280,27 @@ (match_operand:SD_HSI 1 "register_operand" "w")) (sign_extend: (vec_select: - (match_operand: 2 "register_operand" "") + (match_operand: 2 "register_operand" "") + (parallel [(match_operand:SI 3 "immediate_operand" "i")])) + )) + (const_int 1)))] + "TARGET_SIMD" + { + operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); + return "sqdmull\\t%0, %1, %2.[%3]"; + } + [(set_attr "type" "neon_sat_mul__scalar_long")] +) + +(define_insn "aarch64_sqdmull_laneq_internal" + [(set (match_operand: 0 "register_operand" "=w") + (ss_ashift: + (mult: + (sign_extend: + (match_operand:SD_HSI 1 "register_operand" "w")) + (sign_extend: + (vec_select: + (match_operand: 2 "register_operand" "") (parallel [(match_operand:SI 3 "immediate_operand" "i")])) )) (const_int 1)))] @@ -3201,12 +3315,12 @@ (define_expand "aarch64_sqdmull_lane" [(match_operand: 0 "register_operand" "=w") (match_operand:VSD_HSI 1 "register_operand" "w") - (match_operand: 2 "register_operand" "") + (match_operand: 2 "register_operand" "") (match_operand:SI 3 "immediate_operand" "i")] "TARGET_SIMD" { - aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode) / 2); - operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); + aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode)); + operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); emit_insn (gen_aarch64_sqdmull_lane_internal (operands[0], operands[1], operands[2], operands[3])); DONE; @@ -3215,13 +3329,13 @@ (define_expand "aarch64_sqdmull_laneq" [(match_operand: 0 "register_operand" "=w") (match_operand:VD_HSI 1 "register_operand" "w") - (match_operand: 2 "register_operand" "") + (match_operand: 2 "register_operand" "") (match_operand:SI 3 "immediate_operand" "i")] "TARGET_SIMD" { - aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode)); + aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode)); operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); - emit_insn (gen_aarch64_sqdmull_lane_internal + emit_insn (gen_aarch64_sqdmull_laneq_internal (operands[0], operands[1], operands[2], operands[3])); DONE; }) @@ -3270,7 +3384,7 @@ (define_expand "aarch64_sqdmull2" [(match_operand: 0 "register_operand" "=w") (match_operand:VQ_HSI 1 "register_operand" "w") - (match_operand: 2 "register_operand" "w")] + (match_operand:VQ_HSI 2 "register_operand" "w")] "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, true); @@ -3292,7 +3406,30 @@ (sign_extend: (vec_duplicate: (vec_select: - (match_operand: 2 "register_operand" "") + (match_operand: 2 "register_operand" "") + (parallel [(match_operand:SI 3 "immediate_operand" "i")]))) + )) + (const_int 1)))] + "TARGET_SIMD" + { + operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); + return "sqdmull2\\t%0, %1, %2.[%3]"; + } + [(set_attr "type" "neon_sat_mul__scalar_long")] +) + +(define_insn "aarch64_sqdmull2_laneq_internal" + [(set (match_operand: 0 "register_operand" "=w") + (ss_ashift: + (mult: + (sign_extend: + (vec_select: + (match_operand:VQ_HSI 1 "register_operand" "w") + (match_operand:VQ_HSI 4 "vect_par_cnst_hi_half" ""))) + (sign_extend: + (vec_duplicate: + (vec_select: + (match_operand: 2 "register_operand" "") (parallel [(match_operand:SI 3 "immediate_operand" "i")]))) )) (const_int 1)))] @@ -3307,13 +3444,13 @@ (define_expand "aarch64_sqdmull2_lane" [(match_operand: 0 "register_operand" "=w") (match_operand:VQ_HSI 1 "register_operand" "w") - (match_operand: 2 "register_operand" "") + (match_operand: 2 "register_operand" "") (match_operand:SI 3 "immediate_operand" "i")] "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, true); - aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode) / 2); - operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); + aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode)); + operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); emit_insn (gen_aarch64_sqdmull2_lane_internal (operands[0], operands[1], operands[2], operands[3], p)); @@ -3323,14 +3460,14 @@ (define_expand "aarch64_sqdmull2_laneq" [(match_operand: 0 "register_operand" "=w") (match_operand:VQ_HSI 1 "register_operand" "w") - (match_operand: 2 "register_operand" "") + (match_operand: 2 "register_operand" "") (match_operand:SI 3 "immediate_operand" "i")] "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, true); - aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode)); - operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); - emit_insn (gen_aarch64_sqdmull2_lane_internal (operands[0], operands[1], + aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode)); + operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3]))); + emit_insn (gen_aarch64_sqdmull2_laneq_internal (operands[0], operands[1], operands[2], operands[3], p)); DONE; diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index b1b78f9..3ed8a98 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -19219,7 +19219,7 @@ vqdmlal_high_s16 (int32x4_t __a, int16x8_t __b, int16x8_t __c) } __extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) -vqdmlal_high_lane_s16 (int32x4_t __a, int16x8_t __b, int16x8_t __c, +vqdmlal_high_lane_s16 (int32x4_t __a, int16x8_t __b, int16x4_t __c, int const __d) { return __builtin_aarch64_sqdmlal2_lanev8hi (__a, __b, __c, __d); @@ -19241,8 +19241,7 @@ vqdmlal_high_n_s16 (int32x4_t __a, int16x8_t __b, int16_t __c) __extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) vqdmlal_lane_s16 (int32x4_t __a, int16x4_t __b, int16x4_t __c, int const __d) { - int16x8_t __tmp = vcombine_s16 (__c, vcreate_s16 (__AARCH64_INT64_C (0))); - return __builtin_aarch64_sqdmlal_lanev4hi (__a, __b, __tmp, __d); + return __builtin_aarch64_sqdmlal_lanev4hi (__a, __b, __c, __d); } __extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) @@ -19270,7 +19269,7 @@ vqdmlal_high_s32 (int64x2_t __a, int32x4_t __b, int32x4_t __c) } __extension__ static __inline int64x2_t __attribute__ ((__always_inline__)) -vqdmlal_high_lane_s32 (int64x2_t __a, int32x4_t __b, int32x4_t __c, +vqdmlal_high_lane_s32 (int64x2_t __a, int32x4_t __b, int32x2_t __c, int const __d) { return __builtin_aarch64_sqdmlal2_lanev4si (__a, __b, __c, __d); @@ -19292,8 +19291,7 @@ vqdmlal_high_n_s32 (int64x2_t __a, int32x4_t __b, int32_t __c) __extension__ static __inline int64x2_t __attribute__ ((__always_inline__)) vqdmlal_lane_s32 (int64x2_t __a, int32x2_t __b, int32x2_t __c, int const __d) { - int32x4_t __tmp = vcombine_s32 (__c, vcreate_s32 (__AARCH64_INT64_C (0))); - return __builtin_aarch64_sqdmlal_lanev2si (__a, __b, __tmp, __d); + return __builtin_aarch64_sqdmlal_lanev2si (__a, __b, __c, __d); } __extension__ static __inline int64x2_t __attribute__ ((__always_inline__)) @@ -19315,7 +19313,7 @@ vqdmlalh_s16 (int32x1_t __a, int16x1_t __b, int16x1_t __c) } __extension__ static __inline int32x1_t __attribute__ ((__always_inline__)) -vqdmlalh_lane_s16 (int32x1_t __a, int16x1_t __b, int16x8_t __c, const int __d) +vqdmlalh_lane_s16 (int32x1_t __a, int16x1_t __b, int16x4_t __c, const int __d) { return __builtin_aarch64_sqdmlal_lanehi (__a, __b, __c, __d); } @@ -19327,7 +19325,7 @@ vqdmlals_s32 (int64x1_t __a, int32x1_t __b, int32x1_t __c) } __extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) -vqdmlals_lane_s32 (int64x1_t __a, int32x1_t __b, int32x4_t __c, const int __d) +vqdmlals_lane_s32 (int64x1_t __a, int32x1_t __b, int32x2_t __c, const int __d) { return __builtin_aarch64_sqdmlal_lanesi (__a, __b, __c, __d); } @@ -19347,7 +19345,7 @@ vqdmlsl_high_s16 (int32x4_t __a, int16x8_t __b, int16x8_t __c) } __extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) -vqdmlsl_high_lane_s16 (int32x4_t __a, int16x8_t __b, int16x8_t __c, +vqdmlsl_high_lane_s16 (int32x4_t __a, int16x8_t __b, int16x4_t __c, int const __d) { return __builtin_aarch64_sqdmlsl2_lanev8hi (__a, __b, __c, __d); @@ -19369,8 +19367,7 @@ vqdmlsl_high_n_s16 (int32x4_t __a, int16x8_t __b, int16_t __c) __extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) vqdmlsl_lane_s16 (int32x4_t __a, int16x4_t __b, int16x4_t __c, int const __d) { - int16x8_t __tmp = vcombine_s16 (__c, vcreate_s16 (__AARCH64_INT64_C (0))); - return __builtin_aarch64_sqdmlsl_lanev4hi (__a, __b, __tmp, __d); + return __builtin_aarch64_sqdmlsl_lanev4hi (__a, __b, __c, __d); } __extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) @@ -19398,7 +19395,7 @@ vqdmlsl_high_s32 (int64x2_t __a, int32x4_t __b, int32x4_t __c) } __extension__ static __inline int64x2_t __attribute__ ((__always_inline__)) -vqdmlsl_high_lane_s32 (int64x2_t __a, int32x4_t __b, int32x4_t __c, +vqdmlsl_high_lane_s32 (int64x2_t __a, int32x4_t __b, int32x2_t __c, int const __d) { return __builtin_aarch64_sqdmlsl2_lanev4si (__a, __b, __c, __d); @@ -19420,8 +19417,7 @@ vqdmlsl_high_n_s32 (int64x2_t __a, int32x4_t __b, int32_t __c) __extension__ static __inline int64x2_t __attribute__ ((__always_inline__)) vqdmlsl_lane_s32 (int64x2_t __a, int32x2_t __b, int32x2_t __c, int const __d) { - int32x4_t __tmp = vcombine_s32 (__c, vcreate_s32 (__AARCH64_INT64_C (0))); - return __builtin_aarch64_sqdmlsl_lanev2si (__a, __b, __tmp, __d); + return __builtin_aarch64_sqdmlsl_lanev2si (__a, __b, __c, __d); } __extension__ static __inline int64x2_t __attribute__ ((__always_inline__)) @@ -19443,7 +19439,7 @@ vqdmlslh_s16 (int32x1_t __a, int16x1_t __b, int16x1_t __c) } __extension__ static __inline int32x1_t __attribute__ ((__always_inline__)) -vqdmlslh_lane_s16 (int32x1_t __a, int16x1_t __b, int16x8_t __c, const int __d) +vqdmlslh_lane_s16 (int32x1_t __a, int16x1_t __b, int16x4_t __c, const int __d) { return __builtin_aarch64_sqdmlsl_lanehi (__a, __b, __c, __d); } @@ -19455,7 +19451,7 @@ vqdmlsls_s32 (int64x1_t __a, int32x1_t __b, int32x1_t __c) } __extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) -vqdmlsls_lane_s32 (int64x1_t __a, int32x1_t __b, int32x4_t __c, const int __d) +vqdmlsls_lane_s32 (int64x1_t __a, int32x1_t __b, int32x2_t __c, const int __d) { return __builtin_aarch64_sqdmlsl_lanesi (__a, __b, __c, __d); } @@ -19493,7 +19489,7 @@ vqdmulhh_s16 (int16x1_t __a, int16x1_t __b) } __extension__ static __inline int16x1_t __attribute__ ((__always_inline__)) -vqdmulhh_lane_s16 (int16x1_t __a, int16x8_t __b, const int __c) +vqdmulhh_lane_s16 (int16x1_t __a, int16x4_t __b, const int __c) { return __builtin_aarch64_sqdmulh_lanehi (__a, __b, __c); } @@ -19505,7 +19501,7 @@ vqdmulhs_s32 (int32x1_t __a, int32x1_t __b) } __extension__ static __inline int32x1_t __attribute__ ((__always_inline__)) -vqdmulhs_lane_s32 (int32x1_t __a, int32x4_t __b, const int __c) +vqdmulhs_lane_s32 (int32x1_t __a, int32x2_t __b, const int __c) { return __builtin_aarch64_sqdmulh_lanesi (__a, __b, __c); } @@ -19525,7 +19521,7 @@ vqdmull_high_s16 (int16x8_t __a, int16x8_t __b) } __extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) -vqdmull_high_lane_s16 (int16x8_t __a, int16x8_t __b, int const __c) +vqdmull_high_lane_s16 (int16x8_t __a, int16x4_t __b, int const __c) { return __builtin_aarch64_sqdmull2_lanev8hi (__a, __b,__c); } @@ -19545,8 +19541,7 @@ vqdmull_high_n_s16 (int16x8_t __a, int16_t __b) __extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) vqdmull_lane_s16 (int16x4_t __a, int16x4_t __b, int const __c) { - int16x8_t __tmp = vcombine_s16 (__b, vcreate_s16 (__AARCH64_INT64_C (0))); - return __builtin_aarch64_sqdmull_lanev4hi (__a, __tmp, __c); + return __builtin_aarch64_sqdmull_lanev4hi (__a, __b, __c); } __extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) @@ -19574,7 +19569,7 @@ vqdmull_high_s32 (int32x4_t __a, int32x4_t __b) } __extension__ static __inline int64x2_t __attribute__ ((__always_inline__)) -vqdmull_high_lane_s32 (int32x4_t __a, int32x4_t __b, int const __c) +vqdmull_high_lane_s32 (int32x4_t __a, int32x2_t __b, int const __c) { return __builtin_aarch64_sqdmull2_lanev4si (__a, __b, __c); } @@ -19594,8 +19589,7 @@ vqdmull_high_n_s32 (int32x4_t __a, int32_t __b) __extension__ static __inline int64x2_t __attribute__ ((__always_inline__)) vqdmull_lane_s32 (int32x2_t __a, int32x2_t __b, int const __c) { - int32x4_t __tmp = vcombine_s32 (__b, vcreate_s32 (__AARCH64_INT64_C (0))); - return __builtin_aarch64_sqdmull_lanev2si (__a, __tmp, __c); + return __builtin_aarch64_sqdmull_lanev2si (__a, __b, __c); } __extension__ static __inline int64x2_t __attribute__ ((__always_inline__)) @@ -19617,7 +19611,7 @@ vqdmullh_s16 (int16x1_t __a, int16x1_t __b) } __extension__ static __inline int32x1_t __attribute__ ((__always_inline__)) -vqdmullh_lane_s16 (int16x1_t __a, int16x8_t __b, const int __c) +vqdmullh_lane_s16 (int16x1_t __a, int16x4_t __b, const int __c) { return __builtin_aarch64_sqdmull_lanehi (__a, __b, __c); } @@ -19629,7 +19623,7 @@ vqdmulls_s32 (int32x1_t __a, int32x1_t __b) } __extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) -vqdmulls_lane_s32 (int32x1_t __a, int32x4_t __b, const int __c) +vqdmulls_lane_s32 (int32x1_t __a, int32x2_t __b, const int __c) { return __builtin_aarch64_sqdmull_lanesi (__a, __b, __c); } @@ -19811,7 +19805,7 @@ vqrdmulhh_s16 (int16x1_t __a, int16x1_t __b) } __extension__ static __inline int16x1_t __attribute__ ((__always_inline__)) -vqrdmulhh_lane_s16 (int16x1_t __a, int16x8_t __b, const int __c) +vqrdmulhh_lane_s16 (int16x1_t __a, int16x4_t __b, const int __c) { return __builtin_aarch64_sqrdmulh_lanehi (__a, __b, __c); } @@ -19823,7 +19817,7 @@ vqrdmulhs_s32 (int32x1_t __a, int32x1_t __b) } __extension__ static __inline int32x1_t __attribute__ ((__always_inline__)) -vqrdmulhs_lane_s32 (int32x1_t __a, int32x4_t __b, const int __c) +vqrdmulhs_lane_s32 (int32x1_t __a, int32x2_t __b, const int __c) { return __builtin_aarch64_sqrdmulh_lanesi (__a, __b, __c); } diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index bf7b683..5c304bf 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -410,14 +410,15 @@ (SI "SI") (HI "HI") (QI "QI")]) -;; Define container mode for lane selection. -(define_mode_attr VCOND [(V4HI "V4HI") (V8HI "V4HI") +;; 64-bit container modes the inner or scalar source mode. +(define_mode_attr VCOND [(HI "V4HI") (SI "V2SI") + (V4HI "V4HI") (V8HI "V4HI") (V2SI "V2SI") (V4SI "V2SI") (DI "DI") (V2DI "DI") (V2SF "V2SF") (V4SF "V2SF") (V2DF "DF")]) -;; Define container mode for lane selection. +;; 128-bit container modes the inner or scalar source mode. (define_mode_attr VCONQ [(V8QI "V16QI") (V16QI "V16QI") (V4HI "V8HI") (V8HI "V8HI") (V2SI "V4SI") (V4SI "V4SI") @@ -426,15 +427,6 @@ (V2DF "V2DF") (SI "V4SI") (HI "V8HI") (QI "V16QI")]) -;; Define container mode for lane selection. -(define_mode_attr VCON [(V8QI "V16QI") (V16QI "V16QI") - (V4HI "V8HI") (V8HI "V8HI") - (V2SI "V4SI") (V4SI "V4SI") - (DI "V2DI") (V2DI "V2DI") - (V2SF "V4SF") (V4SF "V4SF") - (V2DF "V2DF") (SI "V4SI") - (HI "V8HI") (QI "V16QI")]) - ;; Half modes of all vector modes. (define_mode_attr VHALF [(V8QI "V4QI") (V16QI "V8QI") (V4HI "V2HI") (V8HI "V4HI") diff --git a/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c b/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c index aa041cc..782f6d1 100644 --- a/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c +++ b/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c @@ -387,7 +387,7 @@ test_vqdmlalh_s16 (int32x1_t a, int16x1_t b, int16x1_t c) /* { dg-final { scan-assembler-times "\\tsqdmlal\\ts\[0-9\]+, h\[0-9\]+, v" 1 } } */ int32x1_t -test_vqdmlalh_lane_s16 (int32x1_t a, int16x1_t b, int16x8_t c) +test_vqdmlalh_lane_s16 (int32x1_t a, int16x1_t b, int16x4_t c) { return vqdmlalh_lane_s16 (a, b, c, 3); } @@ -403,7 +403,7 @@ test_vqdmlals_s32 (int64x1_t a, int32x1_t b, int32x1_t c) /* { dg-final { scan-assembler-times "\\tsqdmlal\\td\[0-9\]+, s\[0-9\]+, v" 1 } } */ int64x1_t -test_vqdmlals_lane_s32 (int64x1_t a, int32x1_t b, int32x4_t c) +test_vqdmlals_lane_s32 (int64x1_t a, int32x1_t b, int32x2_t c) { return vqdmlals_lane_s32 (a, b, c, 1); } @@ -419,7 +419,7 @@ test_vqdmlslh_s16 (int32x1_t a, int16x1_t b, int16x1_t c) /* { dg-final { scan-assembler-times "\\tsqdmlsl\\ts\[0-9\]+, h\[0-9\]+, v" 1 } } */ int32x1_t -test_vqdmlslh_lane_s16 (int32x1_t a, int16x1_t b, int16x8_t c) +test_vqdmlslh_lane_s16 (int32x1_t a, int16x1_t b, int16x4_t c) { return vqdmlslh_lane_s16 (a, b, c, 3); } @@ -435,7 +435,7 @@ test_vqdmlsls_s32 (int64x1_t a, int32x1_t b, int32x1_t c) /* { dg-final { scan-assembler-times "\\tsqdmlsl\\td\[0-9\]+, s\[0-9\]+, v" 1 } } */ int64x1_t -test_vqdmlsls_lane_s32 (int64x1_t a, int32x1_t b, int32x4_t c) +test_vqdmlsls_lane_s32 (int64x1_t a, int32x1_t b, int32x2_t c) { return vqdmlsls_lane_s32 (a, b, c, 1); } @@ -451,7 +451,7 @@ test_vqdmulhh_s16 (int16x1_t a, int16x1_t b) /* { dg-final { scan-assembler-times "\\tsqdmulh\\th\[0-9\]+, h\[0-9\]+, v" 1 } } */ int16x1_t -test_vqdmulhh_lane_s16 (int16x1_t a, int16x8_t b) +test_vqdmulhh_lane_s16 (int16x1_t a, int16x4_t b) { return vqdmulhh_lane_s16 (a, b, 3); } @@ -467,9 +467,9 @@ test_vqdmulhs_s32 (int32x1_t a, int32x1_t b) /* { dg-final { scan-assembler-times "\\tsqdmulh\\ts\[0-9\]+, s\[0-9\]+, v" 1 } } */ int32x1_t -test_vqdmulhs_lane_s32 (int32x1_t a, int32x4_t b) +test_vqdmulhs_lane_s32 (int32x1_t a, int32x2_t b) { - return vqdmulhs_lane_s32 (a, b, 3); + return vqdmulhs_lane_s32 (a, b, 1); } /* { dg-final { scan-assembler-times "\\tsqdmull\\ts\[0-9\]+, h\[0-9\]+, h\[0-9\]+" 1 } } */ @@ -483,7 +483,7 @@ test_vqdmullh_s16 (int16x1_t a, int16x1_t b) /* { dg-final { scan-assembler-times "\\tsqdmull\\ts\[0-9\]+, h\[0-9\]+, v" 1 } } */ int32x1_t -test_vqdmullh_lane_s16 (int16x1_t a, int16x8_t b) +test_vqdmullh_lane_s16 (int16x1_t a, int16x4_t b) { return vqdmullh_lane_s16 (a, b, 3); } @@ -499,7 +499,7 @@ test_vqdmulls_s32 (int32x1_t a, int32x1_t b) /* { dg-final { scan-assembler-times "\\tsqdmull\\td\[0-9\]+, s\[0-9\]+, v" 1 } } */ int64x1_t -test_vqdmulls_lane_s32 (int32x1_t a, int32x4_t b) +test_vqdmulls_lane_s32 (int32x1_t a, int32x2_t b) { return vqdmulls_lane_s32 (a, b, 1); } @@ -515,9 +515,9 @@ test_vqrdmulhh_s16 (int16x1_t a, int16x1_t b) /* { dg-final { scan-assembler-times "\\tsqrdmulh\\th\[0-9\]+, h\[0-9\]+, v" 1 } } */ int16x1_t -test_vqrdmulhh_lane_s16 (int16x1_t a, int16x8_t b) +test_vqrdmulhh_lane_s16 (int16x1_t a, int16x4_t b) { - return vqrdmulhh_lane_s16 (a, b, 6); + return vqrdmulhh_lane_s16 (a, b, 3); } /* { dg-final { scan-assembler-times "\\tsqrdmulh\\ts\[0-9\]+, s\[0-9\]+, s\[0-9\]+" 1 } } */ @@ -531,9 +531,9 @@ test_vqrdmulhs_s32 (int32x1_t a, int32x1_t b) /* { dg-final { scan-assembler-times "\\tsqrdmulh\\ts\[0-9\]+, s\[0-9\]+, v" 1 } } */ int32x1_t -test_vqrdmulhs_lane_s32 (int32x1_t a, int32x4_t b) +test_vqrdmulhs_lane_s32 (int32x1_t a, int32x2_t b) { - return vqrdmulhs_lane_s32 (a, b, 2); + return vqrdmulhs_lane_s32 (a, b, 1); } /* { dg-final { scan-assembler-times "\\tsuqadd\\tb\[0-9\]+" 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_high_lane_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_high_lane_s16.c new file mode 100644 index 0000000..5ab189e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_high_lane_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmlal_high_lane_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x4_t +t_vqdmlal_high_lane_s16 (int32x4_t a, int16x8_t b, int16x4_t c) +{ + return vqdmlal_high_lane_s16 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlal2\[ \t\]+\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.8\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_high_lane_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_high_lane_s32.c new file mode 100644 index 0000000..ad39d81 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_high_lane_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmlal_high_lane_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x2_t +t_vqdmlal_high_lane_s32 (int64x2_t a, int32x4_t b, int32x2_t c) +{ + return vqdmlal_high_lane_s32 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlal2\[ \t\]+\[vV\]\[0-9\]+\.2\[dD\], ?\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_high_laneq_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_high_laneq_s16.c new file mode 100644 index 0000000..5cb2e4a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_high_laneq_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmlal_high_laneq_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x4_t +t_vqdmlal_high_laneq_s16 (int32x4_t a, int16x8_t b, int16x8_t c) +{ + return vqdmlal_high_laneq_s16 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlal2\[ \t\]+\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.8\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_high_laneq_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_high_laneq_s32.c new file mode 100644 index 0000000..981e5f1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_high_laneq_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmlal_high_laneq_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x2_t +t_vqdmlal_high_laneq_s32 (int64x2_t a, int32x4_t b, int32x4_t c) +{ + return vqdmlal_high_laneq_s32 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlal2\[ \t\]+\[vV\]\[0-9\]+\.2\[dD\], ?\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_lane_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_lane_s16.c new file mode 100644 index 0000000..33ea6f9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_lane_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmlal_lane_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x4_t +t_vqdmlal_lane_s16 (int32x4_t a, int16x4_t b, int16x4_t c) +{ + return vqdmlal_lane_s16 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlal\[ \t\]+\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.4\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_lane_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_lane_s32.c new file mode 100644 index 0000000..e2590b4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_lane_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmlal_lane_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x2_t +t_vqdmlal_lane_s32 (int64x2_t a, int32x2_t b, int32x2_t c) +{ + return vqdmlal_lane_s32 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlal\[ \t\]+\[vV\]\[0-9\]+\.2\[dD\], ?\[vV\]\[0-9\]+\.2\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_laneq_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_laneq_s16.c new file mode 100644 index 0000000..fdc8c8d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_laneq_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmlal_laneq_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x4_t +t_vqdmlal_laneq_s16 (int32x4_t a, int16x4_t b, int16x8_t c) +{ + return vqdmlal_laneq_s16 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlal\[ \t\]+\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.4\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_laneq_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_laneq_s32.c new file mode 100644 index 0000000..c16a846 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlal_laneq_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmlal_laneq_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x2_t +t_vqdmlal_laneq_s32 (int64x2_t a, int32x2_t b, int32x4_t c) +{ + return vqdmlal_laneq_s32 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlal\[ \t\]+\[vV\]\[0-9\]+\.2\[dD\], ?\[vV\]\[0-9\]+\.2\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlalh_lane_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlalh_lane_s16.c new file mode 100644 index 0000000..954b69d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlalh_lane_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmlalh_lane_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x1_t +t_vqdmlalh_lane_s16 (int32x1_t a, int16x1_t b, int16x4_t c) +{ + return vqdmlalh_lane_s16 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlal\[ \t\]+\[sS\]\[0-9\]+, ?\[hH\]\[0-9\]+, ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlals_lane_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlals_lane_s32.c new file mode 100644 index 0000000..e7a6b6a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlals_lane_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmlals_lane_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x1_t +t_vqdmlals_lane_s32 (int64x1_t a, int32x1_t b, int32x2_t c) +{ + return vqdmlals_lane_s32 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlal\[ \t\]+\[dD\]\[0-9\]+, ?\[sS\]\[0-9\]+, ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_high_lane_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_high_lane_s16.c new file mode 100644 index 0000000..b17e51a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_high_lane_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmlsl_high_lane_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x4_t +t_vqdmlsl_high_lane_s16 (int32x4_t a, int16x8_t b, int16x4_t c) +{ + return vqdmlsl_high_lane_s16 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlsl2\[ \t\]+\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.8\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_high_lane_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_high_lane_s32.c new file mode 100644 index 0000000..ba399b4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_high_lane_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmlsl_high_lane_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x2_t +t_vqdmlsl_high_lane_s32 (int64x2_t a, int32x4_t b, int32x2_t c) +{ + return vqdmlsl_high_lane_s32 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlsl2\[ \t\]+\[vV\]\[0-9\]+\.2\[dD\], ?\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_high_laneq_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_high_laneq_s16.c new file mode 100644 index 0000000..eab4732 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_high_laneq_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmlsl_high_laneq_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x4_t +t_vqdmlsl_high_laneq_s16 (int32x4_t a, int16x8_t b, int16x8_t c) +{ + return vqdmlsl_high_laneq_s16 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlsl2\[ \t\]+\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.8\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_high_laneq_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_high_laneq_s32.c new file mode 100644 index 0000000..b926b1a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_high_laneq_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmlsl_high_laneq_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x2_t +t_vqdmlsl_high_laneq_s32 (int64x2_t a, int32x4_t b, int32x4_t c) +{ + return vqdmlsl_high_laneq_s32 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlsl2\[ \t\]+\[vV\]\[0-9\]+\.2\[dD\], ?\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_lane_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_lane_s16.c new file mode 100644 index 0000000..9ca903c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_lane_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmlsl_lane_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x4_t +t_vqdmlsl_lane_s16 (int32x4_t a, int16x4_t b, int16x4_t c) +{ + return vqdmlsl_lane_s16 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlsl\[ \t\]+\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.4\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_lane_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_lane_s32.c new file mode 100644 index 0000000..5c13fe4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_lane_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmlsl_lane_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x2_t +t_vqdmlsl_lane_s32 (int64x2_t a, int32x2_t b, int32x2_t c) +{ + return vqdmlsl_lane_s32 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlsl\[ \t\]+\[vV\]\[0-9\]+\.2\[dD\], ?\[vV\]\[0-9\]+\.2\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_laneq_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_laneq_s32.c new file mode 100644 index 0000000..4538995 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsl_laneq_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmlsl_laneq_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x2_t +t_vqdmlsl_lane_s32 (int64x2_t a, int32x2_t b, int32x4_t c) +{ + return vqdmlsl_laneq_s32 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlsl\[ \t\]+\[vV\]\[0-9\]+\.2\[dD\], ?\[vV\]\[0-9\]+\.2\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlslh_lane_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlslh_lane_s16.c new file mode 100644 index 0000000..ccebe54 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlslh_lane_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmlslh_lane_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x1_t +t_vqdmlslh_lane_s16 (int32x1_t a, int16x1_t b, int16x4_t c) +{ + return vqdmlslh_lane_s16 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlsl\[ \t\]+\[sS\]\[0-9\]+, ?\[hH\]\[0-9\]+, ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsls_lane_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsls_lane_s32.c new file mode 100644 index 0000000..a72aacf --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmlsls_lane_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmlsls_lane_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x1_t +t_vqdmlsls_lane_s32 (int64x1_t a, int32x1_t b, int32x2_t c) +{ + return vqdmlsls_lane_s32 (a, b, c, 0); +} + +/* { dg-final { scan-assembler-times "sqdmlsl\[ \t\]+\[dD\]\[0-9\]+, ?\[sS\]\[0-9\]+, ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmulh_laneq_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulh_laneq_s16.c new file mode 100644 index 0000000..23255d5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulh_laneq_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmulh_laneq_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int16x4_t +t_vqdmulh_laneq_s16 (int16x4_t a, int16x8_t b) +{ + return vqdmulh_laneq_s16 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqdmulh\[ \t\]+\[vV\]\[0-9\]+\.4\[hH\], ?\[vV\]\[0-9\]+\.4\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmulh_laneq_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulh_laneq_s32.c new file mode 100644 index 0000000..2aac35d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulh_laneq_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmulh_laneq_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x2_t +t_vqdmulh_laneq_s32 (int32x2_t a, int32x4_t b) +{ + return vqdmulh_laneq_s32 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqdmulh\[ \t\]+\[vV\]\[0-9\]+\.2\[sS\], ?\[vV\]\[0-9\]+\.2\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmulhh_lane_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulhh_lane_s16.c new file mode 100644 index 0000000..4779e36 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulhh_lane_s16.c @@ -0,0 +1,36 @@ +/* Test the vqdmulhh_lane_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do run } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" +#include + +extern void abort (void); + +int +main (void) +{ + int16_t arg1; + int16x4_t arg2; + int16_t result; + int16_t actual; + int16_t expected; + + arg1 = -32768; + arg2 = vcreate_s16 (0x0000ffff2489e398ULL); + actual = vqdmulhh_lane_s16 (arg1, arg2, 2); + expected = 1; + + if (expected != actual) + { + fprintf (stderr, "Expected: %xd, got %xd\n", expected, actual); + abort (); + } + + return 0; +} + + +/* { dg-final { scan-assembler-times "sqdmulh\[ \t\]+\[hH\]\[0-9\]+, ?\[hH\]\[0-9\]+, ?\[vV\]\[0-9\]+\.\[hH\]\\\[2\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmulhq_laneq_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulhq_laneq_s16.c new file mode 100644 index 0000000..ff654b5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulhq_laneq_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmulhq_laneq_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int16x8_t +t_vqdmulhq_laneq_s16 (int16x8_t a, int16x8_t b) +{ + return vqdmulhq_laneq_s16 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqdmulh\[ \t\]+\[vV\]\[0-9\]+\.8\[hH\], ?\[vV\]\[0-9\]+\.8\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmulhq_laneq_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulhq_laneq_s32.c new file mode 100644 index 0000000..bc88245 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulhq_laneq_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmulhq_laneq_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x4_t +t_vqdmulhq_laneq_s32 (int32x4_t a, int32x4_t b) +{ + return vqdmulhq_laneq_s32 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqdmulh\[ \t\]+\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmulhs_lane_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulhs_lane_s32.c new file mode 100644 index 0000000..9c27f5f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulhs_lane_s32.c @@ -0,0 +1,34 @@ +/* Test the vqdmulhs_lane_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do run } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" +#include + +extern void abort (void); + +int +main (void) +{ + int32_t arg1; + int32x2_t arg2; + int32_t result; + int32_t actual; + int32_t expected; + + arg1 = 57336; + arg2 = vcreate_s32 (0x55897fff7fff0000ULL); + actual = vqdmulhs_lane_s32 (arg1, arg2, 0); + expected = 57334; + + if (expected != actual) + { + fprintf (stderr, "Expected: %xd, got %xd\n", expected, actual); + abort (); + } + + return 0; +} +/* { dg-final { scan-assembler-times "sqdmulh\[ \t\]+\[sS\]\[0-9\]+, ?\[sS\]\[0-9\]+, ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_high_lane_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_high_lane_s16.c new file mode 100644 index 0000000..0bfad16 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_high_lane_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmull_high_lane_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x4_t +t_vqdmull_high_lane_s16 (int16x8_t a, int16x4_t b) +{ + return vqdmull_high_lane_s16 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqdmull2\[ \t\]+\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.8\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_high_lane_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_high_lane_s32.c new file mode 100644 index 0000000..94227ad --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_high_lane_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmull_high_lane_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x2_t +t_vqdmull_high_lane_s32 (int32x4_t a, int32x2_t b) +{ + return vqdmull_high_lane_s32 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqdmull2\[ \t\]+\[vV\]\[0-9\]+\.2\[dD\], ?\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_high_laneq_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_high_laneq_s16.c new file mode 100644 index 0000000..393ad98 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_high_laneq_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmull_high_laneq_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x4_t +t_vqdmull_high_laneq_s16 (int16x8_t a, int16x8_t b) +{ + return vqdmull_high_laneq_s16 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqdmull2\[ \t\]+\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.8\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_high_laneq_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_high_laneq_s32.c new file mode 100644 index 0000000..0a3f48f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_high_laneq_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmull_high_laneq_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x2_t +t_vqdmull_high_laneq_s32 (int32x4_t a, int32x4_t b) +{ + return vqdmull_high_laneq_s32 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqdmull2\[ \t\]+\[vV\]\[0-9\]+\.2\[dD\], ?\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_lane_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_lane_s16.c new file mode 100644 index 0000000..39f7262 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_lane_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmull_lane_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x4_t +t_vqdmull_lane_s16 (int16x4_t a, int16x4_t b) +{ + return vqdmull_lane_s16 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqdmull\[ \t\]+\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.4\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_lane_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_lane_s32.c new file mode 100644 index 0000000..8ae7d7e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_lane_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmull_lane_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x2_t +t_vqdmull_lane_s32 (int32x2_t a, int32x2_t b) +{ + return vqdmull_lane_s32 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqdmull\[ \t\]+\[vV\]\[0-9\]+\.2\[dD\], ?\[vV\]\[0-9\]+\.2\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_laneq_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_laneq_s16.c new file mode 100644 index 0000000..3d87352 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_laneq_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmull_laneq_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x4_t +t_vqdmull_laneq_s16 (int16x4_t a, int16x8_t b) +{ + return vqdmull_laneq_s16 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqdmull\[ \t\]+\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.4\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_laneq_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_laneq_s32.c new file mode 100644 index 0000000..bc35d14 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmull_laneq_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmull_laneq_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x2_t +t_vqdmull_laneq_s32 (int32x2_t a, int32x4_t b) +{ + return vqdmull_laneq_s32 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqdmull\[ \t\]+\[vV\]\[0-9\]+\.2\[dD\], ?\[vV\]\[0-9\]+\.2\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmullh_lane_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmullh_lane_s16.c new file mode 100644 index 0000000..94dd0c6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmullh_lane_s16.c @@ -0,0 +1,15 @@ +/* Test the vqdmullh_lane_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x1_t +t_vqdmullh_lane_s16 (int16x1_t a, int16x4_t b) +{ + return vqdmullh_lane_s16 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqdmull\[ \t\]+\[sS\]\[0-9\]+, ?\[hH\]\[0-9\]+, ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmulls_lane_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulls_lane_s32.c new file mode 100644 index 0000000..9ac7ee7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulls_lane_s32.c @@ -0,0 +1,15 @@ +/* Test the vqdmulls_lane_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int64x1_t +t_vqdmulls_lane_s32 (int32x1_t a, int32x2_t b) +{ + return vqdmulls_lane_s32 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqdmull\[ \t\]+\[dD\]\[0-9\]+, ?\[sS\]\[0-9\]+, ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulh_laneq_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulh_laneq_s16.c new file mode 100644 index 0000000..7d03619 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulh_laneq_s16.c @@ -0,0 +1,15 @@ +/* Test the vqrdmulh_laneq_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int16x4_t +t_vqrdmulh_laneq_s16 (int16x4_t a, int16x8_t b) +{ + return vqrdmulh_laneq_s16 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqrdmulh\[ \t\]+\[vV\]\[0-9\]+\.4\[hH\], ?\[vV\]\[0-9\]+\.4\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulh_laneq_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulh_laneq_s32.c new file mode 100644 index 0000000..ac5da9c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulh_laneq_s32.c @@ -0,0 +1,15 @@ +/* Test the vqrdmulh_laneq_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x2_t +t_vqrdmulh_laneq_s32 (int32x2_t a, int32x4_t b) +{ + return vqrdmulh_laneq_s32 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqrdmulh\[ \t\]+\[vV\]\[0-9\]+\.2\[sS\], ?\[vV\]\[0-9\]+\.2\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulhh_lane_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulhh_lane_s16.c new file mode 100644 index 0000000..afa3e36 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulhh_lane_s16.c @@ -0,0 +1,35 @@ +/* Test the vqrdmulhh_lane_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do run } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" +#include + +extern void abort (void); + +int +main (void) +{ + int16_t arg1; + int16x4_t arg2; + int16_t result; + int16_t actual; + int16_t expected; + + arg1 = -32768; + arg2 = vcreate_s16 (0xd78e000005d78000ULL); + actual = vqrdmulhh_lane_s16 (arg1, arg2, 3); + expected = 10354; + + if (expected != actual) + { + fprintf (stderr, "Expected: %xd, got %xd\n", expected, actual); + abort (); + } + + return 0; +} + +/* { dg-final { scan-assembler-times "sqrdmulh\[ \t\]+\[hH\]\[0-9\]+, ?\[hH\]\[0-9\]+, ?\[vV\]\[0-9\]+\.\[hH\]\\\[3\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulhq_laneq_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulhq_laneq_s16.c new file mode 100644 index 0000000..ec5434b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulhq_laneq_s16.c @@ -0,0 +1,15 @@ +/* Test the vqrdmulhq_laneq_s16 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int16x8_t +t_vqrdmulhq_laneq_s16 (int16x8_t a, int16x8_t b) +{ + return vqrdmulhq_laneq_s16 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqrdmulh\[ \t\]+\[vV\]\[0-9\]+\.8\[hH\], ?\[vV\]\[0-9\]+\.8\[hH\], ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulhq_laneq_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulhq_laneq_s32.c new file mode 100644 index 0000000..b2013f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulhq_laneq_s32.c @@ -0,0 +1,15 @@ +/* Test the vqrdmulhq_laneq_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do compile } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" + +int32x4_t +t_vqrdmulhq_laneq_s32 (int32x4_t a, int32x4_t b) +{ + return vqrdmulhq_laneq_s32 (a, b, 0); +} + +/* { dg-final { scan-assembler-times "sqrdmulh\[ \t\]+\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.4\[sS\], ?\[vV\]\[0-9\]+\.\[sS\]\\\[0\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulhs_lane_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulhs_lane_s32.c new file mode 100644 index 0000000..83d2ba2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vqrdmulhs_lane_s32.c @@ -0,0 +1,35 @@ +/* Test the vqrdmulhs_lane_s32 AArch64 SIMD intrinsic. */ + +/* { dg-do run } */ +/* { dg-options "-save-temps -O3 -fno-inline" } */ + +#include "arm_neon.h" +#include + +extern void abort (void); + +int +main (void) +{ + int32_t arg1; + int32x2_t arg2; + int32_t result; + int32_t actual; + int32_t expected; + + arg1 = -2099281921; + arg2 = vcreate_s32 (0x000080007fff0000ULL); + actual = vqrdmulhs_lane_s32 (arg1, arg2, 1); + expected = -32033; + + if (expected != actual) + { + fprintf (stderr, "Expected: %xd, got %xd\n", expected, actual); + abort (); + } + + return 0; +} + +/* { dg-final { scan-assembler-times "sqrdmulh\[ \t\]+\[sS\]\[0-9\]+, ?\[sS\]\[0-9\]+, ?\[vV\]\[0-9\]+\.\[sS\]\\\[1\\\]\n" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vector_intrinsics.c b/gcc/testsuite/gcc.target/aarch64/vector_intrinsics.c index affb8a8..52b0496 100644 --- a/gcc/testsuite/gcc.target/aarch64/vector_intrinsics.c +++ b/gcc/testsuite/gcc.target/aarch64/vector_intrinsics.c @@ -1,7 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2" } */ -#include "../../../config/aarch64/arm_neon.h" +#include "arm_neon.h" /* { dg-final { scan-assembler-times "\\tfmax\\tv\[0-9\]+\.2s, v\[0-9\].2s, v\[0-9\].2s" 1 } } */ @@ -305,7 +305,7 @@ test_vqdmlal_high_s16 (int32x4_t __a, int16x8_t __b, int16x8_t __c) /* { dg-final { scan-assembler-times "\\tsqdmlal2\\tv\[0-9\]+\.4s, v\[0-9\]+\.8h, v\[0-9\]+\.h" 3 } } */ int32x4_t -test_vqdmlal_high_lane_s16 (int32x4_t a, int16x8_t b, int16x8_t c) +test_vqdmlal_high_lane_s16 (int32x4_t a, int16x8_t b, int16x4_t c) { return vqdmlal_high_lane_s16 (a, b, c, 3); } @@ -361,7 +361,7 @@ test_vqdmlal_high_s32 (int64x2_t __a, int32x4_t __b, int32x4_t __c) /* { dg-final { scan-assembler-times "\\tsqdmlal2\\tv\[0-9\]+\.2d, v\[0-9\]+\.4s, v\[0-9\]+\.s" 3 } } */ int64x2_t -test_vqdmlal_high_lane_s32 (int64x2_t __a, int32x4_t __b, int32x4_t __c) +test_vqdmlal_high_lane_s32 (int64x2_t __a, int32x4_t __b, int32x2_t __c) { return vqdmlal_high_lane_s32 (__a, __b, __c, 1); } @@ -417,7 +417,7 @@ test_vqdmlsl_high_s16 (int32x4_t __a, int16x8_t __b, int16x8_t __c) /* { dg-final { scan-assembler-times "\\tsqdmlsl2\\tv\[0-9\]+\.4s, v\[0-9\]+\.8h, v\[0-9\]+\.h" 3 } } */ int32x4_t -test_vqdmlsl_high_lane_s16 (int32x4_t a, int16x8_t b, int16x8_t c) +test_vqdmlsl_high_lane_s16 (int32x4_t a, int16x8_t b, int16x4_t c) { return vqdmlsl_high_lane_s16 (a, b, c, 3); } @@ -473,7 +473,7 @@ test_vqdmlsl_high_s32 (int64x2_t __a, int32x4_t __b, int32x4_t __c) /* { dg-final { scan-assembler-times "\\tsqdmlsl2\\tv\[0-9\]+\.2d, v\[0-9\]+\.4s, v\[0-9\]+\.s" 3 } } */ int64x2_t -test_vqdmlsl_high_lane_s32 (int64x2_t __a, int32x4_t __b, int32x4_t __c) +test_vqdmlsl_high_lane_s32 (int64x2_t __a, int32x4_t __b, int32x2_t __c) { return vqdmlsl_high_lane_s32 (__a, __b, __c, 1); } @@ -529,7 +529,7 @@ test_vqdmull_high_s16 (int16x8_t __a, int16x8_t __b) /* { dg-final { scan-assembler-times "\\tsqdmull2\\tv\[0-9\]+\.4s, v\[0-9\]+\.8h, v\[0-9\]+\.h" 3 } } */ int32x4_t -test_vqdmull_high_lane_s16 (int16x8_t a, int16x8_t b) +test_vqdmull_high_lane_s16 (int16x8_t a, int16x4_t b) { return vqdmull_high_lane_s16 (a, b, 3); } @@ -585,7 +585,7 @@ test_vqdmull_high_s32 (int32x4_t __a, int32x4_t __b) /* { dg-final { scan-assembler-times "\\tsqdmull2\\tv\[0-9\]+\.2d, v\[0-9\]+\.4s, v\[0-9\]+\.s" 3 } } */ int64x2_t -test_vqdmull_high_lane_s32 (int32x4_t __a, int32x4_t __b) +test_vqdmull_high_lane_s32 (int32x4_t __a, int32x2_t __b) { return vqdmull_high_lane_s32 (__a, __b, 1); }