From patchwork Thu Nov 26 16:10:36 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wahab X-Patchwork-Id: 549167 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 8E5CD1402C4 for ; Fri, 27 Nov 2015 03:11:04 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=iRl5ThYM; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=KJ+O8vTV229Vn2h+1 6NirjfnGJF759ZmxGGP+cDIAF1jqSGBSRdXVb2aZgkvkYJONM6YOmUXsm1rsPBsN yQr2SUrVR4IP5oO2/zmNWZtRlnfUZ8P/fs5DJksySeGCTpve4PybtKplyTU4aRfY eOSGnuWbA3LbYJTqQRhhbM9aC8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=Aks7THcNc8iY/mRAcWj9zzM nF2s=; b=iRl5ThYMgTtE9gTlcKj10TUCUvyJEBQ2a+Tl2cA4EcOTzNBGgRmgomN 9U2ztUfc4e2DeEPE5z21l3h8OBwzoNXOCi+Ey53ksyop8Leobh5DW3m3+W/iUf8r PvzcMBJfn/HXFmGkBztvOHXy9XZnSrXyWyEU4cmb7Qv3woEU3ZWI= Received: (qmail 125566 invoked by alias); 26 Nov 2015 16:10:46 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 125476 invoked by uid 89); 26 Nov 2015 16:10:45 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.1 required=5.0 tests=AWL, BAYES_00, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 26 Nov 2015 16:10:39 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5923949 for ; Thu, 26 Nov 2015 08:10:20 -0800 (PST) Received: from e108033-lin.cambridge.arm.com (e108033-lin.cambridge.arm.com [10.2.206.36]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 090983F308 for ; Thu, 26 Nov 2015 08:10:37 -0800 (PST) Subject: Re: [PATCH 7/7][ARM] Add ACLE intrinsics vqrdmlah_lane and vqrdmlsh_lane To: gcc-patches@gcc.gnu.org References: <56572B79.9000406@foss.arm.com> <56572DA8.9030804@foss.arm.com> From: Matthew Wahab Message-ID: <56572EFC.2030407@foss.arm.com> Date: Thu, 26 Nov 2015 16:10:36 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <56572DA8.9030804@foss.arm.com> X-IsSubscribed: yes Attached the missing patch. Matthew On 26/11/15 16:04, Matthew Wahab wrote: > Hello, > > This patch adds the ACLE intrinsics for the instructions introduced in > ARMv8.1. It adds the vqrmdlah_lane and vqrdmlsh_lane forms of the > instrinsics to the arm_neon.h header, together with the ARM builtins > used to implement them. The intrinsics are available when > -march=armv8.1-a is enabled together with appropriate fpu options. > > Tested the series for arm-none-eabi with cross-compiled check-gcc on an > ARMv8.1 emulator. Also tested arm-none-linux-gnueabihf with native > bootstrap and make check. > > Ok for trunk? > Matthew > > gcc/ > 2015-11-26 Matthew Wahab > > * config/arm/arm_neon.h (vqrdmlahq_lane_s16): New. > (vqrdmlahq_lane_s32): New. > (vqrdmlah_lane_s16): New. > (vqrdmlah_lane_s32): New. > (vqrdmlshq_lane_s16): New. > (vqrdmlshq_lane_s32): New. > (vqrdmlsh_lane_s16): New. > (vqrdmlsh_lane_s32): New. > * config/arm/arm_neon_builtins.def: Add "vqrdmlah_lane" and > "vqrdmlsh_lane". > From cdfee6be49e52056de8999fbc33a432f2cc7254f Mon Sep 17 00:00:00 2001 From: Matthew Wahab Date: Tue, 1 Sep 2015 16:22:34 +0100 Subject: [PATCH 7/7] [ARM] Add neon intrinsics vqrdmlah_lane, vqrdmlsh_lane. Change-Id: Ia0ab4bbe683af2d019d18a34302a7b9798193a79 --- gcc/config/arm/arm_neon.h | 50 ++++++++++++++++++++++++++++++++++++ gcc/config/arm/arm_neon_builtins.def | 2 ++ 2 files changed, 52 insertions(+) diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h index b617f80..ed50253 100644 --- a/gcc/config/arm/arm_neon.h +++ b/gcc/config/arm/arm_neon.h @@ -7096,6 +7096,56 @@ vqrdmulh_lane_s32 (int32x2_t __a, int32x2_t __b, const int __c) return (int32x2_t)__builtin_neon_vqrdmulh_lanev2si (__a, __b, __c); } +#ifdef __ARM_FEATURE_QRDMX +__extension__ static __inline int16x8_t __attribute__ ((__always_inline__)) +vqrdmlahq_lane_s16 (int16x8_t __a, int16x8_t __b, int16x4_t __c, const int __d) +{ + return (int16x8_t)__builtin_neon_vqrdmlah_lanev8hi (__a, __b, __c, __d); +} + +__extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) +vqrdmlahq_lane_s32 (int32x4_t __a, int32x4_t __b, int32x2_t __c, const int __d) +{ + return (int32x4_t)__builtin_neon_vqrdmlah_lanev4si (__a, __b, __c, __d); +} + +__extension__ static __inline int16x4_t __attribute__ ((__always_inline__)) +vqrdmlah_lane_s16 (int16x4_t __a, int16x4_t __b, int16x4_t __c, const int __d) +{ + return (int16x4_t)__builtin_neon_vqrdmlah_lanev4hi (__a, __b, __c, __d); +} + +__extension__ static __inline int32x2_t __attribute__ ((__always_inline__)) +vqrdmlah_lane_s32 (int32x2_t __a, int32x2_t __b, int32x2_t __c, const int __d) +{ + return (int32x2_t)__builtin_neon_vqrdmlah_lanev2si (__a, __b, __c, __d); +} + +__extension__ static __inline int16x8_t __attribute__ ((__always_inline__)) +vqrdmlshq_lane_s16 (int16x8_t __a, int16x8_t __b, int16x4_t __c, const int __d) +{ + return (int16x8_t)__builtin_neon_vqrdmlsh_lanev8hi (__a, __b, __c, __d); +} + +__extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) +vqrdmlshq_lane_s32 (int32x4_t __a, int32x4_t __b, int32x2_t __c, const int __d) +{ + return (int32x4_t)__builtin_neon_vqrdmlsh_lanev4si (__a, __b, __c, __d); +} + +__extension__ static __inline int16x4_t __attribute__ ((__always_inline__)) +vqrdmlsh_lane_s16 (int16x4_t __a, int16x4_t __b, int16x4_t __c, const int __d) +{ + return (int16x4_t)__builtin_neon_vqrdmlsh_lanev4hi (__a, __b, __c, __d); +} + +__extension__ static __inline int32x2_t __attribute__ ((__always_inline__)) +vqrdmlsh_lane_s32 (int32x2_t __a, int32x2_t __b, int32x2_t __c, const int __d) +{ + return (int32x2_t)__builtin_neon_vqrdmlsh_lanev2si (__a, __b, __c, __d); +} +#endif + __extension__ static __inline int16x4_t __attribute__ ((__always_inline__)) vmul_n_s16 (int16x4_t __a, int16_t __b) { diff --git a/gcc/config/arm/arm_neon_builtins.def b/gcc/config/arm/arm_neon_builtins.def index 8d5c0ca..1fdb2a8 100644 --- a/gcc/config/arm/arm_neon_builtins.def +++ b/gcc/config/arm/arm_neon_builtins.def @@ -60,6 +60,8 @@ VAR4 (BINOP, vqdmulh_n, v4hi, v2si, v8hi, v4si) VAR4 (BINOP, vqrdmulh_n, v4hi, v2si, v8hi, v4si) VAR4 (SETLANE, vqdmulh_lane, v4hi, v2si, v8hi, v4si) VAR4 (SETLANE, vqrdmulh_lane, v4hi, v2si, v8hi, v4si) +VAR4 (MAC_LANE, vqrdmlah_lane, v4hi, v2si, v8hi, v4si) +VAR4 (MAC_LANE, vqrdmlsh_lane, v4hi, v2si, v8hi, v4si) VAR2 (BINOP, vqdmull, v4hi, v2si) VAR8 (BINOP, vshls, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) VAR8 (BINOP, vshlu, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) -- 2.1.4