From patchwork Thu Nov 29 14:27:36 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrylo Tkachov X-Patchwork-Id: 202757 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 947B42C0087 for ; Fri, 30 Nov 2012 01:28:15 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1354804095; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type: Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:Sender:Delivered-To; bh=5a/T5SJ+GKN1WTLVtA39 jApXd+g=; b=DjnmLv7vigKWv5s29ayjt69vvci8grvDYTvBn1yCDjrqB3IPVp1l CfQyLngtDfVbs+T0MqnMqJvRLW/Iiy17hGW7XzpJLcSlNr3gQV9i6uSDH+HI+qmY vcL//K8Wbc1vOdWBlWRCT+XNs9UehMZpYp4PHi6g9okqfRD1rCBX3Sc= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:From:To:Cc:Subject:Date:Message-ID:MIME-Version:X-MC-Unique:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=nUUyNx6RaNDKua2PVR7muyy+72t9miq+9RLR4kk/R5fJ1tiRC5STpuC6eIx0Fy 2j+8fdPOJ7W3D4tPtTh8rW8xKmhU+E/CaJikHkSn9GMfYjVzplJoObauFo50sc6a z3oY2VqwxKf8Cu7ASvLALgOgQHBMTVKR/1apWSiWKEtlA=; Received: (qmail 27600 invoked by alias); 29 Nov 2012 14:27:57 -0000 Received: (qmail 27581 invoked by uid 22791); 29 Nov 2012 14:27:55 -0000 X-SWARE-Spam-Status: No, hits=-0.9 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, MSGID_MULTIPLE_AT, RCVD_IN_DNSWL_LOW, TW_DQ, TW_QD, TW_VR X-Spam-Check-By: sourceware.org Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 29 Nov 2012 14:27:43 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Thu, 29 Nov 2012 14:27:41 +0000 Received: from e106372vm ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.0); Thu, 29 Nov 2012 14:27:40 +0000 From: "Kyrylo Tkachov" To: Cc: "'Ramana Radhakrishnan'" , "Richard Earnshaw" Subject: [PATCH][ARM][3/3] AArch32 NEON vrint builtins and intrinsics Date: Thu, 29 Nov 2012 14:27:36 -0000 Message-ID: <006b01cdce3d$ae0ff260$0a2fd720$@tkachov@arm.com> MIME-Version: 1.0 X-MC-Unique: 112112914274104101 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi all, This patch adds the intrinsics support for the vrnd intrinsics that are implemented by the vrint instructions. The .ml scripts contain the new information and should used to regenerate the arm_neon.h header file, tests and documentation. In particular: * config/arm/arm_neon.h should be regenerated using config/arm/neon-gen.ml. * doc/arm-neon-intrinsics.texi should be regenerated using config/arm/neon-docgen.ml. * The tests in testsuite/gcc.target/arm/neon/ should be generated using config/arm/neon-testgen.ml. All three of these scripts should be linked against the compiled neon.ml file i.e: $ ocamlc -c neon.ml $ ocamlc -o neon-gen neon.cmo neon-gen.ml The following intrinsics are defined: vrnd_f32 (float32x2_t a) (generating a vrintz instruction) vrndq_f32 (float32x4_t a) (generating a vrintz instruction) vrnda_f32 (float32x2_t a) (generating a vrinta instruction) vrndqa_f32 (float32x4_t a) (generating a vrinta instruction) vrndm_f32 (float32x2_t a) (generating a vrintm instruction) vrndqm_f32 (float32x4_t a) (generating a vrintm instruction) vrndn_f32 (float32x2_t a) (generating a vrintn instruction) vrndqn_f32 (float32x4_t a) (generating a vrintn instruction) vrndp_f32 (float32x2_t a) (generating a vrintp instruction) vrndqp_f32 (float32x4_t a) (generating a vrintp instruction) Note that AArch32 NEON does not support double precision floats, so we don't have _f64 versions. Tested on arm-none-eabi. New tests pass, no regressions (once the effective target checks patch is added). Ok for trunk? Thanks, Kyrill 2012-11-29 Kyrylo Tkachov * config/arm/neon.ml (opcode): Add Vrintn, Vrinta, Vrintp, Vrintm, Vrintz to type. (type features): Add Requires_arch type constructor. (ops): Define Vrintn, Vrinta, Vrintp, Vrintm, Vrintz features. * config/arm/neon-docgen.ml (intrinsic_groups): Define Vrintn, Vrinta, Vrintp, Vrintm, Vrintz, Vrintx. * config/arm/neon-testgen.ml (effective_target): Define check for Requires_arch 8. * config/arm/neon-gen.ml (print_feature_test_start): Handle Requires_arch. (print_feature_test_end): Likewise. * doc/arm-neon-intrinsics.texi: Regenerate. * config/arm/arm_neon.h: Regenerate. gcc/testsuite/ChangeLog 2012-11-29 Kyrylo Tkachov * gcc.target/arm/neon/vrndaf32.c: New test. * gcc.target/arm/neon/vrndqaf32.c: Likewise. * gcc.target/arm/neon/vrndf32.c: Likewise. * gcc.target/arm/neon/vrndqf32.c: Likewise. * gcc.target/arm/neon/vrndmf32.c: Likewise. * gcc.target/arm/neon/vrndqmf32.c: Likewise. * gcc.target/arm/neon/vrndnf32.c: Likewise. * gcc.target/arm/neon/vrndqnf32.c: Likewise. * gcc.target/arm/neon/vrndpf32.c: Likewise. * gcc.target/arm/neon/vrndqpf32.c: Likewise. diff --git a/gcc/config/arm/neon-docgen.ml b/gcc/config/arm/neon-docgen.ml index 043b1e0..228de16 100644 --- a/gcc/config/arm/neon-docgen.ml +++ b/gcc/config/arm/neon-docgen.ml @@ -105,6 +105,11 @@ let intrinsic_groups = "Multiply-subtract", single_opcode Vmls; "Fused-multiply-accumulate", single_opcode Vfma; "Fused-multiply-subtract", single_opcode Vfms; + "Round to integral (to nearest, ties to even)", single_opcode Vrintn; + "Round to integral (to nearest, ties away from zero)", single_opcode Vrinta; + "Round to integral (towards +Inf)", single_opcode Vrintp; + "Round to integral (towards -Inf)", single_opcode Vrintm; + "Round to integral (towards 0)", single_opcode Vrintz; "Subtraction", single_opcode Vsub; "Comparison (equal-to)", single_opcode Vceq; "Comparison (greater-than-or-equal-to)", single_opcode Vcge; diff --git a/gcc/config/arm/neon-gen.ml b/gcc/config/arm/neon-gen.ml index 6c4e272..c5f0583 100644 --- a/gcc/config/arm/neon-gen.ml +++ b/gcc/config/arm/neon-gen.ml @@ -290,17 +290,21 @@ let print_feature_test_start features = try match List.find (fun feature -> match feature with Requires_feature _ -> true + | Requires_arch _ -> true | _ -> false) features with Requires_feature feature -> Format.printf "#ifdef __ARM_FEATURE_%s@\n" feature + | Requires_arch arch -> + Format.printf "#if __ARM_ARCH >= %d@\n" arch | _ -> assert false with Not_found -> assert true let print_feature_test_end features = let feature = List.exists (function Requires_feature x -> true - | _ -> false) features in + | Requires_arch x -> true + | _ -> false) features in if feature then Format.printf "#endif@\n" diff --git a/gcc/config/arm/neon-testgen.ml b/gcc/config/arm/neon-testgen.ml index 4645f39..f6c8d9a 100644 --- a/gcc/config/arm/neon-testgen.ml +++ b/gcc/config/arm/neon-testgen.ml @@ -162,9 +162,11 @@ let effective_target features = try match List.find (fun feature -> match feature with Requires_feature _ -> true + | Requires_arch _ -> true | _ -> false) features with Requires_feature "FMA" -> "arm_neonv2" + | Requires_arch 8 -> "arm_v8_neon" | _ -> assert false with Not_found -> "arm_neon" diff --git a/gcc/config/arm/neon.ml b/gcc/config/arm/neon.ml index 101f8f6..c968f6d 100644 --- a/gcc/config/arm/neon.ml +++ b/gcc/config/arm/neon.ml @@ -152,6 +152,11 @@ type opcode = | Vqdmulh_n | Vqdmulh_lane (* Unary ops. *) + | Vrintn + | Vrinta + | Vrintp + | Vrintm + | Vrintz | Vabs | Vneg | Vcls @@ -279,6 +285,7 @@ type features = | Fixed_core_reg (* Mark that the intrinsic requires __ARM_FEATURE_string to be defined. *) | Requires_feature of string + | Requires_arch of int exception MixedMode of elts * elts @@ -812,6 +819,27 @@ let ops = Vfms, [Requires_feature "FMA"], All (3, Dreg), "vfms", elts_same_io, [F32]; Vfms, [Requires_feature "FMA"], All (3, Qreg), "vfmsQ", elts_same_io, [F32]; + (* Round to integral. *) + Vrintn, [Builtin_name "vrintn"; Requires_arch 8], Use_operands [| Dreg; Dreg |], + "vrndn", elts_same_1, [F32]; + Vrintn, [Builtin_name "vrintn"; Requires_arch 8], Use_operands [| Qreg; Qreg |], + "vrndqn", elts_same_1, [F32]; + Vrinta, [Builtin_name "vrinta"; Requires_arch 8], Use_operands [| Dreg; Dreg |], + "vrnda", elts_same_1, [F32]; + Vrinta, [Builtin_name "vrinta"; Requires_arch 8], Use_operands [| Qreg; Qreg |], + "vrndqa", elts_same_1, [F32]; + Vrintp, [Builtin_name "vrintp"; Requires_arch 8], Use_operands [| Dreg; Dreg |], + "vrndp", elts_same_1, [F32]; + Vrintp, [Builtin_name "vrintp"; Requires_arch 8], Use_operands [| Qreg; Qreg |], + "vrndqp", elts_same_1, [F32]; + Vrintm, [Builtin_name "vrintm"; Requires_arch 8], Use_operands [| Dreg; Dreg |], + "vrndm", elts_same_1, [F32]; + Vrintm, [Builtin_name "vrintm"; Requires_arch 8], Use_operands [| Qreg; Qreg |], + "vrndqm", elts_same_1, [F32]; + Vrintz, [Builtin_name "vrintz"; Requires_arch 8], Use_operands [| Dreg; Dreg |], + "vrnd", elts_same_1, [F32]; + Vrintz, [Builtin_name "vrintz"; Requires_arch 8], Use_operands [| Qreg; Qreg |], + "vrndq", elts_same_1, [F32]; (* Subtraction. *) Vsub, [], All (3, Dreg), "vsub", sign_invar_2, F32 :: su_8_32; Vsub, [No_op], All (3, Dreg), "vsub", sign_invar_2, [S64; U64];