From patchwork Mon Jan 7 13:11:43 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejas Belagod X-Patchwork-Id: 209916 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 270582C0097 for ; Tue, 8 Jan 2013 00:12:03 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1358169125; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:Sender:Delivered-To; bh=ZC604Xq 60PUUChi+lCdKvnIgUnQ=; b=EuKtUXof1rmvf/BcP7qY3OTZRoDJSHh4PWhEz1M W0/OxOy0hxRAqnwMwjFmkokpM4Wvxb6YWg9Oofe3u5/xO3Bx0rNHagL7vgCGvED1 UxhV5s1z0+9FkCGtiRUiJbgd8tAju1obcrUqgTddkS2FPX8njQHRg2JJYooqe6ab YGeo= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:X-MC-Unique:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=wBaC7OOr2RLV99Sqe8CLR03xd/JWnKvSfKSQ+o6IpRxxqGwifzOgnTDROnOZuL lyyP7MQ9lphoVePnL/ox4LgQ7j9RFDXL7r+fq302uAG43ZCJ32yGxaj/LdQTDP7P t6FWjf8q3uyJLZ9E9yhkvTtfnxW8g4Lqd2n2Bcuima7f0=; Received: (qmail 14236 invoked by alias); 7 Jan 2013 13:11:54 -0000 Received: (qmail 14101 invoked by uid 22791); 7 Jan 2013 13:11:53 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 07 Jan 2013 13:11:47 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Mon, 07 Jan 2013 13:11:45 +0000 Received: from [10.1.79.66] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.0); Mon, 7 Jan 2013 13:11:44 +0000 Message-ID: <50EAC98F.2050302@arm.com> Date: Mon, 07 Jan 2013 13:11:43 +0000 From: Tejas Belagod User-Agent: Thunderbird 2.0.0.18 (X11/20081120) MIME-Version: 1.0 To: "gcc-patches@gcc.gnu.org" CC: Marcus Shawcroft Subject: [Patch, AArch64-4.7] Implement vec_init. X-MC-Unique: 113010713114509301 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, The attached patch implements vec_init for AArch64. This has been tested on aarch64-none-elf with no regressions. OK for aarch64-4.7-branch? Thanks, Tejas Belagod ARM. 2013-01-07 Tejas Belagod gcc/ * config/aarch64/aarch64-simd.md (vec_init): New. * config/aarch64/aarch64-protos.h (aarch64_expand_vector_init): Declare. * config/aarch64/aarch64.c (aarch64_simd_dup_constant, aarch64_simd_make_constant, aarch64_expand_vector_init): New. diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index c09330a..4ecbf5f 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -138,6 +138,7 @@ HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned); bool aarch64_bitmask_imm (HOST_WIDE_INT val, enum machine_mode); bool aarch64_const_double_zero_rtx_p (rtx); bool aarch64_constant_address_p (rtx); +void aarch64_expand_vector_init (rtx, rtx); bool aarch64_function_arg_regno_p (unsigned); bool aarch64_gen_movmemqi (rtx *); bool aarch64_is_extend_from_extract (enum machine_mode, rtx, rtx); diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index febf71d..c630808 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3527,3 +3527,14 @@ DONE; }) +;; Standard pattern name vec_init. + +(define_expand "vec_init" + [(match_operand:VALL 0 "register_operand" "") + (match_operand 1 "" "")] + "TARGET_SIMD" +{ + aarch64_expand_vector_init (operands[0], operands[1]); + DONE; +}) + diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 9b7bed1..f38df32 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -6423,6 +6423,166 @@ aarch64_simd_vector_alignment_reachable (const_tree type, bool is_packed) return true; } +/* If VALS is a vector constant that can be loaded into a register + using DUP, generate instructions to do so and return an RTX to + assign to the register. Otherwise return NULL_RTX. */ +static rtx +aarch64_simd_dup_constant (rtx vals) +{ + enum machine_mode mode = GET_MODE (vals); + enum machine_mode inner_mode = GET_MODE_INNER (mode); + int n_elts = GET_MODE_NUNITS (mode); + bool all_same = true; + rtx x; + int i; + + if (GET_CODE (vals) != CONST_VECTOR) + return NULL_RTX; + + for (i = 1; i < n_elts; ++i) + { + x = CONST_VECTOR_ELT (vals, i); + if (!rtx_equal_p (x, CONST_VECTOR_ELT (vals, 0))) + all_same = false; + } + + if (!all_same) + return NULL_RTX; + + /* We can load this constant by using DUP and a constant in a + single ARM register. This will be cheaper than a vector + load. */ + x = copy_to_mode_reg (inner_mode, CONST_VECTOR_ELT (vals, 0)); + return gen_rtx_VEC_DUPLICATE (mode, x); +} + + +/* Generate code to load VALS, which is a PARALLEL containing only + constants (for vec_init) or CONST_VECTOR, efficiently into a + register. Returns an RTX to copy into the register, or NULL_RTX + for a PARALLEL that can not be converted into a CONST_VECTOR. */ +rtx +aarch64_simd_make_constant (rtx vals) +{ + enum machine_mode mode = GET_MODE (vals); + rtx const_dup; + rtx const_vec = NULL_RTX; + int n_elts = GET_MODE_NUNITS (mode); + int n_const = 0; + int i; + + if (GET_CODE (vals) == CONST_VECTOR) + const_vec = vals; + else if (GET_CODE (vals) == PARALLEL) + { + /* A CONST_VECTOR must contain only CONST_INTs and + CONST_DOUBLEs, but CONSTANT_P allows more (e.g. SYMBOL_REF). + Only store valid constants in a CONST_VECTOR. */ + for (i = 0; i < n_elts; ++i) + { + rtx x = XVECEXP (vals, 0, i); + if (CONST_INT_P (x) || CONST_DOUBLE_P (x)) + n_const++; + } + if (n_const == n_elts) + const_vec = gen_rtx_CONST_VECTOR (mode, XVEC (vals, 0)); + } + else + gcc_unreachable (); + + if (const_vec != NULL_RTX + && aarch64_simd_immediate_valid_for_move (const_vec, mode, NULL, NULL, + NULL, NULL, NULL)) + /* Load using MOVI/MVNI. */ + return const_vec; + else if ((const_dup = aarch64_simd_dup_constant (vals)) != NULL_RTX) + /* Loaded using DUP. */ + return const_dup; + else if (const_vec != NULL_RTX) + /* Load from constant pool. We can not take advantage of single-cycle + LD1 because we need a PC-relative addressing mode. */ + return const_vec; + else + /* A PARALLEL containing something not valid inside CONST_VECTOR. + We can not construct an initializer. */ + return NULL_RTX; +} + +void +aarch64_expand_vector_init (rtx target, rtx vals) +{ + enum machine_mode mode = GET_MODE (target); + enum machine_mode inner_mode = GET_MODE_INNER (mode); + int n_elts = GET_MODE_NUNITS (mode); + int n_var = 0, one_var = -1; + bool all_same = true; + rtx x, mem; + int i; + + x = XVECEXP (vals, 0, 0); + if (!CONST_INT_P (x) && !CONST_DOUBLE_P (x)) + n_var = 1, one_var = 0; + + for (i = 1; i < n_elts; ++i) + { + x = XVECEXP (vals, 0, i); + if (!CONST_INT_P (x) && !CONST_DOUBLE_P (x)) + ++n_var, one_var = i; + + if (!rtx_equal_p (x, XVECEXP (vals, 0, 0))) + all_same = false; + } + + if (n_var == 0) + { + rtx constant = aarch64_simd_make_constant (vals); + if (constant != NULL_RTX) + { + emit_move_insn (target, constant); + return; + } + } + + /* Splat a single non-constant element if we can. */ + if (all_same) + { + x = copy_to_mode_reg (inner_mode, XVECEXP (vals, 0, 0)); + aarch64_emit_move (target, gen_rtx_VEC_DUPLICATE (mode, x)); + return; + } + + /* One field is non-constant. Load constant then overwrite varying + field. This is more efficient than using the stack. */ + if (n_var == 1) + { + rtx copy = copy_rtx (vals); + rtx index = GEN_INT (one_var); + enum insn_code icode; + + /* Load constant part of vector, substitute neighboring value for + varying element. */ + XVECEXP (copy, 0, one_var) = XVECEXP (vals, 0, one_var ^ 1); + aarch64_expand_vector_init (target, copy); + + /* Insert variable. */ + x = copy_to_mode_reg (inner_mode, XVECEXP (vals, 0, one_var)); + icode = optab_handler (vec_set_optab, mode); + gcc_assert (icode != CODE_FOR_nothing); + emit_insn (GEN_FCN (icode) (target, x, index)); + return; + } + + /* Construct the vector in memory one field at a time + and load the whole vector. */ + mem = assign_stack_temp (mode, GET_MODE_SIZE (mode), 0); + for (i = 0; i < n_elts; i++) + emit_move_insn (adjust_address_nv (mem, inner_mode, + i * GET_MODE_SIZE (inner_mode)), + XVECEXP (vals, 0, i)); + emit_move_insn (target, mem); + +} + static unsigned HOST_WIDE_INT aarch64_shift_truncation_mask (enum machine_mode mode) {