From patchwork Sun Oct 2 08:30:50 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ira Rosen X-Patchwork-Id: 117283 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id EA151B6F67 for ; Sun, 2 Oct 2011 19:31:21 +1100 (EST) Received: (qmail 29999 invoked by alias); 2 Oct 2011 08:31:13 -0000 Received: (qmail 29983 invoked by uid 22791); 2 Oct 2011 08:31:07 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, TW_TM X-Spam-Check-By: sourceware.org Received: from mail-yw0-f47.google.com (HELO mail-yw0-f47.google.com) (209.85.213.47) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 02 Oct 2011 08:30:51 +0000 Received: by ywf7 with SMTP id 7so3084398ywf.20 for ; Sun, 02 Oct 2011 01:30:50 -0700 (PDT) MIME-Version: 1.0 Received: by 10.236.133.145 with SMTP id q17mr10034829yhi.58.1317544250487; Sun, 02 Oct 2011 01:30:50 -0700 (PDT) Received: by 10.147.99.14 with HTTP; Sun, 2 Oct 2011 01:30:50 -0700 (PDT) In-Reply-To: References: Date: Sun, 2 Oct 2011 11:30:50 +0300 Message-ID: Subject: Re: [patch] Support vectorization of widening shifts From: Ira Rosen To: Ramana Radhakrishnan Cc: gcc-patches@gcc.gnu.org, Patch Tracking Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On 29 September 2011 17:30, Ramana Radhakrishnan wrote: > On 19 September 2011 08:54, Ira Rosen wrote: > >> >> Bootstrapped on powerpc64-suse-linux, tested on powerpc64-suse-linux >> and arm-linux-gnueabi >> OK for mainline? > > Sorry I missed this patch. Is there any reason why we need unspecs in > this case ? Can't this be represented by subregs and zero/ sign > extensions in RTL without the UNSPECs ? Like this: ; because the ordering of vector elements in Q registers is different from what ; the semantics of the instructions require. ? Thanks, Ira > > cheers > Ramana > >> >> Thanks, >> Ira >> >> ChangeLog: >> >>        * doc/md.texi (vec_widen_ushiftl_hi, vec_widen_ushiftl_lo, >> vec_widen_sshiftl_hi, >>        vec_widen_sshiftl_lo): Document. >>        * tree-pretty-print.c (dump_generic_node): Handle WIDEN_SHIFT_LEFT_EXPR, >>        VEC_WIDEN_SHIFT_LEFT_HI_EXPR and VEC_WIDEN_SHIFT_LEFT_LO_EXPR. >>        (op_code_prio): Likewise. >>        (op_symbol_code): Handle WIDEN_SHIFT_LEFT_EXPR. >>        * optabs.c (optab_for_tree_code): Handle >>        VEC_WIDEN_SHIFT_LEFT_HI_EXPR and VEC_WIDEN_SHIFT_LEFT_LO_EXPR. >>        (init-optabs): Initialize optab codes for vec_widen_u/sshiftl_hi/lo. >>        * optabs.h (enum optab_index): Add OTI_vec_widen_u/sshiftl_hi/lo. >>        * genopinit.c (optabs): Initialize the new optabs. >>        * expr.c (expand_expr_real_2): Handle >>        VEC_WIDEN_SHIFT_LEFT_HI_EXPR and VEC_WIDEN_SHIFT_LEFT_LO_EXPR. >>        * gimple-pretty-print.c (dump_binary_rhs): Likewise. >>        * tree-vectorizer.h (NUM_PATTERNS): Increase to 6. >>        * tree.def (WIDEN_SHIFT_LEFT_EXPR, VEC_WIDEN_SHIFT_LEFT_HI_EXPR, >>        VEC_WIDEN_SHIFT_LEFT_LO_EXPR): New. >>        * cfgexpand.c (expand_debug_expr):  Handle new tree codes. >>        * tree-vect-patterns.c (vect_vect_recog_func_ptrs): Add >>        vect_recog_widen_shift_pattern. >>        (vect_handle_widen_mult_by_const): Rename... >>        (vect_handle_widen_op_by_const): ...to this.  Handle shifts. >>        Add a new argument, update documentation. >>        (vect_recog_widen_mult_pattern): Assume that only second >>        operand can be constant.  Update call to >>        vect_handle_widen_op_by_const. >>        (vect_operation_fits_smaller_type): Add the already existing >>        def stmt to the list of pattern statements. >>        (vect_recog_widen_shift_pattern): New. >>        * tree-vect-stmts.c (vectorizable_type_promotion): Handle >>        widening shifts. >>        (supportable_widening_operation): Likewise. >>        * tree-inline.c (estimate_operator_cost): Handle new tree codes. >>        * tree-vect-generic.c (expand_vector_operations_1): Likewise. >>        * tree-cfg.c (verify_gimple_assign_binary): Likewise. >>        * config/arm/neon.md (neon_vec_shiftl_lo_): New. >>        (vec_widen_shiftl_lo_, neon_vec_shiftl_hi_, >>        vec_widen_shiftl_hi_, neon_vec_shift_left_): >>        Likewise. >>        * tree-vect-slp.c (vect_build_slp_tree): Require same shift operand >>        for widening shift. >> >> testsuite/ChangeLog: >> >>       * gcc.dg/vect/vect-widen-shift-s16.c: New. >>       * gcc.dg/vect/vect-widen-shift-s8.c: New. >>       * gcc.dg/vect/vect-widen-shift-u16.c: New. >>       * gcc.dg/vect/vect-widen-shift-u8.c: New. >> > Index: config/arm/neon.md =================================================================== --- config/arm/neon.md (revision 178942) +++ config/arm/neon.md (working copy) @@ -5550,6 +5550,46 @@ } ) +(define_insn "neon_vec_shiftl_" + [(set (match_operand: 0 "register_operand" "=w") + (SE: (match_operand:VW 1 "register_operand" "w"))) + (match_operand:SI 2 "immediate_operand" "i")] + "TARGET_NEON" +{ + /* The boundaries are: 0 < imm <= size. */ + neon_const_bounds (operands[2], 0, neon_element_bits (mode) + 1); + return "vshll. %q0, %P1, %2"; +} + [(set_attr "neon_type" "neon_shift_1")] +) + +(define_expand "vec_widen_shiftl_lo_" + [(match_operand: 0 "register_operand" "") + (SE: (match_operand:VU 1 "register_operand" "")) + (match_operand:SI 2 "immediate_operand" "i")] + "TARGET_NEON && !BYTES_BIG_ENDIAN" + { + emit_insn (gen_neon_vec_shiftl_ (operands[0], + simplify_gen_subreg (mode, operands[1], mode, 0), + operands[2])); + DONE; + } +) + +(define_expand "vec_widen_shiftl_hi_" + [(match_operand: 0 "register_operand" "") + (SE: (match_operand:VU 1 "register_operand" "")) + (match_operand:SI 2 "immediate_operand" "i")] + "TARGET_NEON && !BYTES_BIG_ENDIAN" + { + emit_insn (gen_neon_vec_shiftl_ (operands[0], + simplify_gen_subreg (mode, operands[1], mode, + GET_MODE_SIZE (mode)), + operands[2])); + DONE; + } +) + ;; Vectorize for non-neon-quad case (define_insn "neon_unpack_" [(set (match_operand: 0 "register_operand" "=w") @@ -5626,6 +5666,34 @@ } ) +(define_expand "vec_widen_shiftl_hi_" + [(match_operand: 0 "register_operand" "") + (SE: (match_operand:VDI 1 "register_operand" "")) + (match_operand:SI 2 "immediate_operand" "i")] + "TARGET_NEON" + { + rtx tmpreg = gen_reg_rtx (mode); + emit_insn (gen_neon_vec_shiftl_ (tmpreg, operands[1], operands[2])); + emit_insn (gen_neon_vget_high (operands[0], tmpreg)); + + DONE; + } +) + +(define_expand "vec_widen_shiftl_lo_" + [(match_operand: 0 "register_operand" "") + (SE: (match_operand:VDI 1 "register_operand" "")) + (match_operand:SI 2 "immediate_operand" "i")] + "TARGET_NEON" + { + rtx tmpreg = gen_reg_rtx (mode); + emit_insn (gen_neon_vec_shiftl_ (tmpreg, operands[1], operands[2])); + emit_insn (gen_neon_vget_low (operands[0], tmpreg)); + + DONE; + } +) + ; FIXME: These instruction patterns can't be used safely in big-endian mode