From patchwork Sun Oct  2 08:30:50 2011
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Ira Rosen <ira.rosen@linaro.org>
X-Patchwork-Id: 117283
Return-Path: 
 <gcc-patches-return-303236-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	by ozlabs.org (Postfix) with SMTP id EA151B6F67
	for <incoming@patchwork.ozlabs.org>;
	Sun,  2 Oct 2011 19:31:21 +1100 (EST)
Received: (qmail 29999 invoked by alias); 2 Oct 2011 08:31:13 -0000
Received: (qmail 29983 invoked by uid 22791); 2 Oct 2011 08:31:07 -0000
X-SWARE-Spam-Status: No, hits=-2.4 required=5.0	tests=AWL, BAYES_00,
	RCVD_IN_DNSWL_LOW, TW_TM
X-Spam-Check-By: sourceware.org
Received: from mail-yw0-f47.google.com (HELO mail-yw0-f47.google.com)
	(209.85.213.47) by sourceware.org (qpsmtpd/0.43rc1) with
	ESMTP; Sun, 02 Oct 2011 08:30:51 +0000
Received: by ywf7 with SMTP id 7so3084398ywf.20 for
	<gcc-patches@gcc.gnu.org>; Sun, 02 Oct 2011 01:30:50 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.236.133.145 with SMTP id q17mr10034829yhi.58.1317544250487;
	Sun, 02 Oct 2011 01:30:50 -0700 (PDT)
Received: by 10.147.99.14 with HTTP; Sun, 2 Oct 2011 01:30:50 -0700 (PDT)
In-Reply-To: 
 <CACUk7=WOuecU95M061T7d2arOhRKVrvHKhf8qu=qDVQ0BXtZHA@mail.gmail.com>
References: 
 <CAKSNEw512ANrc1U=9MwOfSHV-x5Jou2gad09pNz7Jvi_tt1gHA@mail.gmail.com>
	<CACUk7=WOuecU95M061T7d2arOhRKVrvHKhf8qu=qDVQ0BXtZHA@mail.gmail.com>
Date: Sun, 2 Oct 2011 11:30:50 +0300
Message-ID: 
 <CAKSNEw4m5FirSqekmAyRi5SWqPidqz2kb2=X1r1WQrjwVEjStw@mail.gmail.com>
Subject: Re: [patch] Support vectorization of widening shifts
From: Ira Rosen <ira.rosen@linaro.org>
To: Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org>
Cc: gcc-patches@gcc.gnu.org, Patch Tracking <patches@linaro.org>
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org

On 29 September 2011 17:30, Ramana Radhakrishnan
<ramana.radhakrishnan@linaro.org> wrote:
> On 19 September 2011 08:54, Ira Rosen <ira.rosen@linaro.org> wrote:
>
>>
>> Bootstrapped on powerpc64-suse-linux, tested on powerpc64-suse-linux
>> and arm-linux-gnueabi
>> OK for mainline?
>
> Sorry I missed this patch. Is there any reason why we need unspecs in
> this case ? Can't this be represented by subregs and zero/ sign
> extensions in RTL without the UNSPECs ?

Like this:

 ; because the ordering of vector elements in Q registers is different from what
 ; the semantics of the instructions require.

?

Thanks,
Ira


>
> cheers
> Ramana
>
>>
>> Thanks,
>> Ira
>>
>> ChangeLog:
>>
>>        * doc/md.texi (vec_widen_ushiftl_hi, vec_widen_ushiftl_lo,
>> vec_widen_sshiftl_hi,
>>        vec_widen_sshiftl_lo): Document.
>>        * tree-pretty-print.c (dump_generic_node): Handle WIDEN_SHIFT_LEFT_EXPR,
>>        VEC_WIDEN_SHIFT_LEFT_HI_EXPR and VEC_WIDEN_SHIFT_LEFT_LO_EXPR.
>>        (op_code_prio): Likewise.
>>        (op_symbol_code): Handle WIDEN_SHIFT_LEFT_EXPR.
>>        * optabs.c (optab_for_tree_code): Handle
>>        VEC_WIDEN_SHIFT_LEFT_HI_EXPR and VEC_WIDEN_SHIFT_LEFT_LO_EXPR.
>>        (init-optabs): Initialize optab codes for vec_widen_u/sshiftl_hi/lo.
>>        * optabs.h (enum optab_index): Add OTI_vec_widen_u/sshiftl_hi/lo.
>>        * genopinit.c (optabs): Initialize the new optabs.
>>        * expr.c (expand_expr_real_2): Handle
>>        VEC_WIDEN_SHIFT_LEFT_HI_EXPR and VEC_WIDEN_SHIFT_LEFT_LO_EXPR.
>>        * gimple-pretty-print.c (dump_binary_rhs): Likewise.
>>        * tree-vectorizer.h (NUM_PATTERNS): Increase to 6.
>>        * tree.def (WIDEN_SHIFT_LEFT_EXPR, VEC_WIDEN_SHIFT_LEFT_HI_EXPR,
>>        VEC_WIDEN_SHIFT_LEFT_LO_EXPR): New.
>>        * cfgexpand.c (expand_debug_expr):  Handle new tree codes.
>>        * tree-vect-patterns.c (vect_vect_recog_func_ptrs): Add
>>        vect_recog_widen_shift_pattern.
>>        (vect_handle_widen_mult_by_const): Rename...
>>        (vect_handle_widen_op_by_const): ...to this.  Handle shifts.
>>        Add a new argument, update documentation.
>>        (vect_recog_widen_mult_pattern): Assume that only second
>>        operand can be constant.  Update call to
>>        vect_handle_widen_op_by_const.
>>        (vect_operation_fits_smaller_type): Add the already existing
>>        def stmt to the list of pattern statements.
>>        (vect_recog_widen_shift_pattern): New.
>>        * tree-vect-stmts.c (vectorizable_type_promotion): Handle
>>        widening shifts.
>>        (supportable_widening_operation): Likewise.
>>        * tree-inline.c (estimate_operator_cost): Handle new tree codes.
>>        * tree-vect-generic.c (expand_vector_operations_1): Likewise.
>>        * tree-cfg.c (verify_gimple_assign_binary): Likewise.
>>        * config/arm/neon.md (neon_vec_<US>shiftl_lo_<mode>): New.
>>        (vec_widen_<US>shiftl_lo_<mode>, neon_vec_<US>shiftl_hi_<mode>,
>>        vec_widen_<US>shiftl_hi_<mode>, neon_vec_<US>shift_left_<mode>):
>>        Likewise.
>>        * tree-vect-slp.c (vect_build_slp_tree): Require same shift operand
>>        for widening shift.
>>
>> testsuite/ChangeLog:
>>
>>       * gcc.dg/vect/vect-widen-shift-s16.c: New.
>>       * gcc.dg/vect/vect-widen-shift-s8.c: New.
>>       * gcc.dg/vect/vect-widen-shift-u16.c: New.
>>       * gcc.dg/vect/vect-widen-shift-u8.c: New.
>>
>

Index: config/arm/neon.md
===================================================================
--- config/arm/neon.md  (revision 178942)
+++ config/arm/neon.md  (working copy)
@@ -5550,6 +5550,46 @@
  }
 )

+(define_insn "neon_vec_<US>shiftl_<mode>"
+ [(set (match_operand:<V_widen> 0 "register_operand" "=w")
+       (SE:<V_widen> (match_operand:VW 1 "register_operand" "w")))
+       (match_operand:SI 2 "immediate_operand" "i")]
+  "TARGET_NEON"
+{
+  /* The boundaries are: 0 < imm <= size.  */
+  neon_const_bounds (operands[2], 0, neon_element_bits (<MODE>mode) + 1);
+  return "vshll.<US><V_sz_elem> %q0, %P1, %2";
+}
+  [(set_attr "neon_type" "neon_shift_1")]
+)
+
+(define_expand "vec_widen_<US>shiftl_lo_<mode>"
+  [(match_operand:<V_unpack> 0 "register_operand" "")
+   (SE:<V_unpack> (match_operand:VU 1 "register_operand" ""))
+   (match_operand:SI 2 "immediate_operand" "i")]
+ "TARGET_NEON && !BYTES_BIG_ENDIAN"
+ {
+  emit_insn (gen_neon_vec_<US>shiftl_<V_half> (operands[0],
+               simplify_gen_subreg (<V_HALF>mode, operands[1], <MODE>mode, 0),
+               operands[2]));
+   DONE;
+ }
+)
+
+(define_expand "vec_widen_<US>shiftl_hi_<mode>"
+  [(match_operand:<V_unpack> 0 "register_operand" "")
+   (SE:<V_unpack> (match_operand:VU 1 "register_operand" ""))
+   (match_operand:SI 2 "immediate_operand" "i")]
+ "TARGET_NEON && !BYTES_BIG_ENDIAN"
+ {
+  emit_insn (gen_neon_vec_<US>shiftl_<V_half> (operands[0],
+                simplify_gen_subreg (<V_HALF>mode, operands[1], <MODE>mode,
+                                    GET_MODE_SIZE (<V_HALF>mode)),
+                operands[2]));
+   DONE;
+ }
+)
+
 ;; Vectorize for non-neon-quad case
 (define_insn "neon_unpack<US>_<mode>"
  [(set (match_operand:<V_widen> 0 "register_operand" "=w")
@@ -5626,6 +5666,34 @@
  }
 )

+(define_expand "vec_widen_<US>shiftl_hi_<mode>"
+ [(match_operand:<V_double_width> 0 "register_operand" "")
+   (SE:<V_double_width> (match_operand:VDI 1 "register_operand" ""))
+   (match_operand:SI 2 "immediate_operand" "i")]
+ "TARGET_NEON"
+ {
+   rtx tmpreg = gen_reg_rtx (<V_widen>mode);
+   emit_insn (gen_neon_vec_<US>shiftl_<mode> (tmpreg, operands[1],
operands[2]));
+   emit_insn (gen_neon_vget_high<V_widen_l> (operands[0], tmpreg));
+
+   DONE;
+ }
+)
+
+(define_expand "vec_widen_<US>shiftl_lo_<mode>"
+  [(match_operand:<V_double_width> 0 "register_operand" "")
+   (SE:<V_double_width> (match_operand:VDI 1 "register_operand" ""))
+   (match_operand:SI 2 "immediate_operand" "i")]
+ "TARGET_NEON"
+ {
+   rtx tmpreg = gen_reg_rtx (<V_widen>mode);
+   emit_insn (gen_neon_vec_<US>shiftl_<mode> (tmpreg, operands[1],
operands[2]));
+   emit_insn (gen_neon_vget_low<V_widen_l> (operands[0], tmpreg));
+
+   DONE;
+ }
+)
+
 ; FIXME: These instruction patterns can't be used safely in big-endian mode