From patchwork Wed May 16 13:51:06 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 159676 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 302B7B6FDE for ; Wed, 16 May 2012 23:51:47 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1337781108; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:Received:Message-ID:Date:From:User-Agent:MIME-Version: To:Cc:Subject:References:In-Reply-To:Content-Type: Content-Transfer-Encoding:Mailing-List:Precedence:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=FRQF0uSl+4s8g9NsfaB7ongy+j8=; b=gEYeGynaBWci506 /t+fC6Yx8PMM+i2Au1vz0RnjZ/Ln8keFkp7DQVROLZkzrbQge7pdat6zZ24+q4VK SK4qXr/d4/otqQLVjza3rzNBPY+4AmccKr+uDq+ZOVXVQiTWmOuqpqlvskOH0cKb HKbAVd32I53PFBb6IM2HU+eTFi4Q= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:Received:Message-ID:Date:From:User-Agent:MIME-Version:To:Cc:Subject:References:In-Reply-To:Content-Type:Content-Transfer-Encoding:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=Ljz+NfLo9N9veAfITvQVLWhfzGwqRtyzjvbAoG0/Bw/IG9Pc+dkmBDdXm9mvFE IawC682ka0gOLFnM9jOUWD3LZODz7va8FrUUHFqeeM4+LK46bEwsjl8hxvUVFf+m M/5czbimAbCcu/aoJqNpQe9N7aOMZv/bmh1xlPJ7Bt7k0=; Received: (qmail 19670 invoked by alias); 16 May 2012 13:51:41 -0000 Received: (qmail 19654 invoked by uid 22791); 16 May 2012 13:51:38 -0000 X-SWARE-Spam-Status: No, hits=-4.0 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, KHOP_THREADED, RCVD_IN_HOSTKARMA_W, RCVD_IN_HOSTKARMA_WL X-Spam-Check-By: sourceware.org Received: from eu1sys200aog102.obsmtp.com (HELO eu1sys200aog102.obsmtp.com) (207.126.144.113) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 16 May 2012 13:51:25 +0000 Received: from beta.dmz-eu.st.com ([164.129.1.35]) (using TLSv1) by eu1sys200aob102.postini.com ([207.126.147.11]) with SMTP ID DSNKT7Ow2kUKOoXnF3i5GasQSARl8P5mV6a3@postini.com; Wed, 16 May 2012 13:51:24 UTC Received: from zeta.dmz-eu.st.com (zeta.dmz-eu.st.com [164.129.230.9]) by beta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 11F9F4A8; Wed, 16 May 2012 13:51:07 +0000 (GMT) Received: from Webmail-eu.st.com (safex1hubcas1.st.com [10.75.90.14]) by zeta.dmz-eu.st.com (STMicroelectronics) with ESMTP id A11232C19; Wed, 16 May 2012 13:51:07 +0000 (GMT) Received: from [164.129.122.162] (164.129.122.162) by webmail-eu.st.com (10.75.90.13) with Microsoft SMTP Server (TLS) id 8.3.192.1; Wed, 16 May 2012 15:51:07 +0200 Message-ID: <4FB3B0CA.1090605@st.com> Date: Wed, 16 May 2012 15:51:06 +0200 From: Christophe Lyon User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:12.0) Gecko/20120420 Thunderbird/12.0 MIME-Version: 1.0 To: Ramana Radhakrishnan Cc: "gcc-patches@gcc.gnu.org" Subject: Re: [PATCH] ARM/NEON: vld1q_dup_s64 builtin References: <4FAA445A.8080605@st.com> <4FABDF5F.6070105@st.com> In-Reply-To: X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On 11.05.2012 16:48, Ramana Radhakrishnan wrote: > I would change the iterator from VQX to VQ in the pattern above (you > can also simplify the setting of neon_type in that case as well as > change that to be a vec_duplicate as below and get rid of any > lingering definitions of UNSPEC_VLD1_DUP if they exist), define a > separate pattern that expressed this as a define_insn_and_split as > below. > > (define_insn_and_split "neon_vld1_dupv2di" > [(set (match_operand:V2DI 0 "s_register_operand" "=w") > (vec_duplicate:V2DI (match_operand:DI 1 "neon_struct_operand" "Um")))] > "TARGET_NEON" > "#" > "&& reload_completed" > [(const_int 0)] > { > rtx tmprtx = gen_lowpart (DImode, operands[0]); > emit_insn (gen_neon_vld1_dupdi (tmprtx, operands[1])); > emit_move_insn (gen_highpart (DImode, operands[0]), tmprtx ); > DONE; > } > (set_attr "length" "8") > (set_attr "neon_type" ") > ) > > Do you want to try this and see what you get ? Thanks for this example and suggestion, it does work. > I'd rather have an extra regression test in gcc.target/arm that was a run time test. for e.g. take a look at gcc.target/arm/neon-vadds64.c . Here is an updated patch: 2012-05-16 Christophe Lyon * gcc/config/arm/neon.md (neon_vld1_dup): Restrict to VQ operands. (neon_vld1_dupv2di): New, fixes vld1q_dup_s64. * gcc/testsuite/gcc.target/arm/neon-vld1_dupQ.c: New test. Index: gcc/testsuite/gcc.target/arm/neon-vld1_dupQ.c =================================================================== --- gcc.orig/gcc/testsuite/gcc.target/arm/neon-vld1_dupQ.c (revision 0) +++ gcc.new/gcc/testsuite/gcc.target/arm/neon-vld1_dupQ.c (revision 0) @@ -0,0 +1,24 @@ +/* Test the `vld1q_s64' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O0" } */ +/* { dg-add-options arm_neon } */ + +#include "arm_neon.h" +#include + +int main (void) +{ + int64x1_t input[2] = {(int64x1_t)0x0123456776543210LL, + (int64x1_t)0x89abcdeffedcba90LL}; + int64x1_t output[2] = {0, 0}; + int64x2_t var = vld1q_dup_s64(input); + + vst1q_s64(output, var); + if (output[0] != (int64x1_t)0x0123456776543210LL) + abort(); + if (output[1] != (int64x1_t)0x0123456776543210LL) + abort(); + return 0; +} Index: gcc/config/arm/neon.md =================================================================== --- gcc.orig/gcc/config/arm/neon.md (revision 2659) +++ gcc.new/gcc/config/arm/neon.md (working copy) @@ -4195,20 +4195,32 @@ ) (define_insn "neon_vld1_dup" - [(set (match_operand:VQX 0 "s_register_operand" "=w") - (unspec:VQX [(match_operand: 1 "neon_struct_operand" "Um")] + [(set (match_operand:VQ 0 "s_register_operand" "=w") + (unspec:VQ [(match_operand: 1 "neon_struct_operand" "Um")] UNSPEC_VLD1_DUP))] "TARGET_NEON" { - if (GET_MODE_NUNITS (mode) > 2) return "vld1.\t{%e0[], %f0[]}, %A1"; - else - return "vld1.\t%h0, %A1"; } [(set (attr "neon_type") - (if_then_else (gt (const_string "") (const_string "1")) - (const_string "neon_vld2_2_regs_vld1_vld2_all_lanes") - (const_string "neon_vld1_1_2_regs")))] + (const_string "neon_vld2_2_regs_vld1_vld2_all_lanes"))] +) + +(define_insn_and_split "neon_vld1_dupv2di" + [(set (match_operand:V2DI 0 "s_register_operand" "=w") + (vec_duplicate:V2DI (match_operand:DI 1 "neon_struct_operand" "Um")))] + "TARGET_NEON" + "#" + "&& reload_completed" + [(const_int 0)] + { + rtx tmprtx = gen_lowpart (DImode, operands[0]); + emit_insn (gen_neon_vld1_dupdi (tmprtx, operands[1])); + emit_move_insn (gen_highpart (DImode, operands[0]), tmprtx ); + DONE; + } + [(set_attr "length" "8") + (set (attr "neon_type") (const_string "neon_vld2_2_regs_vld1_vld2_all_lanes"))] ) (define_expand "vec_store_lanes"