From patchwork Fri Aug 26 14:57:54 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramana Radhakrishnan X-Patchwork-Id: 111792 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 0AF7EB6F72 for ; Sat, 27 Aug 2011 00:58:15 +1000 (EST) Received: (qmail 29163 invoked by alias); 26 Aug 2011 14:58:13 -0000 Received: (qmail 29154 invoked by uid 22791); 26 Aug 2011 14:58:12 -0000 X-SWARE-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from mail-ew0-f47.google.com (HELO mail-ew0-f47.google.com) (209.85.215.47) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 26 Aug 2011 14:57:57 +0000 Received: by ewy5 with SMTP id 5so1700377ewy.20 for ; Fri, 26 Aug 2011 07:57:56 -0700 (PDT) MIME-Version: 1.0 Received: by 10.42.154.136 with SMTP id q8mr1267657icw.109.1314370674135; Fri, 26 Aug 2011 07:57:54 -0700 (PDT) Received: by 10.231.31.4 with HTTP; Fri, 26 Aug 2011 07:57:54 -0700 (PDT) In-Reply-To: References: Date: Fri, 26 Aug 2011 15:57:54 +0100 Message-ID: Subject: Re: [Patch ARM] Fix vec_pack_trunc pattern for vectorize_with_neon_quad. From: Ramana Radhakrishnan To: gcc-patches Cc: Patch Tracking , Ira Rosen X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On 16 August 2011 15:20, Ramana Radhakrishnan wrote: > Hi, > > While looking at a failure with regrename and > mvectorize-with-neon-quad I noticed that the early-clobber in this > vec_pack_trunc pattern is superfluous given that we can use > reg_overlap_mentioned_p to decide in which order we want to emit these > 2 instructions. While it works around the problem in regrename.c I > still think that the behaviour in regrename is a bit suspicious and > needs some more investigation. > RichardS finally fixed the problem in data-flow and hence we should be able to turn on vectorize_with_quad anyway. Here's the patch which I thought I should have committed as a workaround but I think it's better to split this further in the case where the 2 registers are equal because otherwise you are pointlessly creating a stall in the Neon pipe for the vmovn result to arrive. Hence I'm not committing this patch. Tests finished OK btw for this patch. cheers Ramana index 24dd941..2c60c5f 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -5631,14 +5631,29 @@ ; the semantics of the instructions require. (define_insn "vec_pack_trunc_" - [(set (match_operand: 0 "register_operand" "=&w") + [(set (match_operand: 0 "register_operand" "=w") (vec_concat: (truncate: (match_operand:VN 1 "register_operand" "w")) (truncate: (match_operand:VN 2 "register_operand" "w"))))] "TARGET_NEON && !BYTES_BIG_ENDIAN" - "vmovn.i\t%e0, %q1\;vmovn.i\t%f0, %q2" + { + /* If operand1 and operand2 are identical, then the second + narrowing operation isn't needed as the values obtained + in both parts of the destination q register are identical. + This precludes the need for an early clobber in the destination + operand. */ + if (rtx_equal_p (operands[1], operands[2])) + return "vmovn.i\\t%e0, %q1\;vmov.i\\t%f0, %e0"; + else + { + if (reg_overlap_mentioned_p (operands[0], operands[2])) + return "vmovn.i\\t%f0, %q2\;vmovn.i\\t%e0, %q1"; + else + return "vmovn.i\\t%e0, %q1\;vmovn.i\\t%f0, %q2"; + } + } [(set_attr "neon_type" "neon_shift_1") (set_attr "length" "8")] )