From patchwork Tue Nov 19 16:43:19 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Earnshaw X-Patchwork-Id: 292491 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id DC7842C00E2 for ; Wed, 20 Nov 2013 03:48:46 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; q= dns; s=default; b=nAXFzsQqPi2LIkRYDNvhL5ZqruUNTRbotGxUy2CFcHKGoo SHYlU6M16bRLUEMGPKEpMydTx7RFbZ1rsgSYCcvDbOONCyxvuuhgVQyOtDogQEh4 JSw3dDOKr7tWChUn8qKcnVCmO5d23fRSi/zxsnTm8e7wr/HW1ZTJ77GovUKQk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; s= default; bh=ghymDjdEyYyrVxnFg+6peSPeKaw=; b=mlN+73Y+l9RgHKllkSeY fXFr9IYJv0Tu9DlYMUUQRaAHUvCqUk7x6PA77r/HIYekQfc8JHxxWVSB+edAQrnX Oy1qcerh3hhdhAZMaLRzxiLDl2A/5iA7YqWEPNFUSesfvM2nqBzqK/yPli6jSVME l2XUMnq4trU938ZX6zNDiV8= Received: (qmail 12629 invoked by alias); 19 Nov 2013 16:43:56 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 12610 invoked by uid 89); 19 Nov 2013 16:43:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.8 required=5.0 tests=AWL, BAYES_50, RDNS_NONE, SPF_PASS autolearn=no version=3.3.2 X-HELO: service87.mimecast.com Received: from Unknown (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 19 Nov 2013 16:43:29 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Tue, 19 Nov 2013 16:43:20 +0000 Received: from [10.1.208.33] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Tue, 19 Nov 2013 16:43:19 +0000 Message-ID: <528B9527.2050800@arm.com> Date: Tue, 19 Nov 2013 16:43:19 +0000 From: Richard Earnshaw User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:15.0) Gecko/20120907 Thunderbird/15.0.1 MIME-Version: 1.0 To: Eric Botcazou , gcc-patches Subject: [patch] regcprop fix for PR rtl-optimization/54300 X-MC-Unique: 113111916432003001 X-IsSubscribed: yes PR 54300 is a problem in regcprop where the compiler sees (parallel [(set (x) (y) (set (y) (x)]) (REG_UNUSED (y)) as a single-set insn (since the other operand, y, is not used) and replaces a use of x with a use of y. However, it fails to take into account that y has been clobbered in the insn itself. I considered changing single_set() to not return this case, but then decided that would potentially cause missed optimization opportunities in passes like combine which do know how to deal with cases like this. The fix consists of two parts: a) Spotting the unused sets and ensuring that their values are killed in the value chains b) Disabling the simple-move optimization when we've killed something in a). The test is unfortunately ARM specific -- I'm not aware of any generic code that triggers this. gcc/ PR rtl-optimization/54300 * regcprop.c (copyprop_hardreg_forward_1): Ensure any unused outputs in a single-set are killed from the value chains. gcc/testsuite: PR rtl-optimization/54300 * gcc.target/arm/pr54300.C: New test. Bootstrapped/tested on x86_64 and tested on arm-eabi. R. Index: regcprop.c =================================================================== --- regcprop.c (revision 204974) +++ regcprop.c (working copy) @@ -747,6 +747,7 @@ copyprop_hardreg_forward_1 (basic_block int n_ops, i, alt, predicated; bool is_asm, any_replacements; rtx set; + rtx link; bool replaced[MAX_RECOG_OPERANDS]; bool changed = false; struct kill_set_value_data ksvd; @@ -815,6 +816,23 @@ copyprop_hardreg_forward_1 (basic_block if (recog_op_alt[i][alt].earlyclobber) kill_value (recog_data.operand[i], vd); + /* If we have dead sets in the insn, then we need to note these as we + would clobbers. */ + for (link = REG_NOTES (insn); link; link = XEXP (link, 1)) + { + if (REG_NOTE_KIND (link) == REG_UNUSED) + { + kill_value (XEXP (link, 0), vd); + /* Furthermore, if the insn looked like a single-set, + but the dead store kills the source value of that + set, then we can no-longer use the plain move + special case below. */ + if (set + && reg_overlap_mentioned_p (XEXP (link, 0), SET_SRC (set))) + set = NULL; + } + } + /* Special-case plain move instructions, since we may well be able to do the move from a different register class. */ if (set && REG_P (SET_SRC (set))) Index: testsuite/gcc.target/arm/pr54300.C =================================================================== --- testsuite/gcc.target/arm/pr54300.C (revision 0) +++ testsuite/gcc.target/arm/pr54300.C (revision 0) @@ -0,0 +1,61 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon } */ +/* { dg-options "-O2" } */ +/* { dg-add-options arm_neon } */ + +#include +#include + +struct __attribute__ ((aligned(8))) _v16u8_ { + uint8x16_t val; + _v16u8_( const int16x8_t &src) { val = vreinterpretq_u8_s16(src); } + operator int16x8_t () const { return vreinterpretq_s16_u8(val); } +}; +typedef struct _v16u8_ v16u8; + +struct __attribute__ ((aligned(4))) _v8u8_ { + uint8x8_t val; + _v8u8_( const uint8x8_t &src) { val = src; } + operator int16x4_t () const { return vreinterpret_s16_u8(val); } +}; +typedef struct _v8u8_ v8u8; + +typedef v16u8 v8i16; +typedef int32x4_t v4i32; +typedef const short cv1i16; +typedef const unsigned char cv1u8; +typedef const v8i16 cv8i16; + +static inline __attribute__((always_inline)) v8u8 zero_64(){ return vdup_n_u8( 0 ); } + +static inline __attribute__((always_inline)) v8i16 loadlo_8i16( cv8i16* p ){ + return vcombine_s16( vld1_s16( (cv1i16 *)p ), zero_64() ); +} +static inline __attribute__((always_inline)) v8i16 _loadlo_8i16( cv8i16* p, int offset ){ + return loadlo_8i16( (cv8i16*)(&((cv1u8*)p)[offset]) ); +} + +void __attribute__((noinline)) +test(unsigned short *_Inp, int32_t *_Out, + unsigned int s1v, unsigned int dv0, + unsigned int smask_v) +{ + int32x4_t c = vdupq_n_s32(0); + + for(unsigned int sv=0 ; sv!=dv0 ; sv=(sv+s1v)&smask_v ) + { + int32x4_t s; + s = vmovl_s16( vget_low_s16( _loadlo_8i16( (cv8i16*) _Inp, sv ) ) ); + c = vaddq_s32( c, s ); + } + vst1q_s32( _Out, c ); +} + +main() +{ + unsigned short a[4] = {1, 2, 3, 4}; + int32_t b[4] = {0, 0, 0, 0}; + test(a, b, 1, 1, ~0); + if (b[0] != 1 || b[1] != 2 || b[2] != 3 || b[3] != 4) + abort(); +}