From patchwork Wed Jan 9 17:03:57 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Makarov X-Patchwork-Id: 210776 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 8295B2C014C for ; Thu, 10 Jan 2013 04:04:25 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1358355865; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Message-ID:Date:From:User-Agent:MIME-Version:To:Subject: Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:Sender:Delivered-To; bh=cai9L2M wdjWfwJDmovTEeTvyv08=; b=r+2SxAvE53Ufxg0DqSemoInHJtzt3qbLNx7Tyv8 EcjDVNkO5Y8+/zJ8Zw0GaDouAlfcJu4HX/EU1x+P2OCFEUIKpQn52ONsmYgq1Tqr zl577jh0du3CBNNWuHitTvu1FH/i0qixJp3UQ82UKVgOshBDx5C7fYEL+tlUScdc Dd4o= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Message-ID:Date:From:User-Agent:MIME-Version:To:Subject:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=pxGBs4wfz6vS0OiL2MipgP0zk6S1qrq4nGdBfPMQePbx20YUzeZvjKL6TTjpG1 pSps8B/QcrJTrudMvhFk0LYdgmeML80gcNxx6aSXJOYHvPf0B2hHXnVuQT0G/qp6 12EybNDx6dFBJasi/nbUlg7o+FAfsPUtksAtaeapbXk8A=; Received: (qmail 31527 invoked by alias); 9 Jan 2013 17:04:08 -0000 Received: (qmail 31517 invoked by uid 22791); 9 Jan 2013 17:04:06 -0000 X-SWARE-Spam-Status: No, hits=-6.5 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_DNSWL_HI, RCVD_IN_HOSTKARMA_W, RP_MATCHES_RCVD, SPF_HELO_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 09 Jan 2013 17:03:58 +0000 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r09H3wVJ002103 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 9 Jan 2013 12:03:58 -0500 Received: from Mair.local (vpn-53-72.rdu2.redhat.com [10.10.53.72]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r09H3vIZ020939 for ; Wed, 9 Jan 2013 12:03:57 -0500 Message-ID: <50EDA2FD.3060802@redhat.com> Date: Wed, 09 Jan 2013 12:03:57 -0500 From: Vladimir Makarov User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: GCC Patches Subject: patch to fix PR55829 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org The following patch fixes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55829 The patch was successfully bootstrapped on x86-64. Committed as rev. 195057. 2013-01-09 Vladimir Makarov PR rtl-optimization/pr55829 * lra-constraints.c (match_reload): Add code for absent output. (curr_insn_transform): Add code for reloads of matched inputs without output. 2013-01-09 Vladimir Makarov PR rtl-optimization/pr55829 * gcc.target/i386/pr55829.c: New. Index: lra-constraints.c =================================================================== --- lra-constraints.c (revision 195054) +++ lra-constraints.c (working copy) @@ -658,8 +658,9 @@ narrow_reload_pseudo_class (rtx reg, enu /* Generate reloads for matching OUT and INS (array of input operand numbers with end marker -1) with reg class GOAL_CLASS. Add input - and output reloads correspondingly to the lists *BEFORE and - *AFTER. */ + and output reloads correspondingly to the lists *BEFORE and *AFTER. + OUT might be negative. In this case we generate input reloads for + matched input operands INS. */ static void match_reload (signed char out, signed char *ins, enum reg_class goal_class, rtx *before, rtx *after) @@ -668,10 +669,10 @@ match_reload (signed char out, signed ch rtx new_in_reg, new_out_reg, reg, clobber; enum machine_mode inmode, outmode; rtx in_rtx = *curr_id->operand_loc[ins[0]]; - rtx out_rtx = *curr_id->operand_loc[out]; + rtx out_rtx = out < 0 ? in_rtx : *curr_id->operand_loc[out]; - outmode = curr_operand_mode[out]; inmode = curr_operand_mode[ins[0]]; + outmode = out < 0 ? inmode : curr_operand_mode[out]; push_to_sequence (*before); if (inmode != outmode) { @@ -746,14 +747,13 @@ match_reload (signed char out, signed ch = lra_create_new_reg_with_unique_value (outmode, out_rtx, goal_class, ""); } - /* In and out operand can be got from transformations before - processing insn constraints. One example of such transformations - is subreg reloading (see function simplify_operand_subreg). The - new pseudos created by the transformations might have inaccurate + /* In operand can be got from transformations before processing insn + constraints. One example of such transformations is subreg + reloading (see function simplify_operand_subreg). The new + pseudos created by the transformations might have inaccurate class (ALL_REGS) and we should make their classes more accurate. */ narrow_reload_pseudo_class (in_rtx, goal_class); - narrow_reload_pseudo_class (out_rtx, goal_class); lra_emit_move (copy_rtx (new_in_reg), in_rtx); *before = get_insns (); end_sequence (); @@ -765,6 +765,10 @@ match_reload (signed char out, signed ch *curr_id->operand_loc[in] = new_in_reg; } lra_update_dups (curr_id, ins); + if (out < 0) + return; + /* See a comment for the input operand above. */ + narrow_reload_pseudo_class (out_rtx, goal_class); if (find_reg_note (curr_insn, REG_UNUSED, out_rtx) == NULL_RTX) { start_sequence (); @@ -2597,6 +2601,7 @@ curr_insn_transform (void) int n_alternatives; int commutative; signed char goal_alt_matched[MAX_RECOG_OPERANDS][MAX_RECOG_OPERANDS]; + signed char match_inputs[MAX_RECOG_OPERANDS + 1]; rtx before, after; bool alt_p = false; /* Flag that the insn has been changed through a transformation. */ @@ -3052,17 +3057,28 @@ curr_insn_transform (void) && (curr_static_id->operand[goal_alt_matched[i][0]].type == OP_OUT)) { - signed char arr[2]; - - arr[0] = i; - arr[1] = -1; - match_reload (goal_alt_matched[i][0], arr, + /* generate reloads for input and matched outputs. */ + match_inputs[0] = i; + match_inputs[1] = -1; + match_reload (goal_alt_matched[i][0], match_inputs, goal_alt[i], &before, &after); } else if (curr_static_id->operand[i].type == OP_OUT && (curr_static_id->operand[goal_alt_matched[i][0]].type == OP_IN)) + /* Generate reloads for output and matched inputs. */ match_reload (i, goal_alt_matched[i], goal_alt[i], &before, &after); + else if (curr_static_id->operand[i].type == OP_IN + && (curr_static_id->operand[goal_alt_matched[i][0]].type + == OP_IN)) + { + /* Generate reloads for matched inputs. */ + match_inputs[0] = i; + for (j = 0; (k = goal_alt_matched[i][j]) >= 0; j++) + match_inputs[j + 1] = k; + match_inputs[j + 1] = -1; + match_reload (-1, match_inputs, goal_alt[i], &before, &after); + } else /* We must generate code in any case when function process_alt_operands decides that it is possible. */ Index: testsuite/gcc.target/i386/pr55829.c =================================================================== --- testsuite/gcc.target/i386/pr55829.c (revision 0) +++ testsuite/gcc.target/i386/pr55829.c (working copy) @@ -0,0 +1,34 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse3 -fno-expensive-optimizations" } */ + +typedef double __m128d __attribute__ ((__vector_size__ (16))); + +extern double p1[]; +extern double p2[]; +extern double ck[]; +extern int n; + +__attribute__((__noinline__, __noclone__)) int chk_pd (double *v1, double *v2) +{ + return v2[n] != v1[n]; +} + +static inline void sse3_test_movddup_reg_subsume_ldsd (double *i1, double *r) +{ + __m128d t1 = (__m128d){*i1, 0}; + __m128d t2 = __builtin_ia32_shufpd (t1, t1, 0); + __builtin_ia32_storeupd (r, t2); +} + +int sse3_test (void) +{ + int i = 0; + int fail = 0; + for (; i < 80; i += 1) + { + ck[0] = p1[0]; + fail += chk_pd (ck, p2); + sse3_test_movddup_reg_subsume_ldsd (p1, p2); + } + return fail; +}