From patchwork Fri Mar 11 18:34:10 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Bergner X-Patchwork-Id: 86446 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id E3218B6F7A for ; Sat, 12 Mar 2011 05:34:29 +1100 (EST) Received: (qmail 502 invoked by alias); 11 Mar 2011 18:34:27 -0000 Received: (qmail 32094 invoked by uid 22791); 11 Mar 2011 18:34:25 -0000 X-SWARE-Spam-Status: No, hits=-1.3 required=5.0 tests=AWL, BAYES_00, SPF_SOFTFAIL, TW_XL, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from e39.co.us.ibm.com (HELO e39.co.us.ibm.com) (32.97.110.160) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 11 Mar 2011 18:34:15 +0000 Received: from d03relay02.boulder.ibm.com (d03relay02.boulder.ibm.com [9.17.195.227]) by e39.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id p2BILLsY001481 for ; Fri, 11 Mar 2011 11:21:21 -0700 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay02.boulder.ibm.com (8.13.8/8.13.8/NCO v9.1) with ESMTP id p2BIYD8i093304 for ; Fri, 11 Mar 2011 11:34:13 -0700 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p2BIYDKE003667 for ; Fri, 11 Mar 2011 11:34:13 -0700 Received: from [192.168.1.101] (vorma.rchland.ibm.com [9.10.86.174]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p2BIYBGD003425; Fri, 11 Mar 2011 11:34:11 -0700 Subject: [PATCH, rs6000] Fix PR48053, ICEs in SPEC benchmarks From: Peter Bergner To: gcc-patches Cc: David Edelsohn , Michael Meissner , Pat Haugen Date: Fri, 11 Mar 2011 12:34:10 -0600 Message-ID: <1299868450.2663.159.camel@otta> Mime-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This patch fixes the two related bugs in PR48053. The problem here deals with loading constants into VSX registers. The first bug occurs when we try and load up a full constant into the VSX register. We end up calling easy_vector_constant_msb which is supported only for V4SI and V4SF modes. The fix here was to modify the easy_vector_constant_msb predicate to reject V2DI and V2DF modes. The second bug was when we attempt to load a VSX register with a constant for one half of the register and a variable for the other half. This caused an ICE in int_mode_for_mode. I fixed this by first forcing the correct scalar mode in rs6000_expand_vector_init() and then adding special support in the movdi_internal* patterns to allow setting of VSX registers to zero. These fixed both test cases in PR48053 and Pat Haugen verified that they fixed the SPEC benchmarks the test cases were reduced from, without introducing any new SPEC regressions. This patch passed bootstrapp and regtesting on powerpc64-linux (test suite run in both 32-bit and 64-bit modes). Ok for mainline? Peter gcc/ PR target/48053 * config/rs6000/predicates.md (easy_vector_constant_add_self, easy_vector_constant_msb): Do not handle V2DImode and V2DFmode. * config/rs6000/rs6000.c (const_vector_elt_as_int): Add assert that mode is not V2DImode or V2DFmode. (vspltis_constant): Do not handle V2DImode and V2DFmode. (rs6000_expand_vector_init): Replace copy_to_reg with copy_to_mode_reg. * config/rs6000/rs6000.md (movdi_internal32): Allow setting VSX registers to 0. (movdi_internal64): Likewise. gcc/testsuite/ PR target/48053 * gcc/testsuite/gcc.target/powerpc/pr48053-1.c: New test. * gcc/testsuite/gcc.target/powerpc/pr48053-2.c: Likewise. Index: gcc/config/rs6000/predicates.md =================================================================== --- gcc/config/rs6000/predicates.md (revision 170880) +++ gcc/config/rs6000/predicates.md (working copy) @@ -371,7 +371,10 @@ (define_predicate "easy_vector_constant_ (and (match_test "TARGET_ALTIVEC") (match_test "easy_altivec_constant (op, mode)"))) { - HOST_WIDE_INT val = const_vector_elt_as_int (op, GET_MODE_NUNITS (mode) - 1); + HOST_WIDE_INT val; + if (mode == V2DImode || mode == V2DFmode) + return 0; + val = const_vector_elt_as_int (op, GET_MODE_NUNITS (mode) - 1); val = ((val & 0xff) ^ 0x80) - 0x80; return EASY_VECTOR_15_ADD_SELF (val); }) @@ -382,7 +385,10 @@ (define_predicate "easy_vector_constant_ (and (match_test "TARGET_ALTIVEC") (match_test "easy_altivec_constant (op, mode)"))) { - HOST_WIDE_INT val = const_vector_elt_as_int (op, GET_MODE_NUNITS (mode) - 1); + HOST_WIDE_INT val; + if (mode == V2DImode || mode == V2DFmode) + return 0; + val = const_vector_elt_as_int (op, GET_MODE_NUNITS (mode) - 1); return EASY_VECTOR_MSB (val, GET_MODE_INNER (mode)); }) Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 170880) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -4855,7 +4855,13 @@ num_insns_constant (rtx op, enum machine HOST_WIDE_INT const_vector_elt_as_int (rtx op, unsigned int elt) { - rtx tmp = CONST_VECTOR_ELT (op, elt); + rtx tmp; + + /* We can't handle V2DImode and V2DFmode vector constants here yet. */ + gcc_assert (GET_MODE (op) != V2DImode + && GET_MODE (op) != V2DFmode); + + tmp = CONST_VECTOR_ELT (op, elt); if (GET_MODE (op) == V4SFmode || GET_MODE (op) == V2SFmode) tmp = gen_lowpart (SImode, tmp); @@ -4876,13 +4882,24 @@ vspltis_constant (rtx op, unsigned step, enum machine_mode inner = GET_MODE_INNER (mode); unsigned i; - unsigned nunits = GET_MODE_NUNITS (mode); - unsigned bitsize = GET_MODE_BITSIZE (inner); - unsigned mask = GET_MODE_MASK (inner); - - HOST_WIDE_INT val = const_vector_elt_as_int (op, nunits - 1); - HOST_WIDE_INT splat_val = val; - HOST_WIDE_INT msb_val = val > 0 ? 0 : -1; + unsigned nunits; + unsigned bitsize; + unsigned mask; + + HOST_WIDE_INT val; + HOST_WIDE_INT splat_val; + HOST_WIDE_INT msb_val; + + if (mode == V2DImode || mode == V2DFmode) + return false; + + nunits = GET_MODE_NUNITS (mode); + bitsize = GET_MODE_BITSIZE (inner); + mask = GET_MODE_MASK (inner); + + val = const_vector_elt_as_int (op, nunits - 1); + splat_val = val; + msb_val = val > 0 ? 0 : -1; /* Construct the value to be splatted, if possible. If not, return 0. */ for (i = 2; i <= copies; i *= 2) @@ -5314,12 +5331,18 @@ rs6000_expand_vector_init (rtx target, r } else { - rtx op0 = copy_to_reg (XVECEXP (vals, 0, 0)); - rtx op1 = copy_to_reg (XVECEXP (vals, 0, 1)); if (mode == V2DFmode) - emit_insn (gen_vsx_concat_v2df (target, op0, op1)); + { + rtx op0 = copy_to_mode_reg (DFmode, XVECEXP (vals, 0, 0)); + rtx op1 = copy_to_mode_reg (DFmode, XVECEXP (vals, 0, 1)); + emit_insn (gen_vsx_concat_v2df (target, op0, op1)); + } else - emit_insn (gen_vsx_concat_v2di (target, op0, op1)); + { + rtx op0 = copy_to_mode_reg (DImode, XVECEXP (vals, 0, 0)); + rtx op1 = copy_to_mode_reg (DImode, XVECEXP (vals, 0, 1)); + emit_insn (gen_vsx_concat_v2di (target, op0, op1)); + } } return; } Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 170880) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -10052,8 +10052,8 @@ (define_expand "abstf2_internal" ; List r->r after r->"o<>", otherwise reload will try to reload a ; non-offsettable address by using r->r which won't make progress. (define_insn "*movdi_internal32" - [(set (match_operand:DI 0 "rs6000_nonimmediate_operand" "=o<>,r,r,*d,*d,m,r") - (match_operand:DI 1 "input_operand" "r,r,m,d,m,d,IJKnGHF"))] + [(set (match_operand:DI 0 "rs6000_nonimmediate_operand" "=o<>,r,r,*d,*d,m,r,?wa") + (match_operand:DI 1 "input_operand" "r,r,m,d,m,d,IJKnGHF,O"))] "! TARGET_POWERPC64 && (gpc_reg_operand (operands[0], DImode) || gpc_reg_operand (operands[1], DImode))" @@ -10064,8 +10064,9 @@ (define_insn "*movdi_internal32" fmr %0,%1 lfd%U1%X1 %0,%1 stfd%U0%X0 %1,%0 - #" - [(set_attr "type" "load,*,store,fp,fpload,fpstore,*")]) + # + xxlxor %x0,%x0,%x0" + [(set_attr "type" "load,*,store,fp,fpload,fpstore,*,vecsimple")]) (define_split [(set (match_operand:DI 0 "gpc_reg_operand" "") @@ -10122,8 +10123,8 @@ (define_insn "*movdi_mfpgpr" (set_attr "length" "4,4,4,4,4,20,4,4,4,4,4,4,4,4,4")]) (define_insn "*movdi_internal64" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,m,r,r,r,r,*d,*d,m,r,*h,*h") - (match_operand:DI 1 "input_operand" "r,m,r,I,L,nF,R,d,m,d,*h,r,0"))] + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,m,r,r,r,r,*d,*d,m,r,*h,*h,?wa") + (match_operand:DI 1 "input_operand" "r,m,r,I,L,nF,R,d,m,d,*h,r,0,O"))] "TARGET_POWERPC64 && (!TARGET_MFPGPR || !TARGET_HARD_FLOAT || !TARGET_FPRS) && (gpc_reg_operand (operands[0], DImode) || gpc_reg_operand (operands[1], DImode))" @@ -10140,9 +10141,10 @@ (define_insn "*movdi_internal64" stfd%U0%X0 %1,%0 mf%1 %0 mt%0 %1 - {cror 0,0,0|nop}" - [(set_attr "type" "*,load,store,*,*,*,*,fp,fpload,fpstore,mfjmpr,mtjmpr,*") - (set_attr "length" "4,4,4,4,4,20,4,4,4,4,4,4,4")]) + {cror 0,0,0|nop} + xxlxor %x0,%x0,%x0" + [(set_attr "type" "*,load,store,*,*,*,*,fp,fpload,fpstore,mfjmpr,mtjmpr,*,vecsimple") + (set_attr "length" "4,4,4,4,4,20,4,4,4,4,4,4,4,4")]) ;; immediate value valid for a single instruction hiding in a const_double (define_insn "" Index: gcc/testsuite/gcc.target/powerpc/pr48053-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr48053-1.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr48053-1.c (revision 0) @@ -0,0 +1,30 @@ +/* Test for ICE arising from VSX code generation. */ +/* { dg-do compile } */ +/* { dg-options "-O3 -mcpu=power7 -funroll-loops" } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ + +int sourcenode; +int ARCHelems; +int *source_elms; +void +foo (int argc, char **argv) +{ + int i, j; + int cor[4]; + double Ke[12][12], Me[12], Ce[12], Mexv[12], Cexv[12], v[12]; + for (i = 0; i < ARCHelems; i++) + { + for (j = 0; j < 12; j++) + Me[j] = 0.0; + if (cor[j] == sourcenode) + vv12x12 (Me, v, Mexv); + vv12x12 (Ce, v, Cexv); + if (source_elms[i] == 3) + for (j = 0; j < 12; j++) + { + v[j] = -v[j]; + Mexv[j] = -Mexv[j]; + Cexv[j] = -Cexv[j]; + } + } +} Index: gcc/testsuite/gcc.target/powerpc/pr48053-2.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr48053-2.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr48053-2.c (revision 0) @@ -0,0 +1,38 @@ +/* Test for ICE arising from VSX code generation. */ +/* { dg-do compile } */ +/* { dg-options "-O3 -mcpu=power7" } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ + +struct timeval +{ + long tv_sec; + long tv_usec; +}; + +extern char *bar (struct timeval *); +int *error; + +void +foo (void *ptr) +{ + struct timeval tm; + long n1, n2; + + if (!ptr) + { + *error = 1; + n1 = -1; + n2 = -1; + } + else + { + n1 = 0; + n2 = *error; + } + + tm.tv_sec = n1; + tm.tv_usec = n2; + + if (*error) + bar (&tm); +}