From patchwork Fri Nov 11 08:39:35 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Miller X-Patchwork-Id: 125105 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id F14581007D4 for ; Fri, 11 Nov 2011 19:40:22 +1100 (EST) Received: (qmail 17893 invoked by alias); 11 Nov 2011 08:40:20 -0000 Received: (qmail 17871 invoked by uid 22791); 11 Nov 2011 08:40:13 -0000 X-SWARE-Spam-Status: No, hits=-1.7 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from shards.monkeyblade.net (HELO shards.monkeyblade.net) (198.137.202.13) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 11 Nov 2011 08:39:53 +0000 Received: from localhost (cpe-66-65-61-233.nyc.res.rr.com [66.65.61.233]) (authenticated bits=0) by shards.monkeyblade.net (8.14.4/8.14.4) with ESMTP id pAB8dZMf023503 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 11 Nov 2011 00:39:36 -0800 Date: Fri, 11 Nov 2011 03:39:35 -0500 (EST) Message-Id: <20111111.033935.675991847858154567.davem@davemloft.net> To: gcc-patches@gcc.gnu.org CC: ebotcazou@adacore.com Subject: [PATCH] Revert sparc vec_init improvements as they cause 64-bit regressions. From: David Miller Mime-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Eric, I tried my best to get the new code working properly on 64-bit and I just couldn't figure out a reasonably way to do so. Therefore I simply reverted the changes. I'll come back to this at some point in the future. One thing that really irks me is how pseudo's can only be subreg'd on UNITS_PER_WORD boundaries. That's the real reason this stuff doesn't work and it's nearly impossible to subreg 32-bit values that end up in float regs on sparc when compiling 64-bit. We have REGMODE_NATURAL_SIZE which basically describes the subreg'ability of the hard registers that a pseudo in a given mode will end up using. I started playing around with using REGMODE_NATURAL_SIZE in place of UNITS_PER_WORD in the subreg code but it got way out of the scope of fixing this regression. Anyways, commited to trunk and all the 64-bit failures should be gone. gcc/ Revert 2011-11-05 David S. Miller --- gcc/ChangeLog | 5 + gcc/config/sparc/sparc.c | 440 ++++++++++---------------------------------- gcc/config/sparc/sparc.md | 54 ------ 3 files changed, 105 insertions(+), 394 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 5f19470..cf4e66b 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2011-11-11 David S. Miller + + Revert + 2011-11-05 David S. Miller + 2011-11-11 Jakub Jelinek * opts-common.c (generate_canonical_option): Free opt_text diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c index 1f2a27a..55759a0 100644 --- a/gcc/config/sparc/sparc.c +++ b/gcc/config/sparc/sparc.c @@ -11285,357 +11285,88 @@ output_v8plus_mult (rtx insn, rtx *operands, const char *opcode) } } -/* Subroutine of sparc_expand_vector_init. Emit code to initialize TARGET to - the N_ELTS values for individual fields contained in LOCS by means of VIS2 - BSHUFFLE insn. MODE and INNER_MODE are the modes describing TARGET. */ +/* Subroutine of sparc_expand_vector_init. Emit code to initialize + all fields of TARGET to ELT by means of VIS2 BSHUFFLE insn. MODE + and INNER_MODE are the modes describing TARGET. */ static void -vector_init_bshuffle (rtx target, rtx *locs, int n_elts, - enum machine_mode mode, +vector_init_bshuffle (rtx target, rtx elt, enum machine_mode mode, enum machine_mode inner_mode) { - rtx mid_target, r0_high, r0_low, r1_high, r1_low; - enum machine_mode partial_mode; - int bmask, i, idxs[8]; + rtx t1, final_insn; + int bmask; - partial_mode = (mode == V4HImode - ? V2HImode - : (mode == V8QImode - ? V4QImode : mode)); + t1 = gen_reg_rtx (mode); - r0_high = r0_low = NULL_RTX; - r1_high = r1_low = NULL_RTX; + elt = convert_modes (SImode, inner_mode, elt, true); + emit_move_insn (gen_lowpart(SImode, t1), elt); - /* Move the pieces into place, as needed, and calculate the nibble - indexes for the bmask calculation. After we execute this loop the - locs[] array is no longer needed. Therefore, to simplify things, - we set entries that have been processed already to NULL_RTX. */ - - for (i = 0; i < n_elts; i++) - { - int j; - - if (locs[i] == NULL_RTX) - continue; - - if (!r0_low) - { - r0_low = locs[i]; - idxs[i] = 0x7; - } - else if (!r1_low) - { - r1_low = locs[i]; - idxs[i] = 0xf; - } - else if (!r0_high) - { - r0_high = gen_highpart (partial_mode, r0_low); - emit_move_insn (r0_high, gen_lowpart (partial_mode, locs[i])); - idxs[i] = 0x3; - } - else if (!r1_high) - { - r1_high = gen_highpart (partial_mode, r1_low); - emit_move_insn (r1_high, gen_lowpart (partial_mode, locs[i])); - idxs[i] = 0xb; - } - else - gcc_unreachable (); - - for (j = i + 1; j < n_elts; j++) - { - if (locs[j] == locs[i]) - { - locs[j] = NULL_RTX; - idxs[j] = idxs[i]; - } - } - locs[i] = NULL_RTX; - } - - bmask = 0; - for (i = 0; i < n_elts; i++) - { - int v = idxs[i]; - - switch (GET_MODE_SIZE (inner_mode)) - { - case 2: - bmask <<= 8; - bmask |= (((v - 1) << 4) | v); - break; - - case 1: - bmask <<= 4; - bmask |= v; - break; - - default: - gcc_unreachable (); - } - } - - emit_insn (gen_bmasksi_vis (gen_reg_rtx (SImode), CONST0_RTX (SImode), - force_reg (SImode, GEN_INT (bmask)))); - - mid_target = target; - if (GET_MODE_SIZE (mode) == 4) - { - mid_target = gen_reg_rtx (mode == V2HImode - ? V4HImode : V8QImode); - } - - if (!r1_low) - r1_low = r0_low; - - switch (GET_MODE (mid_target)) + switch (mode) { + case V2SImode: + final_insn = gen_bshufflev2si_vis (target, t1, t1); + bmask = 0x45674567; + break; case V4HImode: - emit_insn (gen_bshufflev4hi_vis (mid_target, r0_low, r1_low)); + final_insn = gen_bshufflev4hi_vis (target, t1, t1); + bmask = 0x67676767; break; case V8QImode: - emit_insn (gen_bshufflev8qi_vis (mid_target, r0_low, r1_low)); + final_insn = gen_bshufflev8qi_vis (target, t1, t1); + bmask = 0x77777777; break; default: gcc_unreachable (); } - if (mid_target != target) - emit_move_insn (target, gen_lowpart (partial_mode, mid_target)); -} - -/* Subroutine of sparc_expand_vector_init. Emit code to initialize TARGET to - values for individual fields VALS by means of simple word moves if this is - possible. MODE and INNER_MODE are the modes describing TARGET. Return true - on success. */ - -static bool -vector_init_move_words (rtx target, rtx vals, enum machine_mode mode, - enum machine_mode inner_mode) -{ - switch (mode) - { - case V1SImode: - case V1DImode: - emit_move_insn (gen_lowpart (inner_mode, target), - gen_lowpart (inner_mode, XVECEXP (vals, 0, 0))); - return true; - - case V2SImode: - emit_move_insn (gen_highpart (SImode, target), XVECEXP (vals, 0, 0)); - emit_move_insn (gen_lowpart (SImode, target), XVECEXP (vals, 0, 1)); - return true; - - default: - break; - } - return false; + emit_insn (gen_bmasksi_vis (gen_reg_rtx (SImode), CONST0_RTX (SImode), + force_reg (SImode, GEN_INT (bmask)))); + emit_insn (final_insn); } -/* Subroutine of sparc_expand_vector_init. Move the N_ELTS elements in VALS - into registers compatible with MODE and INNER_MODE. Store the RTX for - these regs into the corresponding array entry of LOCS. */ - static void -vector_init_prepare_elts (rtx *locs, rtx vals, int n_elts, - enum machine_mode mode, - enum machine_mode inner_mode) +vector_init_fpmerge (rtx target, rtx elt, enum machine_mode inner_mode) { - enum machine_mode loc_mode; - int i; + rtx t1, t2, t3, t3_low; - switch (mode) - { - case V2HImode: - loc_mode = V4HImode; - break; + t1 = gen_reg_rtx (V4QImode); + elt = convert_modes (SImode, inner_mode, elt, true); + emit_move_insn (gen_lowpart (SImode, t1), elt); - case V4QImode: - loc_mode = V8QImode; - break; + t2 = gen_reg_rtx (V4QImode); + emit_move_insn (t2, t1); - case V4HImode: - case V8QImode: - loc_mode = mode; - break; - - default: - gcc_unreachable (); - } + t3 = gen_reg_rtx (V8QImode); + t3_low = gen_lowpart (V4QImode, t3); - gcc_assert (GET_MODE_SIZE (inner_mode) <= 4); - for (i = 0; i < n_elts; i++) - { - rtx dst, elt = XVECEXP (vals, 0, i); - int j; - - /* Did we see this already? If so just record it's location. */ - dst = NULL_RTX; - for (j = 0; j < i; j++) - { - if (XVECEXP (vals, 0, j) == elt) - { - dst = locs[j]; - break; - } - } - - if (! dst) - { - enum rtx_code code = GET_CODE (elt); + emit_insn (gen_fpmerge_vis (t3, t1, t2)); + emit_move_insn (t1, t3_low); + emit_move_insn (t2, t3_low); - dst = gen_reg_rtx (loc_mode); + emit_insn (gen_fpmerge_vis (t3, t1, t2)); + emit_move_insn (t1, t3_low); + emit_move_insn (t2, t3_low); - /* We use different strategies based upon whether the element - is in memory or in a register. When we start in a register - and we're VIS3 capable, it's always cheaper to use the VIS3 - int-->fp register moves since we avoid having to use stack - memory. */ - if ((TARGET_VIS3 && (code == REG || code == SUBREG)) - || (CONSTANT_P (elt) - && (const_zero_operand (elt, inner_mode) - || const_all_ones_operand (elt, inner_mode)))) - { - elt = convert_modes (SImode, inner_mode, elt, true); - - emit_clobber (dst); - emit_move_insn (gen_lowpart (SImode, dst), elt); - } - else - { - rtx m = elt; - - if (CONSTANT_P (elt)) - { - m = force_const_mem (inner_mode, elt); - } - else if (code != MEM) - { - rtx stk - = assign_stack_temp (inner_mode, GET_MODE_SIZE(inner_mode), - 0); - emit_move_insn (stk, elt); - m = stk; - } - - switch (loc_mode) - { - case V4HImode: - emit_insn (gen_zero_extend_v4hi_vis (dst, m)); - break; - case V8QImode: - emit_insn (gen_zero_extend_v8qi_vis (dst, m)); - break; - default: - gcc_unreachable (); - } - } - } - locs[i] = dst; - } + emit_insn (gen_fpmerge_vis (gen_lowpart (V8QImode, target), t1, t2)); } -/* Subroutine of sparc_expand_vector_init. Emit code to initialize TARGET to - the N_ELTS values for individual fields contained in LOCS by means of VIS2 - instructions, among which N_UNIQUE are unique. MODE and INNER_MODE are the - modes describing TARGET. */ - static void -sparc_expand_vector_init_vis2 (rtx target, rtx *locs, int n_elts, int n_unique, - enum machine_mode mode, - enum machine_mode inner_mode) +vector_init_faligndata (rtx target, rtx elt, enum machine_mode inner_mode) { - if (n_unique <= 4) - { - vector_init_bshuffle (target, locs, n_elts, mode, inner_mode); - } - else - { - int i; + rtx t1 = gen_reg_rtx (V4HImode); - gcc_assert (mode == V8QImode); + elt = convert_modes (SImode, inner_mode, elt, true); - emit_insn (gen_alignaddrsi_vis (gen_reg_rtx (SImode), - force_reg (SImode, GEN_INT (7)), - CONST0_RTX (SImode))); - i = n_elts - 1; - emit_insn (gen_faligndatav8qi_vis (target, locs[i], locs[i])); - while (--i >= 0) - emit_insn (gen_faligndatav8qi_vis (target, locs[i], target)); - } -} - -/* Subroutine of sparc_expand_vector_init. Emit code to initialize TARGET to - the N_ELTS values for individual fields contained in LOCS by means of VIS1 - instructions, among which N_UNIQUE are unique. MODE is TARGET's mode. */ - -static void -sparc_expand_vector_init_vis1 (rtx target, rtx *locs, int n_elts, int n_unique, - enum machine_mode mode) -{ - enum machine_mode full_mode = mode; - rtx (*emitter)(rtx, rtx, rtx); - int alignaddr_val, i; - rtx tmp = target; - - if (n_unique == 1 && mode == V8QImode) - { - rtx t2, t2_low, t1; - - t1 = gen_reg_rtx (V4QImode); - emit_move_insn (t1, gen_lowpart (V4QImode, locs[0])); - - t2 = gen_reg_rtx (V8QImode); - t2_low = gen_lowpart (V4QImode, t2); - - /* xxxxxxAA --> xxxxxxxxxxxxAAAA - xxxxAAAA --> xxxxxxxxAAAAAAAA - AAAAAAAA --> AAAAAAAAAAAAAAAA */ - emit_insn (gen_fpmerge_vis (t2, t1, t1)); - emit_move_insn (t1, t2_low); - emit_insn (gen_fpmerge_vis (t2, t1, t1)); - emit_move_insn (t1, t2_low); - emit_insn (gen_fpmerge_vis (target, t1, t1)); - return; - } - - switch (mode) - { - case V2HImode: - full_mode = V4HImode; - /* FALLTHRU */ - case V4HImode: - emitter = gen_faligndatav4hi_vis; - alignaddr_val = 6; - break; - - case V4QImode: - full_mode = V8QImode; - /* FALLTHRU */ - case V8QImode: - emitter = gen_faligndatav8qi_vis; - alignaddr_val = 7; - break; - - default: - gcc_unreachable (); - } - - if (full_mode != mode) - tmp = gen_reg_rtx (full_mode); + emit_move_insn (gen_lowpart (SImode, t1), elt); emit_insn (gen_alignaddrsi_vis (gen_reg_rtx (SImode), - force_reg (SImode, GEN_INT (alignaddr_val)), + force_reg (SImode, GEN_INT (6)), CONST0_RTX (SImode))); - i = n_elts - 1; - emit_insn (emitter (tmp, locs[i], locs[i])); - while (--i >= 0) - emit_insn (emitter (tmp, locs[i], tmp)); - - if (tmp != target) - emit_move_insn (target, gen_highpart (mode, tmp)); + emit_insn (gen_faligndatav4hi_vis (target, t1, target)); + emit_insn (gen_faligndatav4hi_vis (target, t1, target)); + emit_insn (gen_faligndatav4hi_vis (target, t1, target)); + emit_insn (gen_faligndatav4hi_vis (target, t1, target)); } /* Emit code to initialize TARGET to values for individual fields VALS. */ @@ -11646,30 +11377,19 @@ sparc_expand_vector_init (rtx target, rtx vals) enum machine_mode mode = GET_MODE (target); enum machine_mode inner_mode = GET_MODE_INNER (mode); int n_elts = GET_MODE_NUNITS (mode); - int i, n_var = 0, n_unique = 0; - rtx locs[8]; - - gcc_assert (n_elts <= 8); + int i, n_var = 0; + bool all_same; + rtx mem; + all_same = true; for (i = 0; i < n_elts; i++) { rtx x = XVECEXP (vals, 0, i); - bool found = false; - int j; - if (!CONSTANT_P (x)) n_var++; - for (j = 0; j < i; j++) - { - if (rtx_equal_p (x, XVECEXP (vals, 0, j))) - { - found = true; - break; - } - } - if (!found) - n_unique++; + if (i > 0 && !rtx_equal_p (x, XVECEXP (vals, 0, 0))) + all_same = false; } if (n_var == 0) @@ -11678,16 +11398,56 @@ sparc_expand_vector_init (rtx target, rtx vals) return; } - if (vector_init_move_words (target, vals, mode, inner_mode)) - return; + if (GET_MODE_SIZE (inner_mode) == GET_MODE_SIZE (mode)) + { + if (GET_MODE_SIZE (inner_mode) == 4) + { + emit_move_insn (gen_lowpart (SImode, target), + gen_lowpart (SImode, XVECEXP (vals, 0, 0))); + return; + } + else if (GET_MODE_SIZE (inner_mode) == 8) + { + emit_move_insn (gen_lowpart (DImode, target), + gen_lowpart (DImode, XVECEXP (vals, 0, 0))); + return; + } + } + else if (GET_MODE_SIZE (inner_mode) == GET_MODE_SIZE (word_mode) + && GET_MODE_SIZE (mode) == 2 * GET_MODE_SIZE (word_mode)) + { + emit_move_insn (gen_highpart (word_mode, target), + gen_lowpart (word_mode, XVECEXP (vals, 0, 0))); + emit_move_insn (gen_lowpart (word_mode, target), + gen_lowpart (word_mode, XVECEXP (vals, 0, 1))); + return; + } - vector_init_prepare_elts (locs, vals, n_elts, mode, inner_mode); + if (all_same && GET_MODE_SIZE (mode) == 8) + { + if (TARGET_VIS2) + { + vector_init_bshuffle (target, XVECEXP (vals, 0, 0), mode, inner_mode); + return; + } + if (mode == V8QImode) + { + vector_init_fpmerge (target, XVECEXP (vals, 0, 0), inner_mode); + return; + } + if (mode == V4HImode) + { + vector_init_faligndata (target, XVECEXP (vals, 0, 0), inner_mode); + return; + } + } - if (TARGET_VIS2) - sparc_expand_vector_init_vis2 (target, locs, n_elts, n_unique, - mode, inner_mode); - else - sparc_expand_vector_init_vis1 (target, locs, n_elts, n_unique, mode); + mem = assign_stack_temp (mode, GET_MODE_SIZE (mode), 0); + for (i = 0; i < n_elts; i++) + emit_move_insn (adjust_address_nv (mem, inner_mode, + i * GET_MODE_SIZE (inner_mode)), + XVECEXP (vals, 0, i)); + emit_move_insn (target, mem); } /* Implement TARGET_SECONDARY_RELOAD. */ diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md index bf750b2..c059dc5 100644 --- a/gcc/config/sparc/sparc.md +++ b/gcc/config/sparc/sparc.md @@ -7830,60 +7830,6 @@ DONE; }) -(define_expand "zero_extend_v8qi_vis" - [(set (match_operand:V8QI 0 "register_operand" "") - (vec_merge:V8QI - (vec_duplicate:V8QI - (match_operand:QI 1 "memory_operand" "")) - (match_dup 2) - (const_int 254)))] - "TARGET_VIS" -{ - if (! REG_P (XEXP (operands[1], 0))) - { - rtx addr = force_reg (Pmode, XEXP (operands[1], 0)); - operands[1] = replace_equiv_address (operands[1], addr); - } - operands[2] = CONST0_RTX (V8QImode); -}) - -(define_expand "zero_extend_v4hi_vis" - [(set (match_operand:V4HI 0 "register_operand" "") - (vec_merge:V4HI - (vec_duplicate:V4HI - (match_operand:HI 1 "memory_operand" "")) - (match_dup 2) - (const_int 14)))] - "TARGET_VIS" -{ - if (! REG_P (XEXP (operands[1], 0))) - { - rtx addr = force_reg (Pmode, XEXP (operands[1], 0)); - operands[1] = replace_equiv_address (operands[1], addr); - } - operands[2] = CONST0_RTX (V4HImode); -}) - -(define_insn "*zero_extend_v8qi__insn" - [(set (match_operand:V8QI 0 "register_operand" "=e") - (vec_merge:V8QI - (vec_duplicate:V8QI - (mem:QI (match_operand:P 1 "register_operand" "r"))) - (match_operand:V8QI 2 "const_zero_operand" "Y") - (const_int 254)))] - "TARGET_VIS" - "ldda\t[%1] 0xd0, %0") - -(define_insn "*zero_extend_v4hi__insn" - [(set (match_operand:V4HI 0 "register_operand" "=e") - (vec_merge:V4HI - (vec_duplicate:V4HI - (mem:HI (match_operand:P 1 "register_operand" "r"))) - (match_operand:V4HI 2 "const_zero_operand" "Y") - (const_int 14)))] - "TARGET_VIS" - "ldda\t[%1] 0xd2, %0") - (define_expand "vec_init" [(match_operand:VMALL 0 "register_operand" "") (match_operand:VMALL 1 "" "")]