From patchwork Tue Oct 23 03:39:23 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Miller X-Patchwork-Id: 193322 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 65FC32C0172 for ; Tue, 23 Oct 2012 14:39:42 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1351568383; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Date: Message-Id:To:CC:Subject:From:Mime-Version:Content-Type: Content-Transfer-Encoding:Mailing-List:Precedence:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=lX7QqsVa+UdeWzgYk0XE0ahhLH0=; b=CDN9lplqh19NK+G KRsJnVeWsCOHM1+MZ2svi2pglRiE4h/P/V+TXUjyDGSniktrnCer/wI0fmWtkgj+ 5LF3IA/0oeERBAX/VIxGrkz47V2uezAaZqQj5CS0s8P2vQgQqt/UKPT/LAoWK3kw hE2Zf0C3B+foCxTqOYKu52CNsteE= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Date:Message-Id:To:CC:Subject:From:Mime-Version:Content-Type:Content-Transfer-Encoding:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=dwQWbmpC0MNSDgvXIqfRFUgc2p+Gd1uZOGVPVVnmBFV6Y/g4dIzjdHZ9Udep8c tmP33r8+A018gxspPZe6CObSS/lucqMkqi1nnTYo0XI7toBhj/7HnwypNlvszfto uW0/aTqzlwFPGAgc8LyjmNOBfaHQERTe3qrVXVEewQGP4=; Received: (qmail 15136 invoked by alias); 23 Oct 2012 03:39:39 -0000 Received: (qmail 15125 invoked by uid 22791); 23 Oct 2012 03:39:38 -0000 X-SWARE-Spam-Status: No, hits=-2.1 required=5.0 tests=AWL, BAYES_00, TW_CX, TW_MF, TW_RX, TW_XB X-Spam-Check-By: sourceware.org Received: from shards.monkeyblade.net (HELO shards.monkeyblade.net) (149.20.54.216) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 23 Oct 2012 03:39:27 +0000 Received: from localhost (cpe-66-108-116-58.nyc.res.rr.com [66.108.116.58]) by shards.monkeyblade.net (Postfix) with ESMTPSA id 3D34B5845EE; Mon, 22 Oct 2012 20:39:29 -0700 (PDT) Date: Mon, 22 Oct 2012 23:39:23 -0400 (EDT) Message-Id: <20121022.233923.1683656545305450956.davem@davemloft.net> To: gcc-patches@gcc.gnu.org CC: ebotcazou@adacore.com, ro@cebitec.uni-bielefeld.de Subject: [PATCH v3] Add support for sparc compare-and-branch From: David Miller Mime-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Differences from v2: 1) If another control transfer comes right after a cbcond we take an enormous performance penalty, some 20 cycles or more. The documentation specifically warns about this, so emit a nop when we encounter this scenerio. 2) Add a heuristic to avoid using cbcond if we know at RTL emit time that we're going to compare against a constant that does not fit in the tiny 5-bit signed immediate field. 3) Use cbcond for unconditional jumps too. Regstrapped on sparc-unknown-linux-gnu w/--with-cpu=niagara4. Eric and Rainer, I think that functionally this patch is fully ready to go into the tree except for the Solaris aspects which I do not have the means to work on. Have either of you made any progress in this area? Thanks! gcc/ 2012-10-12 David S. Miller * configure.ac: Add check for assembler SPARC4 instruction support. * configure: Rebuild. * config.in: Add HAVE_AS_SPARC4 section. * config/sparc/sparc.opt (mcbcond): New option. * doc/invoke.texi: Document it. * config/sparc/constraints.md: New constraint 'A' for 5-bit signed immediates. * doc/md.texi: Document it. * config/sparc/predicates.md (arith5_operand): New predicate. * config/sparc/sparc.c (dump_target_flag_bits): Handle MASK_CBCOND. (sparc_option_override): Likewise. (emit_cbcond_insn): New function. (emit_conditional_branch_insn): Call it. (emit_cbcond_nop): New function. (output_ubranch): Use cbcond, remove label arg. (output_cbcond): New function. * config/sparc/sparc-protos.h (output_ubranch): Update. (output_cbcond): Declare it. (emit_cbcond_nop): Likewise. * config/sparc/sparc.md (type attribute): New types 'cbcond' and uncond_cbcond. (emit_cbcond_nop): New attribute. (length attribute): Handle cbcond and uncond_cbcond. (in_call_delay attribute): Reject cbcond and uncond_cbcond. (in_branch_delay attribute): Likewise. (in_uncond_branch_delay attribute): Likewise. (in_annul_branch_delay attribute): Likewise. (*cbcond_sp32, *cbcond_sp64): New insn patterns. (jump): Rewrite into an expander. (*jump_ubranch, *jump_cbcond): New patterns. * config/sparc/niagara4.md: Match 'cbcond' and 'uncond_cbcond' in 'n4_cti'. * config/sparc/sparc.h (AS_NIAGARA4_FLAG): New macro, use it when target default is niagara4. (SPARC_SIMM5_P): Define. * config/sparc/sol2.h (AS_SPARC64_FLAG): Adjust. (AS_SPARC32_FLAG): Define. (ASM_CPU32_DEFAULT_SPEC, ASM_CPU64_DEFAULT_SPEC): Use AS_NIAGARA4_FLAG as needed. diff --git a/gcc/config.in b/gcc/config.in index b13805d..791d14a 100644 --- a/gcc/config.in +++ b/gcc/config.in @@ -266,6 +266,12 @@ #endif +/* Define if your assembler supports SPARC4 instructions. */ +#ifndef USED_FOR_TARGET +#undef HAVE_AS_SPARC4 +#endif + + /* Define if your assembler supports fprnd. */ #ifndef USED_FOR_TARGET #undef HAVE_AS_FPRND diff --git a/gcc/config/sparc/constraints.md b/gcc/config/sparc/constraints.md index 472490f..8862ea1 100644 --- a/gcc/config/sparc/constraints.md +++ b/gcc/config/sparc/constraints.md @@ -18,7 +18,7 @@ ;; . ;;; Unused letters: -;;; AB +;;; B ;;; a jkl q tuv xyz @@ -62,6 +62,11 @@ ;; Integer constant constraints +(define_constraint "A" + "Signed 5-bit integer constant" + (and (match_code "const_int") + (match_test "SPARC_SIMM5_P (ival)"))) + (define_constraint "H" "Valid operand of double arithmetic operation" (and (match_code "const_double") diff --git a/gcc/config/sparc/niagara4.md b/gcc/config/sparc/niagara4.md index 272c8ff..61ca801 100644 --- a/gcc/config/sparc/niagara4.md +++ b/gcc/config/sparc/niagara4.md @@ -56,7 +56,7 @@ (define_insn_reservation "n4_cti" 2 (and (eq_attr "cpu" "niagara4") - (eq_attr "type" "branch,call,sibcall,call_no_delay_slot,uncond_branch,return")) + (eq_attr "type" "cbcond,uncond_cbcond,branch,call,sibcall,call_no_delay_slot,uncond_branch,return")) "n4_slot1, nothing") (define_insn_reservation "n4_fp" 11 diff --git a/gcc/config/sparc/predicates.md b/gcc/config/sparc/predicates.md index 326524b..b64e109 100644 --- a/gcc/config/sparc/predicates.md +++ b/gcc/config/sparc/predicates.md @@ -391,6 +391,14 @@ (ior (match_operand 0 "register_operand") (match_operand 0 "uns_small_int_operand"))) +;; Return true if OP is a register, or is a CONST_INT that can fit in a +;; signed 5-bit immediate field. This is an acceptable second operand for +;; the cbcond instructions. +(define_predicate "arith5_operand" + (ior (match_operand 0 "register_operand") + (and (match_code "const_int") + (match_test "SPARC_SIMM5_P (INTVAL (op))")))) + ;; Predicates for miscellaneous instructions. diff --git a/gcc/config/sparc/sol2.h b/gcc/config/sparc/sol2.h index ba2ec35..68cc592 100644 --- a/gcc/config/sparc/sol2.h +++ b/gcc/config/sparc/sol2.h @@ -58,8 +58,10 @@ along with GCC; see the file COPYING3. If not see other assemblers will accept. */ #ifndef USE_GAS -#define AS_SPARC64_FLAG "-xarch=v9" +#define AS_SPARC32_FLAG "-m32 -xarch=v9" +#define AS_SPARC64_FLAG "-m64 -xarch=v9" #else +#define AS_SPARC32_FLAG "-TSO -32 -Av9" #define AS_SPARC64_FLAG "-TSO -64 -Av9" #endif @@ -136,9 +138,9 @@ along with GCC; see the file COPYING3. If not see #undef CPP_CPU64_DEFAULT_SPEC #define CPP_CPU64_DEFAULT_SPEC "" #undef ASM_CPU32_DEFAULT_SPEC -#define ASM_CPU32_DEFAULT_SPEC "-xarch=v8plusb" +#define ASM_CPU32_DEFAULT_SPEC AS_SPARC32_FLAG AS_NIAGARA4_FLAG #undef ASM_CPU64_DEFAULT_SPEC -#define ASM_CPU64_DEFAULT_SPEC AS_SPARC64_FLAG "b" +#define ASM_CPU64_DEFAULT_SPEC AS_SPARC64_FLAG AS_NIAGARA4_FLAG #undef ASM_CPU_DEFAULT_SPEC #define ASM_CPU_DEFAULT_SPEC ASM_CPU32_DEFAULT_SPEC #endif @@ -241,7 +243,7 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); %{mcpu=niagara:" DEF_ARCH32_SPEC("-xarch=v8plusb") DEF_ARCH64_SPEC(AS_SPARC64_FLAG "b") "} \ %{mcpu=niagara2:" DEF_ARCH32_SPEC("-xarch=v8plusb") DEF_ARCH64_SPEC(AS_SPARC64_FLAG "b") "} \ %{mcpu=niagara3:" DEF_ARCH32_SPEC("-xarch=v8plus" AS_NIAGARA3_FLAG) DEF_ARCH64_SPEC(AS_SPARC64_FLAG AS_NIAGARA3_FLAG) "} \ -%{mcpu=niagara4:" DEF_ARCH32_SPEC("-xarch=v8plus" AS_NIAGARA3_FLAG) DEF_ARCH64_SPEC(AS_SPARC64_FLAG AS_NIAGARA3_FLAG) "} \ +%{mcpu=niagara4:" DEF_ARCH32_SPEC(AS_SPARC32_FLAG AS_NIAGARA4_FLAG) DEF_ARCH64_SPEC(AS_SPARC64_FLAG AS_NIAGARA4_FLAG) "} \ %{!mcpu=niagara4:%{!mcpu=niagara3:%{!mcpu=niagara2:%{!mcpu=niagara:%{!mcpu=ultrasparc3:%{!mcpu=ultrasparc:%{!mcpu=v9:%{mcpu*:" DEF_ARCH32_SPEC("-xarch=v8") DEF_ARCH64_SPEC(AS_SPARC64_FLAG) "}}}}}}}} \ %{!mcpu*:%(asm_cpu_default)} \ " diff --git a/gcc/config/sparc/sparc-protos.h b/gcc/config/sparc/sparc-protos.h index 97f6233..d5b2b1f 100644 --- a/gcc/config/sparc/sparc-protos.h +++ b/gcc/config/sparc/sparc-protos.h @@ -71,7 +71,7 @@ extern void sparc_emit_set_symbolic_const64 (rtx, rtx, rtx); extern int sparc_splitdi_legitimate (rtx, rtx); extern int sparc_split_regreg_legitimate (rtx, rtx); extern int sparc_absnegfloat_split_legitimate (rtx, rtx); -extern const char *output_ubranch (rtx, int, rtx); +extern const char *output_ubranch (rtx, rtx); extern const char *output_cbranch (rtx, rtx, int, int, int, rtx); extern const char *output_return (rtx); extern const char *output_sibcall (rtx, rtx); @@ -79,10 +79,12 @@ extern const char *output_v8plus_shift (rtx, rtx *, const char *); extern const char *output_v8plus_mult (rtx, rtx *, const char *); extern const char *output_v9branch (rtx, rtx, int, int, int, int, rtx); extern const char *output_probe_stack_range (rtx, rtx); +extern const char *output_cbcond (rtx, rtx, rtx); extern bool emit_scc_insn (rtx []); extern void emit_conditional_branch_insn (rtx []); extern int mems_ok_for_ldd_peep (rtx, rtx, rtx); extern int empty_delay_slot (rtx); +extern int emit_cbcond_nop (rtx); extern int eligible_for_return_delay (rtx); extern int eligible_for_sibcall_delay (rtx); extern int tls_call_delay (rtx); diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c index 8849c03..202f064 100644 --- a/gcc/config/sparc/sparc.c +++ b/gcc/config/sparc/sparc.c @@ -840,6 +840,8 @@ dump_target_flag_bits (const int flags) fprintf (stderr, "VIS2 "); if (flags & MASK_VIS3) fprintf (stderr, "VIS3 "); + if (flags & MASK_CBCOND) + fprintf (stderr, "CBCOND "); if (flags & MASK_DEPRECATED_V8_INSNS) fprintf (stderr, "DEPRECATED_V8_INSNS "); if (flags & MASK_SPARCLET) @@ -946,7 +948,7 @@ sparc_option_override (void) MASK_V9|MASK_POPC|MASK_VIS2|MASK_VIS3|MASK_FMAF }, /* UltraSPARC T4 */ { "niagara4", MASK_ISA, - MASK_V9|MASK_POPC|MASK_VIS2|MASK_VIS3|MASK_FMAF }, + MASK_V9|MASK_POPC|MASK_VIS2|MASK_VIS3|MASK_FMAF|MASK_CBCOND }, }; const struct cpu_table *cpu; unsigned int i; @@ -1073,6 +1075,9 @@ sparc_option_override (void) #ifndef HAVE_AS_FMAF_HPC_VIS3 & ~(MASK_FMAF | MASK_VIS3) #endif +#ifndef HAVE_AS_SPARC4 + & ~MASK_CBCOND +#endif ); /* If -mfpu or -mno-fpu was explicitly used, don't override with @@ -1088,7 +1093,12 @@ sparc_option_override (void) if (TARGET_VIS3) target_flags |= MASK_VIS2 | MASK_VIS; - /* Don't allow -mvis, -mvis2, -mvis3, or -mfmaf if FPU is disabled. */ + /* -mcbcond implies -mvis3, -mvis2 and -mvis */ + if (TARGET_CBCOND) + target_flags |= MASK_VIS3 | MASK_VIS2 | MASK_VIS; + + /* Don't allow -mvis, -mvis2, -mvis3, or -mfmaf if FPU is + disabled. */ if (! TARGET_FPU) target_flags &= ~(MASK_VIS | MASK_VIS2 | MASK_VIS3 | MASK_FMAF); @@ -2660,6 +2670,24 @@ emit_v9_brxx_insn (enum rtx_code code, rtx op0, rtx label) pc_rtx))); } +/* Emit a conditional jump insn for the UA2011 architecture using + comparison code CODE and jump target LABEL. This function exists + to take advantage of the UA2011 Compare and Branch insns. */ + +static void +emit_cbcond_insn (enum rtx_code code, rtx op0, rtx op1, rtx label) +{ + rtx if_then_else; + + if_then_else = gen_rtx_IF_THEN_ELSE (VOIDmode, + gen_rtx_fmt_ee(code, GET_MODE(op0), + op0, op1), + gen_rtx_LABEL_REF (VOIDmode, label), + pc_rtx); + + emit_jump_insn (gen_rtx_SET (VOIDmode, pc_rtx, if_then_else)); +} + void emit_conditional_branch_insn (rtx operands[]) { @@ -2674,6 +2702,20 @@ emit_conditional_branch_insn (rtx operands[]) operands[2] = XEXP (operands[0], 1); } + /* If we can tell early on that the comparison is against a constant + that won't fit in the 5-bit signed immediate field of a cbcond, + use one of the other v9 conditional branch sequences. */ + if (TARGET_CBCOND + && GET_CODE (operands[1]) == REG + && (GET_MODE (operands[1]) == SImode + || (TARGET_ARCH64 && GET_MODE (operands[1]) == DImode)) + && (GET_CODE (operands[2]) != CONST_INT + || SPARC_SIMM5_P (INTVAL (operands[2])))) + { + emit_cbcond_insn (GET_CODE (operands[0]), operands[1], operands[2], operands[3]); + return; + } + if (TARGET_ARCH64 && operands[2] == const0_rtx && GET_CODE (operands[1]) == REG && GET_MODE (operands[1]) == DImode) @@ -3014,6 +3056,44 @@ empty_delay_slot (rtx insn) return 1; } +/* Return nonzero if we should emit a nop after a cbcond instruction. + The cbcond instruction does not have a delay slot, however there is + a severe performance penalty if a control transfer appears right + after a cbcond. Therefore we emit a nop when we detect this + situation. */ + +int +emit_cbcond_nop (rtx insn) +{ + rtx next = next_real_insn (insn); + + if (!next) + return 1; + + if (GET_CODE (next) == INSN + && GET_CODE (PATTERN (next)) == SEQUENCE) + next = XVECEXP (PATTERN (next), 0, 0); + else if (GET_CODE (next) == CALL_INSN + && GET_CODE (PATTERN (next)) == PARALLEL) + { + rtx delay = XVECEXP (PATTERN (next), 0, 1); + + if (GET_CODE (delay) == RETURN) + { + /* It's a sibling call. Do not emit the nop if we're going + to emit something other than the jump itself as the first + instruction of the sibcall sequence. */ + if (sparc_leaf_function_p || TARGET_FLAT) + return 0; + } + } + + if (NONJUMP_INSN_P (next)) + return 0; + + return 1; +} + /* Return nonzero if TRIAL can go into the call delay slot. */ int @@ -7102,19 +7182,49 @@ sparc_preferred_simd_mode (enum machine_mode mode) DEST is the destination insn (i.e. the label), INSN is the source. */ const char * -output_ubranch (rtx dest, int label, rtx insn) +output_ubranch (rtx dest, rtx insn) { static char string[64]; bool v9_form = false; + int delta; char *p; - if (TARGET_V9 && INSN_ADDRESSES_SET_P ()) + /* Even if we are trying to use cbcond for this, evaluate + whether we can use V9 branches as our backup plan. */ + + delta = 5000000; + if (INSN_ADDRESSES_SET_P ()) + delta = (INSN_ADDRESSES (INSN_UID (dest)) + - INSN_ADDRESSES (INSN_UID (insn))); + + /* Leave some instructions for "slop". */ + if (TARGET_V9 && delta >= -260000 && delta < 260000) + v9_form = true; + + if (TARGET_CBCOND) { - int delta = (INSN_ADDRESSES (INSN_UID (dest)) - - INSN_ADDRESSES (INSN_UID (insn))); - /* Leave some instructions for "slop". */ - if (delta >= -260000 && delta < 260000) - v9_form = true; + bool emit_nop = emit_cbcond_nop (insn); + bool far = false; + const char *rval; + + if (delta < -500 || delta > 500) + far = true; + + if (far) + { + if (v9_form) + rval = "ba,a,pt\t%%xcc, %l0"; + else + rval = "b,a\t%l0"; + } + else + { + if (emit_nop) + rval = "cwbe\t%%g0, %%g0, %l0\n\tnop"; + else + rval = "cwbe\t%%g0, %%g0, %l0"; + } + return rval; } if (v9_form) @@ -7125,7 +7235,7 @@ output_ubranch (rtx dest, int label, rtx insn) p = strchr (string, '\0'); *p++ = '%'; *p++ = 'l'; - *p++ = '0' + label; + *p++ = '0'; *p++ = '%'; *p++ = '('; *p = '\0'; @@ -7604,6 +7714,183 @@ sparc_emit_fixunsdi (rtx *operands, enum machine_mode mode) emit_label (donelab); } +/* Return the string to output a compare and branch instruction to DEST. + DEST is the destination insn (i.e. the label), INSN is the source, + and OP is the conditional expression. */ + +const char * +output_cbcond (rtx op, rtx dest, rtx insn) +{ + enum machine_mode mode = GET_MODE (XEXP (op, 0)); + enum rtx_code code = GET_CODE (op); + static char string[64]; + int far, emit_nop, len; + char *p; + + /* Compare and Branch is limited to +-2KB. If it is too far away, + change + + cxbne X, Y, .LC30 + + to + + cxbe X, Y, .+12 + ba,pt xcc, .LC30 + nop */ + + len = get_attr_length (insn); + + far = len == 3; + emit_nop = len == 2; + + if (far) + code = reverse_condition (code); + + p = string; + + *p++ = 'c'; + *p++ = mode == SImode ? 'w' : 'x'; + *p++ = 'b'; + + switch (code) + { + case NE: + *p++ = 'n'; + *p++ = 'e'; + break; + + case EQ: + *p++ = 'e'; + break; + + case GE: + if (mode == CC_NOOVmode || mode == CCX_NOOVmode) + { + *p++ = 'p'; + *p++ = 'o'; + *p++ = 's'; + } + else + { + *p++ = 'g'; + *p++ = 'e'; + } + break; + + case GT: + *p++ = 'g'; + break; + + case LE: + *p++ = 'l'; + *p++ = 'e'; + break; + + case LT: + if (mode == CC_NOOVmode || mode == CCX_NOOVmode) + { + *p++ = 'n'; + *p++ = 'e'; + *p++ = 'g'; + } + else + *p++ = 'l'; + break; + + case GEU: + *p++ = 'c'; + *p++ = 'c'; + break; + + case GTU: + *p++ = 'g'; + *p++ = 'u'; + break; + + case LEU: + *p++ = 'l'; + *p++ = 'e'; + *p++ = 'u'; + break; + + case LTU: + *p++ = 'c'; + *p++ = 's'; + break; + + default: + gcc_unreachable (); + } + + *p++ = '\t'; + *p++ = '%'; + *p++ = '1'; + *p++ = ','; + *p++ = ' '; + *p++ = '%'; + *p++ = '2'; + *p++ = ','; + *p++ = ' '; + + if (far) + { + int veryfar = 1, delta; + + if (INSN_ADDRESSES_SET_P ()) + { + delta = (INSN_ADDRESSES (INSN_UID (dest)) + - INSN_ADDRESSES (INSN_UID (insn))); + /* Leave some instructions for "slop". */ + if (delta >= -260000 && delta < 260000) + veryfar = 0; + } + *p++ = '.'; + *p++ = '+'; + *p++ = '1'; + *p++ = '2'; + *p++ = '\n'; + *p++ = '\t'; + if (veryfar) + { + *p++ = 'b'; + *p++ = '\t'; + } + else + { + *p++ = 'b'; + *p++ = 'a'; + *p++ = ','; + *p++ = 'p'; + *p++ = 't'; + *p++ = '\t'; + *p++ = '%'; + *p++ = '%'; + *p++ = 'x'; + *p++ = 'c'; + *p++ = 'c'; + *p++ = ','; + *p++ = ' '; + } + } + + *p++ = '%'; + *p++ = 'l'; + *p++ = '3'; + + if (far || emit_nop) + { + *p++ = '\n'; + *p++ = '\t'; + *p++ = 'n'; + *p++ = 'o'; + *p++ = 'p'; + } + + *p = '\0'; + + return string; +} + /* Return the string to output a conditional branch to LABEL, testing register REG. LABEL is the operand number of the label; REG is the operand number of the reg. OP is the conditional expression. The mode diff --git a/gcc/config/sparc/sparc.h b/gcc/config/sparc/sparc.h index 8f86100..374919f 100644 --- a/gcc/config/sparc/sparc.h +++ b/gcc/config/sparc/sparc.h @@ -195,7 +195,7 @@ extern enum cmodel sparc_cmodel; #endif #if TARGET_CPU_DEFAULT == TARGET_CPU_niagara4 #define CPP_CPU64_DEFAULT_SPEC "-D__sparc_v9__" -#define ASM_CPU64_DEFAULT_SPEC "-Av9" AS_NIAGARA3_FLAG +#define ASM_CPU64_DEFAULT_SPEC AS_NIAGARA4_FLAG #endif #else @@ -337,7 +337,7 @@ extern enum cmodel sparc_cmodel; %{mcpu=niagara:%{!mv8plus:-Av9b}} \ %{mcpu=niagara2:%{!mv8plus:-Av9b}} \ %{mcpu=niagara3:%{!mv8plus:-Av9" AS_NIAGARA3_FLAG "}} \ -%{mcpu=niagara4:%{!mv8plus:-Av9" AS_NIAGARA3_FLAG "}} \ +%{mcpu=niagara4:%{!mv8plus:" AS_NIAGARA4_FLAG "}} \ %{!mcpu*:%(asm_cpu_default)} \ " @@ -1006,7 +1006,8 @@ extern char leaf_reg_remap[]; /* Local macro to handle the two v9 classes of FP regs. */ #define FP_REG_CLASS_P(CLASS) ((CLASS) == FP_REGS || (CLASS) == EXTRA_FP_REGS) -/* Predicates for 10-bit, 11-bit and 13-bit signed constants. */ +/* Predicates for 5-bit, 10-bit, 11-bit and 13-bit signed constants. */ +#define SPARC_SIMM5_P(X) ((unsigned HOST_WIDE_INT) (X) + 0x10 < 0x20) #define SPARC_SIMM10_P(X) ((unsigned HOST_WIDE_INT) (X) + 0x200 < 0x400) #define SPARC_SIMM11_P(X) ((unsigned HOST_WIDE_INT) (X) + 0x400 < 0x800) #define SPARC_SIMM13_P(X) ((unsigned HOST_WIDE_INT) (X) + 0x1000 < 0x2000) @@ -1746,6 +1747,12 @@ extern int sparc_indent_opcode; #define AS_NIAGARA3_FLAG "d" #endif +#ifndef HAVE_AS_SPARC4 +#define AS_NIAGARA4_FLAG " -xarch=v9b" +#else +#define AS_NIAGARA4_FLAG " -xarch=sparc4" +#endif + /* We use gcc _mcount for profiling. */ #define NO_PROFILE_COUNTERS 0 diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md index f604f46..bdc8a8d 100644 --- a/gcc/config/sparc/sparc.md +++ b/gcc/config/sparc/sparc.md @@ -257,6 +257,7 @@ "ialu,compare,shift, load,sload,store, uncond_branch,branch,call,sibcall,call_no_delay_slot,return, + cbcond,uncond_cbcond, imul,idiv, fpload,fpstore, fp,fpmove, @@ -275,6 +276,12 @@ (symbol_ref "(empty_delay_slot (insn) ? EMPTY_DELAY_SLOT_TRUE : EMPTY_DELAY_SLOT_FALSE)")) +;; True if we are making use of compare-and-branch instructions. +;; True if we should emit a nop after a cbcond instruction +(define_attr "emit_cbcond_nop" "false,true" + (symbol_ref "(emit_cbcond_nop (insn) + ? EMIT_CBCOND_NOP_TRUE : EMIT_CBCOND_NOP_FALSE)")) + (define_attr "branch_type" "none,icc,fcc,reg" (const_string "none")) @@ -377,6 +384,30 @@ (if_then_else (eq_attr "empty_delay_slot" "true") (const_int 4) (const_int 3)))) + (eq_attr "type" "cbcond") + (if_then_else (lt (pc) (match_dup 3)) + (if_then_else (lt (minus (match_dup 3) (pc)) (const_int 500)) + (if_then_else (eq_attr "emit_cbcond_nop" "true") + (const_int 2) + (const_int 1)) + (const_int 3)) + (if_then_else (lt (minus (pc) (match_dup 3)) (const_int 500)) + (if_then_else (eq_attr "emit_cbcond_nop" "true") + (const_int 2) + (const_int 1)) + (const_int 3))) + (eq_attr "type" "uncond_cbcond") + (if_then_else (lt (pc) (match_dup 0)) + (if_then_else (lt (minus (match_dup 0) (pc)) (const_int 500)) + (if_then_else (eq_attr "emit_cbcond_nop" "true") + (const_int 2) + (const_int 1)) + (const_int 1)) + (if_then_else (lt (minus (pc) (match_dup 0)) (const_int 500)) + (if_then_else (eq_attr "emit_cbcond_nop" "true") + (const_int 2) + (const_int 1)) + (const_int 1))) ] (const_int 1))) ;; FP precision. @@ -397,7 +428,7 @@ ? TLS_CALL_DELAY_TRUE : TLS_CALL_DELAY_FALSE)")) (define_attr "in_call_delay" "false,true" - (cond [(eq_attr "type" "uncond_branch,branch,call,sibcall,call_no_delay_slot,multi") + (cond [(eq_attr "type" "uncond_branch,branch,cbcond,uncond_cbcond,call,sibcall,call_no_delay_slot,multi") (const_string "false") (eq_attr "type" "load,fpload,store,fpstore") (if_then_else (eq_attr "length" "1") @@ -431,19 +462,19 @@ ;; because it prevents us from moving back the final store of inner loops. (define_attr "in_branch_delay" "false,true" - (if_then_else (and (eq_attr "type" "!uncond_branch,branch,call,sibcall,call_no_delay_slot,multi") + (if_then_else (and (eq_attr "type" "!uncond_branch,branch,cbcond,uncond_cbcond,call,sibcall,call_no_delay_slot,multi") (eq_attr "length" "1")) (const_string "true") (const_string "false"))) (define_attr "in_uncond_branch_delay" "false,true" - (if_then_else (and (eq_attr "type" "!uncond_branch,branch,call,sibcall,call_no_delay_slot,multi") + (if_then_else (and (eq_attr "type" "!uncond_branch,branch,cbcond,uncond_cbcond,call,sibcall,call_no_delay_slot,multi") (eq_attr "length" "1")) (const_string "true") (const_string "false"))) (define_attr "in_annul_branch_delay" "false,true" - (if_then_else (and (eq_attr "type" "!uncond_branch,branch,call,sibcall,call_no_delay_slot,multi") + (if_then_else (and (eq_attr "type" "!uncond_branch,branch,cbcond,uncond_cbcond,call,sibcall,call_no_delay_slot,multi") (eq_attr "length" "1")) (const_string "true") (const_string "false"))) @@ -1313,6 +1344,32 @@ ;; SPARC V9-specific jump insns. None of these are guaranteed to be ;; in the architecture. +(define_insn "*cbcond_sp32" + [(set (pc) + (if_then_else (match_operator 0 "noov_compare_operator" + [(match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "arith5_operand" "rA")]) + (label_ref (match_operand 3 "" "")) + (pc)))] + "TARGET_CBCOND" +{ + return output_cbcond (operands[0], operands[3], insn); +} + [(set_attr "type" "cbcond")]) + +(define_insn "*cbcond_sp64" + [(set (pc) + (if_then_else (match_operator 0 "noov_compare_operator" + [(match_operand:DI 1 "register_operand" "r") + (match_operand:DI 2 "arith5_operand" "rA")]) + (label_ref (match_operand 3 "" "")) + (pc)))] + "TARGET_ARCH64 && TARGET_CBCOND" +{ + return output_cbcond (operands[0], operands[3], insn); +} + [(set_attr "type" "cbcond")]) + ;; There are no 32 bit brreg insns. ;; XXX @@ -6076,12 +6133,22 @@ ;; Unconditional and other jump instructions. -(define_insn "jump" +(define_expand "jump" [(set (pc) (label_ref (match_operand 0 "" "")))] - "" - "* return output_ubranch (operands[0], 0, insn);" + "") + +(define_insn "*jump_ubranch" + [(set (pc) (label_ref (match_operand 0 "" "")))] + "!TARGET_CBCOND" + "* return output_ubranch (operands[0], insn);" [(set_attr "type" "uncond_branch")]) +(define_insn "*jump_cbcond" + [(set (pc) (label_ref (match_operand 0 "" "")))] + "TARGET_CBCOND" + "* return output_ubranch (operands[0], insn);" + [(set_attr "type" "uncond_cbcond")]) + (define_expand "tablejump" [(parallel [(set (pc) (match_operand 0 "register_operand" "r")) (use (label_ref (match_operand 1 "" "")))])] diff --git a/gcc/config/sparc/sparc.opt b/gcc/config/sparc/sparc.opt index 58ba6b7..241cb07 100644 --- a/gcc/config/sparc/sparc.opt +++ b/gcc/config/sparc/sparc.opt @@ -73,6 +73,10 @@ mvis3 Target Report Mask(VIS3) Use UltraSPARC Visual Instruction Set version 3.0 extensions +mcbcond +Target Report Mask(CBCOND) +Use UltraSPARC Compare-and-Branch extensions + mfmaf Target Report Mask(FMAF) Use UltraSPARC Fused Multiply-Add extensions diff --git a/gcc/configure b/gcc/configure index a223c60..3fc0088 100755 --- a/gcc/configure +++ b/gcc/configure @@ -24090,6 +24090,48 @@ if test $gcc_cv_as_sparc_fmaf = yes; then $as_echo "#define HAVE_AS_FMAF_HPC_VIS3 1" >>confdefs.h fi + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for SPARC4 instructions" >&5 +$as_echo_n "checking assembler for SPARC4 instructions... " >&6; } +if test "${gcc_cv_as_sparc_sparc4+set}" = set; then : + $as_echo_n "(cached) " >&6 +else + gcc_cv_as_sparc_sparc4=no + if test x$gcc_cv_as != x; then + $as_echo '.text + .register %g2, #scratch + .register %g3, #scratch + .align 4 + cxbe %g2, %g3, 1f +1: cwbneg %g2, %g3, 1f +1: sha1 + md5 + aes_kexpand0 %f4, %f6, %f8 + des_round %f38, %f40, %f42, %f44 + camellia_f %f54, %f56, %f58, %f60 + kasumi_fi_xor %f46, %f48, %f50, %f52' > conftest.s + if { ac_try='$gcc_cv_as $gcc_cv_as_flags -xarch=sparc4 -o conftest.o conftest.s >&5' + { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5 + (eval $ac_try) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; } + then + gcc_cv_as_sparc_sparc4=yes + else + echo "configure: failed program was" >&5 + cat conftest.s >&5 + fi + rm -f conftest.o conftest.s + fi +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_sparc_sparc4" >&5 +$as_echo "$gcc_cv_as_sparc_sparc4" >&6; } +if test $gcc_cv_as_sparc_sparc4 = yes; then + +$as_echo "#define HAVE_AS_SPARC4 1" >>confdefs.h + +fi ;; i[34567]86-*-* | x86_64-*-*) diff --git a/gcc/configure.ac b/gcc/configure.ac index 17e1d86..95007a2 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -3501,6 +3501,24 @@ foo: fnaddd %f10, %f12, %f14],, [AC_DEFINE(HAVE_AS_FMAF_HPC_VIS3, 1, [Define if your assembler supports FMAF, HPC, and VIS 3.0 instructions.])]) + + gcc_GAS_CHECK_FEATURE([SPARC4 instructions], + gcc_cv_as_sparc_sparc4,, + [-xarch=sparc4], + [.text + .register %g2, #scratch + .register %g3, #scratch + .align 4 + cxbe %g2, %g3, 1f +1: cwbneg %g2, %g3, 1f +1: sha1 + md5 + aes_kexpand0 %f4, %f6, %f8 + des_round %f38, %f40, %f42, %f44 + camellia_f %f54, %f56, %f58, %f60 + kasumi_fi_xor %f46, %f48, %f50, %f52],, + [AC_DEFINE(HAVE_AS_SPARC4, 1, + [Define if your assembler supports SPARC4 instructions.])]) ;; changequote(,)dnl diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index f8c9230..fc2addc 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -918,6 +918,7 @@ See RS/6000 and PowerPC Options. -munaligned-doubles -mno-unaligned-doubles @gol -mv8plus -mno-v8plus -mvis -mno-vis @gol -mvis2 -mno-vis2 -mvis3 -mno-vis3 @gol +-mcbcond -mno-cbcond @gol -mfmaf -mno-fmaf -mpopc -mno-popc @gol -mfix-at697f} @@ -18878,6 +18879,16 @@ default is @option{-mvis3} when targeting a cpu that supports such instructions, such as niagara-3 and later. Setting @option{-mvis3} also sets @option{-mvis2} and @option{-mvis}. +@item -mcbcond +@itemx -mno-cbcond +@opindex mcbcond +@opindex mno-cbcond +With @option{-mcbcond}, GCC generates code that takes advantage of +compare-and-branch instructions, as defined in the Sparc Architecture 2011. +The default is @option{-mcbcond} when targeting a cpu that supports such +instructions, such as niagara-4 and later. Setting @option{-mcbcond} also +sets @option{-mvis3}, @option{-mvis2}, and @option{-mvis}. + @item -mpopc @itemx -mno-popc @opindex mpopc diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 32866d5..250cb1c 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -3157,6 +3157,9 @@ when the Visual Instruction Set is available. @item h 64-bit global or out register for the SPARC-V8+ architecture. +@item A +Signed 5-bit constant + @item D A vector constant