From patchwork Wed Oct 10 08:00:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Claudiu Zissulescu Ianculescu X-Patchwork-Id: 981747 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-487237-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="R352h8z0"; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="MsXddVMW"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42VRR80WHcz9s8F for ; Wed, 10 Oct 2018 19:02:43 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; q=dns; s= default; b=YKJAL4ESXGisKtojtdOSXFBG9O0k4psq2F8v4fJg387ZuD+AlwiL4 sc52s4dnI1CbzaujIYlK8f5IjXk/TaQeVh2gwIfGlHmwb7oGGffT8AjOO1wiovZD 7qzGIpGEhdcNbGiKY8fSbRH/CDHVssdzQA2PjTsI4EF7sK0K3kKsaU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; s= default; bh=6f2CxW7T2igTXUeFuBPmShZfQe0=; b=R352h8z0w6dMk0/RWxwL pg5UVd2dsKeafgd1m0k/i2nWeZHz/GeLTq7ann5z8NX1HwWuUd3Lmm1bveh5iZmv quwouUL1kX/OS501E5WWw6Kw+bUhSlSQlqqKdkZXC30wAGJv+nYAhVpjWgiY+a/F K8XtWS+u1ppSstB8tjbqEl4= Received: (qmail 71748 invoked by alias); 10 Oct 2018 08:01:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 70856 invoked by uid 89); 10 Oct 2018 08:00:47 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.8 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=H*r:sk:x31-v6s, NOP, dmp X-HELO: mail-ed1-f45.google.com Received: from mail-ed1-f45.google.com (HELO mail-ed1-f45.google.com) (209.85.208.45) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 10 Oct 2018 08:00:36 +0000 Received: by mail-ed1-f45.google.com with SMTP id x31-v6so4052183edd.8 for ; Wed, 10 Oct 2018 01:00:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6zQ67wAir67QwN9K9ByzvRBnPIQCU8BzjRrgGRSYWSo=; b=MsXddVMWQHdnymUEfkGb6rnPzvEuI4nU/AaHMK6mxmGKnSk9TqPVh67ktSxzbDLlkV MVBg+fka5vfe7B0mjmaoXqFDatRQFZC/95m/HGu5DVVpj7330pHG9P2A4yW2Rjjpl8dN aufg/DgWCFdEptsNlf6ECLtgAc6I4jEMfvHYwxVkos2r/6FGJdn/6sLAvvZOOjKjDrgH upuBuS6IFroyEunSEi04YpPgt96P21g1df8xX3p0+JPbPSCcsYLhD8+/CpA2SBtOAxeb W7kwm2L6fdeXT+qg3SThPkcHakj7wMpo85NAz3OwBZ31wDruCqx+J/KS97+dShLLFm2/ baLw== Received: from localhost.localdomain ([188.241.79.25]) by smtp.gmail.com with ESMTPSA id a17-v6sm7533362edd.61.2018.10.10.01.00.32 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 10 Oct 2018 01:00:33 -0700 (PDT) From: Claudiu Zissulescu To: gcc-patches@gcc.gnu.org Cc: andrew.burgess@embecosm.com, fbedard@synopsys.com, claziss@synopsys.com Subject: [PATCH 6/6] [ARC] Handle store cacheline hazard. Date: Wed, 10 Oct 2018 11:00:16 +0300 Message-Id: <20181010080016.12317-7-claziss@gmail.com> In-Reply-To: <20181010080016.12317-1-claziss@gmail.com> References: <20181010080016.12317-1-claziss@gmail.com> X-IsSubscribed: yes Handle store cacheline hazard for A700 cpus by inserting two NOP_S between ST ST LD or their logical equivalent (like ST ST NOP_S NOP_S J_L.D LD) gcc/ 2016-08-01 Claudiu Zissulescu * config/arc/arc-arch.h (ARC_TUNE_ARC7XX): New tune value. * config/arc/arc.c (arc_active_insn): New function. (check_store_cacheline_hazard): Likewise. (workaround_arc_anomaly): Use check_store_cacheline_hazard. (arc_override_options): Disable delay slot scheduler for older A7. (arc_store_addr_hazard_p): New implementation, old one renamed to ... (arc_store_addr_hazard_internal_p): Renamed. (arc_reorg): Don't combine into brcc instructions which are part of hardware hazard solution. * config/arc/arc.md (attr tune): Consider new arc7xx tune value. (tune_arc700): Likewise. * config/arc/arc.opt (arc7xx): New tune value. * config/arc/arc700.md: Improve A7 scheduler. --- gcc/config/arc/arc-arch.h | 1 + gcc/config/arc/arc.c | 142 ++++++++++++++++++++++++++++++++------ gcc/config/arc/arc.md | 8 ++- gcc/config/arc/arc.opt | 3 + gcc/config/arc/arc700.md | 18 +---- 5 files changed, 132 insertions(+), 40 deletions(-) diff --git a/gcc/config/arc/arc-arch.h b/gcc/config/arc/arc-arch.h index 859af0684b8..ad540607e55 100644 --- a/gcc/config/arc/arc-arch.h +++ b/gcc/config/arc/arc-arch.h @@ -71,6 +71,7 @@ enum arc_tune_attr { ARC_TUNE_NONE, ARC_TUNE_ARC600, + ARC_TUNE_ARC7XX, ARC_TUNE_ARC700_4_2_STD, ARC_TUNE_ARC700_4_2_XMAC, ARC_TUNE_CORE_3, diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c index ab7735d6b38..90454928379 100644 --- a/gcc/config/arc/arc.c +++ b/gcc/config/arc/arc.c @@ -1308,6 +1308,10 @@ arc_override_options (void) if (TARGET_LONG_CALLS_SET) target_flags &= ~MASK_MILLICODE_THUNK_SET; + /* A7 has an issue with delay slots. */ + if (TARGET_ARC700 && (arc_tune != ARC_TUNE_ARC7XX)) + flag_delayed_branch = 0; + /* These need to be done at start up. It's convenient to do them here. */ arc_init (); } @@ -7529,11 +7533,91 @@ arc_invalid_within_doloop (const rtx_insn *insn) return NULL; } +static rtx_insn * +arc_active_insn (rtx_insn *insn) +{ + rtx_insn *nxt = next_active_insn (insn); + + if (nxt && GET_CODE (PATTERN (nxt)) == ASM_INPUT) + nxt = next_active_insn (nxt); + return nxt; +} + +/* Search for a sequence made out of two stores and a given number of + loads, insert a nop if required. */ + +static void +check_store_cacheline_hazard (void) +{ + rtx_insn *insn, *succ0, *insn1; + bool found = false; + + for (insn = get_insns (); insn; insn = arc_active_insn (insn)) + { + succ0 = arc_active_insn (insn); + + if (!succ0) + return; + + if (!single_set (insn) || !single_set (succ0)) + continue; + + if ((get_attr_type (insn) != TYPE_STORE) + || (get_attr_type (succ0) != TYPE_STORE)) + continue; + + /* Found at least two consecutive stores. Goto the end of the + store sequence. */ + for (insn1 = succ0; insn1; insn1 = arc_active_insn (insn1)) + if (!single_set (insn1) || get_attr_type (insn1) != TYPE_STORE) + break; + + /* Now, check the next two instructions for the following cases: + 1. next instruction is a LD => insert 2 nops between store + sequence and load. + 2. next-next instruction is a LD => inset 1 nop after the store + sequence. */ + if (insn1 && single_set (insn1) + && (get_attr_type (insn1) == TYPE_LOAD)) + { + found = true; + emit_insn_before (gen_nopv (), insn1); + emit_insn_before (gen_nopv (), insn1); + } + else + { + if (insn1 && (get_attr_type (insn1) == TYPE_COMPARE)) + { + /* REG_SAVE_NOTE is used by Haifa scheduler, we are in + reorg, so it is safe to reuse it for avoiding the + current compare insn to be part of a BRcc + optimization. */ + add_reg_note (insn1, REG_SAVE_NOTE, GEN_INT (3)); + } + insn1 = arc_active_insn (insn1); + if (insn1 && single_set (insn1) + && (get_attr_type (insn1) == TYPE_LOAD)) + { + found = true; + emit_insn_before (gen_nopv (), insn1); + } + } + + insn = insn1; + if (found) + { + /* warning (0, "Potential lockup sequence found, patching"); */ + found = false; + } + } +} + /* Return true if a load instruction (CONSUMER) uses the same address as a store instruction (PRODUCER). This function is used to avoid st/ld address hazard in ARC700 cores. */ -bool -arc_store_addr_hazard_p (rtx_insn* producer, rtx_insn* consumer) + +static bool +arc_store_addr_hazard_internal_p (rtx_insn* producer, rtx_insn* consumer) { rtx in_set, out_set; rtx out_addr, in_addr; @@ -7581,6 +7665,14 @@ arc_store_addr_hazard_p (rtx_insn* producer, rtx_insn* consumer) return false; } +bool +arc_store_addr_hazard_p (rtx_insn* producer, rtx_insn* consumer) +{ + if (TARGET_ARC700 && (arc_tune != ARC_TUNE_ARC7XX)) + return true; + return arc_store_addr_hazard_internal_p (producer, consumer); +} + /* The same functionality as arc_hazard. It is called in machine reorg before any other optimization. Hence, the NOP size is taken into account when doing branch shortening. */ @@ -7589,6 +7681,7 @@ static void workaround_arc_anomaly (void) { rtx_insn *insn, *succ0; + rtx_insn *succ1; /* For any architecture: call arc_hazard here. */ for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) @@ -7600,27 +7693,30 @@ workaround_arc_anomaly (void) } } - if (TARGET_ARC700) - { - rtx_insn *succ1; + if (!TARGET_ARC700) + return; - for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) - { - succ0 = next_real_insn (insn); - if (arc_store_addr_hazard_p (insn, succ0)) - { - emit_insn_after (gen_nopv (), insn); - emit_insn_after (gen_nopv (), insn); - continue; - } + /* Old A7 are suffering of a cache hazard, and we need to insert two + nops between any sequence of stores and a load. */ + if (arc_tune != ARC_TUNE_ARC7XX) + check_store_cacheline_hazard (); - /* Avoid adding nops if the instruction between the ST and LD is - a call or jump. */ - succ1 = next_real_insn (succ0); - if (succ0 && !JUMP_P (succ0) && !CALL_P (succ0) - && arc_store_addr_hazard_p (insn, succ1)) - emit_insn_after (gen_nopv (), insn); + for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) + { + succ0 = next_real_insn (insn); + if (arc_store_addr_hazard_internal_p (insn, succ0)) + { + emit_insn_after (gen_nopv (), insn); + emit_insn_after (gen_nopv (), insn); + continue; } + + /* Avoid adding nops if the instruction between the ST and LD is + a call or jump. */ + succ1 = next_real_insn (succ0); + if (succ0 && !JUMP_P (succ0) && !CALL_P (succ0) + && arc_store_addr_hazard_internal_p (insn, succ1)) + emit_insn_after (gen_nopv (), insn); } } @@ -8291,11 +8387,15 @@ arc_reorg (void) if (!link_insn) continue; else - /* Check if this is a data dependency. */ { + /* Check if this is a data dependency. */ rtx op, cc_clob_rtx, op0, op1, brcc_insn, note; rtx cmp0, cmp1; + /* Make sure we can use it for brcc insns. */ + if (find_reg_note (link_insn, REG_SAVE_NOTE, GEN_INT (3))) + continue; + /* Ok this is the set cc. copy args here. */ op = XEXP (pc_target, 0); diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md index fb8a1c9ee09..caf7deda505 100644 --- a/gcc/config/arc/arc.md +++ b/gcc/config/arc/arc.md @@ -600,11 +600,13 @@ ;; somehow modify them to become inelegible for delay slots if a decision ;; is made that makes conditional execution required. -(define_attr "tune" "none,arc600,arc700_4_2_std,arc700_4_2_xmac, core_3, \ -archs4x, archs4xd, archs4xd_slow" +(define_attr "tune" "none,arc600,arc7xx,arc700_4_2_std,arc700_4_2_xmac, \ +core_3, archs4x, archs4xd, archs4xd_slow" (const (cond [(symbol_ref "arc_tune == TUNE_ARC600") (const_string "arc600") + (symbol_ref "arc_tune == ARC_TUNE_ARC7XX") + (const_string "arc7xx") (symbol_ref "arc_tune == TUNE_ARC700_4_2_STD") (const_string "arc700_4_2_std") (symbol_ref "arc_tune == TUNE_ARC700_4_2_XMAC") @@ -619,7 +621,7 @@ archs4x, archs4xd, archs4xd_slow" (const_string "none")))) (define_attr "tune_arc700" "false,true" - (if_then_else (eq_attr "tune" "arc700_4_2_std, arc700_4_2_xmac") + (if_then_else (eq_attr "tune" "arc7xx, arc700_4_2_std, arc700_4_2_xmac") (const_string "true") (const_string "false"))) diff --git a/gcc/config/arc/arc.opt b/gcc/config/arc/arc.opt index 93e18af1d27..bcffb2720ba 100644 --- a/gcc/config/arc/arc.opt +++ b/gcc/config/arc/arc.opt @@ -262,6 +262,9 @@ Enum(arc_tune_attr) String(arc600) Value(ARC_TUNE_ARC600) EnumValue Enum(arc_tune_attr) String(arc601) Value(ARC_TUNE_ARC600) +EnumValue +Enum(arc_tune_attr) String(arc7xx) Value(ARC_TUNE_ARC7XX) + EnumValue Enum(arc_tune_attr) String(arc700) Value(ARC_TUNE_ARC700_4_2_STD) diff --git a/gcc/config/arc/arc700.md b/gcc/config/arc/arc700.md index a0f9f74a9f2..cbb868d8dcd 100644 --- a/gcc/config/arc/arc700.md +++ b/gcc/config/arc/arc700.md @@ -145,28 +145,14 @@ ; no functional unit runs when blockage is reserved (exclusion_set "blockage" "core, multiplier") -(define_insn_reservation "data_load_DI" 4 - (and (eq_attr "tune_arc700" "true") - (eq_attr "type" "load") - (match_operand:DI 0 "" "")) - "issue+dmp, issue+dmp, dmp_write_port, dmp_write_port") - (define_insn_reservation "data_load" 3 (and (eq_attr "tune_arc700" "true") - (eq_attr "type" "load") - (not (match_operand:DI 0 "" ""))) + (eq_attr "type" "load")) "issue+dmp, nothing, dmp_write_port") -(define_insn_reservation "data_store_DI" 2 - (and (eq_attr "tune_arc700" "true") - (eq_attr "type" "store") - (match_operand:DI 0 "" "")) - "issue+dmp_write_port, issue+dmp_write_port") - (define_insn_reservation "data_store" 1 (and (eq_attr "tune_arc700" "true") - (eq_attr "type" "store") - (not (match_operand:DI 0 "" ""))) + (eq_attr "type" "store")) "issue+dmp_write_port") (define_bypass 3 "data_store" "data_load" "arc_store_addr_hazard_p")