From patchwork Fri Dec 11 08:11:49 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mike Stump X-Patchwork-Id: 555586 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 0E9A71402D8 for ; Fri, 11 Dec 2015 19:13:26 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=eQMq2jAT; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :content-type:content-transfer-encoding:subject:message-id:date :to:mime-version; q=dns; s=default; b=tw+7d4ISc4nRzeFOSZ++zTE9wE 2nShdCVZ1b5egB068SjJUyZnDs252d8GrUxxItsPD6ODtdksu/YlEFHVTb9eb7Lp gAsLmdP06vHnC1+9P+FQONTfEEyYK5zUtfL7yFoA0zyije3HrXSWy18u/2mCjn82 LaGdNBpkeKFgPrcNg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :content-type:content-transfer-encoding:subject:message-id:date :to:mime-version; s=default; bh=Wox79kj/l4vqV7JfwcD0ayYY+uk=; b= eQMq2jATLxuIMQvxU2wU9PhEn9zaEnpWqFHWZM5uZq8NNwcTh9TCztQHVsUdzjBg /nmqlqtzxLC644YxRvM7/yFc771rZKSr3bKNQ0RqxQTwaRsf0n259jRRo3aeR+UI JBoBAKvqJ7nA17NHKVTmcpvZWF+6tqBj2QcVw7prxb4= Received: (qmail 37006 invoked by alias); 11 Dec 2015 08:13:19 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 36991 invoked by uid 89); 11 Dec 2015 08:13:18 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.3 required=5.0 tests=BAYES_05, FREEMAIL_FROM, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_PASS, T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: resqmta-po-07v.sys.comcast.net Received: from resqmta-po-07v.sys.comcast.net (HELO resqmta-po-07v.sys.comcast.net) (96.114.154.166) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Fri, 11 Dec 2015 08:13:16 +0000 Received: from resomta-po-11v.sys.comcast.net ([96.114.154.235]) by resqmta-po-07v.sys.comcast.net with comcast id s8DE1r00654zqzk018DEq9; Fri, 11 Dec 2015 08:13:14 +0000 Received: from [IPv6:2001:558:6045:a4:40c6:7199:cd03:b02d] ([IPv6:2001:558:6045:a4:40c6:7199:cd03:b02d]) by resomta-po-11v.sys.comcast.net with comcast id s8DD1r0052ztT3H018DEf7; Fri, 11 Dec 2015 08:13:14 +0000 From: Mike Stump Subject: fix scheduling antideps Message-Id: <0E3A13C8-561C-4587-AEAE-A067F9DD356D@comcast.net> Date: Fri, 11 Dec 2015 00:11:49 -0800 To: GCC Patches Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) X-IsSubscribed: yes This patch allows a target to increase the cost of anti-deps to better reflect the actual cost on the machine. This gets me get 5% more performance on an important inner loop by exposing the actual cost of long dep chains that have lots of anti-deps in them. Be scheduling the longer chain first, we have more opportunities to fill in the holes with content from the other less critical chains. I’m unsure if all machines should have a cost of 1, or just some machines. I suspect that OOO can hide the del chains well enough so that the value 0 is more appropriate. Ok? Index: defaults.h =================================================================== --- defaults.h (revision 231539) +++ defaults.h (working copy) @@ -1486,6 +1486,10 @@ #define TARGET_VTABLE_USES_DESCRIPTORS 0 #endif +#ifndef TARGET_ANTI_DEP_COST +#define TARGET_ANTI_DEP_COST 0 +#endif + #endif /* GCC_INSN_FLAGS_H */ #endif /* ! GCC_DEFAULTS_H */ Index: doc/tm.texi =================================================================== --- doc/tm.texi (revision 231539) +++ doc/tm.texi (working copy) @@ -6970,6 +6970,13 @@ the hook implementation for how different fusion types are supported. @end deftypefn +@defmac TARGET_ANTI_DEP_COST +The cost in cycles for an anti-dependency. Defaults to 0. On non-OOO +multi-issue machines that can't issue instructions that have +overlapping registers in the same cycle, a value of 1 will better +reflect the actual cost of the instruction sequence. +@end defmac + @node Sections @section Dividing the Output into Sections (Texts, Data, @dots{}) @c the above section title is WAY too long. maybe cut the part between Index: doc/tm.texi.in =================================================================== --- doc/tm.texi.in (revision 231539) +++ doc/tm.texi.in (working copy) @@ -4852,6 +4852,13 @@ @hook TARGET_SCHED_FUSION_PRIORITY +@defmac TARGET_ANTI_DEP_COST +The cost in cycles for an anti-dependency. Defaults to 0. On non-OOO +multi-issue machines that can't issue instructions that have +overlapping registers in the same cycle, a value of 1 will better +reflect the actual cost of the instruction sequence. +@end defmac + @node Sections @section Dividing the Output into Sections (Texts, Data, @dots{}) @c the above section title is WAY too long. maybe cut the part between Index: haifa-sched.c =================================================================== --- haifa-sched.c (revision 231539) +++ haifa-sched.c (working copy) @@ -1470,7 +1470,7 @@ if (INSN_CODE (insn) >= 0) { if (dep_type == REG_DEP_ANTI) - cost = 0; + cost = TARGET_ANTI_DEP_COST; else if (dep_type == REG_DEP_OUTPUT) { cost = (insn_default_latency (insn)