From patchwork Thu Nov 8 12:10:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrill Tkachov X-Patchwork-Id: 994822 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-489347-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=foss.arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="NSnyUaO3"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42rMZ10BcZz9sBN for ; Thu, 8 Nov 2018 23:10:48 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=eiE0oAi8dZHxfMsyUfPDgq+1s8s7mFRtLuqB3Nf9HCZ WelUMMi1qgfoq+fjQDYSm0pK7SbK0I15KOivKC8Y0WlEgMPAwTnkHsPf+6UoeJEG nbWe3JTmCjt3bvlkYxbZ1FR3+uwwtWCdcmi7Z6Ayn3Lsq35VKvA5r9/CKnzeptbM = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=kNCBfvY7fdr6hynFc58xrrIKDmM=; b=NSnyUaO3jBJP4njQy 6SbsohG8TBuSyytHtdY3+Uf7AwNXFg4n6gdjzF7vPOksKfTrcWAEZ7vV3hGs7PiG +Mb1FKb0dpyRfpevG2cQZfpzXe9j4cf2TrQ85vUs8oTrwsJlV9p8SY66k/9Z7h+r oL3WNn1vlgcAcYA5NlCkarqOl4= Received: (qmail 25986 invoked by alias); 8 Nov 2018 12:10:41 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 25968 invoked by uid 89); 8 Nov 2018 12:10:39 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY autolearn=ham version=3.3.2 spammy=greatest, benchmarking, longest, satisfied X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 08 Nov 2018 12:10:37 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 79D42EBD; Thu, 8 Nov 2018 04:10:35 -0800 (PST) Received: from [10.2.207.77] (e100706-lin.cambridge.arm.com [10.2.207.77]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 003353F5BD; Thu, 8 Nov 2018 04:10:34 -0800 (PST) Message-ID: <5BE427B9.9020607@foss.arm.com> Date: Thu, 08 Nov 2018 12:10:33 +0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: "gcc-patches@gcc.gnu.org" CC: Richard Sandiford Subject: Tweak ALAP calculation in SCHED_PRESSURE_MODEL This patch fixes a flaw in the relationship between the way that SCHED_PRESSURE_MODEL calculates the alap and depth vs how it uses them in model_order_p. A comment in model_order_p says: /* Combine the length of the longest path of satisfied true dependencies that leads to each instruction (depth) with the length of the longest path of any dependencies that leads from the instruction (alap). Prefer instructions with the greatest combined length. If the combined lengths are equal, prefer instructions with the greatest depth. The idea is that, if we have a set S of "equal" instructions that each have ALAP value X, and we pick one such instruction I, any true-dependent successors of I that have ALAP value X - 1 should be preferred over S. This encourages the schedule to be "narrow" rather than "wide". However, if I is a low-priority instruction that we decided to schedule because of its model_classify_pressure, and if there is a set of higher-priority instructions T, the aforementioned successors of I should not have the edge over T. */ The expectation was that scheduling an instruction X would give a greater priority to the highest-priority successor instructions Y than X had: Y.depth would be X.depth + 1 and Y.alap would be X.alap - 1, giving an equal combined height, but with the greater depth winning as a tie-breaker. But this doesn't work if the alap value was ultimately determined by an anti-dependence. This is particularly bad when --param max-pending-list-length kicks in, since we then start adding fake anti-dependencies in order to keep the list length down. These fake dependencies tend to be on the critical path. The attached patch avoids that by making the alap calculation only look at true dependencies. This shouldn't be too bad, since we use INSN_PRIORITY as the final tie-breaker than that does take anti-dependencies into account. This reduces the number of spills in the hot function from 436.cactusADM by 14% on aarch64 at -O3 (and the number of instructions in general). SPEC2017 shows a minor improvement on Cortex-A72 (about 0.1% overall). Thanks to Wilco for the benchmarking. Bootstrapped and tested on aarch64-none-linux-gnu. Is this ok for trunk? Thanks, Kyrill 2018-11-08 Richard Sandiford gcc/ * haifa-sched.c (model_analyze_insns): Only add 1 to the consumer's ALAP if the dependence is a true dependence. diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c index 1fdc9df9fb26f23758ec8326cec91eecc4c917c1..01825de440c2e818eceab5ab7411b20b05ee54f1 100644 --- a/gcc/haifa-sched.c +++ b/gcc/haifa-sched.c @@ -3504,8 +3504,10 @@ model_analyze_insns (void) FOR_EACH_DEP (iter, SD_LIST_FORW, sd_it, dep) { con = MODEL_INSN_INFO (DEP_CON (dep)); - if (con->insn && insn->alap < con->alap + 1) - insn->alap = con->alap + 1; + unsigned int min_alap + = con->alap + (DEP_TYPE (dep) == REG_DEP_TRUE); + if (con->insn && insn->alap < min_alap) + insn->alap = min_alap; } insn->old_queue = QUEUE_INDEX (iter);