From patchwork Wed Jan 6 12:36:19 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Segher Boessenkool X-Patchwork-Id: 563929 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 9E2871402C0 for ; Wed, 6 Jan 2016 23:36:33 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=Zt6cmMEJ; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=yaCr07p5kuiYvsSKt 7OJVhWL9ix5KSEN6vEATkvJGX8XoVhK05u26W6/2izVNA/edeKPp5ixE7iyJcMmW evbCxHKrquHIcpWV8NfuvszG3I3z1EPDyd0z/oqtctEtVkXJ+sNtkYS9F/nG02MM ecCjiZMuML6cPBgiS1qnwnSXII= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=default; bh=3nOWuVe3zbtxjK0ZD4jhmF/ FRfY=; b=Zt6cmMEJ5bxjrq302lYDjsB+QsxrS1x0pRThYnheSg5CMfCqC42liqd qov+3fap1TwczfbZr4bVwKsAxHxv44js3yzmIZwRUC3a0ri7PH8ptsyGCfYGQs8/ BbFUWbPLPfIjoqBLGGin14t8U0n9NsT2dhPNaSxuQt/GBLNfmxp0= Received: (qmail 35955 invoked by alias); 6 Jan 2016 12:36:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 35937 invoked by uid 89); 6 Jan 2016 12:36:25 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=scheduling, peek, putting, PRO X-HELO: gate.crashing.org Received: from gate.crashing.org (HELO gate.crashing.org) (63.228.1.57) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Wed, 06 Jan 2016 12:36:24 +0000 Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.13.8) with ESMTP id u06CaJ1f013085; Wed, 6 Jan 2016 06:36:20 -0600 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id u06CaJ7q013083; Wed, 6 Jan 2016 06:36:19 -0600 Date: Wed, 6 Jan 2016 06:36:19 -0600 From: Segher Boessenkool To: Bernd Schmidt Cc: gcc-patches@gcc.gnu.org, jakub@redhat.com Subject: Re: [PATCH] shrink-wrap: Once more PRs 67778, 68634, and now 68909 Message-ID: <20160106123618.GA12044@gate.crashing.org> References: <56735F29.7090807@redhat.com> <20151220162733.GA30241@gate.crashing.org> <568A808A.8070804@redhat.com> Mime-Version: 1.0 Content-Disposition: inline In-Reply-To: <568A808A.8070804@redhat.com> User-Agent: Mutt/1.4.2.3i X-IsSubscribed: yes On Mon, Jan 04, 2016 at 03:24:10PM +0100, Bernd Schmidt wrote: > >>This code is getting really quite confusing, > >>and at the least I think we > >>need more documentation of what exactly vec is supposed to contain at > >>the entry to the inner while loop here. > > > >Same as in the other loop: vec is a stack of blocks that still need to > >be looked at. I can duplicate the comment if you want? > > No, I think more is needed. Thanks for the review. New patch attached. > The inner loop looks like it should be > emptying the vec, but this is not true if we break out of it, and your > patch now even adds an explicit push. I added a big comment explaining the algorithm, and changed the push back to do a peek instead. > It also looks like it wants to use > the bb_tmp bitmap to cache results for future iterations of the outer > loop, but I'm not convinced this is actually correct. I can't follow > this behaviour anymore without clear a description of intent. See the new comment. > Also, it might be clearer to not modify "pro" in this loop - use a > "cand" variable, and modify "pro" instead of last_ok, getting rid of the > latter. I tried that, and it doesn't make things clearer in my opinion. Also tried assigning "pro = pre" earlier, to make it more similar to the previous loop; also not an improvement. The patch also removes a superfluous (and confusing) bitmap set as well as a bitmap test. How's this? Segher Subject: [PATCH] shrink-wrap: Once more PRs 67778, 68634, and now 68909 If a candidate PRE cannot get the prologue because a block BB is reachable from it, but PRE does not dominate BB, we try again with the dominators of PRE. That "try again" needs to again consider BB though, we aren't done with it. This fixes this problem. Tested on the 68909 testcase, and bootstrapped and regression checked on powerpc64-linux. Is this okay for trunk? 2016-01-06 Segher Boessenkool PR rtl-optimization/67778 PR rtl-optimization/68634 PR rtl-optimization/68909 * shrink-wrap.c (try_shrink_wrapping): Add comment. Don't pop block from the stack until done with it. Remove a superfluous bitmap set. Remove a superfluous bitmap test. --- gcc/shrink-wrap.c | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c index e51bd36..84abd6b 100644 --- a/gcc/shrink-wrap.c +++ b/gcc/shrink-wrap.c @@ -750,9 +750,21 @@ try_shrink_wrapping (edge *entry_edge, bitmap_head *bb_with, /* If we can move PRO back without having to duplicate more blocks, do so. We do this because putting the prologue earlier is better for scheduling. + We can move back to a block PRE if every path from PRE will eventually need a prologue, that is, PRO is a post-dominator of PRE. PRE needs - to dominate every block reachable from itself. */ + to dominate every block reachable from itself. We keep in BB_TMP a + bitmap of the blocks reachable from PRE that we already found, and in + VEC a stack of those we still need to consider. + + Any block reachable from PRE is also reachable from all predecessors + of PRE, so if we find we need to move PRE back further we can leave + everything not considered so far on the stack. Any block dominated + by PRE is also dominated by all other dominators of PRE, so anything + found good for some PRE does not need to be reconsidered later. + + We don't need to update BB_WITH because none of the new blocks found + can jump to a block that does not need the prologue. */ if (pro != entry) { @@ -775,18 +787,15 @@ try_shrink_wrapping (edge *entry_edge, bitmap_head *bb_with, bool ok = true; while (!vec.is_empty ()) { - basic_block bb = vec.pop (); - bitmap_set_bit (bb_tmp, pre->index); - - if (!dominated_by_p (CDI_DOMINATORS, bb, pre)) + if (!dominated_by_p (CDI_DOMINATORS, vec.last (), pre)) { ok = false; break; } + basic_block bb = vec.pop (); FOR_EACH_EDGE (e, ei, bb->succs) - if (!bitmap_bit_p (bb_with, e->dest->index) - && bitmap_set_bit (bb_tmp, e->dest->index)) + if (bitmap_set_bit (bb_tmp, e->dest->index)) vec.quick_push (e->dest); }