From patchwork Wed Dec 9 20:47:40 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Segher Boessenkool X-Patchwork-Id: 554803 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id B99A51402DE for ; Thu, 10 Dec 2015 07:48:00 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=eOcGEhCu; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id; q=dns; s=default; b=Bdj//8d6sDVB frL/4UvcVHRwmQ+URY8uQ+/Xin97DCycwLQXmpHkMA53P+aI93OUUNbkB1K2UzQg ZNH2qevHPAwOlgdoXWMOCvM9R0qgvDb+R/i6ZFtz8VMtYVT6IhTD24zy1uYHyUkB blPwSySQYVEbh4rQOl1fW+N1fMHgHGo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id; s=default; bh=ljkfthsJD65uq5QpUJ 89NTyGNIE=; b=eOcGEhCutRP+3UTIVqg/BK+DgrpWArtvYzoKJa481BFu9TmqTe FY4wQvyipmfWMwO7sTZsY8P9ahFDGm6mNadnvXpy6mg4HPYtFO1j6vd9ohQd3cGU higFrFm7EnfLk5LC3+JHW6YkbIAd4NPj1fsnZp518XhOeDYzWZy4gzZ4I= Received: (qmail 128272 invoked by alias); 9 Dec 2015 20:47:49 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 128247 invoked by uid 89); 9 Dec 2015 20:47:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.2 required=5.0 tests=AWL, BAYES_00, KAM_LAZY_DOMAIN_SECURITY, T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: gcc1-power7.osuosl.org Received: from gcc1-power7.osuosl.org (HELO gcc1-power7.osuosl.org) (140.211.15.137) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 09 Dec 2015 20:47:46 +0000 Received: by gcc1-power7.osuosl.org (Postfix, from userid 10019) id 9C2281C0692; Wed, 9 Dec 2015 20:47:43 +0000 (UTC) From: Segher Boessenkool To: gcc-patches@gcc.gnu.org Cc: jakub@redhat.com, bschmidt@redhat.com, Segher Boessenkool Subject: [PATCH v4] Fix shrink-wrapping bug (PR67778, PR68634) Date: Wed, 9 Dec 2015 20:47:40 +0000 Message-Id: X-IsSubscribed: yes After shrink-wrapping has found the "tightest fit" for where to place the prologue, it tries move it earlier (so that frame saves are run earlier) -- but without copying any more basic blocks. Unfortunately a candidate block we select can be inside a loop, and we will still allow it (because the loop always exits via our previously chosen block). We can do that just fine if we make a duplicate of the block, but we do not want to here. So we need to detect this situation. We can place the prologue at a previous block PRE only if PRE dominates every block reachable from it, because then we will never need to duplicate that block (it will always be executed with prologue). v4: Fixed all the stupid mistakes you noticed. Also, the previous version stopped looking when the previous try didn't work out. This version doesn't: it is simpler, more in line with the rest of the algorithm, potentially useful, and doesn't really cost more. Tested on the two testcases from the PRs. Also regression checked on powerpc64-linux. Is this okay for trunk? Segher 2015-12-09 Segher Boessenkool PR rtl-optimization/67778 PR rtl-optimization/68634 * shrink-wrap.c (try_shrink_wrapping): Add a comment about why we want to put the prologue earlier. When determining if an earlier block is suitable, make sure it dominates every block reachable from it. --- gcc/shrink-wrap.c | 42 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 37 insertions(+), 5 deletions(-) --- gcc/shrink-wrap.c | 40 +++++++++++++++++++++++++++++++++++----- 1 file changed, 35 insertions(+), 5 deletions(-) diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c index 3a1df84..f65b0c3 100644 --- a/gcc/shrink-wrap.c +++ b/gcc/shrink-wrap.c @@ -744,36 +744,66 @@ try_shrink_wrapping (edge *entry_edge, bitmap_head *bb_with, vec.quick_push (e->dest); } - vec.release (); - if (dump_file) fprintf (dump_file, "Avoiding non-duplicatable blocks, PRO is now %d\n", pro->index); /* If we can move PRO back without having to duplicate more blocks, do so. + We do this because putting the prologue earlier is better for scheduling. We can move back to a block PRE if every path from PRE will eventually - need a prologue, that is, PRO is a post-dominator of PRE. */ + need a prologue, that is, PRO is a post-dominator of PRE. PRE needs + to dominate every block reachable from itself. */ if (pro != entry) { calculate_dominance_info (CDI_POST_DOMINATORS); + bitmap bb_tmp = BITMAP_ALLOC (NULL); + bitmap_copy (bb_tmp, bb_with); basic_block last_ok = pro; + vec.truncate (0); + while (pro != entry) { basic_block pre = get_immediate_dominator (CDI_DOMINATORS, pro); if (!dominated_by_p (CDI_POST_DOMINATORS, pre, pro)) break; + if (bitmap_set_bit (bb_tmp, pre->index)) + vec.quick_push (pre); + + bool ok = true; + while (!vec.is_empty ()) + { + basic_block bb = vec.pop (); + bitmap_set_bit (bb_tmp, pre->index); + + if (!dominated_by_p (CDI_DOMINATORS, bb, pre)) + { + ok = false; + break; + } + + FOR_EACH_EDGE (e, ei, bb->succs) + if (!bitmap_bit_p (bb_with, e->dest->index) + && bitmap_set_bit (bb_tmp, e->dest->index)) + vec.quick_push (e->dest); + } + + if (ok && can_get_prologue (pre, prologue_clobbered)) + last_ok = pre; + pro = pre; - if (can_get_prologue (pro, prologue_clobbered)) - last_ok = pro; } + pro = last_ok; + BITMAP_FREE (bb_tmp); free_dominance_info (CDI_POST_DOMINATORS); } + vec.release (); + if (dump_file) fprintf (dump_file, "Bumping back to anticipatable blocks, PRO is now %d\n", pro->index);