From patchwork Fri Apr 26 14:27:52 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 239883 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 165302C00D2 for ; Sat, 27 Apr 2013 00:28:11 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=default; b=IU6/eYldk2qrVaMbk 7g3WKX25H0+UYpLbY7Uf76wwwKpIV6+8l2QF4uuWaOC6ZCqBvAI1R6YyaMb3kdO+ VsS1ichByG68bjxvlbWyXWE1vJq0A0WNxxoCHk062WrqXJWOu0VxNgVjdDjgUqQn u1UI5jNBcBEySHK0l1SLSKcT74= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=default; bh=nB7hgil+DESuiAY3MaJvzJI veg8=; b=jMdugtEZrfzLkEvRVUMEF1lco3dBJQoOkLYduMFyBgE4XBNuwZwL318 HOP2ASkxLo8EspRhn8eVm+NsWfP/uBJOkSU5II/us5h+XW2gVNYcQJdAr8qPpFN2 VpWE2PYN/t0grrwTVa8+aAAoz7DiQnYzcPPe+2cvum3OBQzyNeuY= Received: (qmail 32034 invoked by alias); 26 Apr 2013 14:28:02 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 32019 invoked by uid 89); 26 Apr 2013 14:28:02 -0000 X-Spam-SWARE-Status: No, score=-3.1 required=5.0 tests=AWL, BAYES_50, KHOP_RCVD_UNTRUST, KHOP_THREADED, RCVD_IN_HOSTKARMA_W, RCVD_IN_HOSTKARMA_WL autolearn=ham version=3.3.1 Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Fri, 26 Apr 2013 14:28:01 +0000 Received: from svr-orw-fem-01.mgc.mentorg.com ([147.34.98.93]) by relay1.mentorg.com with esmtp id 1UVjcz-0004ZD-95 from Tom_deVries@mentor.com ; Fri, 26 Apr 2013 07:27:57 -0700 Received: from SVR-IES-FEM-01.mgc.mentorg.com ([137.202.0.104]) by svr-orw-fem-01.mgc.mentorg.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Fri, 26 Apr 2013 07:27:56 -0700 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.2.247.3; Fri, 26 Apr 2013 15:27:55 +0100 Message-ID: <517A8EE8.8010805@mentor.com> Date: Fri, 26 Apr 2013 16:27:52 +0200 From: Tom de Vries User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130329 Thunderbird/17.0.5 MIME-Version: 1.0 To: Richard Biener CC: , Jakub Jelinek Subject: Re: [PATCH] Preserve loops from CFG build until after RTL loop opts References: In-Reply-To: X-Virus-Found: No On 25/04/13 16:19, Richard Biener wrote: > and compared to the previous patch changed the tree-ssa-tailmerge.c > part to deal with merging of loop latch and loop preheader (even > if that's a really bad idea) to not regress gcc.dg/pr50763.c. > Any suggestion on how to improve that part welcome. > * tree-ssa-tail-merge.c: Include cfgloop.h. > (replace_block_by): When merging loop latches mark loops for fixup. > Index: trunk/gcc/tree-ssa-tail-merge.c > =================================================================== > *** trunk.orig/gcc/tree-ssa-tail-merge.c 2013-04-25 11:31:14.000000000 +0200 > --- trunk/gcc/tree-ssa-tail-merge.c 2013-04-25 12:39:00.236390580 +0200 > *************** along with GCC; see the file COPYING3. > *** 197,202 **** > --- 197,203 ---- > #include "gimple-pretty-print.h" > #include "tree-ssa-sccvn.h" > #include "tree-dump.h" > + #include "cfgloop.h" > > /* ??? This currently runs as part of tree-ssa-pre. Why is this not > a stand-alone GIMPLE pass? */ > *************** replace_block_by (basic_block bb1, basic > *** 1459,1464 **** > --- 1460,1476 ---- > /* Mark the basic block as deleted. */ > mark_basic_block_deleted (bb1); > > + /* ??? If we merge the loop preheader with the loop latch we are creating > + additional entries into the loop, eventually rotating it. > + Mark loops for fixup in this case. > + ??? This is a completely unwanted transform and will wreck most > + loops at this point - but with just not considering loop latches as > + merge candidates we fail to commonize the two loops in gcc.dg/pr50763.c. > + A better fix to avoid that regression is needed. */ > + if (current_loops > + && bb2->loop_father->latch == bb2) > + loops_state_set (LOOPS_NEED_FIXUP); > + > /* Redirect the incoming edges of bb1 to bb2. */ > for (i = EDGE_COUNT (bb1->preds); i > 0 ; --i) > { Richard, I'm not sure if I get your comment about the two loops in pr50763.c. There is just one loop, both before and after tail-merge. BEFORE: 2(e) / \ * * 3 9 \ / * * 4 / \ * * 5 10 | | * * +--*7 8(x) | / \ * \ * * / 6 11 TAIL-MERGE: 1. merges empty blocks 10 and 11 2. merges empty block 5 and 6 3. merges block 4 and 7, which are empty except for testing the conditional, The transformations in steps 2 and 3 affect the loop. AFTER: 2(e) / \ * * 3 9 \ / * * +-*4,7 | / \ \ * * 5,6 10,11 \ * 8(x) Although step 2 and 3 reduce the amount of BBs, which could make sense for compile-for-size, I wonder whether this transformation works in general. Step 3 only works if the same conditional is tested, which means an eternal loop. Step 2 works if the loop pre-header and the loop latch are empty. This will be the case quite often since loop_optimizer_init is called with LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES before pass_pre. OTOH, loops will typically have a non-virtual phi, with different values coming from the loop pre-header and the loop latch, which prevents the optimization. So I think this is really a cornercase, and we should disregard it if that makes things simpler. Rather than fixing up the loop structure, we could prevent tail-merge in these cases. The current fix tests for current_loops == NULL, and I'm not sure that can still happen there, given that we have PROP_loops. It's not evident to me that the test bb2->loop_father->latch == bb2 is sufficient. Before calling tail_merge_optimize, we call loop_optimizer_finalize in which we assert that LOOPS_MAY_HAVE_MULTIPLE_LATCHES from there on, so in theory we might miss some latches. But I guess that pre (having started out with simple latches) maintains simple latches throughout, and that tail-merge does the same. Tentative patch attached. I'll try build & test. [ Btw, it would be nice if restricting the optimization also means that we can simplify dominator handling in the pass. ] Thanks, - Tom 2013-04-26 Tom de Vries * tree-ssa-tail-merge.c (find_same_succ_bb): Skip loop latch bbs. (replace_block_by): Don't set LOOPS_NEED_FIXUP. * gcc.dg/pr50763.c: Update test. diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c index f2ab7444..e49c3e7 100644 --- a/gcc/tree-ssa-tail-merge.c +++ b/gcc/tree-ssa-tail-merge.c @@ -689,7 +689,8 @@ find_same_succ_bb (basic_block bb, same_succ *same_p) edge_iterator ei; edge e; - if (bb == NULL) + if (bb == NULL + || bb->loop_father->latch == bb) return; bitmap_set_bit (same->bbs, bb->index); FOR_EACH_EDGE (e, ei, bb->succs) @@ -1460,17 +1461,6 @@ replace_block_by (basic_block bb1, basic_block bb2) /* Mark the basic block as deleted. */ mark_basic_block_deleted (bb1); - /* ??? If we merge the loop preheader with the loop latch we are creating - additional entries into the loop, eventually rotating it. - Mark loops for fixup in this case. - ??? This is a completely unwanted transform and will wreck most - loops at this point - but with just not considering loop latches as - merge candidates we fail to commonize the two loops in gcc.dg/pr50763.c. - A better fix to avoid that regression is needed. */ - if (current_loops - && bb2->loop_father->latch == bb2) - loops_state_set (LOOPS_NEED_FIXUP); - /* Redirect the incoming edges of bb1 to bb2. */ for (i = EDGE_COUNT (bb1->preds); i > 0 ; --i) { diff --git a/gcc/testsuite/gcc.dg/pr50763.c b/gcc/testsuite/gcc.dg/pr50763.c index 60025e3..695b61c 100644 --- a/gcc/testsuite/gcc.dg/pr50763.c +++ b/gcc/testsuite/gcc.dg/pr50763.c @@ -12,5 +12,5 @@ foo (int c, int d) while (c == d); } -/* { dg-final { scan-tree-dump-times "== 33" 1 "pre"} } */ +/* { dg-final { scan-tree-dump-times "== 33" 2 "pre"} } */ /* { dg-final { cleanup-tree-dump "pre" } } */