From patchwork Fri Apr 26 14:27:52 2013
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Tom de Vries <Tom_deVries@mentor.com>
X-Patchwork-Id: 239883
Return-Path: 
 <gcc-patches-return-340634-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified))
	by ozlabs.org (Postfix) with ESMTPS id 165302C00D2
	for <incoming@patchwork.ozlabs.org>;
	Sat, 27 Apr 2013 00:28:11 +1000 (EST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:message-id:date:from:mime-version:to:cc:subject:references
	:in-reply-to:content-type; q=dns; s=default; b=IU6/eYldk2qrVaMbk
	7g3WKX25H0+UYpLbY7Uf76wwwKpIV6+8l2QF4uuWaOC6ZCqBvAI1R6YyaMb3kdO+
	VsS1ichByG68bjxvlbWyXWE1vJq0A0WNxxoCHk062WrqXJWOu0VxNgVjdDjgUqQn
	u1UI5jNBcBEySHK0l1SLSKcT74=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:message-id:date:from:mime-version:to:cc:subject:references
	:in-reply-to:content-type; s=default; bh=nB7hgil+DESuiAY3MaJvzJI
	veg8=; b=jMdugtEZrfzLkEvRVUMEF1lco3dBJQoOkLYduMFyBgE4XBNuwZwL318
	HOP2ASkxLo8EspRhn8eVm+NsWfP/uBJOkSU5II/us5h+XW2gVNYcQJdAr8qPpFN2
	VpWE2PYN/t0grrwTVa8+aAAoz7DiQnYzcPPe+2cvum3OBQzyNeuY=
Received: (qmail 32034 invoked by alias); 26 Apr 2013 14:28:02 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <mailto:gcc-patches-unsubscribe-##L=##H@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 32019 invoked by uid 89); 26 Apr 2013 14:28:02 -0000
X-Spam-SWARE-Status: No, score=-3.1 required=5.0 tests=AWL, BAYES_50,
	KHOP_RCVD_UNTRUST, KHOP_THREADED, RCVD_IN_HOSTKARMA_W,
	RCVD_IN_HOSTKARMA_WL autolearn=ham version=3.3.1
Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131)
	by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with
	ESMTP; Fri, 26 Apr 2013 14:28:01 +0000
Received: from svr-orw-fem-01.mgc.mentorg.com ([147.34.98.93])	by
	relay1.mentorg.com with esmtp id 1UVjcz-0004ZD-95 from
	Tom_deVries@mentor.com ; Fri, 26 Apr 2013 07:27:57 -0700
Received: from SVR-IES-FEM-01.mgc.mentorg.com ([137.202.0.104]) by
	svr-orw-fem-01.mgc.mentorg.com over TLS secured channel with
	Microsoft SMTPSVC(6.0.3790.4675); Fri, 26 Apr 2013 07:27:56 -0700
Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com
	(137.202.0.104) with Microsoft SMTP Server id 14.2.247.3;
	Fri, 26 Apr 2013 15:27:55 +0100
Message-ID: <517A8EE8.8010805@mentor.com>
Date: Fri, 26 Apr 2013 16:27:52 +0200
From: Tom de Vries <Tom_deVries@mentor.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
	rv:17.0) Gecko/20130329 Thunderbird/17.0.5
MIME-Version: 1.0
To: Richard Biener <rguenther@suse.de>
CC: <gcc-patches@gcc.gnu.org>, Jakub Jelinek <jakub@redhat.com>
Subject: Re: [PATCH] Preserve loops from CFG build until after RTL loop opts
References: <alpine.LNX.2.00.1304251607460.7969@zhemvz.fhfr.qr>
In-Reply-To: <alpine.LNX.2.00.1304251607460.7969@zhemvz.fhfr.qr>
X-Virus-Found: No

On 25/04/13 16:19, Richard Biener wrote:
<SNIP>
> and compared to the previous patch changed the tree-ssa-tailmerge.c
> part to deal with merging of loop latch and loop preheader (even
> if that's a really bad idea) to not regress gcc.dg/pr50763.c.
> Any suggestion on how to improve that part welcome.
<SNIP>
> 	* tree-ssa-tail-merge.c: Include cfgloop.h.
> 	(replace_block_by): When merging loop latches mark loops for fixup.
<SNIP>
> Index: trunk/gcc/tree-ssa-tail-merge.c
> ===================================================================
> *** trunk.orig/gcc/tree-ssa-tail-merge.c	2013-04-25 11:31:14.000000000 +0200
> --- trunk/gcc/tree-ssa-tail-merge.c	2013-04-25 12:39:00.236390580 +0200
> *************** along with GCC; see the file COPYING3.
> *** 197,202 ****
> --- 197,203 ----
>   #include "gimple-pretty-print.h"
>   #include "tree-ssa-sccvn.h"
>   #include "tree-dump.h"
> + #include "cfgloop.h"
>   
>   /* ??? This currently runs as part of tree-ssa-pre.  Why is this not
>      a stand-alone GIMPLE pass?  */
> *************** replace_block_by (basic_block bb1, basic
> *** 1459,1464 ****
> --- 1460,1476 ----
>     /* Mark the basic block as deleted.  */
>     mark_basic_block_deleted (bb1);
>   
> +   /* ???  If we merge the loop preheader with the loop latch we are creating
> +      additional entries into the loop, eventually rotating it.
> +      Mark loops for fixup in this case.
> +      ???  This is a completely unwanted transform and will wreck most
> +      loops at this point - but with just not considering loop latches as
> +      merge candidates we fail to commonize the two loops in gcc.dg/pr50763.c.
> +      A better fix to avoid that regression is needed.  */
> +   if (current_loops
> +       && bb2->loop_father->latch == bb2)
> +     loops_state_set (LOOPS_NEED_FIXUP);
> + 
>     /* Redirect the incoming edges of bb1 to bb2.  */
>     for (i = EDGE_COUNT (bb1->preds); i > 0 ; --i)
>       {
<SNIP>

Richard,

I'm not sure if I get your comment about the two loops in pr50763.c. There is
just one loop, both before and after tail-merge.

BEFORE:
        2(e)
       / \
      *   *
     3     9
      \   /
       * *
        4
       / \
      *   *
     5     10
     |     |
     *     *
 +--*7     8(x)
 |  / \    *
 \ *   *  /
  6     11

TAIL-MERGE:
1. merges empty blocks 10 and 11
2. merges empty block 5 and 6
3. merges block 4 and 7, which are empty except for testing the conditional,
The transformations in steps 2 and 3 affect the loop.

AFTER:
        2(e)
       / \
      *   *
     3     9
      \   /
       * *
    +-*4,7
    |  / \
    \ *   *
     5,6   10,11
                \
                 *
                  8(x)

Although step 2 and 3 reduce the amount of BBs, which could make sense for
compile-for-size, I wonder whether this transformation works in general.

Step 3 only works if the same conditional is tested, which means an eternal loop.

Step 2 works if the loop pre-header and the loop latch are empty. This will be
the case quite often since loop_optimizer_init is called with
LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES before pass_pre. OTOH, loops
will typically have a non-virtual phi, with different values coming from the
loop pre-header and the loop latch, which prevents the optimization.

So I think this is really a cornercase, and we should disregard it if that makes
things simpler.

Rather than fixing up the loop structure, we could prevent tail-merge in these
cases.

The current fix tests for current_loops == NULL, and I'm not sure that can still
happen there, given that we have PROP_loops.

It's not evident to me that the test bb2->loop_father->latch == bb2 is
sufficient. Before calling tail_merge_optimize, we call loop_optimizer_finalize
in which we assert that LOOPS_MAY_HAVE_MULTIPLE_LATCHES from there on, so in
theory we might miss some latches.

But I guess that pre (having started out with simple latches) maintains simple
latches throughout, and that tail-merge does the same.

Tentative patch attached. I'll try build & test.

[ Btw, it would be nice if restricting the optimization also means that we can
  simplify dominator handling in the pass. ]

Thanks,
- Tom

2013-04-26  Tom de Vries  <tom@codesourcery.com>

	* tree-ssa-tail-merge.c (find_same_succ_bb): Skip loop latch bbs.
	(replace_block_by): Don't set LOOPS_NEED_FIXUP.

	* gcc.dg/pr50763.c: Update test.
diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
index f2ab7444..e49c3e7 100644
--- a/gcc/tree-ssa-tail-merge.c
+++ b/gcc/tree-ssa-tail-merge.c
@@ -689,7 +689,8 @@ find_same_succ_bb (basic_block bb, same_succ *same_p)
   edge_iterator ei;
   edge e;
 
-  if (bb == NULL)
+  if (bb == NULL
+      || bb->loop_father->latch == bb)
     return;
   bitmap_set_bit (same->bbs, bb->index);
   FOR_EACH_EDGE (e, ei, bb->succs)
@@ -1460,17 +1461,6 @@ replace_block_by (basic_block bb1, basic_block bb2)
   /* Mark the basic block as deleted.  */
   mark_basic_block_deleted (bb1);
 
-  /* ???  If we merge the loop preheader with the loop latch we are creating
-     additional entries into the loop, eventually rotating it.
-     Mark loops for fixup in this case.
-     ???  This is a completely unwanted transform and will wreck most
-     loops at this point - but with just not considering loop latches as
-     merge candidates we fail to commonize the two loops in gcc.dg/pr50763.c.
-     A better fix to avoid that regression is needed.  */
-  if (current_loops
-      && bb2->loop_father->latch == bb2)
-    loops_state_set (LOOPS_NEED_FIXUP);
-
   /* Redirect the incoming edges of bb1 to bb2.  */
   for (i = EDGE_COUNT (bb1->preds); i > 0 ; --i)
     {
diff --git a/gcc/testsuite/gcc.dg/pr50763.c b/gcc/testsuite/gcc.dg/pr50763.c
index 60025e3..695b61c 100644
--- a/gcc/testsuite/gcc.dg/pr50763.c
+++ b/gcc/testsuite/gcc.dg/pr50763.c
@@ -12,5 +12,5 @@ foo (int c, int d)
   while (c == d);
 }
 
-/* { dg-final { scan-tree-dump-times "== 33" 1 "pre"} } */
+/* { dg-final { scan-tree-dump-times "== 33" 2 "pre"} } */
 /* { dg-final { cleanup-tree-dump "pre" } } */