From patchwork Fri May 27 16:30:44 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 627288 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rGWjX6jXbz9t3t for ; Sat, 28 May 2016 02:31:12 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=rBmu5rRm; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:message-id:date:mime-version:content-type; q=dns; s= default; b=raxZkKBFMwaH5vXOBW7A4MrywNRN9lTvpMOUF42v9nLrFBzqqNwZu wXvzBYpqKzoPQIqnD8umNwFUm+WNwTk7had7uinac8e2pgIUWzeCFwwG7KaWp6eK UfNApHr43KmFxOke8bamqE5m6tLPj5e3Ldwf8tZnGHgrfXNZouT61I= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:message-id:date:mime-version:content-type; s= default; bh=YpUno4K1AO0Jx8zSXlj6sfNPs6Y=; b=rBmu5rRm28A+9HdcTbd0 ugp2UWpD74Wl4bikwcjOIvdAHgTQrYaNsfx/TlCldeNOSrtn7N/RbOTPTng92ohZ oYkPB9aK8mAyf4VmB4CMEkLlUVQ4DqPv1gSrm0n8MzXU+nqyED65D4lX251Mn1yc r9y0Y53MyQctqWsMbf2/ni4= Received: (qmail 21423 invoked by alias); 27 May 2016 16:31:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 21410 invoked by uid 89); 27 May 2016 16:31:03 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.6 required=5.0 tests=BAYES_50, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=pass_dce, NEXT_PASS, next_pass, tree_common X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Fri, 27 May 2016 16:30:51 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AF6B57D0C2 for ; Fri, 27 May 2016 16:30:49 +0000 (UTC) Received: from localhost.localdomain (ovpn-116-172.phx2.redhat.com [10.3.116.172]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u4RGUnr4029579 for ; Fri, 27 May 2016 12:30:49 -0400 From: Jeff Law To: gcc-patches Subject: Moving backwards/FSM threader into its own pass Message-ID: <40ef7284-eb9d-29c7-3af0-05d5de81e602@redhat.com> Date: Fri, 27 May 2016 10:30:44 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.0 MIME-Version: 1.0 X-IsSubscribed: yes It's been my plan since finally wrapping my head around Bodik's thesis to revamp how we handle jump threading to use some of the principles from his thesis. In particular, the back substitution and simplification model feels like the right long term direction. Sebastian's FSM threader was the first step on that path (gcc-5). Exploiting that threader for more than just FSM loops was the next big step (gcc-6). This patch takes the next step -- dis-entangling that new jump threading code from the old threading code and VRP/DOM. The key thing to realize here is that the backwards (FSM) jump threader does not inherently need the DOM tables nor the ASSERT_EXPRs from VRP to do its job. ie, it can and should run completely independently of DOM/VRP (though one day it may exploit range information that a prior VRP pass has computed). By moving the backwards threader into its own pass, we can run it prior to DOM/VRP, which allow DOM/VRP to work on a simpler CFG with larger extended basic blocks. The removal of unexecutable paths before VRP also has the nice effect of also eliminating false positive warnings for some work Aldy is doing around out-of-bound array index warnings. We can remove all the calls to the backwards threader from the old style threader. The way the FSM bits wired into the old threader caused redundant path evaluations. That can be easily fixed with the FSM bits in their own pass. The net is a 25% reduction in paths examined by the FSM threader. Finally, we ultimately end up threading more jumps. I don't have the #s handy anymore, but when I ran this through my tester there was a clear decrease in the number of runtime jumps. So what are the downsides. With the threader in its own pass, we end up getting more calls into the CFG & SSA verification routines in a checking-enabled compiler. So the compile-time improvement is lost for a checking-enabled compiler. The backward threader does not combine identical jump threading paths with different starting edges into a single jump threading path with multiple entry points. This is primarily a codesize issue, but can have a secondary effect on performance. I know how to fix this and it's on the list for gcc-7 along with further cleanups. Bootstrapped and regression tested on x86_64 linux. Installing on the trunk momentarily. Jeff commit 35bd646a4834a68a49af9ccb5873362a0fc742ae Author: Jeff Law Date: Fri May 27 10:23:54 2016 -0600 * tree-ssa-threadedge.c: Remove include of tree-ssa-threadbackward.h. (thread_across_edge): Remove calls to find_jump_threads_backwards. * passes.def: Add jump threading passes before DOM/VRP. * tree-ssa-threadbackward.c (find_jump_threads_backwards): Change argument to a basic block from an edge. Remove tests which are handled elsewhere. (pass_data_thread_jumps, class pass_thread_jumps): New. (pass_thread_jumps::gate, pass_thread_jumps::execute): New. (make_pass_thread_jumps): Likewise. * tree-pass.h (make_pass_thread_jumps): Declare. * gcc.dg/tree-ssa/pr21417.c: Update expected output. * gcc.dg/tree-ssa/pr66752-3.c: Likewise. * gcc.dg/tree-ssa/pr68198.c: Likewise. * gcc.dg/tree-ssa/pr69196-1.c: Likewise. * gcc.dg/tree-ssa/pr69270-3.c: Likewise. * gcc.dg/tree-ssa/ssa-dom-thread-2b.c: Likewise. * gcc.dg/tree-ssa/ssa-dom-thread-2g.c: Likewise. * gcc.dg/tree-ssa/ssa-dom-thread-2h.c: Likewise. * gcc.dg/tree-ssa/ssa-dom-thread-6.c: Likewise. * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Likewise. * gcc.dg/tree-ssa/ssa-dom-thread-12.c: Likewise. * gcc.dg/tree-ssa/ssa-dom-thread-13.c: Likewise. * gcc.dg/tree-ssa/vrp56.c: Likewise. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index f04d26d..40bac96 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,16 @@ +2016-05-26 Jeff Law + + * tree-ssa-threadedge.c: Remove include of tree-ssa-threadbackward.h. + (thread_across_edge): Remove calls to find_jump_threads_backwards. + * passes.def: Add jump threading passes before DOM/VRP. + * tree-ssa-threadbackward.c (find_jump_threads_backwards): Change + argument to a basic block from an edge. Remove tests which are + handled elsewhere. + (pass_data_thread_jumps, class pass_thread_jumps): New. + (pass_thread_jumps::gate, pass_thread_jumps::execute): New. + (make_pass_thread_jumps): Likewise. + * tree-pass.h (make_pass_thread_jumps): Declare. + 2016-05-27 Eric Botcazou * config/visium/visium-protos.h (split_double_move): Rename into... diff --git a/gcc/passes.def b/gcc/passes.def index 993ed28..3647e90 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -199,6 +199,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_return_slot); NEXT_PASS (pass_fre); NEXT_PASS (pass_merge_phi); + NEXT_PASS (pass_thread_jumps); NEXT_PASS (pass_vrp, true /* warn_array_bounds_p */); NEXT_PASS (pass_chkp_opt); NEXT_PASS (pass_dce); @@ -218,6 +219,7 @@ along with GCC; see the file COPYING3. If not see propagations have already run, but before some more dead code is removed, and this place fits nicely. Remember this when trying to move or duplicate pass_dominator somewhere earlier. */ + NEXT_PASS (pass_thread_jumps); NEXT_PASS (pass_dominator, true /* may_peel_loop_headers_p */); /* At this point the majority of const/copy propagations are exposed. Go ahead and identify paths that should never @@ -305,8 +307,10 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_strength_reduction); NEXT_PASS (pass_split_paths); NEXT_PASS (pass_tracer); + NEXT_PASS (pass_thread_jumps); NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */); NEXT_PASS (pass_strlen); + NEXT_PASS (pass_thread_jumps); NEXT_PASS (pass_vrp, false /* warn_array_bounds_p */); /* The only const/copy propagation opportunities left after DOM and VRP should be due to degenerate PHI nodes. So rather than diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 00c2a99..5ded2d4 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,19 @@ +2016-05-26 Jeff Law + + * gcc.dg/tree-ssa/pr21417.c: Update expected output. + * gcc.dg/tree-ssa/pr66752-3.c: Likewise. + * gcc.dg/tree-ssa/pr68198.c: Likewise. + * gcc.dg/tree-ssa/pr69196-1.c: Likewise. + * gcc.dg/tree-ssa/pr69270-3.c: Likewise. + * gcc.dg/tree-ssa/ssa-dom-thread-2b.c: Likewise. + * gcc.dg/tree-ssa/ssa-dom-thread-2g.c: Likewise. + * gcc.dg/tree-ssa/ssa-dom-thread-2h.c: Likewise. + * gcc.dg/tree-ssa/ssa-dom-thread-6.c: Likewise. + * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Likewise. + * gcc.dg/tree-ssa/ssa-dom-thread-12.c: Likewise. + * gcc.dg/tree-ssa/ssa-dom-thread-13.c: Likewise. + * gcc.dg/tree-ssa/vrp56.c: Likewise. + 2016-05-27 Ville Voutilainen PR c++/69855 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c index c865ee3..4845119 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-dom3-details" } */ +/* { dg-options "-O2 -fdump-tree-thread4-details" } */ struct tree_common { @@ -49,5 +49,5 @@ L23: /* We should thread the backedge to the top of the loop; ie we only execute the if (expr->common.code != 142) test once per loop iteration. */ -/* { dg-final { scan-tree-dump-times "FSM jump thread" 1 "dom3" } } */ +/* { dg-final { scan-tree-dump-times "FSM jump thread" 1 "thread4" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c index 2949cbb..896c8bf 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-vrp1-details -fdump-tree-optimized" } */ +/* { dg-options "-O2 -fdump-tree-thread1-details -fdump-tree-dce2" } */ extern int status, pt; extern int count; @@ -32,9 +32,10 @@ foo (int N, int c, int b, int *a) pt--; } -/* There are 3 FSM jump threading opportunities, all of which will be +/* There are 4 FSM jump threading opportunities, all of which will be realized, which will eliminate testing of FLAG, completely. */ -/* { dg-final { scan-tree-dump-times "Registering FSM" 3 "vrp1"} } */ +/* { dg-final { scan-tree-dump-times "Registering FSM" 4 "thread1"} } */ -/* There should be no assignments or references to FLAG. */ -/* { dg-final { scan-tree-dump-not "flag" "optimized"} } */ +/* There should be no assignments or references to FLAG, verify they're + eliminated as early as possible. */ +/* { dg-final { scan-tree-dump-not "if .flag" "dce2"} } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr68198.c b/gcc/testsuite/gcc.dg/tree-ssa/pr68198.c index 227ffeb..d678dc8 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr68198.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr68198.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-vrp1-details" } */ +/* { dg-options "-O2 -fdump-tree-thread1-details" } */ extern void abort (void); @@ -39,5 +39,5 @@ c_finish_omp_clauses (tree clauses) /* There are 3 FSM jump threading opportunities, two of which will get filtered out. */ -/* { dg-final { scan-tree-dump-times "Registering FSM" 1 "vrp1"} } */ -/* { dg-final { scan-tree-dump-times "FSM Thread through multiway branch without threading a multiway branch" 2 "vrp1"} } */ +/* { dg-final { scan-tree-dump-times "Registering FSM" 1 "thread1"} } */ +/* { dg-final { scan-tree-dump-times "FSM Thread through multiway branch without threading a multiway branch" 2 "thread1"} } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr69196-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr69196-1.c index 415ac2e..5842e28 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr69196-1.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr69196-1.c @@ -1,7 +1,7 @@ /* { dg-do compile { target sparc*-*-* x86_64-*-* } } */ -/* { dg-options "-O2 -fdump-tree-vrp1-details" } */ +/* { dg-options "-O2 -fdump-tree-thread1-details" } */ -/* { dg-final { scan-tree-dump "FSM did not thread around loop and would copy too many statements" "vrp1" } } */ +/* { dg-final { scan-tree-dump "FSM did not thread around loop and would copy too many statements" "thread1" } } */ typedef __builtin_va_list __gnuc_va_list; diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c index 89735f6..d546ac6 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c @@ -3,7 +3,7 @@ /* We're looking for a constant argument a PHI node. There should only be one if we unpropagate correctly. */ -/* { dg-final { scan-tree-dump-times ", 1" 1 "uncprop1"} } */ +/* { dg-final { scan-tree-dump-times ", 1" 4 "uncprop1"} } */ typedef long unsigned int size_t; typedef union gimple_statement_d *gimple; diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2b.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2b.c index 0d40254..eb66136 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2b.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2b.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-vrp1-stats -fdump-tree-dom2-stats" } */ +/* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-dom2-stats" } */ void foo(); void bla(); @@ -26,4 +26,4 @@ void thread_latch_through_header (void) case. And we want to thread through the header as well. These are both caught by threading in DOM. */ /* { dg-final { scan-tree-dump-not "Jumps threaded" "dom2"} } */ -/* { dg-final { scan-tree-dump-times "Jumps threaded: 2" 1 "vrp1"} } */ +/* { dg-final { scan-tree-dump-times "Jumps threaded: 1" 1 "thread1"} } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2g.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2g.c index 6d1ff5d..ede03e0 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2g.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2g.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-vrp1-stats -fdump-tree-dom2-stats" } */ +/* { dg-options "-O2 -fdump-tree-dom2-details" } */ void foo(); void bla(); @@ -22,5 +22,8 @@ void dont_thread_1 (void) } while (i++ < 100); } -/* { dg-final { scan-tree-dump "Jumps threaded: 2" "vrp1"} } */ -/* { dg-final { scan-tree-dump "Jumps threaded: 1" "dom2"} } */ +/* This one can only be threaded if both paths to the + conditional inside the loop are threaded at the same + time. Else we potentially end up with irreducible + loops. */ +/* { dg-final { scan-tree-dump-not "IRREDUCIBLE_LOOP" "dom2" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2h.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2h.c index 61705e1..043918d 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2h.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2h.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-vrp1-stats -fdump-tree-dom2-stats" } */ +/* { dg-options "-O2 -fdump-tree-dom2-details" } */ void foo(); void bla(); @@ -25,5 +25,4 @@ void dont_thread_2 (int first) /* Peeling off the first iteration would make threading through the loop latch safe, but we don't do that currently. */ -/* { dg-final { scan-tree-dump "Jumps threaded: 1" "vrp1"} } */ -/* { dg-final { scan-tree-dump "Jumps threaded: 1" "dom2"} } */ +/* { dg-final { scan-tree-dump-not "IRREDUCIBLE_LOOP" "dom2" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c index ab6b6eb..5ec4687 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-dom2-details" } */ -/* { dg-final { scan-tree-dump-times "FSM" 6 "dom2" } } */ +/* { dg-options "-O2 -fdump-tree-thread1-details -fdump-tree-thread2-details" } */ +/* { dg-final { scan-tree-dump-times "FSM" 3 "thread1" } } */ +/* { dg-final { scan-tree-dump-times "FSM" 4 "thread2" } } */ int sum0, sum1, sum2, sum3; int foo (char *s, char **ret) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c index a7a737b..7cd1451 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c @@ -1,7 +1,8 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-vrp1-stats -fdump-tree-dom2-stats -fdump-tree-dom3-stats -fdump-tree-vrp2-stats" } */ -/* { dg-final { scan-tree-dump "Jumps threaded: 19" "vrp1" } } */ -/* { dg-final { scan-tree-dump "Jumps threaded: 12" "dom2" } } */ +/* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-thread2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats -fdump-tree-vrp2-stats" } */ +/* { dg-final { scan-tree-dump "Jumps threaded: 16" "thread1" } } */ +/* { dg-final { scan-tree-dump "Jumps threaded: 11" "thread2" } } */ +/* { dg-final { scan-tree-dump "Jumps threaded: 3" "thread3" } } */ /* { dg-final { scan-tree-dump-not "Jumps threaded" "dom3" } } */ /* { dg-final { scan-tree-dump-not "Jumps threaded" "vrp2" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c index f250d22..fa6da7b 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c @@ -1,6 +1,8 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-dom2-details" } */ -/* { dg-final { scan-tree-dump "FSM" "dom2" } } */ +/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details" } */ +/* { dg-final { scan-tree-dump "FSM" "thread2" } } */ +/* { dg-final { scan-tree-dump "FSM" "thread3" } } */ +/* { dg-final { scan-tree-dump "FSM" "thread4" } } */ typedef struct bitmap_head_def *bitmap; typedef const struct bitmap_head_def *const_bitmap; diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-13.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-13.c index 99d45f5..f5f338b 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-13.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-13.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-vrp1-details" } */ -/* { dg-final { scan-tree-dump "FSM" "vrp1" } } */ +/* { dg-options "-O2 -fdump-tree-thread1-details" } */ +/* { dg-final { scan-tree-dump "FSM" "thread1" } } */ typedef struct rtx_def *rtx; typedef const struct rtx_def *const_rtx; diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp56.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp56.c index 481e888..3a64b01 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/vrp56.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp56.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-vrp1-details" } */ +/* { dg-options "-O2 -fdump-tree-thread1-stats" } */ typedef struct basic_block_def *basic_block; struct basic_block_def; struct edge_def; @@ -38,5 +38,5 @@ cleanup_empty_eh (basic_block bb) foo (); } } -/* { dg-final { scan-tree-dump-times "Threaded" 1 "vrp1"} } */ +/* { dg-final { scan-tree-dump "Jumps threaded: 1" "thread1"} } */ diff --git a/gcc/timevar.def b/gcc/timevar.def index 170ee9a..362aa6e 100644 --- a/gcc/timevar.def +++ b/gcc/timevar.def @@ -156,6 +156,7 @@ DEFTIMEVAR (TV_TREE_SSA_OTHER , "tree SSA other") DEFTIMEVAR (TV_TREE_SSA_INCREMENTAL , "tree SSA incremental") DEFTIMEVAR (TV_TREE_OPS , "tree operand scan") DEFTIMEVAR (TV_TREE_SSA_DOMINATOR_OPTS , "dominator optimization") +DEFTIMEVAR (TV_TREE_SSA_THREAD_JUMPS , "backwards jump threading") DEFTIMEVAR (TV_TREE_SRA , "tree SRA") DEFTIMEVAR (TV_ISOLATE_ERRONEOUS_PATHS , "isolate eroneous paths") DEFTIMEVAR (TV_TREE_CCP , "tree CCP") diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 66e103a..36299a6 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -398,6 +398,7 @@ extern gimple_opt_pass *make_pass_dce (gcc::context *ctxt); extern gimple_opt_pass *make_pass_cd_dce (gcc::context *ctxt); extern gimple_opt_pass *make_pass_call_cdce (gcc::context *ctxt); extern gimple_opt_pass *make_pass_merge_phi (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_thread_jumps (gcc::context *ctxt); extern gimple_opt_pass *make_pass_split_crit_edges (gcc::context *ctxt); extern gimple_opt_pass *make_pass_laddress (gcc::context *ctxt); extern gimple_opt_pass *make_pass_pre (gcc::context *ctxt); diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index 636b67d..57f1f5c 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -33,6 +33,8 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa-loop.h" #include "cfganal.h" #include "tree-pass.h" +#include "gimple-ssa.h" +#include "tree-phinodes.h" static int max_threaded_paths; @@ -596,15 +598,13 @@ fsm_find_control_statement_thread_paths (tree name, finding a path where NAME is a constant, we can thread the path. */ void -find_jump_threads_backwards (edge e) +find_jump_threads_backwards (basic_block bb) { if (!flag_expensive_optimizations - || optimize_function_for_size_p (cfun) - || e->dest->loop_father != e->src->loop_father - || loop_depth (e->dest->loop_father) == 0) + || optimize_function_for_size_p (cfun)) return; - gimple *stmt = get_gimple_control_stmt (e->dest); + gimple *stmt = get_gimple_control_stmt (bb); if (!stmt) return; @@ -628,7 +628,7 @@ find_jump_threads_backwards (edge e) vec *bb_path; vec_alloc (bb_path, 10); - vec_safe_push (bb_path, e->dest); + vec_safe_push (bb_path, bb); hash_set *visited_bbs = new hash_set; max_threaded_paths = PARAM_VALUE (PARAM_MAX_FSM_THREAD_PATHS); @@ -637,3 +637,60 @@ find_jump_threads_backwards (edge e) delete visited_bbs; vec_free (bb_path); } + +namespace { + +const pass_data pass_data_thread_jumps = +{ + GIMPLE_PASS, + "thread", + OPTGROUP_NONE, + TV_TREE_SSA_THREAD_JUMPS, + ( PROP_cfg | PROP_ssa ), + 0, + 0, + 0, + ( TODO_cleanup_cfg | TODO_update_ssa ), +}; + +class pass_thread_jumps : public gimple_opt_pass +{ +public: + pass_thread_jumps (gcc::context *ctxt) + : gimple_opt_pass (pass_data_thread_jumps, ctxt) + {} + + opt_pass * clone (void) { return new pass_thread_jumps (m_ctxt); } + virtual bool gate (function *); + virtual unsigned int execute (function *); +}; + +bool +pass_thread_jumps::gate (function *fun ATTRIBUTE_UNUSED) +{ + return (flag_expensive_optimizations + && ! optimize_function_for_size_p (cfun)); +} + + +unsigned int +pass_thread_jumps::execute (function *fun) +{ + /* Try to thread each block with more than one successor. */ + basic_block bb; + FOR_EACH_BB_FN (bb, fun) + { + if (EDGE_COUNT (bb->succs) > 1) + find_jump_threads_backwards (bb); + } + thread_through_all_blocks (true); + return 0; +} + +} + +gimple_opt_pass * +make_pass_thread_jumps (gcc::context *ctxt) +{ + return new pass_thread_jumps (ctxt); +} diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c index eca3812..5fd5b98 100644 --- a/gcc/tree-ssa-threadedge.c +++ b/gcc/tree-ssa-threadedge.c @@ -34,7 +34,6 @@ along with GCC; see the file COPYING3. If not see #include "params.h" #include "tree-ssa-scopedtables.h" #include "tree-ssa-threadedge.h" -#include "tree-ssa-threadbackward.h" #include "tree-ssa-dom.h" #include "gimple-fold.h" @@ -1183,8 +1182,6 @@ thread_across_edge (gcond *dummy_cond, path->release (); delete path; - find_jump_threads_backwards (e); - /* A negative status indicates the target block was deemed too big to duplicate. Just quit now rather than trying to use the block as a joiner in a jump threading path. @@ -1231,10 +1228,7 @@ thread_across_edge (gcond *dummy_cond, { if ((e->flags & EDGE_DFS_BACK) != 0 || (taken_edge->flags & EDGE_DFS_BACK) != 0) - { - find_jump_threads_backwards (taken_edge); - continue; - } + continue; /* Push a fresh marker so we can unwind the equivalences created for each of E->dest's successors. */ @@ -1282,10 +1276,7 @@ thread_across_edge (gcond *dummy_cond, register_jump_thread (path); } else - { - find_jump_threads_backwards (path->last ()->e); - delete_jump_thread_path (path); - } + delete_jump_thread_path (path); /* And unwind the equivalence table. */ if (avail_exprs_stack)