From patchwork Thu Sep 5 20:30:04 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 272945 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "www.sourceware.org", Issuer "StartCom Class 1 Primary Intermediate Server CA" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 61D6A2C00F9 for ; Fri, 6 Sep 2013 06:42:35 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; q= dns; s=default; b=r4f1wSmc4zS4sgykZbHxhyb1ru/ZwRnhGlybubgsKYNknq iTurzKPXUBsU/GbR8pIok4edtGBp0hSC1u8pOkM3FmuizFG32miwIMhn5ZdDvq2Q kuuYnK+IUZ0KkpYmuumzNNzoq+nXjIp5ecdOOCdQtBNZhLXnp8+tzrYma3nnU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; s= default; bh=Zb3vslIvOI6Cxix0R+CnT9bktHw=; b=eeGvCTw803Ky4u7wWhgL obGL7AUTKHdG/MH7+7IaZC5OebTfuPiEz8JbJQcXXPY7HL3RL/b6oIkfObjZTC8M gp8dWawN2kuJt1pIkzHa+VfzUmCneeDmhKxrv0Fyel+3j53gfoh/z/VGtWwf8FHk KE1RmxgGHHCAC+TI47Od5o4= Received: (qmail 2088 invoked by alias); 5 Sep 2013 20:42:28 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 2079 invoked by uid 89); 5 Sep 2013 20:42:28 -0000 Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 05 Sep 2013 20:42:28 +0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.8 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r85KgO3t002617 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 5 Sep 2013 16:42:25 -0400 Received: from stumpy.slc.redhat.com (ovpn-113-28.phx2.redhat.com [10.3.113.28]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r85KU4rM028927 for ; Thu, 5 Sep 2013 16:30:04 -0400 Message-ID: <5228E9CC.8030207@redhat.com> Date: Thu, 05 Sep 2013 14:30:04 -0600 From: Jeff Law User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130805 Thunderbird/17.0.8 MIME-Version: 1.0 To: gcc-patches Subject: Simplify jump threading slightly X-IsSubscribed: yes The basic structure we're moving towards for the general form of the FSA optimization will look something like: while (something changed) { changed = thread_through_empty_blocks () if (!threaded_through_block_with_side_effects) changed |= thread_through_block_with_wide_effects (); if (!threaded_through_joiner_block) changed |= thread_through_joiner_block (); } The idea being that the updating code will know how to handle updating the CFG and SSA graph when copying precisely one block with side effects and one joiner block, potentially in either order with an unlimited number of blocks which just transfer control without side effects thrown into the mix as well. So basically we're going to be classifying blocks into 3 categories. 1. Those with no side effects other than transferring control and we can statically determine the successor. Note these should require no copying as we can just wire up the edges in the desired fashion. 2. Blocks with side effects where we can statically determine the successor. These require duplication so we can preserve the side effects. Obviously the duplicate will transfer control to the desired destination unconditionally. 3. Joiner blocks. These blocks aren't threadable, but have a successor which is threadable. We have to copy the joiner block and redirect the edge leading to the threadable successor either to the ultimate destination (if the successor is of type #1) or to a duplicate of the successor if it is of type #2. We're going to annotate each block in the path with its type so that the updating code can "easily" determine how to perform the update. To this end I'm reorganizing the code to detect the thread path into subroutines to detect each of the three types of blocks. This is the first (and obviously easiest) of those changes. In the specific case of the FSA optimization, the key path(s) to thread have the form #3 -> #1 -> #2. Where #1 is the loop back edge. Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Installed onto the trunk. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 91f41b1..ce0e39d 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2013-09-05 Jeff Law + + * tree-ssa-threadedge.c (thread_around_empty_blocks): Renamed + from thread_around_empty_block. Record threading path into PATH. + Recurse if threading through the initial block is successful. + (thread_across_edge): Corresponding changes to slightly simplify. + 2013-09-05 James Greenhalgh * config/aarch64/aarch64.md diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c index dddcfce..afdd0af 100644 --- a/gcc/tree-ssa-threadedge.c +++ b/gcc/tree-ssa-threadedge.c @@ -738,46 +738,68 @@ propagate_threaded_block_debug_into (basic_block dest, basic_block src) fewvars.release (); } -/* TAKEN_EDGE represents the an edge taken as a result of jump threading. - See if we can thread around TAKEN_EDGE->dest as well. If so, return - the edge out of TAKEN_EDGE->dest that we can statically compute will be - traversed. - - We are much more restrictive as to the contents of TAKEN_EDGE->dest - as the path isolation code in tree-ssa-threadupdate.c isn't prepared - to handle copying intermediate blocks on a threaded path. - - Long term a more consistent and structured approach to path isolation - would be a huge help. */ -static edge -thread_around_empty_block (edge taken_edge, - gimple dummy_cond, - bool handle_dominating_asserts, - tree (*simplify) (gimple, gimple), - bitmap visited) +/* See if TAKEN_EDGE->dest is a threadable block with no side effecs (ie, it + need not be duplicated as part of the CFG/SSA updating process). + + If it is threadable, add it to PATH and VISITED and recurse, ultimately + returning TRUE from the toplevel call. Otherwise do nothing and + return false. + + DUMMY_COND, HANDLE_DOMINATING_ASSERTS and SIMPLIFY are used to + try and simplify the condition at the end of TAKEN_EDGE->dest. */ +static bool +thread_around_empty_blocks (edge taken_edge, + gimple dummy_cond, + bool handle_dominating_asserts, + tree (*simplify) (gimple, gimple), + bitmap visited, + vec *path) { basic_block bb = taken_edge->dest; gimple_stmt_iterator gsi; gimple stmt; tree cond; - /* This block can have no PHI nodes. This is overly conservative. */ + /* The key property of these blocks is that they need not be duplicated + when threading. Thus they can not have visible side effects such + as PHI nodes. */ if (!gsi_end_p (gsi_start_phis (bb))) return NULL; /* Skip over DEBUG statements at the start of the block. */ gsi = gsi_start_nondebug_bb (bb); + /* If the block has no statements, but does have a single successor, then + it's just a forwarding block and we can thread through it trivially. */ if (gsi_end_p (gsi)) - return NULL; + { + if (single_succ_p (bb)) + { + taken_edge = single_succ_edge (bb); + if ((taken_edge->flags & EDGE_DFS_BACK) == 0 + && !bitmap_bit_p (visited, taken_edge->dest->index)) + { + bitmap_set_bit (visited, taken_edge->dest->index); + path->safe_push (taken_edge); + thread_around_empty_blocks (taken_edge, + dummy_cond, + handle_dominating_asserts, + simplify, + visited, + path); + return true; + } + } + return false; + } - /* This block can have no statements other than its control altering - statement. This is overly conservative. */ + /* The only real statements this block can have are a control + flow altering statement. Anything else stops the thread. */ stmt = gsi_stmt (gsi); if (gimple_code (stmt) != GIMPLE_COND && gimple_code (stmt) != GIMPLE_GOTO && gimple_code (stmt) != GIMPLE_SWITCH) - return NULL; + return false; /* Extract and simplify the condition. */ cond = simplify_control_stmt_condition (taken_edge, stmt, dummy_cond, @@ -788,15 +810,22 @@ thread_around_empty_block (edge taken_edge, path. */ if (cond && is_gimple_min_invariant (cond)) { - edge taken_edge = find_taken_edge (bb, cond); + taken_edge = find_taken_edge (bb, cond); if (bitmap_bit_p (visited, taken_edge->dest->index)) - return NULL; + return false; bitmap_set_bit (visited, taken_edge->dest->index); - return taken_edge; + path->safe_push (taken_edge); + thread_around_empty_blocks (taken_edge, + dummy_cond, + handle_dominating_asserts, + simplify, + visited, + path); + return true; } - return NULL; + return false; } /* E1 and E2 are edges into the same basic block. Return TRUE if the @@ -896,51 +925,40 @@ thread_across_edge (gimple dummy_cond, edge taken_edge = find_taken_edge (e->dest, cond); basic_block dest = (taken_edge ? taken_edge->dest : NULL); bitmap visited; - edge e2; - if (dest == e->dest) + /* DEST could be NULL for a computed jump to an absolute + address. */ + if (dest == NULL || dest == e->dest) goto fail; vec path = vNULL; path.safe_push (e); path.safe_push (taken_edge); - /* DEST could be null for a computed jump to an absolute - address. If DEST is not null, then see if we can thread - through it as well, this helps capture secondary effects - of threading without having to re-run DOM or VRP. */ - if (dest - && ((e->flags & EDGE_DFS_BACK) == 0 - || ! cond_arg_set_in_bb (taken_edge, e->dest))) + /* See if we can thread through DEST as well, this helps capture + secondary effects of threading without having to re-run DOM or + VRP. */ + if ((e->flags & EDGE_DFS_BACK) == 0 + || ! cond_arg_set_in_bb (taken_edge, e->dest)) { /* We don't want to thread back to a block we have already visited. This may be overly conservative. */ visited = BITMAP_ALLOC (NULL); bitmap_set_bit (visited, dest->index); bitmap_set_bit (visited, e->dest->index); - do - { - e2 = thread_around_empty_block (taken_edge, - dummy_cond, - handle_dominating_asserts, - simplify, - visited); - if (e2) - { - taken_edge = e2; - path.safe_push (e2); - } - } - while (e2); + thread_around_empty_blocks (taken_edge, + dummy_cond, + handle_dominating_asserts, + simplify, + visited, + &path); BITMAP_FREE (visited); } remove_temporary_equivalences (stack); - if (taken_edge) - { - propagate_threaded_block_debug_into (taken_edge->dest, e->dest); - register_jump_thread (path, false); - } + propagate_threaded_block_debug_into (path[path.length () - 1]->dest, + e->dest); + register_jump_thread (path, false); path.release (); return; } @@ -958,7 +976,7 @@ thread_across_edge (gimple dummy_cond, This is a stopgap until we have a more structured approach to path isolation. */ { - edge e2, e3, taken_edge; + edge taken_edge; edge_iterator ei; bool found = false; bitmap visited = BITMAP_ALLOC (NULL); @@ -976,28 +994,14 @@ thread_across_edge (gimple dummy_cond, of E->dest. */ path.safe_push (e); path.safe_push (taken_edge); - found = false; - e3 = taken_edge; - do - { - if ((e->flags & EDGE_DFS_BACK) == 0 - || ! cond_arg_set_in_bb (e3, e->dest)) - e2 = thread_around_empty_block (e3, + if ((e->flags & EDGE_DFS_BACK) == 0 + || ! cond_arg_set_in_bb (path[path.length () - 1], e->dest)) + found = thread_around_empty_blocks (taken_edge, dummy_cond, handle_dominating_asserts, simplify, - visited); - else - e2 = NULL; - - if (e2) - { - path.safe_push (e2); - e3 = e2; - found = true; - } - } - while (e2); + visited, + &path); /* If we were able to thread through a successor of E->dest, then record the jump threading opportunity. */ @@ -1008,10 +1012,10 @@ thread_across_edge (gimple dummy_cond, (E2->src) to the final target (E3->dest), then make sure that the PHI args associated with the edges E2 and E3 are the same. */ - tmp = find_edge (taken_edge->src, e3->dest); - if (!tmp || phi_args_equal_on_edges (tmp, e3)) + tmp = find_edge (taken_edge->src, path[path.length () - 1]->dest); + if (!tmp || phi_args_equal_on_edges (tmp, path[path.length () - 1])) { - propagate_threaded_block_debug_into (e3->dest, + propagate_threaded_block_debug_into (path[path.length () - 1]->dest, taken_edge->dest); register_jump_thread (path, true); }