From patchwork Wed Jul 12 23:30:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 1807020 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=is0aBkIc; dkim-atps=neutral Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4R1YsC0yYxz20Ph for ; Thu, 13 Jul 2023 09:31:11 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4DFC23857011 for ; Wed, 12 Jul 2023 23:31:08 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4DFC23857011 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689204668; bh=mScY39AEhHaAWpzadeaWFVGozuYr8mwbvhu3Wyy5WNQ=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=is0aBkIcT1mWPb1vhYPU7e4n9t8ehPXqC/rd5DbHXfqOe4sevMGcobcCkIxHLaeI2 ukhqOhMxrAECI9nnXJ6a0HXTcYNZZ4+QyQ+dRpm8p/3AQcFDO5hgibg79JtXjHHZ9Y KSnKPb1LvJrmPZch9sT2sB0bpCAnxitnMtwcZb+U= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id DADE13857023 for ; Wed, 12 Jul 2023 23:30:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DADE13857023 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id EF691281D4F; Thu, 13 Jul 2023 01:30:20 +0200 (CEST) Date: Thu, 13 Jul 2023 01:30:20 +0200 To: gcc-patches@gcc.gnu.org Subject: Loop-ch improvements, part 2 Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_NUMSUBJECT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Hubicka via Gcc-patches From: Jan Hubicka Reply-To: Jan Hubicka Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi, as discussed this patch moves profile updating to tree-ssa-loop-ch.cc since it is now quite ch specific. There are no functional changes. Boostrapped/regtesed x86_64-linux, comitted. gcc/ChangeLog: * tree-cfg.cc (gimple_duplicate_sese_region): Rename to ... (gimple_duplicate_seme_region): ... this; break out profile updating code to ... * tree-ssa-loop-ch.cc (update_profile_after_ch): ... here. (ch_base::copy_headers): Update. * tree-cfg.h (gimple_duplicate_sese_region): Rename to ... (gimple_duplicate_seme_region): ... this. diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index 7dad7b4ac72..7ccc2a5a5a7 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -6662,25 +6662,19 @@ add_phi_args_after_copy (basic_block *region_copy, unsigned n_region, The function returns false if it is unable to copy the region, true otherwise. - ELIMINATED_EDGE is an edge that is known to be removed in the dupicated - region. ORIG_ELIMINATED_EDGES, if non-NULL is set of edges known to be - removed from the original region. */ + It is callers responsibility to update profile. */ bool -gimple_duplicate_sese_region (edge entry, edge exit, +gimple_duplicate_seme_region (edge entry, edge exit, basic_block *region, unsigned n_region, basic_block *region_copy, - bool update_dominance, - edge eliminated_edge, - hash_set *orig_eliminated_edges) + bool update_dominance) { unsigned i; bool free_region_copy = false, copying_header = false; class loop *loop = entry->dest->loop_father; edge exit_copy; edge redirected; - profile_count total_count = profile_count::uninitialized (); - profile_count entry_count = profile_count::uninitialized (); if (!can_copy_bbs_p (region, n_region)) return false; @@ -6733,144 +6727,10 @@ gimple_duplicate_sese_region (edge entry, edge exit, inside. */ auto_vec doms; if (update_dominance) - { - doms = get_dominated_by_region (CDI_DOMINATORS, region, n_region); - } - - if (entry->dest->count.initialized_p ()) - { - total_count = entry->dest->count; - entry_count = entry->count (); - /* Fix up corner cases, to avoid division by zero or creation of negative - frequencies. */ - if (entry_count > total_count) - entry_count = total_count; - } + doms = get_dominated_by_region (CDI_DOMINATORS, region, n_region); copy_bbs (region, n_region, region_copy, &exit, 1, &exit_copy, loop, split_edge_bb_loc (entry), update_dominance); - if (total_count.initialized_p () && entry_count.initialized_p ()) - { - if (!eliminated_edge - && (!orig_eliminated_edges || orig_eliminated_edges->is_empty ())) - { - scale_bbs_frequencies_profile_count (region, n_region, - total_count - entry_count, - total_count); - scale_bbs_frequencies_profile_count (region_copy, n_region, - entry_count, total_count); - } - else - { - /* We only support only case where eliminated_edge is one and it - exists first BB. We also assume that the duplicated region is - acyclic. So we expect the following: - - // region_copy_start entry will be scaled to entry_count - if (cond1) <- this condition will become false - and we update probabilities - goto loop_exit; - if (cond2) <- this condition is loop invariant - goto loop_exit; - goto loop_header <- this will be redirected to loop. - // region_copy_end - loop: - - // region start - loop_header: - if (cond1) <- we need to update probabbility here - goto loop_exit; - if (cond2) <- and determine scaling factor here. - moreover cond2 is now always true - goto loop_exit; - else - goto loop; - // region end - - Adding support for more exits can be done similarly, - but only consumer so far is tree-ssa-loop-ch and it uses only this - to handle the common case of peeling headers which have - conditionals known to be always true upon entry. */ - gcc_checking_assert (copying_header); - for (unsigned int i = 0; i < n_region; i++) - { - edge exit_e, exit_e_copy, e, e_copy; - if (EDGE_COUNT (region[i]->succs) == 1) - { - region_copy[i]->count = entry_count; - region[i]->count -= entry_count; - continue; - } - - gcc_checking_assert (EDGE_COUNT (region[i]->succs) == 2); - if (loop_exit_edge_p (region[0]->loop_father, - EDGE_SUCC (region[i], 0))) - { - exit_e = EDGE_SUCC (region[i], 0); - exit_e_copy = EDGE_SUCC (region_copy[i], 0); - e = EDGE_SUCC (region[i], 1); - e_copy = EDGE_SUCC (region_copy[i], 1); - } - else - { - exit_e = EDGE_SUCC (region[i], 1); - exit_e_copy = EDGE_SUCC (region_copy[i], 1); - e = EDGE_SUCC (region[i], 0); - e_copy = EDGE_SUCC (region_copy[i], 0); - } - gcc_assert (i == n_region - 1 - || (e->dest == region[i + 1] - && e_copy->dest == region_copy[i + 1])); - region_copy[i]->count = entry_count; - profile_count exit_e_count = exit_e->count (); - if (eliminated_edge == exit_e) - { - /* Update profile and the conditional. - CFG update is done by caller. */ - e_copy->probability = profile_probability::always (); - exit_e_copy->probability = profile_probability::never (); - gcond *cond_stmt - = as_a (*gsi_last_bb (region_copy[i])); - if (e_copy->flags & EDGE_TRUE_VALUE) - gimple_cond_make_true (cond_stmt); - else - gimple_cond_make_false (cond_stmt); - update_stmt (cond_stmt); - /* Header copying is a special case of jump threading, so use - common code to update loop body exit condition. */ - update_bb_profile_for_threading (region[i], entry_count, e); - eliminated_edge = NULL; - } - else - region[i]->count -= region_copy[i]->count; - if (orig_eliminated_edges->contains (exit_e)) - { - orig_eliminated_edges->remove (exit_e); - /* All exits will happen in exit_e_copy which is out of the - loop, so increase probability accordingly. - If the edge is eliminated_edge we already corrected - profile above. */ - if (entry_count.nonzero_p () && eliminated_edge != exit_e) - set_edge_probability_and_rescale_others - (exit_e_copy, exit_e_count.probability_in - (entry_count)); - /* Eliminate in-loop conditional. */ - e->probability = profile_probability::always (); - exit_e->probability = profile_probability::never (); - gcond *cond_stmt = as_a (*gsi_last_bb (region[i])); - if (e->flags & EDGE_TRUE_VALUE) - gimple_cond_make_true (cond_stmt); - else - gimple_cond_make_false (cond_stmt); - update_stmt (cond_stmt); - } - entry_count = e_copy->count (); - } - /* Be sure that we seen all edges we are supposed to update. */ - gcc_checking_assert (!eliminated_edge - && orig_eliminated_edges->is_empty ()); - } - } if (copying_header) { diff --git a/gcc/tree-cfg.h b/gcc/tree-cfg.h index a7cc37f3b97..89f06500909 100644 --- a/gcc/tree-cfg.h +++ b/gcc/tree-cfg.h @@ -69,9 +69,8 @@ extern tree gimple_block_label (basic_block); extern void add_phi_args_after_copy_bb (basic_block); extern void add_phi_args_after_copy (basic_block *, unsigned, edge); extern basic_block split_edge_bb_loc (edge); -extern bool gimple_duplicate_sese_region (edge, edge, basic_block *, unsigned, - basic_block *, bool, edge, - hash_set *); +extern bool gimple_duplicate_seme_region (edge, edge, basic_block *, unsigned, + basic_block *, bool); extern bool gimple_duplicate_sese_tail (edge, edge, basic_block *, unsigned, basic_block *); extern void gather_blocks_in_sese_region (basic_block entry, basic_block exit, diff --git a/gcc/tree-ssa-loop-ch.cc b/gcc/tree-ssa-loop-ch.cc index cae6266be46..24e7fbc805a 100644 --- a/gcc/tree-ssa-loop-ch.cc +++ b/gcc/tree-ssa-loop-ch.cc @@ -347,6 +347,128 @@ do_while_loop_p (class loop *loop) return true; } +/* Update profile after header copying of LOOP. + REGION is the original (in loop) sequence, REGION_COPY is the + duplicated header (now outside of loop). N_REGION is number of + bbs duplicated. + ELIMINATED_EDGE is edge to be removed from duplicated sequence. + INVARIANT_EXITS are edges in the loop body to be elimianted + since they are loop invariants + + So We expect the following: + + // region_copy_start entry will be scaled to entry_count + if (cond1) <- this condition will become false + and we update probabilities + goto loop_exit; + if (cond2) <- this condition is loop invariant + goto loop_exit; + goto loop_header <- this will be redirected to loop. + // region_copy_end + loop: + + // region start + loop_header: + if (cond1) <- we need to update probabbility here + goto loop_exit; + if (cond2) <- and determine scaling factor here. + moreover cond2 is now always true + goto loop_exit; + else + goto loop; + // region end + + Adding support for more exits can be done similarly, + but only consumer so far is tree-ssa-loop-ch and it uses only this + to handle the common case of peeling headers which have + conditionals known to be always true upon entry. */ + +static void +update_profile_after_ch (class loop *loop, + basic_block *region, basic_block *region_copy, + unsigned n_region, edge eliminated_edge, + hash_set *invariant_exits, + profile_count entry_count) +{ + for (unsigned int i = 0; i < n_region; i++) + { + edge exit_e, exit_e_copy, e, e_copy; + if (EDGE_COUNT (region[i]->succs) == 1) + { + region_copy[i]->count = entry_count; + region[i]->count -= entry_count; + continue; + } + + gcc_checking_assert (EDGE_COUNT (region[i]->succs) == 2); + if (loop_exit_edge_p (loop, + EDGE_SUCC (region[i], 0))) + { + exit_e = EDGE_SUCC (region[i], 0); + exit_e_copy = EDGE_SUCC (region_copy[i], 0); + e = EDGE_SUCC (region[i], 1); + e_copy = EDGE_SUCC (region_copy[i], 1); + } + else + { + exit_e = EDGE_SUCC (region[i], 1); + exit_e_copy = EDGE_SUCC (region_copy[i], 1); + e = EDGE_SUCC (region[i], 0); + e_copy = EDGE_SUCC (region_copy[i], 0); + } + gcc_assert (i == n_region - 1 + || (e->dest == region[i + 1] + && e_copy->dest == region_copy[i + 1])); + region_copy[i]->count = entry_count; + profile_count exit_e_count = exit_e->count (); + if (eliminated_edge == exit_e) + { + /* Update profile and the conditional. + CFG update is done by caller. */ + e_copy->probability = profile_probability::always (); + exit_e_copy->probability = profile_probability::never (); + gcond *cond_stmt + = as_a (*gsi_last_bb (region_copy[i])); + if (e_copy->flags & EDGE_TRUE_VALUE) + gimple_cond_make_true (cond_stmt); + else + gimple_cond_make_false (cond_stmt); + update_stmt (cond_stmt); + /* Header copying is a special case of jump threading, so use + common code to update loop body exit condition. */ + update_bb_profile_for_threading (region[i], entry_count, e); + eliminated_edge = NULL; + } + else + region[i]->count -= region_copy[i]->count; + if (invariant_exits->contains (exit_e)) + { + invariant_exits->remove (exit_e); + /* All exits will happen in exit_e_copy which is out of the + loop, so increase probability accordingly. + If the edge is eliminated_edge we already corrected + profile above. */ + if (entry_count.nonzero_p () && eliminated_edge != exit_e) + set_edge_probability_and_rescale_others + (exit_e_copy, exit_e_count.probability_in + (entry_count)); + /* Eliminate in-loop conditional. */ + e->probability = profile_probability::always (); + exit_e->probability = profile_probability::never (); + gcond *cond_stmt = as_a (*gsi_last_bb (region[i])); + if (e->flags & EDGE_TRUE_VALUE) + gimple_cond_make_true (cond_stmt); + else + gimple_cond_make_false (cond_stmt); + update_stmt (cond_stmt); + } + entry_count = e_copy->count (); + } + /* Be sure that we seen all edges we are supposed to update. */ + gcc_checking_assert (!eliminated_edge + && invariant_exits->is_empty ()); +} + namespace { /* Common superclass for both header-copying phases. */ @@ -593,14 +715,17 @@ ch_base::copy_headers (function *fun) entry = loop_preheader_edge (loop); propagate_threaded_block_debug_into (exit->dest, entry->dest); - if (!gimple_duplicate_sese_region (entry, exit, bbs, n_bbs, copied_bbs, - true, candidate.static_exit, - &invariant_exits)) + if (!gimple_duplicate_seme_region (entry, exit, bbs, n_bbs, copied_bbs, + true)) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Duplication failed.\n"); continue; } + if (loop->header->count.initialized_p ()) + update_profile_after_ch (loop, bbs, copied_bbs, n_bbs, + candidate.static_exit, &invariant_exits, + entry_count); copied.safe_push (std::make_pair (entry, loop)); /* If the loop has the form "for (i = j; i < j + 10; i++)" then