From patchwork Tue Jan 1 12:51:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 1019759 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-493235-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=ucw.cz Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="rA4QjgYb"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43TYw74xMqz9sCh for ; Tue, 1 Jan 2019 23:51:34 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=kZ7qHDBkevZsd00uMdsgH+YoQojks7KfZFoqg0uRuU8XMkEZP6fCv v12p/Yrjp0E+hvIoRu90x45dyPzGLu071udOoja36yHHyZA7wDRcZ/6iXumCV2vG Iacg/jhBq4eD+FjwFYAv216m3ofgF6+7kdvf2pk/0+w77Yddju4b60= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=PDiSaDb0VW1Qlo4I9qdMcPMRKTU=; b=rA4QjgYbpZa+NTousFZU xh5A5b9Sio7UR6QRyzVIJQkL7iCYNiIY+bMLDnhABY9d4FOOb/u9Qh7iu5KpPOfV YV/1aaat/3NfmWDsryCxBVT0HBapCINWCWcMMHQAT85atZxsW9KQkB+rS79l6eTZ nmnb1wYJy8jLCFG3dDFsSAM= Received: (qmail 66036 invoked by alias); 1 Jan 2019 12:51:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 66023 invoked by uid 89); 1 Jan 2019 12:51:26 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-15.1 required=5.0 tests=BAYES_00, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY autolearn=ham version=3.3.2 spammy=shape, gsi, labels, entrance X-HELO: nikam.ms.mff.cuni.cz Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 01 Jan 2019 12:51:20 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 381B4281E5B; Tue, 1 Jan 2019 13:51:18 +0100 (CET) Date: Tue, 1 Jan 2019 13:51:18 +0100 From: Jan Hubicka To: gcc-patches@gcc.gnu.org Subject: Fix profiling of thunks Message-ID: <20190101125118.b3hczdik73wcgiu4@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Disposition: inline User-Agent: NeoMutt/20170113 (1.7.2) Hi, this patch fixes second problem demonstrated by testcase Jakub reduced from firefox where we inline agressively into thunks. This is because we have no profile and inliner priorities are not really comparable. I will need to look into that case, but this patch adds the missing profile. This is also important to get indirect call profiling right. Profiled-bootstrapped/regtested x86_64-linux, comitted. Honza * coverage.c (get_coverage_counts): Use current_function_decl. * profile.c (read_thunk_profile): New function. (branch_prob): Add THUNK parameter. * tree-profile.c (tree_profiling): Handle thunks. * value-prof.c (init_node_map): Handle thunks. * value-prof.h (branch_prob): Upate prototype. (read_thunk_profile): Declare. * g++.dg/tree-prof/devirt.C: Update testcase. Index: coverage.c =================================================================== --- coverage.c (revision 267488) +++ coverage.c (working copy) @@ -329,7 +329,7 @@ get_coverage_counts (unsigned counter, u else { gcc_assert (coverage_node_map_initialized_p ()); - elt.ident = cgraph_node::get (cfun->decl)->profile_id; + elt.ident = cgraph_node::get (current_function_decl)->profile_id; } elt.ctr = counter; entry = counts_hash->find (&elt); Index: profile.c =================================================================== --- profile.c (revision 267488) +++ profile.c (working copy) @@ -963,6 +963,25 @@ compare_freqs (const void *p1, const voi return e2->dest->index - e1->dest->index; } +/* Only read execution count for thunks. */ + +void +read_thunk_profile (struct cgraph_node *node) +{ + tree old = current_function_decl; + current_function_decl = node->decl; + gcov_type *counts = get_coverage_counts (GCOV_COUNTER_ARCS, 0, 0, 1); + if (counts) + { + node->callees->count = node->count + = profile_count::from_gcov_type (counts[0]); + free (counts); + } + current_function_decl = old; + return; +} + + /* Instrument and/or analyze program behavior based on program the CFG. This function creates a representation of the control flow graph (of @@ -983,7 +1002,7 @@ compare_freqs (const void *p1, const voi Main entry point of this file. */ void -branch_prob (void) +branch_prob (bool thunk) { basic_block bb; unsigned i; @@ -1000,118 +1019,121 @@ branch_prob (void) hash_set streamed_locations; - /* We can't handle cyclic regions constructed using abnormal edges. - To avoid these we replace every source of abnormal edge by a fake - edge from entry node and every destination by fake edge to exit. - This keeps graph acyclic and our calculation exact for all normal - edges except for exit and entrance ones. - - We also add fake exit edges for each call and asm statement in the - basic, since it may not return. */ - - FOR_EACH_BB_FN (bb, cfun) + if (!thunk) { - int need_exit_edge = 0, need_entry_edge = 0; - int have_exit_edge = 0, have_entry_edge = 0; - edge e; - edge_iterator ei; + /* We can't handle cyclic regions constructed using abnormal edges. + To avoid these we replace every source of abnormal edge by a fake + edge from entry node and every destination by fake edge to exit. + This keeps graph acyclic and our calculation exact for all normal + edges except for exit and entrance ones. - /* Functions returning multiple times are not handled by extra edges. - Instead we simply allow negative counts on edges from exit to the - block past call and corresponding probabilities. We can't go - with the extra edges because that would result in flowgraph that - needs to have fake edges outside the spanning tree. */ + We also add fake exit edges for each call and asm statement in the + basic, since it may not return. */ - FOR_EACH_EDGE (e, ei, bb->succs) + FOR_EACH_BB_FN (bb, cfun) { - gimple_stmt_iterator gsi; - gimple *last = NULL; + int need_exit_edge = 0, need_entry_edge = 0; + int have_exit_edge = 0, have_entry_edge = 0; + edge e; + edge_iterator ei; + + /* Functions returning multiple times are not handled by extra edges. + Instead we simply allow negative counts on edges from exit to the + block past call and corresponding probabilities. We can't go + with the extra edges because that would result in flowgraph that + needs to have fake edges outside the spanning tree. */ - /* It may happen that there are compiler generated statements - without a locus at all. Go through the basic block from the - last to the first statement looking for a locus. */ - for (gsi = gsi_last_nondebug_bb (bb); - !gsi_end_p (gsi); - gsi_prev_nondebug (&gsi)) + FOR_EACH_EDGE (e, ei, bb->succs) { - last = gsi_stmt (gsi); - if (!RESERVED_LOCATION_P (gimple_location (last))) - break; - } + gimple_stmt_iterator gsi; + gimple *last = NULL; - /* Edge with goto locus might get wrong coverage info unless - it is the only edge out of BB. - Don't do that when the locuses match, so - if (blah) goto something; - is not computed twice. */ - if (last - && gimple_has_location (last) - && !RESERVED_LOCATION_P (e->goto_locus) - && !single_succ_p (bb) - && (LOCATION_FILE (e->goto_locus) - != LOCATION_FILE (gimple_location (last)) - || (LOCATION_LINE (e->goto_locus) - != LOCATION_LINE (gimple_location (last))))) + /* It may happen that there are compiler generated statements + without a locus at all. Go through the basic block from the + last to the first statement looking for a locus. */ + for (gsi = gsi_last_nondebug_bb (bb); + !gsi_end_p (gsi); + gsi_prev_nondebug (&gsi)) + { + last = gsi_stmt (gsi); + if (!RESERVED_LOCATION_P (gimple_location (last))) + break; + } + + /* Edge with goto locus might get wrong coverage info unless + it is the only edge out of BB. + Don't do that when the locuses match, so + if (blah) goto something; + is not computed twice. */ + if (last + && gimple_has_location (last) + && !RESERVED_LOCATION_P (e->goto_locus) + && !single_succ_p (bb) + && (LOCATION_FILE (e->goto_locus) + != LOCATION_FILE (gimple_location (last)) + || (LOCATION_LINE (e->goto_locus) + != LOCATION_LINE (gimple_location (last))))) + { + basic_block new_bb = split_edge (e); + edge ne = single_succ_edge (new_bb); + ne->goto_locus = e->goto_locus; + } + if ((e->flags & (EDGE_ABNORMAL | EDGE_ABNORMAL_CALL)) + && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)) + need_exit_edge = 1; + if (e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)) + have_exit_edge = 1; + } + FOR_EACH_EDGE (e, ei, bb->preds) { - basic_block new_bb = split_edge (e); - edge ne = single_succ_edge (new_bb); - ne->goto_locus = e->goto_locus; + if ((e->flags & (EDGE_ABNORMAL | EDGE_ABNORMAL_CALL)) + && e->src != ENTRY_BLOCK_PTR_FOR_FN (cfun)) + need_entry_edge = 1; + if (e->src == ENTRY_BLOCK_PTR_FOR_FN (cfun)) + have_entry_edge = 1; } - if ((e->flags & (EDGE_ABNORMAL | EDGE_ABNORMAL_CALL)) - && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)) - need_exit_edge = 1; - if (e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)) - have_exit_edge = 1; - } - FOR_EACH_EDGE (e, ei, bb->preds) - { - if ((e->flags & (EDGE_ABNORMAL | EDGE_ABNORMAL_CALL)) - && e->src != ENTRY_BLOCK_PTR_FOR_FN (cfun)) - need_entry_edge = 1; - if (e->src == ENTRY_BLOCK_PTR_FOR_FN (cfun)) - have_entry_edge = 1; - } - if (need_exit_edge && !have_exit_edge) - { - if (dump_file) - fprintf (dump_file, "Adding fake exit edge to bb %i\n", - bb->index); - make_edge (bb, EXIT_BLOCK_PTR_FOR_FN (cfun), EDGE_FAKE); - } - if (need_entry_edge && !have_entry_edge) - { - if (dump_file) - fprintf (dump_file, "Adding fake entry edge to bb %i\n", - bb->index); - make_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun), bb, EDGE_FAKE); - /* Avoid bbs that have both fake entry edge and also some - exit edge. One of those edges wouldn't be added to the - spanning tree, but we can't instrument any of them. */ - if (have_exit_edge || need_exit_edge) + if (need_exit_edge && !have_exit_edge) + { + if (dump_file) + fprintf (dump_file, "Adding fake exit edge to bb %i\n", + bb->index); + make_edge (bb, EXIT_BLOCK_PTR_FOR_FN (cfun), EDGE_FAKE); + } + if (need_entry_edge && !have_entry_edge) { - gimple_stmt_iterator gsi; - gimple *first; - - gsi = gsi_start_nondebug_after_labels_bb (bb); - gcc_checking_assert (!gsi_end_p (gsi)); - first = gsi_stmt (gsi); - /* Don't split the bbs containing __builtin_setjmp_receiver - or ABNORMAL_DISPATCHER calls. These are very - special and don't expect anything to be inserted before - them. */ - if (is_gimple_call (first) - && (gimple_call_builtin_p (first, BUILT_IN_SETJMP_RECEIVER) - || (gimple_call_flags (first) & ECF_RETURNS_TWICE) - || (gimple_call_internal_p (first) - && (gimple_call_internal_fn (first) - == IFN_ABNORMAL_DISPATCHER)))) - continue; - if (dump_file) - fprintf (dump_file, "Splitting bb %i after labels\n", + fprintf (dump_file, "Adding fake entry edge to bb %i\n", bb->index); - split_block_after_labels (bb); + make_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun), bb, EDGE_FAKE); + /* Avoid bbs that have both fake entry edge and also some + exit edge. One of those edges wouldn't be added to the + spanning tree, but we can't instrument any of them. */ + if (have_exit_edge || need_exit_edge) + { + gimple_stmt_iterator gsi; + gimple *first; + + gsi = gsi_start_nondebug_after_labels_bb (bb); + gcc_checking_assert (!gsi_end_p (gsi)); + first = gsi_stmt (gsi); + /* Don't split the bbs containing __builtin_setjmp_receiver + or ABNORMAL_DISPATCHER calls. These are very + special and don't expect anything to be inserted before + them. */ + if (is_gimple_call (first) + && (gimple_call_builtin_p (first, BUILT_IN_SETJMP_RECEIVER) + || (gimple_call_flags (first) & ECF_RETURNS_TWICE) + || (gimple_call_internal_p (first) + && (gimple_call_internal_fn (first) + == IFN_ABNORMAL_DISPATCHER)))) + continue; + + if (dump_file) + fprintf (dump_file, "Splitting bb %i after labels\n", + bb->index); + split_block_after_labels (bb); + } } } } @@ -1143,7 +1165,18 @@ branch_prob (void) on the spanning tree. We insert as many abnormal and critical edges as possible to minimize number of edge splits necessary. */ - find_spanning_tree (el); + if (!thunk) + find_spanning_tree (el); + else + { + edge e; + edge_iterator ei; + /* Keep only edge from entry block to be instrumented. */ + FOR_EACH_BB_FN (bb, cfun) + FOR_EACH_EDGE (e, ei, bb->succs) + EDGE_INFO (e)->ignore = true; + } + /* Fake edges that are not on the tree will not be instrumented, so mark them ignored. */ @@ -1183,8 +1216,17 @@ branch_prob (void) the checksum in only once place, since it depends on the shape of the control flow which can change during various transformations. */ - cfg_checksum = coverage_compute_cfg_checksum (cfun); - lineno_checksum = coverage_compute_lineno_checksum (); + if (thunk) + { + /* At stream in time we do not have CFG, so we can not do checksums. */ + cfg_checksum = 0; + lineno_checksum = 0; + } + else + { + cfg_checksum = coverage_compute_cfg_checksum (cfun); + lineno_checksum = coverage_compute_lineno_checksum (); + } /* Write the data from which gcov can reconstruct the basic block graph and function line numbers (the gcno file). */ Index: testsuite/g++.dg/tree-prof/devirt.C =================================================================== --- testsuite/g++.dg/tree-prof/devirt.C (revision 267488) +++ testsuite/g++.dg/tree-prof/devirt.C (working copy) @@ -119,5 +119,5 @@ main () __builtin_abort (); } -/* { dg-final-use-not-autofdo { scan-tree-dump-times "folding virtual function call to virtual unsigned int mozPersonalDictionary::_ZThn16" 3 "dom3" } } */ -/* { dg-final-use-not-autofdo { scan-tree-dump-times "folding virtual function call to virtual unsigned int mozPersonalDictionary::AddRef" 3 "dom3" } } */ +/* { dg-final-use-not-autofdo { scan-tree-dump-times "folding virtual function call to virtual unsigned int mozPersonalDictionary::_ZThn16" 1 "dom3" } } */ +/* { dg-final-use-not-autofdo { scan-tree-dump-times "folding virtual function call to virtual unsigned int mozPersonalDictionary::AddRef" 1 "dom3" } } */ Index: tree-profile.c =================================================================== --- tree-profile.c (revision 267488) +++ tree-profile.c (working copy) @@ -739,7 +739,8 @@ tree_profiling (void) FOR_EACH_DEFINED_FUNCTION (node) { - if (!gimple_has_body_p (node->decl)) + bool thunk = false; + if (!gimple_has_body_p (node->decl) && !node->thunk.thunk_p) continue; /* Don't profile functions produced for builtin stuff. */ @@ -760,22 +761,43 @@ tree_profiling (void) if (!include_source_file_for_profile (file)) continue; + if (node->thunk.thunk_p) + { + /* We can not expand variadic thunks to Gimple. */ + if (stdarg_p (TREE_TYPE (node->decl))) + continue; + thunk = true; + /* When generate profile, expand thunk to gimple so it can be + instrumented same way as other functions. */ + if (profile_arc_flag) + node->expand_thunk (false, true); + /* Read cgraph profile but keep function as thunk at profile-use + time. */ + else + { + read_thunk_profile (node); + continue; + } + } + push_cfun (DECL_STRUCT_FUNCTION (node->decl)); if (dump_file) dump_function_header (dump_file, cfun->decl, dump_flags); /* Local pure-const may imply need to fixup the cfg. */ - if (execute_fixup_cfg () & TODO_cleanup_cfg) + if (gimple_has_body_p (node->decl) + && (execute_fixup_cfg () & TODO_cleanup_cfg)) cleanup_tree_cfg (); - branch_prob (); + branch_prob (thunk); if (! flag_branch_probabilities && flag_profile_values) gimple_gen_ic_func_profiler (); if (flag_branch_probabilities + && !thunk && flag_profile_values && flag_value_profile_transformations) gimple_value_profile_transformations (); Index: value-prof.c =================================================================== --- value-prof.c (revision 267488) +++ value-prof.c (working copy) @@ -1188,7 +1188,7 @@ init_node_map (bool local) cgraph_node_map = new hash_map; FOR_EACH_DEFINED_FUNCTION (n) - if (n->has_gimple_body_p ()) + if (n->has_gimple_body_p () || n->thunk.thunk_p) { cgraph_node **val; if (local) Index: value-prof.h =================================================================== --- value-prof.h (revision 267488) +++ value-prof.h (working copy) @@ -112,7 +112,8 @@ extern struct cgraph_node* find_func_by_ /* In profile.c. */ extern void init_branch_prob (void); -extern void branch_prob (void); +extern void branch_prob (bool); +extern void read_thunk_profile (struct cgraph_node *); extern void end_branch_prob (void); #endif /* GCC_VALUE_PROF_H */