From patchwork Sat Feb 12 13:43:06 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Oliva X-Patchwork-Id: 82906 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id EAECBB7123 for ; Sun, 13 Feb 2011 00:44:14 +1100 (EST) Received: (qmail 15799 invoked by alias); 12 Feb 2011 13:44:11 -0000 Received: (qmail 15776 invoked by uid 22791); 12 Feb 2011 13:44:04 -0000 X-SWARE-Spam-Status: No, hits=-4.7 required=5.0 tests=AWL, BAYES_40, RCVD_IN_DNSWL_HI, SPF_HELO_PASS, T_RP_MATCHES_RCVD, T_TVD_MIME_NO_HEADERS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sat, 12 Feb 2011 13:43:50 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p1CDhmHA008903 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 12 Feb 2011 08:43:48 -0500 Received: from freie.oliva.athome.lsd.ic.unicamp.br (ovpn01.gateway.prod.ext.phx2.redhat.com [10.5.9.1]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p1CDhjAA030011 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 12 Feb 2011 08:43:47 -0500 Received: from livre.localdomain (livre-to-gw.oliva.athome.lsd.ic.unicamp.br [172.31.160.19]) by freie.oliva.athome.lsd.ic.unicamp.br (8.14.4/8.14.4) with ESMTP id p1CDhiTx011229; Sat, 12 Feb 2011 11:43:44 -0200 Received: from livre.localdomain (aoliva@localhost.localdomain [127.0.0.1]) by livre.localdomain (8.14.3/8.14.3/Debian-5+lenny1) with ESMTP id p1CDhATX029170; Sat, 12 Feb 2011 11:43:11 -0200 Received: (from aoliva@localhost) by livre.localdomain (8.14.3/8.14.3/Submit) id p1CDh8Sc029168; Sat, 12 Feb 2011 11:43:08 -0200 From: Alexandre Oliva To: Richard Guenther Cc: gcc-patches@gcc.gnu.org, Jan Hubicka Subject: Re: [PR debug/47106] account used vars only once References: Date: Sat, 12 Feb 2011 11:43:06 -0200 In-Reply-To: (Richard Guenther's message of "Mon, 7 Feb 2011 14:39:45 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Feb 7, 2011, Richard Guenther wrote: > The referenced_var_lookup interface change is ok. After much hacking and testing, I have a patchset that works (no regressions) and that I'm happy with. I've broken it up into preparation patches with interface changes, the aactual change, one independent bug fix, some additional improvements that I regard as slightly risky, but that have survived multiple bootstrap cycles (-O1 and -O3 in addition to -O2, on i686 and x86_64), and some additional patches I used for testing and for verifying that the estimated stack sizes are unchanged from the current method, when debug info is not enabled (when it is, variables retained in lexical blocks for debug information only used to inflate the estimate, but now they no longer do) On Feb 7, 2011, Jan Hubicka wrote: >> we still tried to estimate the stack size of functions that hadn't >> been put in SSA form and had referenced_vars computed. >> From where we did so? SRA, indeed. On Feb 8, 2011, Richard Guenther wrote: > Yes, we can have cycles and no referenced vars in the callee. But > we won't inline that callee (as it isn't in SSA form), so we can just > as well skip looking at its estimated stack frame size. Excellent! On Feb 8, 2011, Jan Hubicka wrote: > Functions checking inline parameters (check_inline_limits etc.) are done only > after in_ssa_p check is done, so we should be safe here. Yup. > (with exception of disregard inline limits, you can move that check > later but any value of that flag is OK since we won't do anything with > the call) I hadn't realized what this note was about before now. I hit the problem of processing functions in the wrong order, breaking pr42632.c, which is why I introduced a new pass, compute_inlinable, at the point the initial compute_inline_params was. I tried not calling compute_inlinable from within compute_inline_parameters, but I found out inlinable changed, or had to be computed separately, or something like that. What I didn't try was to replace the explicit call of compute_inline_parameters in cgraph_process_new_functions with compute_inlinable. From the failures I observed, it now occurs to me that this might be what I was missing. Anyhow, this patch works, but if you'd rather I make this other change, I will. Anyhow, here is the patch set. Interface changes: The patch that fixes the problem, using referenced_vars to estimate stack sizes. This patch fixes a latent bug: when estimating function sizes/times, we use functions that access current_function_decl, without always having set it up to the function being estimated. These two patches remove unnecessary setting up of cfun and current_function_decl. Are the above ok to install? All regstrapped on x86_64-linux-gnu and i686-pc-linux-gnu, and also bootstrapped with -O1 and -O3, besides the tests described below. This FTR patch makes sure the previous two patches are safe, at least inasmuchas they ensure we don't unconditionally dereferenced cfun or current_function_decl within these functions. It wasn't enough, however, to detect the problem in function size/time estimation. That required comparing the output of the compiler with and without the patches that dropped cfun overriding. There was only one difference in stage3-like builds, in ada/trans.o, and that was fixed with the patch that set current_function_decl while estimating function body sizes and times. Finally, this FTR patch was used to compare the stack size estimations of the current (trunk) approach and the one introduced with this patchset. No messages printed in stage2 host for a bootstrap-debug build (no -g), indicating no differences; plenty printed during stage3 (with -g), always lower counts. for gcc/ChangeLog from Alexandre Oliva PR debug/47106 PR debug/47402 * cfgexpand.c (account_used_vars_for_block): Remove. (estimated_stack_frame_size): Use referenced vars. * tree-inline.c (remap_decl): Only mark VAR_DECLs as referenced that were referenced in the original function. Test src_fn rather than cfun. Drop redundant get_var_ann. (setup_one_parameter): Drop redundant get_var_ann. (declare_return_variable): Likewise. (copy_decl_for_dup_finish): Mark VAR_DECLs referenced in src_fn. (copy_arguments_for_versioning): Drop redundant get_var_ann. * tree-pass.h (pass_inlinable): Declare. * passes.c (init_optimization_passes): Move first pass_inline_parameters after pass_referenced_vars. Add pass_inlinable where it was before. * ipa-inline.c (compute_inlinable): New, split out of... (compute_inline_parameters): ... this. Quit if referenced_vars are not available. (compute_inlinable_for_current, pass_inlinable): New. (pass_inline_parameters): Require PROP_referenced_vars. * cgraphunit.c (cgraph_process_new_functions): Don't run compute_inline_parameters explicitly. for gcc/testsuite/ChangeLog from Alexandre Oliva PR debug/47106 PR debug/47402 * g++.dg/debug/pr47106.C: New. Index: gcc/cfgexpand.c =================================================================== --- gcc/cfgexpand.c.orig 2011-02-11 13:42:37.430636412 -0200 +++ gcc/cfgexpand.c 2011-02-11 13:42:40.067581306 -0200 @@ -1311,30 +1311,6 @@ create_stack_guard (void) crtl->stack_protect_guard = guard; } -/* A subroutine of expand_used_vars. Walk down through the BLOCK tree - expanding variables. Those variables that can be put into registers - are allocated pseudos; those that can't are put on the stack. - - TOPLEVEL is true if this is the outermost BLOCK. */ - -static HOST_WIDE_INT -account_used_vars_for_block (tree block, bool toplevel) -{ - tree t; - HOST_WIDE_INT size = 0; - - /* Expand all variables at this level. */ - for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t)) - if (var_ann (t) && is_used_p (t)) - size += expand_one_var (t, toplevel, false); - - /* Expand all variables at containing levels. */ - for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t)) - size += account_used_vars_for_block (t, false); - - return size; -} - /* Prepare for expanding variables. */ static void init_vars_expansion (void) @@ -1379,23 +1355,17 @@ estimated_stack_frame_size (struct cgrap { HOST_WIDE_INT size = 0; size_t i; - tree var, outer_block = DECL_INITIAL (current_function_decl); - unsigned ix; + tree var; tree old_cur_fun_decl = current_function_decl; + referenced_var_iterator rvi; + struct function *fn = DECL_STRUCT_FUNCTION (node->decl); current_function_decl = node->decl; - push_cfun (DECL_STRUCT_FUNCTION (node->decl)); + push_cfun (fn); - init_vars_expansion (); - - FOR_EACH_LOCAL_DECL (cfun, ix, var) - { - /* TREE_USED marks local variables that do not appear in lexical - blocks. We don't want to expand those that do twice. */ - if (TREE_USED (var)) - size += expand_one_var (var, true, false); - } - size += account_used_vars_for_block (outer_block, true); + gcc_checking_assert (gimple_referenced_vars (fn)); + FOR_EACH_REFERENCED_VAR (fn, var, rvi) + size += expand_one_var (var, true, false); if (stack_vars_num > 0) { Index: gcc/tree-inline.c =================================================================== --- gcc/tree-inline.c.orig 2011-02-11 13:42:38.454614859 -0200 +++ gcc/tree-inline.c 2011-02-11 13:42:40.122579711 -0200 @@ -312,13 +312,17 @@ remap_decl (tree decl, copy_body_data *i walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL); } - if (cfun && gimple_in_ssa_p (cfun) - && (TREE_CODE (t) == VAR_DECL - || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL)) - { - get_var_ann (t); - add_referenced_var (t); - } + if ((TREE_CODE (t) == VAR_DECL + || TREE_CODE (t) == RESULT_DECL + || TREE_CODE (t) == PARM_DECL) + && id->src_fn && DECL_STRUCT_FUNCTION (id->src_fn) + && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn)) + /* We don't want to mark as referenced VAR_DECLs that were + not marked as such in the src function. */ + && (TREE_CODE (decl) != VAR_DECL + || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn), + DECL_UID (decl)))) + add_referenced_var (t); return t; } @@ -2547,10 +2551,7 @@ setup_one_parameter (copy_body_data *id, /* We're actually using the newly-created var. */ if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL) - { - get_var_ann (var); - add_referenced_var (var); - } + add_referenced_var (var); /* Declare this new variable. */ DECL_CHAIN (var) = *vars; @@ -2857,10 +2858,7 @@ declare_return_variable (copy_body_data var = copy_result_decl_to_var (result, id); if (gimple_in_ssa_p (cfun)) - { - get_var_ann (var); - add_referenced_var (var); - } + add_referenced_var (var); DECL_SEEN_IN_BIND_EXPR_P (var) = 1; @@ -2896,10 +2894,7 @@ declare_return_variable (copy_body_data { tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr"); if (gimple_in_ssa_p (id->src_cfun)) - { - get_var_ann (temp); - add_referenced_var (temp); - } + add_referenced_var (temp); insert_decl_map (id, result, temp); /* When RESULT_DECL is in SSA form, we need to use it's default_def SSA_NAME. */ @@ -4753,6 +4748,14 @@ copy_decl_for_dup_finish (copy_body_data new function. */ DECL_CONTEXT (copy) = id->dst_fn; + if (TREE_CODE (decl) == VAR_DECL + /* C++ clones functions during parsing, before + referenced_vars. */ + && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn)) + && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn), + DECL_UID (decl))) + add_referenced_var (copy); + return copy; } @@ -4864,7 +4867,6 @@ copy_arguments_for_versioning (tree orig as temporary variable later in function, the uses will be replaced by local variable. */ tree var = copy_decl_to_var (arg, id); - get_var_ann (var); add_referenced_var (var); insert_decl_map (id, arg, var); /* Declare this new variable. */ Index: gcc/passes.c =================================================================== --- gcc/passes.c.orig 2011-02-11 13:42:38.255618853 -0200 +++ gcc/passes.c 2011-02-11 13:42:40.146579235 -0200 @@ -729,7 +729,7 @@ init_optimization_passes (void) NEXT_PASS (pass_build_cfg); NEXT_PASS (pass_warn_function_return); NEXT_PASS (pass_build_cgraph_edges); - NEXT_PASS (pass_inline_parameters); + NEXT_PASS (pass_inlinable); *p = NULL; /* Interprocedural optimization passes. */ @@ -747,12 +747,8 @@ init_optimization_passes (void) NEXT_PASS (pass_build_ssa); NEXT_PASS (pass_lower_vector); NEXT_PASS (pass_early_warn_uninitialized); - /* Note that it is not strictly necessary to schedule an early - inline pass here. However, some test cases (e.g., - g++.dg/other/p334435.C g++.dg/other/i386-1.C) expect extern - inline functions to be inlined even at -O0. This does not - happen during the first early inline pass. */ NEXT_PASS (pass_rebuild_cgraph_edges); + NEXT_PASS (pass_inline_parameters); NEXT_PASS (pass_early_inline); NEXT_PASS (pass_all_early_optimizations); { Index: gcc/ipa-inline.c =================================================================== --- gcc/ipa-inline.c.orig 2011-02-11 13:42:37.859627265 -0200 +++ gcc/ipa-inline.c 2011-02-11 13:42:55.655254908 -0200 @@ -1974,21 +1974,10 @@ estimate_function_body_sizes (struct cgr inline_summary (node)->size_inlining_benefit = size_inlining_benefit; } -/* Compute parameters of functions used by inliner. */ -void -compute_inline_parameters (struct cgraph_node *node) +/* Determine whether node is inlinable. */ +static void +compute_inlinable (struct cgraph_node *node) { - HOST_WIDE_INT self_stack_size; - - gcc_assert (!node->global.inlined_to); - - /* Estimate the stack size for the function if we're optimizing. */ - self_stack_size = optimize ? estimated_stack_frame_size (node) : 0; - inline_summary (node)->estimated_self_stack_size = self_stack_size; - node->global.estimated_stack_size = self_stack_size; - node->global.stack_frame_offset = 0; - - /* Can this function be inlined at all? */ node->local.inlinable = tree_inlinable_function_p (node->decl); /* Inlinable functions always can change signature. */ @@ -1998,7 +1987,7 @@ compute_inline_parameters (struct cgraph { struct cgraph_edge *e; - /* Functions calling builtlin_apply can not change signature. */ + /* Functions calling builtin_apply can not change signature. */ for (e = node->callees; e; e = e->next_callee) if (DECL_BUILT_IN (e->callee->decl) && DECL_BUILT_IN_CLASS (e->callee->decl) == BUILT_IN_NORMAL @@ -2006,9 +1995,33 @@ compute_inline_parameters (struct cgraph break; node->local.can_change_signature = !e; } + if (node->local.inlinable && !node->local.disregard_inline_limits) node->local.disregard_inline_limits = DECL_DISREGARD_INLINE_LIMITS (node->decl); +} + +/* Compute parameters of functions used by inliner. */ +void +compute_inline_parameters (struct cgraph_node *node) +{ + HOST_WIDE_INT self_stack_size; + + gcc_assert (!node->global.inlined_to); + + compute_inlinable (node); + + /* A function won't be considered for inlining before we put it in + SSA form and compute referenced_vars. */ + if (!gimple_referenced_vars (DECL_STRUCT_FUNCTION (node->decl))) + return; + + /* Estimate the stack size for the function if we're optimizing. */ + self_stack_size = optimize ? estimated_stack_frame_size (node) : 0; + inline_summary (node)->estimated_self_stack_size = self_stack_size; + node->global.estimated_stack_size = self_stack_size; + node->global.stack_frame_offset = 0; + estimate_function_body_sizes (node); /* Inlining characteristics are maintained by the cgraph_mark_inline. */ node->global.time = inline_summary (node)->self_time; @@ -2016,6 +2029,14 @@ compute_inline_parameters (struct cgraph } +/* Determine whether current_function_decl is inlinable. */ +static unsigned int +compute_inlinable_for_current (void) +{ + compute_inlinable (cgraph_node (current_function_decl)); + return 0; +} + /* Compute parameters of functions used by inliner using current_function_decl. */ static unsigned int @@ -2025,6 +2046,25 @@ compute_inline_parameters_for_current (v return 0; } +struct gimple_opt_pass pass_inlinable = +{ + { + GIMPLE_PASS, + "inlinable", /* name */ + NULL, /* gate */ + compute_inlinable_for_current, /* execute */ + NULL, /* sub */ + NULL, /* next */ + 0, /* static_pass_number */ + TV_INLINE_HEURISTICS, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + 0 /* todo_flags_finish */ + } +}; + struct gimple_opt_pass pass_inline_parameters = { { @@ -2036,7 +2076,7 @@ struct gimple_opt_pass pass_inline_param NULL, /* next */ 0, /* static_pass_number */ TV_INLINE_HEURISTICS, /* tv_id */ - 0, /* properties_required */ + PROP_referenced_vars, /* properties_required */ 0, /* properties_provided */ 0, /* properties_destroyed */ 0, /* todo_flags_start */ Index: gcc/cgraphunit.c =================================================================== --- gcc/cgraphunit.c.orig 2011-02-11 13:42:38.057623032 -0200 +++ gcc/cgraphunit.c 2011-02-11 13:42:40.211577872 -0200 @@ -246,7 +246,6 @@ cgraph_process_new_functions (void) cgraph_analyze_function (node); push_cfun (DECL_STRUCT_FUNCTION (fndecl)); current_function_decl = fndecl; - compute_inline_parameters (node); if ((cgraph_state == CGRAPH_STATE_IPA_SSA && !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl))) /* When not optimizing, be sure we run early local passes anyway Index: gcc/testsuite/g++.dg/debug/pr47106.C =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ gcc/testsuite/g++.dg/debug/pr47106.C 2011-02-11 13:42:40.236577347 -0200 @@ -0,0 +1,37 @@ +// { dg-do compile } +// { dg-options "-O -fpartial-inlining -flto -fconserve-stack -fcompare-debug" } + +void end (int, int) __attribute__ ((__noreturn__)); + +struct S +{ + int i; + S *s; +}; + +inline bool f (S *s) +{ + if (!s->s) + end (0, 0); + return s->s == s; +} + +inline bool +baz (S s1, S) +{ + while (f (&s1)); +} + +inline bool +bar (S s1, S s2, S) +{ + baz (s1, s2); +} + +S getS (); + +bool +foo () +{ + bar (getS (), getS (), getS ()); +} Index: gcc/tree-pass.h =================================================================== --- gcc/tree-pass.h.orig 2011-02-11 13:42:37.641631793 -0200 +++ gcc/tree-pass.h 2011-02-11 13:42:40.249577075 -0200 @@ -572,6 +572,7 @@ extern struct rtl_opt_pass pass_final; extern struct rtl_opt_pass pass_rtl_seqabstr; extern struct gimple_opt_pass pass_release_ssa_names; extern struct gimple_opt_pass pass_early_inline; +extern struct gimple_opt_pass pass_inlinable; extern struct gimple_opt_pass pass_inline_parameters; extern struct gimple_opt_pass pass_all_early_optimizations; extern struct gimple_opt_pass pass_update_address_taken;