From patchwork Wed Jul 7 11:10:26 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: [tree-sra] Fix to set up correct context for call to compute_inline_parameter (PR44768) X-Patchwork-Submitter: Ramana Radhakrishnan X-Patchwork-Id: 58101 Message-Id: <1278501026.25686.23.camel@e102325-lin.cambridge.arm.com> To: Richard Guenther Cc: gcc-patches@gcc.gnu.org, mjambor@suse.cz Date: Wed, 07 Jul 2010 12:10:26 +0100 From: Ramana Radhakrishnan List-Id: On Wed, 2010-07-07 at 11:33 +0200, Richard Guenther wrote: > Switching cfun is expensive. Why and where does > compute_inline_parameters end up using cfun? We should fix > that instead. The reason compute_inline_parameters ends up using cfun / current_function_decl is because this ends up calling estimated_stack_frame_size that ends up calling a backend hook that uses current_function_decl as can be seen in the audit trail http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44768#c5 Based on your idea after our IRC chat - Does this look any better ? Verified that this actually generates the correct code by manual inspection of generated code. Ok after bootstrapping on arm-linux-gnueabi and regression testing ? cheers Ramana 2010-07-07 Ramana Radhakrishnan PR bootstrap/44768 * cfgexpand.c (estimated_stack_frame_size): Make self-contained with respect to current_function_decl. Pass decl of the function. * tree-inline.h (estimated_stack_frame_size): Adjust prototype. * ipa-inline.c (compute_inline_parameters): Pass decl to estimated_stack_frame_size. Index: ipa-inline.c =================================================================== --- ipa-inline.c (revision 161901) +++ ipa-inline.c (working copy) @@ -2019,7 +2019,7 @@ compute_inline_parameters (struct cgraph /* Estimate the stack size for the function. But not at -O0 because estimated_stack_frame_size is a quadratic problem. */ - self_stack_size = optimize ? estimated_stack_frame_size () : 0; + self_stack_size = optimize ? estimated_stack_frame_size (node->decl) : 0; inline_summary (node)->estimated_self_stack_size = self_stack_size; node->global.estimated_stack_size = self_stack_size; node->global.stack_frame_offset = 0; Index: cfgexpand.c =================================================================== --- cfgexpand.c (revision 161901) +++ cfgexpand.c (working copy) @@ -1252,8 +1252,8 @@ fini_vars_expansion (void) stack_vars_alloc = stack_vars_num = 0; } -/* Make a fair guess for the size of the stack frame of the current - function. This doesn't have to be exact, the result is only used +/* Make a fair guess for the size of the stack frame of the decl + passed. This doesn't have to be exact, the result is only used in the inline heuristics. So we don't want to run the full stack var packing algorithm (which is quadratic in the number of stack vars). Instead, we calculate the total size of all stack vars. @@ -1261,12 +1261,15 @@ fini_vars_expansion (void) vars doesn't happen very often. */ HOST_WIDE_INT -estimated_stack_frame_size (void) +estimated_stack_frame_size (tree decl) { HOST_WIDE_INT size = 0; size_t i; tree var, outer_block = DECL_INITIAL (current_function_decl); unsigned ix; + tree old_cur_fun_decl = current_function_decl; + current_function_decl = decl; + push_cfun (DECL_STRUCT_FUNCTION (decl)); init_vars_expansion (); @@ -1287,7 +1290,8 @@ estimated_stack_frame_size (void) size += account_stack_vars (); fini_vars_expansion (); } - + pop_cfun (); + current_function_decl = old_cur_fun_decl; return size; } Index: tree-inline.h =================================================================== --- tree-inline.h (revision 161901) +++ tree-inline.h (working copy) @@ -185,6 +185,6 @@ extern tree remap_decl (tree decl, copy_ extern tree remap_type (tree type, copy_body_data *id); extern gimple_seq copy_gimple_seq_and_replace_locals (gimple_seq seq); -extern HOST_WIDE_INT estimated_stack_frame_size (void); +extern HOST_WIDE_INT estimated_stack_frame_size (tree); #endif /* GCC_TREE_INLINE_H */