diff mbox

[PR,debug/47106] account used vars only once

Message ID or39ntpdw5.fsf@livre.localdomain
State New
Headers show

Commit Message

Alexandre Oliva Feb. 12, 2011, 1:43 p.m. UTC
On Feb  7, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:

> The referenced_var_lookup interface change is ok.

After much hacking and testing, I have a patchset that works (no
regressions) and that I'm happy with.

I've broken it up into preparation patches with interface changes, the
aactual change, one independent bug fix, some additional improvements
that I regard as slightly risky, but that have survived multiple
bootstrap cycles (-O1 and -O3 in addition to -O2, on i686 and x86_64),
and some additional patches I used for testing and for verifying that
the estimated stack sizes are unchanged from the current method, when
debug info is not enabled (when it is, variables retained in lexical
blocks for debug information only used to inflate the estimate, but now
they no longer do)

On Feb  7, 2011, Jan Hubicka <hubicka@ucw.cz> wrote:

>> we still tried to estimate the stack size of functions that hadn't
>> been put in SSA form and had referenced_vars computed.

>> From where we did so?

SRA, indeed.

On Feb  8, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:

> Yes, we can have cycles and no referenced vars in the callee.  But
> we won't inline that callee (as it isn't in SSA form), so we can just
> as well skip looking at its estimated stack frame size.

Excellent!

On Feb  8, 2011, Jan Hubicka <hubicka@ucw.cz> wrote:

> Functions checking inline parameters (check_inline_limits etc.) are done only
> after in_ssa_p check is done, so we should be safe here.

Yup.

> (with exception of disregard inline limits, you can move that check
> later but any value of that flag is OK since we won't do anything with
> the call)

I hadn't realized what this note was about before now.  I hit the
problem of processing functions in the wrong order, breaking pr42632.c,
which is why I introduced a new pass, compute_inlinable, at the point
the initial compute_inline_params was.  I tried not calling
compute_inlinable from within compute_inline_parameters, but I found out
inlinable changed, or had to be computed separately, or something like
that.  What I didn't try was to replace the explicit call of
compute_inline_parameters in cgraph_process_new_functions with
compute_inlinable.  From the failures I observed, it now occurs to me
that this might be what I was missing.  Anyhow, this patch works, but if
you'd rather I make this other change, I will.


Anyhow, here is the patch set.

Interface changes:
The patch that fixes the problem, using referenced_vars to estimate
stack sizes.
This patch fixes a latent bug: when estimating function sizes/times, we
use functions that access current_function_decl, without always having
set it up to the function being estimated.
These two patches remove unnecessary setting up of cfun and
current_function_decl.
Are the above ok to install?  All regstrapped on x86_64-linux-gnu and
i686-pc-linux-gnu, and also bootstrapped with -O1 and -O3, besides the
tests described below.


This FTR patch makes sure the previous two patches are safe, at least
inasmuchas they ensure we don't unconditionally dereferenced cfun or
current_function_decl within these functions.  It wasn't enough,
however, to detect the problem in function size/time estimation.  That
required comparing the output of the compiler with and without the
patches that dropped cfun overriding.  There was only one difference in
stage3-like builds, in ada/trans.o, and that was fixed with the patch
that set current_function_decl while estimating function body sizes and
times.
Finally, this FTR patch was used to compare the stack size estimations
of the current (trunk) approach and the one introduced with this
patchset.  No messages printed in stage2 host for a bootstrap-debug
build (no -g), indicating no differences; plenty printed during stage3
(with -g), always lower counts.

Comments

Richard Biener Feb. 14, 2011, 12:33 p.m. UTC | #1
On Sat, Feb 12, 2011 at 2:43 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Feb  7, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:
>
>> The referenced_var_lookup interface change is ok.
>
> After much hacking and testing, I have a patchset that works (no
> regressions) and that I'm happy with.
>
> I've broken it up into preparation patches with interface changes, the
> aactual change, one independent bug fix, some additional improvements
> that I regard as slightly risky, but that have survived multiple
> bootstrap cycles (-O1 and -O3 in addition to -O2, on i686 and x86_64),
> and some additional patches I used for testing and for verifying that
> the estimated stack sizes are unchanged from the current method, when
> debug info is not enabled (when it is, variables retained in lexical
> blocks for debug information only used to inflate the estimate, but now
> they no longer do)
>
> On Feb  7, 2011, Jan Hubicka <hubicka@ucw.cz> wrote:
>
>>> we still tried to estimate the stack size of functions that hadn't
>>> been put in SSA form and had referenced_vars computed.
>
>>> From where we did so?
>
> SRA, indeed.
>
> On Feb  8, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:
>
>> Yes, we can have cycles and no referenced vars in the callee.  But
>> we won't inline that callee (as it isn't in SSA form), so we can just
>> as well skip looking at its estimated stack frame size.
>
> Excellent!
>
> On Feb  8, 2011, Jan Hubicka <hubicka@ucw.cz> wrote:
>
>> Functions checking inline parameters (check_inline_limits etc.) are done only
>> after in_ssa_p check is done, so we should be safe here.
>
> Yup.
>
>> (with exception of disregard inline limits, you can move that check
>> later but any value of that flag is OK since we won't do anything with
>> the call)
>
> I hadn't realized what this note was about before now.  I hit the
> problem of processing functions in the wrong order, breaking pr42632.c,
> which is why I introduced a new pass, compute_inlinable, at the point
> the initial compute_inline_params was.  I tried not calling
> compute_inlinable from within compute_inline_parameters, but I found out
> inlinable changed, or had to be computed separately, or something like
> that.  What I didn't try was to replace the explicit call of
> compute_inline_parameters in cgraph_process_new_functions with
> compute_inlinable.  From the failures I observed, it now occurs to me
> that this might be what I was missing.  Anyhow, this patch works, but if
> you'd rather I make this other change, I will.
>
>
> Anyhow, here is the patch set.
>
> Interface changes:
>
>
>
> The patch that fixes the problem, using referenced_vars to estimate
> stack sizes.
>
>
>
> This patch fixes a latent bug: when estimating function sizes/times, we
> use functions that access current_function_decl, without always having
> set it up to the function being estimated.
>
>
>
> These two patches remove unnecessary setting up of cfun and
> current_function_decl.
>
>
>
>
> Are the above ok to install?  All regstrapped on x86_64-linux-gnu and
> i686-pc-linux-gnu, and also bootstrapped with -O1 and -O3, besides the
> tests described below.
>
>
> This FTR patch makes sure the previous two patches are safe, at least
> inasmuchas they ensure we don't unconditionally dereferenced cfun or
> current_function_decl within these functions.  It wasn't enough,
> however, to detect the problem in function size/time estimation.  That
> required comparing the output of the compiler with and without the
> patches that dropped cfun overriding.  There was only one difference in
> stage3-like builds, in ada/trans.o, and that was fixed with the patch
> that set current_function_decl while estimating function body sizes and
> times.
>
>
>
> Finally, this FTR patch was used to compare the stack size estimations
> of the current (trunk) approach and the one introduced with this
> patchset.  No messages printed in stage2 host for a bootstrap-debug
> build (no -g), indicating no differences; plenty printed during stage3
> (with -g), always lower counts.

Thanks for splitting the patch up.

referenced-var-lookup-fn-arg.patch, for-each-referenced-var-fn-arg.patch,
estimated-stack-frame-size-cgraph-node-arg.patch and
compute-inline-params-ret-void.patch are obvious (and ok), you can apply
them independently now.

I'm skipping ipa-inline-stack-growth-referenced-vars-pr47106.patch for detailed
review here.

--

estimate-function-body-sizes-set-cfundecl.patch contains

+  /* ??? estimate_num_insns may indirectly call
+     decl_address_invariant_p which tests current_function_decl.  We
+     should probably somehow arrange for decl_address_ip_invariant_p
+     to be called instead.  Leaving current_function_decl NULL would
+     probably have the same effect, but setting current_function_decl
+     we get a more accurate estimate.  */
+  current_function_decl = node->decl;

which does look like a hack.  Looking at decl_address_invariant_p it
_seems_ that the test
          || DECL_CONTEXT (op) == current_function_decl
          || decl_function_context (op) == current_function_decl)
could simply be
   DECL_CONTEXT (op) && TREE_CODE (DECL_CONTEX (op)) == FUNCTION_DECL

that said, I'm not sure about the necessity of this patch at this stage,
I'd like to defer it (or rather a better solution) to 4.7 if possible.
(you might want to open a bug so we don't forget)

Does dont-push-cfun-in-estimate-stack-frame-size.patch depend on this
patch?  I guess dont-push-cfun-in-estimate-stack-frame-size.patch depends on
ipa-inline-stack-growth-referenced-vars-pr47106.patch at least?
Similar dont-push-cfun-in-analyze-function.patch?

ipa-inline-stack-growth-referenced-vars-pr47106.patch was attached twice,
so I'm not sure about inter-dependencies.  If the do-not-set-cfun patches
pass testing ontop of the obvious cleanup patches they are ok as well.

--

Now onto ipa-inline-stack-growth-referenced-vars-pr47106.patch

it looks ok apart from the new pass_inlinable stuff, that at least sounds
odd and I'd want Honza to have a look here.  The testcase you say
the patch breaks otherwise redefines a _static_ inline function, so
I'd say it's invalid anyway.  I also wonder why computing
->inlineable should have any effect on that testcase ...

Well, if this remaining pass was all that would need to be applied I'd
have a quick look now ... ;)

So, can we get the approved pre-requesties applied?

Thanks for working on this odd part of GCC!
Richard.
Jan Hubicka Feb. 17, 2011, 1:45 a.m. UTC | #2
> Now onto ipa-inline-stack-growth-referenced-vars-pr47106.patch
> 
> it looks ok apart from the new pass_inlinable stuff, that at least sounds
> odd and I'd want Honza to have a look here.  The testcase you say

It is because early inliner depends on toporder to properly place always_inline functions, right?
I guess it is sane solution.  We might try to avoid recomputing inlinable in inline_parameters
done just before early_inline pass as we have nothing that should change inlinablity.
But in longer run we might want to add some cleanups before early inlining there, so I guess it is
easier to not worry here.

> the patch breaks otherwise redefines a _static_ inline function, so
> I'd say it's invalid anyway.  I also wonder why computing
> ->inlineable should have any effect on that testcase ...

I don't see the testcase redefining anything, it seems to me that toporder is now getting wrong.
I will try to look into this tomorrow.

If this is solved, the patch is OK.
Honza
Jan Hubicka Feb. 17, 2011, 1:54 a.m. UTC | #3
> > Now onto ipa-inline-stack-growth-referenced-vars-pr47106.patch
> > 
> > it looks ok apart from the new pass_inlinable stuff, that at least sounds
> > odd and I'd want Honza to have a look here.  The testcase you say
> 
> It is because early inliner depends on toporder to properly place always_inline functions, right?
> I guess it is sane solution.  We might try to avoid recomputing inlinable in inline_parameters
> done just before early_inline pass as we have nothing that should change inlinablity.
> But in longer run we might want to add some cleanups before early inlining there, so I guess it is
> easier to not worry here.
Also because we only care abou disregard_inline_limits that is computed based on the DECL flag,
perhaps we can move 
node->local.disregard_inline_limits
      = DECL_DISREGARD_INLINE_LIMITS (node->decl);
into cgraph_analyze_function and do not introduce the new pass. 
We would end up setting the flag on uninlinable always inlines, but those are bogus anyway.

Honza
Jan Hubicka Feb. 17, 2011, 8:35 a.m. UTC | #4
Hi,
this is variant of patch I am testing now.
Quick testing on the problematic PR shows no problems and we get around w/o
the extra pass.

Honza

Index: cgraphunit.c
===================================================================
*** cgraphunit.c	(revision 170240)
--- cgraphunit.c	(working copy)
*************** cgraph_analyze_function (struct cgraph_n
*** 783,788 ****
--- 783,793 ----
  
    assign_assembler_name_if_neeeded (node->decl);
  
+   /* disregard_inline_limits affects topological order of the early optimization,
+      so we need to compute it ahead of rest of inline parameters.  */
+   node->local.disregard_inline_limits
+     = DECL_DISREGARD_INLINE_LIMITS (node->decl);
+ 
    /* Make sure to gimplify bodies only once.  During analyzing a
       function we lower it, which will require gimplified nested
       functions, so we can end up here with an already gimplified
Index: testsuite/g++.dg/debug/pr47106.C
===================================================================
*** testsuite/g++.dg/debug/pr47106.C	(revision 0)
--- testsuite/g++.dg/debug/pr47106.C	(revision 0)
***************
*** 0 ****
--- 1,37 ----
+ // { dg-do compile }
+ // { dg-options "-O -fpartial-inlining -flto -fconserve-stack -fcompare-debug" }
+ 
+ void end (int, int) __attribute__ ((__noreturn__));
+ 
+ struct S
+ {
+   int i;
+   S *s;
+ };
+ 
+ inline bool f (S *s)
+ {
+   if (!s->s)
+     end (0, 0);
+   return s->s == s;
+ }
+ 
+ inline bool
+ baz (S s1, S)
+ {
+   while (f (&s1));
+ }
+ 
+ inline bool
+ bar (S s1, S s2, S)
+ {
+   baz (s1, s2);
+ }
+ 
+ S getS ();
+ 
+ bool
+ foo ()
+ {
+   bar (getS (), getS (), getS ());
+ }
Index: ipa-inline.c
===================================================================
*** ipa-inline.c	(revision 170240)
--- ipa-inline.c	(working copy)
*************** compute_inline_parameters (struct cgraph
*** 2006,2014 ****
  	  break;
        node->local.can_change_signature = !e;
      }
-   if (node->local.inlinable && !node->local.disregard_inline_limits)
-     node->local.disregard_inline_limits
-       = DECL_DISREGARD_INLINE_LIMITS (node->decl);
    estimate_function_body_sizes (node);
    /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
    node->global.time = inline_summary (node)->self_time;
--- 2006,2011 ----
Index: cfgexpand.c
===================================================================
*** cfgexpand.c	(revision 170240)
--- cfgexpand.c	(working copy)
*************** create_stack_guard (void)
*** 1311,1340 ****
    crtl->stack_protect_guard = guard;
  }
  
- /* A subroutine of expand_used_vars.  Walk down through the BLOCK tree
-    expanding variables.  Those variables that can be put into registers
-    are allocated pseudos; those that can't are put on the stack.
- 
-    TOPLEVEL is true if this is the outermost BLOCK.  */
- 
- static HOST_WIDE_INT
- account_used_vars_for_block (tree block, bool toplevel)
- {
-   tree t;
-   HOST_WIDE_INT size = 0;
- 
-   /* Expand all variables at this level.  */
-   for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-     if (var_ann (t) && is_used_p (t))
-       size += expand_one_var (t, toplevel, false);
- 
-   /* Expand all variables at containing levels.  */
-   for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t))
-     size += account_used_vars_for_block (t, false);
- 
-   return size;
- }
- 
  /* Prepare for expanding variables.  */
  static void
  init_vars_expansion (void)
--- 1311,1316 ----
*************** estimated_stack_frame_size (struct cgrap
*** 1379,1401 ****
  {
    HOST_WIDE_INT size = 0;
    size_t i;
!   tree var, outer_block = DECL_INITIAL (current_function_decl);
!   unsigned ix;
    tree old_cur_fun_decl = current_function_decl;
  
    current_function_decl = node->decl;
!   push_cfun (DECL_STRUCT_FUNCTION (node->decl));
  
!   init_vars_expansion ();
! 
!   FOR_EACH_LOCAL_DECL (cfun, ix, var)
!     {
!       /* TREE_USED marks local variables that do not appear in lexical
! 	 blocks.  We don't want to expand those that do twice.  */
!       if (TREE_USED (var))
!         size += expand_one_var (var, true, false);
!     }
!   size += account_used_vars_for_block (outer_block, true);
  
    if (stack_vars_num > 0)
      {
--- 1355,1371 ----
  {
    HOST_WIDE_INT size = 0;
    size_t i;
!   tree var;
    tree old_cur_fun_decl = current_function_decl;
+   referenced_var_iterator rvi;
+   struct function *fn = DECL_STRUCT_FUNCTION (node->decl);
  
    current_function_decl = node->decl;
!   push_cfun (fn);
  
!   gcc_checking_assert (gimple_referenced_vars (fn));
!   FOR_EACH_REFERENCED_VAR (fn, var, rvi)
!     size += expand_one_var (var, true, false);
  
    if (stack_vars_num > 0)
      {
Index: tree-inline.c
===================================================================
*** tree-inline.c	(revision 170240)
--- tree-inline.c	(working copy)
*************** remap_decl (tree decl, copy_body_data *i
*** 312,324 ****
  	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
  	}
  
!       if (cfun && gimple_in_ssa_p (cfun)
! 	  && (TREE_CODE (t) == VAR_DECL
! 	      || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
! 	{
! 	  get_var_ann (t);
! 	  add_referenced_var (t);
! 	}
        return t;
      }
  
--- 312,328 ----
  	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
  	}
  
!       if ((TREE_CODE (t) == VAR_DECL
! 	   || TREE_CODE (t) == RESULT_DECL
! 	   || TREE_CODE (t) == PARM_DECL)
! 	  && id->src_fn && DECL_STRUCT_FUNCTION (id->src_fn)
! 	  && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
! 	  /* We don't want to mark as referenced VAR_DECLs that were
! 	     not marked as such in the src function.  */
! 	  && (TREE_CODE (decl) != VAR_DECL
! 	      || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
! 					DECL_UID (decl))))
! 	add_referenced_var (t);
        return t;
      }
  
*************** setup_one_parameter (copy_body_data *id,
*** 2547,2556 ****
  
    /* We're actually using the newly-created var.  */
    if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
!     {
!       get_var_ann (var);
!       add_referenced_var (var);
!     }
  
    /* Declare this new variable.  */
    DECL_CHAIN (var) = *vars;
--- 2551,2557 ----
  
    /* We're actually using the newly-created var.  */
    if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
!     add_referenced_var (var);
  
    /* Declare this new variable.  */
    DECL_CHAIN (var) = *vars;
*************** declare_return_variable (copy_body_data 
*** 2857,2866 ****
  
    var = copy_result_decl_to_var (result, id);
    if (gimple_in_ssa_p (cfun))
!     {
!       get_var_ann (var);
!       add_referenced_var (var);
!     }
  
    DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
  
--- 2858,2864 ----
  
    var = copy_result_decl_to_var (result, id);
    if (gimple_in_ssa_p (cfun))
!     add_referenced_var (var);
  
    DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
  
*************** declare_return_variable (copy_body_data 
*** 2896,2905 ****
      {
        tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
        if (gimple_in_ssa_p (id->src_cfun))
! 	{
! 	  get_var_ann (temp);
! 	  add_referenced_var (temp);
! 	}
        insert_decl_map (id, result, temp);
        /* When RESULT_DECL is in SSA form, we need to use it's default_def
  	 SSA_NAME.  */
--- 2894,2900 ----
      {
        tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
        if (gimple_in_ssa_p (id->src_cfun))
! 	add_referenced_var (temp);
        insert_decl_map (id, result, temp);
        /* When RESULT_DECL is in SSA form, we need to use it's default_def
  	 SSA_NAME.  */
*************** copy_decl_for_dup_finish (copy_body_data
*** 4753,4758 ****
--- 4748,4761 ----
         new function.  */
      DECL_CONTEXT (copy) = id->dst_fn;
  
+   if (TREE_CODE (decl) == VAR_DECL
+       /* C++ clones functions during parsing, before
+ 	 referenced_vars.  */
+       && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+       && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+ 				DECL_UID (decl)))
+     add_referenced_var (copy);
+ 
    return copy;
  }
  
*************** copy_arguments_for_versioning (tree orig
*** 4864,4870 ****
  	   as temporary variable later in function, the uses will be
  	   replaced by local variable.  */
  	tree var = copy_decl_to_var (arg, id);
- 	get_var_ann (var);
  	add_referenced_var (var);
  	insert_decl_map (id, arg, var);
          /* Declare this new variable.  */
--- 4867,4872 ----
Index: passes.c
===================================================================
*** passes.c	(revision 170240)
--- passes.c	(working copy)
*************** init_optimization_passes (void)
*** 729,735 ****
    NEXT_PASS (pass_build_cfg);
    NEXT_PASS (pass_warn_function_return);
    NEXT_PASS (pass_build_cgraph_edges);
-   NEXT_PASS (pass_inline_parameters);
    *p = NULL;
  
    /* Interprocedural optimization passes.  */
--- 729,734 ----
*************** init_optimization_passes (void)
*** 747,758 ****
        NEXT_PASS (pass_build_ssa);
        NEXT_PASS (pass_lower_vector);
        NEXT_PASS (pass_early_warn_uninitialized);
-       /* Note that it is not strictly necessary to schedule an early
- 	 inline pass here.  However, some test cases (e.g.,
- 	 g++.dg/other/p334435.C g++.dg/other/i386-1.C) expect extern
- 	 inline functions to be inlined even at -O0.  This does not
- 	 happen during the first early inline pass.  */
        NEXT_PASS (pass_rebuild_cgraph_edges);
        NEXT_PASS (pass_early_inline);
        NEXT_PASS (pass_all_early_optimizations);
  	{
--- 746,753 ----
        NEXT_PASS (pass_build_ssa);
        NEXT_PASS (pass_lower_vector);
        NEXT_PASS (pass_early_warn_uninitialized);
        NEXT_PASS (pass_rebuild_cgraph_edges);
+       NEXT_PASS (pass_inline_parameters);
        NEXT_PASS (pass_early_inline);
        NEXT_PASS (pass_all_early_optimizations);
  	{
Jan Hubicka Feb. 17, 2011, 4:21 p.m. UTC | #5
Hi,
this is variant of patch I comitted after re-testing.
The change relative to previous version I sent is new tree-sra hunk avoiding
call of compute_inline_parameters on functions that are not in SSA yet and
cgraph_process_new_functions calling it in functions that arrive in SSA form
and thus are not early optimized.

Honza

Index: ChangeLog
===================================================================
*** ChangeLog	(revision 170248)
--- ChangeLog	(working copy)
***************
*** 1,3 ****
--- 1,29 ----
+ 2011-02-17  Alexandre Oliva  <aoliva@redhat.com>
+ 	    Jan Hubicka  <jh@suse.cz>
+ 
+ 	PR debug/47106
+ 	PR debug/47402
+ 	* cfgexpand.c (account_used_vars_for_block): Remove.
+ 	(estimated_stack_frame_size): Use referenced vars.
+ 	* tree-inline.c (remap_decl): Only mark VAR_DECLs as referenced
+ 	that were referenced in the original function.  Test src_fn
+ 	rather than cfun.  Drop redundant get_var_ann.
+ 	(setup_one_parameter): Drop redundant get_var_ann.
+ 	(declare_return_variable): Likewise.
+ 	(copy_decl_for_dup_finish): Mark VAR_DECLs referenced in src_fn.
+ 	(copy_arguments_for_versioning): Drop redundant get_var_ann.
+ 	* ipa-inline.c (compute_inline_parameters): Do not compute
+ 	disregard_inline_limits here.
+ 	are not available.
+ 	(compute_inlinable_for_current, pass_inlinable): New.
+ 	(pass_inline_parameters): Require PROP_referenced_vars.
+ 	* cgraphunit.c (cgraph_process_new_functions): Don't run
+ 	compute_inline_parameters explicitly unless function is in
+ 	SSA form.
+ 	(cgraph_analyze_function): Set .disregard_inline_limits.
+ 	* tree-sra.c (convert_callers): Compute inliner parameters
+ 	only for functions already in SSA form.
+ 
  2011-02-17  Joseph Myers  <joseph@codesourcery.com>
  
  	* config/sparc/sparc.h (CPP_ENDIAN_SPEC): Don't handle
Index: cgraphunit.c
===================================================================
*** cgraphunit.c	(revision 170240)
--- cgraphunit.c	(working copy)
*************** cgraph_process_new_functions (void)
*** 246,258 ****
  	    cgraph_analyze_function (node);
  	  push_cfun (DECL_STRUCT_FUNCTION (fndecl));
  	  current_function_decl = fndecl;
- 	  compute_inline_parameters (node);
  	  if ((cgraph_state == CGRAPH_STATE_IPA_SSA
  	      && !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)))
  	      /* When not optimizing, be sure we run early local passes anyway
  		 to expand OMP.  */
  	      || !optimize)
  	    execute_pass_list (pass_early_local_passes.pass.sub);
  	  free_dominance_info (CDI_POST_DOMINATORS);
  	  free_dominance_info (CDI_DOMINATORS);
  	  pop_cfun ();
--- 246,259 ----
  	    cgraph_analyze_function (node);
  	  push_cfun (DECL_STRUCT_FUNCTION (fndecl));
  	  current_function_decl = fndecl;
  	  if ((cgraph_state == CGRAPH_STATE_IPA_SSA
  	      && !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)))
  	      /* When not optimizing, be sure we run early local passes anyway
  		 to expand OMP.  */
  	      || !optimize)
  	    execute_pass_list (pass_early_local_passes.pass.sub);
+ 	  else
+ 	    compute_inline_parameters (node);
  	  free_dominance_info (CDI_POST_DOMINATORS);
  	  free_dominance_info (CDI_DOMINATORS);
  	  pop_cfun ();
*************** cgraph_analyze_function (struct cgraph_n
*** 783,788 ****
--- 784,794 ----
  
    assign_assembler_name_if_neeeded (node->decl);
  
+   /* disregard_inline_limits affects topological order of the early optimization,
+      so we need to compute it ahead of rest of inline parameters.  */
+   node->local.disregard_inline_limits
+     = DECL_DISREGARD_INLINE_LIMITS (node->decl);
+ 
    /* Make sure to gimplify bodies only once.  During analyzing a
       function we lower it, which will require gimplified nested
       functions, so we can end up here with an already gimplified
Index: testsuite/ChangeLog
===================================================================
*** testsuite/ChangeLog	(revision 170248)
--- testsuite/ChangeLog	(working copy)
***************
*** 1,3 ****
--- 1,10 ----
+ 2011-02-17  Alexandre Oliva  <aoliva@redhat.com>
+ 	    Jan Hubicka  <jh@suse.cz>
+ 
+ 	PR debug/47106
+ 	PR debug/47402
+ 	* g++.dg/debug/pr47106.C: New.
+ 
  2011-02-17  Uros Bizjak  <ubizjak@gmail.com>
  
  	PR target/43653
Index: testsuite/g++.dg/debug/pr47106.C
===================================================================
*** testsuite/g++.dg/debug/pr47106.C	(revision 0)
--- testsuite/g++.dg/debug/pr47106.C	(revision 0)
***************
*** 0 ****
--- 1,37 ----
+ // { dg-do compile }
+ // { dg-options "-O -fpartial-inlining -flto -fconserve-stack -fcompare-debug" }
+ 
+ void end (int, int) __attribute__ ((__noreturn__));
+ 
+ struct S
+ {
+   int i;
+   S *s;
+ };
+ 
+ inline bool f (S *s)
+ {
+   if (!s->s)
+     end (0, 0);
+   return s->s == s;
+ }
+ 
+ inline bool
+ baz (S s1, S)
+ {
+   while (f (&s1));
+ }
+ 
+ inline bool
+ bar (S s1, S s2, S)
+ {
+   baz (s1, s2);
+ }
+ 
+ S getS ();
+ 
+ bool
+ foo ()
+ {
+   bar (getS (), getS (), getS ());
+ }
Index: ipa-inline.c
===================================================================
*** ipa-inline.c	(revision 170240)
--- ipa-inline.c	(working copy)
*************** compute_inline_parameters (struct cgraph
*** 2006,2014 ****
  	  break;
        node->local.can_change_signature = !e;
      }
-   if (node->local.inlinable && !node->local.disregard_inline_limits)
-     node->local.disregard_inline_limits
-       = DECL_DISREGARD_INLINE_LIMITS (node->decl);
    estimate_function_body_sizes (node);
    /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
    node->global.time = inline_summary (node)->self_time;
--- 2006,2011 ----
Index: cfgexpand.c
===================================================================
*** cfgexpand.c	(revision 170240)
--- cfgexpand.c	(working copy)
*************** create_stack_guard (void)
*** 1311,1340 ****
    crtl->stack_protect_guard = guard;
  }
  
- /* A subroutine of expand_used_vars.  Walk down through the BLOCK tree
-    expanding variables.  Those variables that can be put into registers
-    are allocated pseudos; those that can't are put on the stack.
- 
-    TOPLEVEL is true if this is the outermost BLOCK.  */
- 
- static HOST_WIDE_INT
- account_used_vars_for_block (tree block, bool toplevel)
- {
-   tree t;
-   HOST_WIDE_INT size = 0;
- 
-   /* Expand all variables at this level.  */
-   for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-     if (var_ann (t) && is_used_p (t))
-       size += expand_one_var (t, toplevel, false);
- 
-   /* Expand all variables at containing levels.  */
-   for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t))
-     size += account_used_vars_for_block (t, false);
- 
-   return size;
- }
- 
  /* Prepare for expanding variables.  */
  static void
  init_vars_expansion (void)
--- 1311,1316 ----
*************** estimated_stack_frame_size (struct cgrap
*** 1379,1401 ****
  {
    HOST_WIDE_INT size = 0;
    size_t i;
!   tree var, outer_block = DECL_INITIAL (current_function_decl);
!   unsigned ix;
    tree old_cur_fun_decl = current_function_decl;
  
    current_function_decl = node->decl;
!   push_cfun (DECL_STRUCT_FUNCTION (node->decl));
  
!   init_vars_expansion ();
! 
!   FOR_EACH_LOCAL_DECL (cfun, ix, var)
!     {
!       /* TREE_USED marks local variables that do not appear in lexical
! 	 blocks.  We don't want to expand those that do twice.  */
!       if (TREE_USED (var))
!         size += expand_one_var (var, true, false);
!     }
!   size += account_used_vars_for_block (outer_block, true);
  
    if (stack_vars_num > 0)
      {
--- 1355,1371 ----
  {
    HOST_WIDE_INT size = 0;
    size_t i;
!   tree var;
    tree old_cur_fun_decl = current_function_decl;
+   referenced_var_iterator rvi;
+   struct function *fn = DECL_STRUCT_FUNCTION (node->decl);
  
    current_function_decl = node->decl;
!   push_cfun (fn);
  
!   gcc_checking_assert (gimple_referenced_vars (fn));
!   FOR_EACH_REFERENCED_VAR (fn, var, rvi)
!     size += expand_one_var (var, true, false);
  
    if (stack_vars_num > 0)
      {
Index: tree-sra.c
===================================================================
*** tree-sra.c	(revision 170240)
--- tree-sra.c	(working copy)
*************** convert_callers (struct cgraph_node *nod
*** 4362,4368 ****
      }
  
    for (cs = node->callers; cs; cs = cs->next_caller)
!     if (bitmap_set_bit (recomputed_callers, cs->caller->uid))
        compute_inline_parameters (cs->caller);
    BITMAP_FREE (recomputed_callers);
  
--- 4362,4369 ----
      }
  
    for (cs = node->callers; cs; cs = cs->next_caller)
!     if (bitmap_set_bit (recomputed_callers, cs->caller->uid)
! 	&& gimple_in_ssa_p (DECL_STRUCT_FUNCTION (cs->caller->decl)))
        compute_inline_parameters (cs->caller);
    BITMAP_FREE (recomputed_callers);
  
Index: tree-inline.c
===================================================================
*** tree-inline.c	(revision 170240)
--- tree-inline.c	(working copy)
*************** remap_decl (tree decl, copy_body_data *i
*** 312,324 ****
  	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
  	}
  
!       if (cfun && gimple_in_ssa_p (cfun)
! 	  && (TREE_CODE (t) == VAR_DECL
! 	      || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
! 	{
! 	  get_var_ann (t);
! 	  add_referenced_var (t);
! 	}
        return t;
      }
  
--- 312,328 ----
  	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
  	}
  
!       if ((TREE_CODE (t) == VAR_DECL
! 	   || TREE_CODE (t) == RESULT_DECL
! 	   || TREE_CODE (t) == PARM_DECL)
! 	  && id->src_fn && DECL_STRUCT_FUNCTION (id->src_fn)
! 	  && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
! 	  /* We don't want to mark as referenced VAR_DECLs that were
! 	     not marked as such in the src function.  */
! 	  && (TREE_CODE (decl) != VAR_DECL
! 	      || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
! 					DECL_UID (decl))))
! 	add_referenced_var (t);
        return t;
      }
  
*************** setup_one_parameter (copy_body_data *id,
*** 2547,2556 ****
  
    /* We're actually using the newly-created var.  */
    if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
!     {
!       get_var_ann (var);
!       add_referenced_var (var);
!     }
  
    /* Declare this new variable.  */
    DECL_CHAIN (var) = *vars;
--- 2551,2557 ----
  
    /* We're actually using the newly-created var.  */
    if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
!     add_referenced_var (var);
  
    /* Declare this new variable.  */
    DECL_CHAIN (var) = *vars;
*************** declare_return_variable (copy_body_data 
*** 2857,2866 ****
  
    var = copy_result_decl_to_var (result, id);
    if (gimple_in_ssa_p (cfun))
!     {
!       get_var_ann (var);
!       add_referenced_var (var);
!     }
  
    DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
  
--- 2858,2864 ----
  
    var = copy_result_decl_to_var (result, id);
    if (gimple_in_ssa_p (cfun))
!     add_referenced_var (var);
  
    DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
  
*************** declare_return_variable (copy_body_data 
*** 2896,2905 ****
      {
        tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
        if (gimple_in_ssa_p (id->src_cfun))
! 	{
! 	  get_var_ann (temp);
! 	  add_referenced_var (temp);
! 	}
        insert_decl_map (id, result, temp);
        /* When RESULT_DECL is in SSA form, we need to use it's default_def
  	 SSA_NAME.  */
--- 2894,2900 ----
      {
        tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
        if (gimple_in_ssa_p (id->src_cfun))
! 	add_referenced_var (temp);
        insert_decl_map (id, result, temp);
        /* When RESULT_DECL is in SSA form, we need to use it's default_def
  	 SSA_NAME.  */
*************** copy_decl_for_dup_finish (copy_body_data
*** 4753,4758 ****
--- 4748,4761 ----
         new function.  */
      DECL_CONTEXT (copy) = id->dst_fn;
  
+   if (TREE_CODE (decl) == VAR_DECL
+       /* C++ clones functions during parsing, before
+ 	 referenced_vars.  */
+       && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+       && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+ 				DECL_UID (decl)))
+     add_referenced_var (copy);
+ 
    return copy;
  }
  
*************** copy_arguments_for_versioning (tree orig
*** 4864,4870 ****
  	   as temporary variable later in function, the uses will be
  	   replaced by local variable.  */
  	tree var = copy_decl_to_var (arg, id);
- 	get_var_ann (var);
  	add_referenced_var (var);
  	insert_decl_map (id, arg, var);
          /* Declare this new variable.  */
--- 4867,4872 ----
Index: passes.c
===================================================================
*** passes.c	(revision 170240)
--- passes.c	(working copy)
*************** init_optimization_passes (void)
*** 729,735 ****
    NEXT_PASS (pass_build_cfg);
    NEXT_PASS (pass_warn_function_return);
    NEXT_PASS (pass_build_cgraph_edges);
-   NEXT_PASS (pass_inline_parameters);
    *p = NULL;
  
    /* Interprocedural optimization passes.  */
--- 729,734 ----
*************** init_optimization_passes (void)
*** 747,758 ****
        NEXT_PASS (pass_build_ssa);
        NEXT_PASS (pass_lower_vector);
        NEXT_PASS (pass_early_warn_uninitialized);
-       /* Note that it is not strictly necessary to schedule an early
- 	 inline pass here.  However, some test cases (e.g.,
- 	 g++.dg/other/p334435.C g++.dg/other/i386-1.C) expect extern
- 	 inline functions to be inlined even at -O0.  This does not
- 	 happen during the first early inline pass.  */
        NEXT_PASS (pass_rebuild_cgraph_edges);
        NEXT_PASS (pass_early_inline);
        NEXT_PASS (pass_all_early_optimizations);
  	{
--- 746,753 ----
        NEXT_PASS (pass_build_ssa);
        NEXT_PASS (pass_lower_vector);
        NEXT_PASS (pass_early_warn_uninitialized);
        NEXT_PASS (pass_rebuild_cgraph_edges);
+       NEXT_PASS (pass_inline_parameters);
        NEXT_PASS (pass_early_inline);
        NEXT_PASS (pass_all_early_optimizations);
  	{
H.J. Lu Feb. 17, 2011, 6:43 p.m. UTC | #6
On Thu, Feb 17, 2011 at 8:21 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> this is variant of patch I comitted after re-testing.
> The change relative to previous version I sent is new tree-sra hunk avoiding
> call of compute_inline_parameters on functions that are not in SSA yet and
> cgraph_process_new_functions calling it in functions that arrive in SSA form
> and thus are not early optimized.
>
> Honza
>
> Index: ChangeLog
> ===================================================================
> *** ChangeLog   (revision 170248)
> --- ChangeLog   (working copy)
> ***************
> *** 1,3 ****
> --- 1,29 ----
> + 2011-02-17  Alexandre Oliva  <aoliva@redhat.com>
> +           Jan Hubicka  <jh@suse.cz>
> +
> +       PR debug/47106
> +       PR debug/47402
> +       * cfgexpand.c (account_used_vars_for_block): Remove.
> +       (estimated_stack_frame_size): Use referenced vars.
> +       * tree-inline.c (remap_decl): Only mark VAR_DECLs as referenced
> +       that were referenced in the original function.  Test src_fn
> +       rather than cfun.  Drop redundant get_var_ann.
> +       (setup_one_parameter): Drop redundant get_var_ann.
> +       (declare_return_variable): Likewise.
> +       (copy_decl_for_dup_finish): Mark VAR_DECLs referenced in src_fn.
> +       (copy_arguments_for_versioning): Drop redundant get_var_ann.
> +       * ipa-inline.c (compute_inline_parameters): Do not compute
> +       disregard_inline_limits here.
> +       are not available.
> +       (compute_inlinable_for_current, pass_inlinable): New.
> +       (pass_inline_parameters): Require PROP_referenced_vars.
> +       * cgraphunit.c (cgraph_process_new_functions): Don't run
> +       compute_inline_parameters explicitly unless function is in
> +       SSA form.
> +       (cgraph_analyze_function): Set .disregard_inline_limits.
> +       * tree-sra.c (convert_callers): Compute inliner parameters
> +       only for functions already in SSA form.
> +

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47788
Jack Howarth Feb. 18, 2011, 3:27 a.m. UTC | #7
On Thu, Feb 17, 2011 at 09:35:56AM +0100, Jan Hubicka wrote:
> Hi,
> this is variant of patch I am testing now.
> Quick testing on the problematic PR shows no problems and we get around w/o
> the extra pass.
> 
> Honza

Honza,
    It seems that this patch may have introduced the regressions....

FAIL: gcc.dg/lto/20081115 c_lto_20081115_0.o-c_lto_20081115_2.o execute -O0 -flto -flto-partition=1to1
FAIL: gcc.dg/lto/20081118 c_lto_20081118_0.o-c_lto_20081118_2.o execute -O0 -flto -flto-partition=1to1
FAIL: gcc.dg/lto/20081201-1 c_lto_20081201-1_0.o-c_lto_20081201-1_2.o execute -O0 -flto -flto-partition=1to1

looking at http://gcc.gnu.org/ml/gcc-testresults/2011-02/msg01897.html compared to
http://gcc.gnu.org/ml/gcc-testresults/2011-02/msg01911.html. I also see this on x86_64-apple-darwin10.
             Jack

> 
> Index: cgraphunit.c
> ===================================================================
> *** cgraphunit.c	(revision 170240)
> --- cgraphunit.c	(working copy)
> *************** cgraph_analyze_function (struct cgraph_n
> *** 783,788 ****
> --- 783,793 ----
>   
>     assign_assembler_name_if_neeeded (node->decl);
>   
> +   /* disregard_inline_limits affects topological order of the early optimization,
> +      so we need to compute it ahead of rest of inline parameters.  */
> +   node->local.disregard_inline_limits
> +     = DECL_DISREGARD_INLINE_LIMITS (node->decl);
> + 
>     /* Make sure to gimplify bodies only once.  During analyzing a
>        function we lower it, which will require gimplified nested
>        functions, so we can end up here with an already gimplified
> Index: testsuite/g++.dg/debug/pr47106.C
> ===================================================================
> *** testsuite/g++.dg/debug/pr47106.C	(revision 0)
> --- testsuite/g++.dg/debug/pr47106.C	(revision 0)
> ***************
> *** 0 ****
> --- 1,37 ----
> + // { dg-do compile }
> + // { dg-options "-O -fpartial-inlining -flto -fconserve-stack -fcompare-debug" }
> + 
> + void end (int, int) __attribute__ ((__noreturn__));
> + 
> + struct S
> + {
> +   int i;
> +   S *s;
> + };
> + 
> + inline bool f (S *s)
> + {
> +   if (!s->s)
> +     end (0, 0);
> +   return s->s == s;
> + }
> + 
> + inline bool
> + baz (S s1, S)
> + {
> +   while (f (&s1));
> + }
> + 
> + inline bool
> + bar (S s1, S s2, S)
> + {
> +   baz (s1, s2);
> + }
> + 
> + S getS ();
> + 
> + bool
> + foo ()
> + {
> +   bar (getS (), getS (), getS ());
> + }
> Index: ipa-inline.c
> ===================================================================
> *** ipa-inline.c	(revision 170240)
> --- ipa-inline.c	(working copy)
> *************** compute_inline_parameters (struct cgraph
> *** 2006,2014 ****
>   	  break;
>         node->local.can_change_signature = !e;
>       }
> -   if (node->local.inlinable && !node->local.disregard_inline_limits)
> -     node->local.disregard_inline_limits
> -       = DECL_DISREGARD_INLINE_LIMITS (node->decl);
>     estimate_function_body_sizes (node);
>     /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
>     node->global.time = inline_summary (node)->self_time;
> --- 2006,2011 ----
> Index: cfgexpand.c
> ===================================================================
> *** cfgexpand.c	(revision 170240)
> --- cfgexpand.c	(working copy)
> *************** create_stack_guard (void)
> *** 1311,1340 ****
>     crtl->stack_protect_guard = guard;
>   }
>   
> - /* A subroutine of expand_used_vars.  Walk down through the BLOCK tree
> -    expanding variables.  Those variables that can be put into registers
> -    are allocated pseudos; those that can't are put on the stack.
> - 
> -    TOPLEVEL is true if this is the outermost BLOCK.  */
> - 
> - static HOST_WIDE_INT
> - account_used_vars_for_block (tree block, bool toplevel)
> - {
> -   tree t;
> -   HOST_WIDE_INT size = 0;
> - 
> -   /* Expand all variables at this level.  */
> -   for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
> -     if (var_ann (t) && is_used_p (t))
> -       size += expand_one_var (t, toplevel, false);
> - 
> -   /* Expand all variables at containing levels.  */
> -   for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t))
> -     size += account_used_vars_for_block (t, false);
> - 
> -   return size;
> - }
> - 
>   /* Prepare for expanding variables.  */
>   static void
>   init_vars_expansion (void)
> --- 1311,1316 ----
> *************** estimated_stack_frame_size (struct cgrap
> *** 1379,1401 ****
>   {
>     HOST_WIDE_INT size = 0;
>     size_t i;
> !   tree var, outer_block = DECL_INITIAL (current_function_decl);
> !   unsigned ix;
>     tree old_cur_fun_decl = current_function_decl;
>   
>     current_function_decl = node->decl;
> !   push_cfun (DECL_STRUCT_FUNCTION (node->decl));
>   
> !   init_vars_expansion ();
> ! 
> !   FOR_EACH_LOCAL_DECL (cfun, ix, var)
> !     {
> !       /* TREE_USED marks local variables that do not appear in lexical
> ! 	 blocks.  We don't want to expand those that do twice.  */
> !       if (TREE_USED (var))
> !         size += expand_one_var (var, true, false);
> !     }
> !   size += account_used_vars_for_block (outer_block, true);
>   
>     if (stack_vars_num > 0)
>       {
> --- 1355,1371 ----
>   {
>     HOST_WIDE_INT size = 0;
>     size_t i;
> !   tree var;
>     tree old_cur_fun_decl = current_function_decl;
> +   referenced_var_iterator rvi;
> +   struct function *fn = DECL_STRUCT_FUNCTION (node->decl);
>   
>     current_function_decl = node->decl;
> !   push_cfun (fn);
>   
> !   gcc_checking_assert (gimple_referenced_vars (fn));
> !   FOR_EACH_REFERENCED_VAR (fn, var, rvi)
> !     size += expand_one_var (var, true, false);
>   
>     if (stack_vars_num > 0)
>       {
> Index: tree-inline.c
> ===================================================================
> *** tree-inline.c	(revision 170240)
> --- tree-inline.c	(working copy)
> *************** remap_decl (tree decl, copy_body_data *i
> *** 312,324 ****
>   	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
>   	}
>   
> !       if (cfun && gimple_in_ssa_p (cfun)
> ! 	  && (TREE_CODE (t) == VAR_DECL
> ! 	      || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
> ! 	{
> ! 	  get_var_ann (t);
> ! 	  add_referenced_var (t);
> ! 	}
>         return t;
>       }
>   
> --- 312,328 ----
>   	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
>   	}
>   
> !       if ((TREE_CODE (t) == VAR_DECL
> ! 	   || TREE_CODE (t) == RESULT_DECL
> ! 	   || TREE_CODE (t) == PARM_DECL)
> ! 	  && id->src_fn && DECL_STRUCT_FUNCTION (id->src_fn)
> ! 	  && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
> ! 	  /* We don't want to mark as referenced VAR_DECLs that were
> ! 	     not marked as such in the src function.  */
> ! 	  && (TREE_CODE (decl) != VAR_DECL
> ! 	      || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
> ! 					DECL_UID (decl))))
> ! 	add_referenced_var (t);
>         return t;
>       }
>   
> *************** setup_one_parameter (copy_body_data *id,
> *** 2547,2556 ****
>   
>     /* We're actually using the newly-created var.  */
>     if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
> !     {
> !       get_var_ann (var);
> !       add_referenced_var (var);
> !     }
>   
>     /* Declare this new variable.  */
>     DECL_CHAIN (var) = *vars;
> --- 2551,2557 ----
>   
>     /* We're actually using the newly-created var.  */
>     if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
> !     add_referenced_var (var);
>   
>     /* Declare this new variable.  */
>     DECL_CHAIN (var) = *vars;
> *************** declare_return_variable (copy_body_data 
> *** 2857,2866 ****
>   
>     var = copy_result_decl_to_var (result, id);
>     if (gimple_in_ssa_p (cfun))
> !     {
> !       get_var_ann (var);
> !       add_referenced_var (var);
> !     }
>   
>     DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
>   
> --- 2858,2864 ----
>   
>     var = copy_result_decl_to_var (result, id);
>     if (gimple_in_ssa_p (cfun))
> !     add_referenced_var (var);
>   
>     DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
>   
> *************** declare_return_variable (copy_body_data 
> *** 2896,2905 ****
>       {
>         tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
>         if (gimple_in_ssa_p (id->src_cfun))
> ! 	{
> ! 	  get_var_ann (temp);
> ! 	  add_referenced_var (temp);
> ! 	}
>         insert_decl_map (id, result, temp);
>         /* When RESULT_DECL is in SSA form, we need to use it's default_def
>   	 SSA_NAME.  */
> --- 2894,2900 ----
>       {
>         tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
>         if (gimple_in_ssa_p (id->src_cfun))
> ! 	add_referenced_var (temp);
>         insert_decl_map (id, result, temp);
>         /* When RESULT_DECL is in SSA form, we need to use it's default_def
>   	 SSA_NAME.  */
> *************** copy_decl_for_dup_finish (copy_body_data
> *** 4753,4758 ****
> --- 4748,4761 ----
>          new function.  */
>       DECL_CONTEXT (copy) = id->dst_fn;
>   
> +   if (TREE_CODE (decl) == VAR_DECL
> +       /* C++ clones functions during parsing, before
> + 	 referenced_vars.  */
> +       && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
> +       && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
> + 				DECL_UID (decl)))
> +     add_referenced_var (copy);
> + 
>     return copy;
>   }
>   
> *************** copy_arguments_for_versioning (tree orig
> *** 4864,4870 ****
>   	   as temporary variable later in function, the uses will be
>   	   replaced by local variable.  */
>   	tree var = copy_decl_to_var (arg, id);
> - 	get_var_ann (var);
>   	add_referenced_var (var);
>   	insert_decl_map (id, arg, var);
>           /* Declare this new variable.  */
> --- 4867,4872 ----
> Index: passes.c
> ===================================================================
> *** passes.c	(revision 170240)
> --- passes.c	(working copy)
> *************** init_optimization_passes (void)
> *** 729,735 ****
>     NEXT_PASS (pass_build_cfg);
>     NEXT_PASS (pass_warn_function_return);
>     NEXT_PASS (pass_build_cgraph_edges);
> -   NEXT_PASS (pass_inline_parameters);
>     *p = NULL;
>   
>     /* Interprocedural optimization passes.  */
> --- 729,734 ----
> *************** init_optimization_passes (void)
> *** 747,758 ****
>         NEXT_PASS (pass_build_ssa);
>         NEXT_PASS (pass_lower_vector);
>         NEXT_PASS (pass_early_warn_uninitialized);
> -       /* Note that it is not strictly necessary to schedule an early
> - 	 inline pass here.  However, some test cases (e.g.,
> - 	 g++.dg/other/p334435.C g++.dg/other/i386-1.C) expect extern
> - 	 inline functions to be inlined even at -O0.  This does not
> - 	 happen during the first early inline pass.  */
>         NEXT_PASS (pass_rebuild_cgraph_edges);
>         NEXT_PASS (pass_early_inline);
>         NEXT_PASS (pass_all_early_optimizations);
>   	{
> --- 746,753 ----
>         NEXT_PASS (pass_build_ssa);
>         NEXT_PASS (pass_lower_vector);
>         NEXT_PASS (pass_early_warn_uninitialized);
>         NEXT_PASS (pass_rebuild_cgraph_edges);
> +       NEXT_PASS (pass_inline_parameters);
>         NEXT_PASS (pass_early_inline);
>         NEXT_PASS (pass_all_early_optimizations);
>   	{
Jakub Jelinek Feb. 18, 2011, 8:21 a.m. UTC | #8
On Thu, Feb 17, 2011 at 10:27:23PM -0500, Jack Howarth wrote:
> On Thu, Feb 17, 2011 at 09:35:56AM +0100, Jan Hubicka wrote:
>     It seems that this patch may have introduced the regressions....
> 
> FAIL: gcc.dg/lto/20081115 c_lto_20081115_0.o-c_lto_20081115_2.o execute -O0 -flto -flto-partition=1to1
> FAIL: gcc.dg/lto/20081118 c_lto_20081118_0.o-c_lto_20081118_2.o execute -O0 -flto -flto-partition=1to1
> FAIL: gcc.dg/lto/20081201-1 c_lto_20081201-1_0.o-c_lto_20081201-1_2.o execute -O0 -flto -flto-partition=1to1
> 
> looking at http://gcc.gnu.org/ml/gcc-testresults/2011-02/msg01897.html compared to
> http://gcc.gnu.org/ml/gcc-testresults/2011-02/msg01911.html. I also see this on x86_64-apple-darwin10.

That's http://gcc.gnu.org/PR47788, already reported.

	Jakub
diff mbox

Patch

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* cfgexpand.c (account_used_vars_for_block): Remove.
	(estimated_stack_frame_size): Use referenced vars.
	* tree-inline.c (remap_decl): Only mark VAR_DECLs as referenced
	that were referenced in the original function.  Test src_fn
	rather than cfun.  Drop redundant get_var_ann.
	(setup_one_parameter): Drop redundant get_var_ann.
	(declare_return_variable): Likewise.
	(copy_decl_for_dup_finish): Mark VAR_DECLs referenced in src_fn.
	(copy_arguments_for_versioning): Drop redundant get_var_ann.
	* tree-pass.h (pass_inlinable): Declare.
	* passes.c (init_optimization_passes): Move first
	pass_inline_parameters after pass_referenced_vars.  Add
	pass_inlinable where it was before.
	* ipa-inline.c (compute_inlinable): New, split out of...
	(compute_inline_parameters): ... this.  Quit if referenced_vars
	are not available.
	(compute_inlinable_for_current, pass_inlinable): New.
	(pass_inline_parameters): Require PROP_referenced_vars.
	* cgraphunit.c (cgraph_process_new_functions): Don't run
	compute_inline_parameters explicitly.

for  gcc/testsuite/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* g++.dg/debug/pr47106.C: New.

Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-02-11 13:42:37.430636412 -0200
+++ gcc/cfgexpand.c	2011-02-11 13:42:40.067581306 -0200
@@ -1311,30 +1311,6 @@  create_stack_guard (void)
   crtl->stack_protect_guard = guard;
 }
 
-/* A subroutine of expand_used_vars.  Walk down through the BLOCK tree
-   expanding variables.  Those variables that can be put into registers
-   are allocated pseudos; those that can't are put on the stack.
-
-   TOPLEVEL is true if this is the outermost BLOCK.  */
-
-static HOST_WIDE_INT
-account_used_vars_for_block (tree block, bool toplevel)
-{
-  tree t;
-  HOST_WIDE_INT size = 0;
-
-  /* Expand all variables at this level.  */
-  for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-    if (var_ann (t) && is_used_p (t))
-      size += expand_one_var (t, toplevel, false);
-
-  /* Expand all variables at containing levels.  */
-  for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t))
-    size += account_used_vars_for_block (t, false);
-
-  return size;
-}
-
 /* Prepare for expanding variables.  */
 static void
 init_vars_expansion (void)
@@ -1379,23 +1355,17 @@  estimated_stack_frame_size (struct cgrap
 {
   HOST_WIDE_INT size = 0;
   size_t i;
-  tree var, outer_block = DECL_INITIAL (current_function_decl);
-  unsigned ix;
+  tree var;
   tree old_cur_fun_decl = current_function_decl;
+  referenced_var_iterator rvi;
+  struct function *fn = DECL_STRUCT_FUNCTION (node->decl);
 
   current_function_decl = node->decl;
-  push_cfun (DECL_STRUCT_FUNCTION (node->decl));
+  push_cfun (fn);
 
-  init_vars_expansion ();
-
-  FOR_EACH_LOCAL_DECL (cfun, ix, var)
-    {
-      /* TREE_USED marks local variables that do not appear in lexical
-	 blocks.  We don't want to expand those that do twice.  */
-      if (TREE_USED (var))
-        size += expand_one_var (var, true, false);
-    }
-  size += account_used_vars_for_block (outer_block, true);
+  gcc_checking_assert (gimple_referenced_vars (fn));
+  FOR_EACH_REFERENCED_VAR (fn, var, rvi)
+    size += expand_one_var (var, true, false);
 
   if (stack_vars_num > 0)
     {
Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c.orig	2011-02-11 13:42:38.454614859 -0200
+++ gcc/tree-inline.c	2011-02-11 13:42:40.122579711 -0200
@@ -312,13 +312,17 @@  remap_decl (tree decl, copy_body_data *i
 	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
 	}
 
-      if (cfun && gimple_in_ssa_p (cfun)
-	  && (TREE_CODE (t) == VAR_DECL
-	      || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
-	{
-	  get_var_ann (t);
-	  add_referenced_var (t);
-	}
+      if ((TREE_CODE (t) == VAR_DECL
+	   || TREE_CODE (t) == RESULT_DECL
+	   || TREE_CODE (t) == PARM_DECL)
+	  && id->src_fn && DECL_STRUCT_FUNCTION (id->src_fn)
+	  && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+	  /* We don't want to mark as referenced VAR_DECLs that were
+	     not marked as such in the src function.  */
+	  && (TREE_CODE (decl) != VAR_DECL
+	      || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+					DECL_UID (decl))))
+	add_referenced_var (t);
       return t;
     }
 
@@ -2547,10 +2551,7 @@  setup_one_parameter (copy_body_data *id,
 
   /* We're actually using the newly-created var.  */
   if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
-    {
-      get_var_ann (var);
-      add_referenced_var (var);
-    }
+    add_referenced_var (var);
 
   /* Declare this new variable.  */
   DECL_CHAIN (var) = *vars;
@@ -2857,10 +2858,7 @@  declare_return_variable (copy_body_data 
 
   var = copy_result_decl_to_var (result, id);
   if (gimple_in_ssa_p (cfun))
-    {
-      get_var_ann (var);
-      add_referenced_var (var);
-    }
+    add_referenced_var (var);
 
   DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
 
@@ -2896,10 +2894,7 @@  declare_return_variable (copy_body_data 
     {
       tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
       if (gimple_in_ssa_p (id->src_cfun))
-	{
-	  get_var_ann (temp);
-	  add_referenced_var (temp);
-	}
+	add_referenced_var (temp);
       insert_decl_map (id, result, temp);
       /* When RESULT_DECL is in SSA form, we need to use it's default_def
 	 SSA_NAME.  */
@@ -4753,6 +4748,14 @@  copy_decl_for_dup_finish (copy_body_data
        new function.  */
     DECL_CONTEXT (copy) = id->dst_fn;
 
+  if (TREE_CODE (decl) == VAR_DECL
+      /* C++ clones functions during parsing, before
+	 referenced_vars.  */
+      && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+      && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+				DECL_UID (decl)))
+    add_referenced_var (copy);
+
   return copy;
 }
 
@@ -4864,7 +4867,6 @@  copy_arguments_for_versioning (tree orig
 	   as temporary variable later in function, the uses will be
 	   replaced by local variable.  */
 	tree var = copy_decl_to_var (arg, id);
-	get_var_ann (var);
 	add_referenced_var (var);
 	insert_decl_map (id, arg, var);
         /* Declare this new variable.  */
Index: gcc/passes.c
===================================================================
--- gcc/passes.c.orig	2011-02-11 13:42:38.255618853 -0200
+++ gcc/passes.c	2011-02-11 13:42:40.146579235 -0200
@@ -729,7 +729,7 @@  init_optimization_passes (void)
   NEXT_PASS (pass_build_cfg);
   NEXT_PASS (pass_warn_function_return);
   NEXT_PASS (pass_build_cgraph_edges);
-  NEXT_PASS (pass_inline_parameters);
+  NEXT_PASS (pass_inlinable);
   *p = NULL;
 
   /* Interprocedural optimization passes.  */
@@ -747,12 +747,8 @@  init_optimization_passes (void)
       NEXT_PASS (pass_build_ssa);
       NEXT_PASS (pass_lower_vector);
       NEXT_PASS (pass_early_warn_uninitialized);
-      /* Note that it is not strictly necessary to schedule an early
-	 inline pass here.  However, some test cases (e.g.,
-	 g++.dg/other/p334435.C g++.dg/other/i386-1.C) expect extern
-	 inline functions to be inlined even at -O0.  This does not
-	 happen during the first early inline pass.  */
       NEXT_PASS (pass_rebuild_cgraph_edges);
+      NEXT_PASS (pass_inline_parameters);
       NEXT_PASS (pass_early_inline);
       NEXT_PASS (pass_all_early_optimizations);
 	{
Index: gcc/ipa-inline.c
===================================================================
--- gcc/ipa-inline.c.orig	2011-02-11 13:42:37.859627265 -0200
+++ gcc/ipa-inline.c	2011-02-11 13:42:55.655254908 -0200
@@ -1974,21 +1974,10 @@  estimate_function_body_sizes (struct cgr
   inline_summary (node)->size_inlining_benefit = size_inlining_benefit;
 }
 
-/* Compute parameters of functions used by inliner.  */
-void
-compute_inline_parameters (struct cgraph_node *node)
+/* Determine whether node is inlinable.  */
+static void
+compute_inlinable (struct cgraph_node *node)
 {
-  HOST_WIDE_INT self_stack_size;
-
-  gcc_assert (!node->global.inlined_to);
-
-  /* Estimate the stack size for the function if we're optimizing.  */
-  self_stack_size = optimize ? estimated_stack_frame_size (node) : 0;
-  inline_summary (node)->estimated_self_stack_size = self_stack_size;
-  node->global.estimated_stack_size = self_stack_size;
-  node->global.stack_frame_offset = 0;
-
-  /* Can this function be inlined at all?  */
   node->local.inlinable = tree_inlinable_function_p (node->decl);
 
   /* Inlinable functions always can change signature.  */
@@ -1998,7 +1987,7 @@  compute_inline_parameters (struct cgraph
     {
       struct cgraph_edge *e;
 
-      /* Functions calling builtlin_apply can not change signature.  */
+      /* Functions calling builtin_apply can not change signature.  */
       for (e = node->callees; e; e = e->next_callee)
 	if (DECL_BUILT_IN (e->callee->decl)
 	    && DECL_BUILT_IN_CLASS (e->callee->decl) == BUILT_IN_NORMAL
@@ -2006,9 +1995,33 @@  compute_inline_parameters (struct cgraph
 	  break;
       node->local.can_change_signature = !e;
     }
+
   if (node->local.inlinable && !node->local.disregard_inline_limits)
     node->local.disregard_inline_limits
       = DECL_DISREGARD_INLINE_LIMITS (node->decl);
+}
+
+/* Compute parameters of functions used by inliner.  */
+void
+compute_inline_parameters (struct cgraph_node *node)
+{
+  HOST_WIDE_INT self_stack_size;
+
+  gcc_assert (!node->global.inlined_to);
+
+  compute_inlinable (node);
+
+  /* A function won't be considered for inlining before we put it in
+     SSA form and compute referenced_vars.  */
+  if (!gimple_referenced_vars (DECL_STRUCT_FUNCTION (node->decl)))
+    return;
+
+  /* Estimate the stack size for the function if we're optimizing.  */
+  self_stack_size = optimize ? estimated_stack_frame_size (node) : 0;
+  inline_summary (node)->estimated_self_stack_size = self_stack_size;
+  node->global.estimated_stack_size = self_stack_size;
+  node->global.stack_frame_offset = 0;
+
   estimate_function_body_sizes (node);
   /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
   node->global.time = inline_summary (node)->self_time;
@@ -2016,6 +2029,14 @@  compute_inline_parameters (struct cgraph
 }
 
 
+/* Determine whether current_function_decl is inlinable.  */
+static unsigned int
+compute_inlinable_for_current (void)
+{
+  compute_inlinable (cgraph_node (current_function_decl));
+  return 0;
+}
+
 /* Compute parameters of functions used by inliner using
    current_function_decl.  */
 static unsigned int
@@ -2025,6 +2046,25 @@  compute_inline_parameters_for_current (v
   return 0;
 }
 
+struct gimple_opt_pass pass_inlinable =
+{
+ {
+  GIMPLE_PASS,
+  "inlinable",				/* name */
+  NULL,					/* gate */
+  compute_inlinable_for_current,	/* execute */
+  NULL,					/* sub */
+  NULL,					/* next */
+  0,					/* static_pass_number */
+  TV_INLINE_HEURISTICS,			/* tv_id */
+  0,					/* properties_required */
+  0,					/* properties_provided */
+  0,					/* properties_destroyed */
+  0,					/* todo_flags_start */
+  0					/* todo_flags_finish */
+ }
+};
+
 struct gimple_opt_pass pass_inline_parameters =
 {
  {
@@ -2036,7 +2076,7 @@  struct gimple_opt_pass pass_inline_param
   NULL,					/* next */
   0,					/* static_pass_number */
   TV_INLINE_HEURISTICS,			/* tv_id */
-  0,	                                /* properties_required */
+  PROP_referenced_vars,	                /* properties_required */
   0,					/* properties_provided */
   0,					/* properties_destroyed */
   0,					/* todo_flags_start */
Index: gcc/cgraphunit.c
===================================================================
--- gcc/cgraphunit.c.orig	2011-02-11 13:42:38.057623032 -0200
+++ gcc/cgraphunit.c	2011-02-11 13:42:40.211577872 -0200
@@ -246,7 +246,6 @@  cgraph_process_new_functions (void)
 	    cgraph_analyze_function (node);
 	  push_cfun (DECL_STRUCT_FUNCTION (fndecl));
 	  current_function_decl = fndecl;
-	  compute_inline_parameters (node);
 	  if ((cgraph_state == CGRAPH_STATE_IPA_SSA
 	      && !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)))
 	      /* When not optimizing, be sure we run early local passes anyway
Index: gcc/testsuite/g++.dg/debug/pr47106.C
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/testsuite/g++.dg/debug/pr47106.C	2011-02-11 13:42:40.236577347 -0200
@@ -0,0 +1,37 @@ 
+// { dg-do compile }
+// { dg-options "-O -fpartial-inlining -flto -fconserve-stack -fcompare-debug" }
+
+void end (int, int) __attribute__ ((__noreturn__));
+
+struct S
+{
+  int i;
+  S *s;
+};
+
+inline bool f (S *s)
+{
+  if (!s->s)
+    end (0, 0);
+  return s->s == s;
+}
+
+inline bool
+baz (S s1, S)
+{
+  while (f (&s1));
+}
+
+inline bool
+bar (S s1, S s2, S)
+{
+  baz (s1, s2);
+}
+
+S getS ();
+
+bool
+foo ()
+{
+  bar (getS (), getS (), getS ());
+}
Index: gcc/tree-pass.h
===================================================================
--- gcc/tree-pass.h.orig	2011-02-11 13:42:37.641631793 -0200
+++ gcc/tree-pass.h	2011-02-11 13:42:40.249577075 -0200
@@ -572,6 +572,7 @@  extern struct rtl_opt_pass pass_final;
 extern struct rtl_opt_pass pass_rtl_seqabstr;
 extern struct gimple_opt_pass pass_release_ssa_names;
 extern struct gimple_opt_pass pass_early_inline;
+extern struct gimple_opt_pass pass_inlinable;
 extern struct gimple_opt_pass pass_inline_parameters;
 extern struct gimple_opt_pass pass_all_early_optimizations;
 extern struct gimple_opt_pass pass_update_address_taken;