Patchwork [PR,debug/47106] account used vars only once

login
register
mail settings
Submitter Alexandre Oliva
Date Jan. 6, 2011, 10:11 a.m.
Message ID <or8vyy2vgb.fsf@livre.localdomain>
Download mbox | patch
Permalink /patch/77699/
State New
Headers show

Comments

Alexandre Oliva - Jan. 6, 2011, 10:11 a.m.
This -fcompare-debug failure occurs because we remove variables from the
lexical blocks in the compilation without debug info, but we account the
stack space they won't use in the compilation with debug info, which
leads to different inlining decisions when stack-growth limits are set.

It turns out that the use of TREE_USED in the stack frame size
estimation functions seems to be misguided: we initially set it for all
local decls that don't appear in lexical blocks (not shown in patch),
then account the size of all LOCAL_DECLs that have it set (which means
accounting only those that don't appear in lexical blocks), set it on
all LOCAL_DECLs, and then account the size of decls in lexical blocks if
they have TREE_USED (which means accounting only those that are in
LOCAL_DECLs, not accounting those that aren't).

I fixed it so that we leave TREE_USED alone after initialization: we
account variables that aren't in lexical blocks while looping over
LOCAL_DECLs, and then account all variables in lexical blocks, as long
as they are actually used, which avoids accounting declarations retained
only for purposes of debug information generation.

I suppose restoring the assignment and test of TREE_USED would also
work, but AFAICT it would just waste CPU cycles.

This was regstrapped on x86_{32,64}-linux-gnu.  I also bootstrapped it
with BOOT_CFLAGS='-O2 -g -fpartial-inlining -flto -fconserve-stack', and
regstrapping succeeded except for gfortran.dg/realloc_on_assign_2.exe at
all torture levels: stage3 f951 fails to compile it, although a
non-bootstrap f951 does, on both x86_{32,64}.  From that I concluded
that stage[23]s are miscompiled with these BOOT_CFLAGS above, even
before my patch, but I haven't debugged or investigated that any
further.

Ok to install?
Richard Guenther - Jan. 10, 2011, 3:22 p.m.
On Thu, Jan 6, 2011 at 11:11 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> This -fcompare-debug failure occurs because we remove variables from the
> lexical blocks in the compilation without debug info, but we account the
> stack space they won't use in the compilation with debug info, which
> leads to different inlining decisions when stack-growth limits are set.
>
> It turns out that the use of TREE_USED in the stack frame size
> estimation functions seems to be misguided: we initially set it for all
> local decls that don't appear in lexical blocks (not shown in patch),
> then account the size of all LOCAL_DECLs that have it set (which means
> accounting only those that don't appear in lexical blocks), set it on
> all LOCAL_DECLs, and then account the size of decls in lexical blocks if
> they have TREE_USED (which means accounting only those that are in
> LOCAL_DECLs, not accounting those that aren't).
>
> I fixed it so that we leave TREE_USED alone after initialization: we
> account variables that aren't in lexical blocks while looping over
> LOCAL_DECLs, and then account all variables in lexical blocks, as long
> as they are actually used, which avoids accounting declarations retained
> only for purposes of debug information generation.
>
> I suppose restoring the assignment and test of TREE_USED would also
> work, but AFAICT it would just waste CPU cycles.
>
> This was regstrapped on x86_{32,64}-linux-gnu.  I also bootstrapped it
> with BOOT_CFLAGS='-O2 -g -fpartial-inlining -flto -fconserve-stack', and
> regstrapping succeeded except for gfortran.dg/realloc_on_assign_2.exe at
> all torture levels: stage3 f951 fails to compile it, although a
> non-bootstrap f951 does, on both x86_{32,64}.  From that I concluded
> that stage[23]s are miscompiled with these BOOT_CFLAGS above, even
> before my patch, but I haven't debugged or investigated that any
> further.
>
> Ok to install?

It looks ok when just looking at the documentation of the var_ann
used filed, but I cannot find the code that computes it during
out-of-SSA, instead do we rely on remove-unused-locals computing it?
Are we sure that we at least
run local vars removal once before we estimated stack frame
size the first time - I see pass_inline_parameters before the
first run?  All variables are initially created with used
set to false (which is a non-conservative default, likely false
even).  I guess I'm more happy with the patch if you switch the
initialization in create_var_ann to mark the variable used.

I can't make sense of the TREE_USED setting in
estimate_stack_frame_size in the first place - maybe if
the loop in account_used_vars_for_block would be
testing !TREE_USED ..., but then only using the
var-anns used flag in the for-each-local-decl loop will
have the same problem with debug-compare?

Thus, I can't see how the duplicate-accounting avoiding
works (even before your patch), and I smell an inconsistency
when only partially using var-anns used flags.

Maybe Honza can explain how the code works ...?

Thanks,
Richard.

>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist      Red Hat Brazil Compiler Engineer
>
>
Jan Hubicka - Jan. 10, 2011, 3:33 p.m.
> It looks ok when just looking at the documentation of the var_ann
> used filed, but I cannot find the code that computes it during
> out-of-SSA, instead do we rely on remove-unused-locals computing it?

AFAIK remove-unused-locals was supposed to clear it and it was supposed
to be run after passes removing references to locals, so it should be run
before we get into inlinine parameters.
> Are we sure that we at least
> run local vars removal once before we estimated stack frame
> size the first time - I see pass_inline_parameters before the
> first run?  All variables are initially created with used
> set to false (which is a non-conservative default, likely false
> even).  I guess I'm more happy with the patch if you switch the
> initialization in create_var_ann to mark the variable used.
> 
> I can't make sense of the TREE_USED setting in
> estimate_stack_frame_size in the first place - maybe if
> the loop in account_used_vars_for_block would be
> testing !TREE_USED ..., but then only using the
> var-anns used flag in the for-each-local-decl loop will
> have the same problem with debug-compare?
> 
> Thus, I can't see how the duplicate-accounting avoiding
> works (even before your patch), and I smell an inconsistency
> when only partially using var-anns used flags.
> 
> Maybe Honza can explain how the code works ...?
Nope, I think this is Steven's code, since the code originally just executed
the Rth's stack packing path and got its result.

Honza

> 
> Thanks,
> Richard.
> 
> >
> >
> > --
> > Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> > You must be the change you wish to see in the world. -- Gandhi
> > Be Free! -- http://FSFLA.org/   FSF Latin America board member
> > Free Software Evangelist      Red Hat Brazil Compiler Engineer
> >
> >
Richard Guenther - Jan. 10, 2011, 4:30 p.m.
2011/1/10 Jan Hubicka <hubicka@ucw.cz>:
>> It looks ok when just looking at the documentation of the var_ann
>> used filed, but I cannot find the code that computes it during
>> out-of-SSA, instead do we rely on remove-unused-locals computing it?
>
> AFAIK remove-unused-locals was supposed to clear it and it was supposed
> to be run after passes removing references to locals, so it should be run
> before we get into inlinine parameters.

inline-parameters is run in lowering passes, so for cycles we'll end
up estimating stack size to zero?  Or can we drop the early
inline-parameters pass run and rely on that after early opts?

>> Are we sure that we at least
>> run local vars removal once before we estimated stack frame
>> size the first time - I see pass_inline_parameters before the
>> first run?  All variables are initially created with used
>> set to false (which is a non-conservative default, likely false
>> even).  I guess I'm more happy with the patch if you switch the
>> initialization in create_var_ann to mark the variable used.
>>
>> I can't make sense of the TREE_USED setting in
>> estimate_stack_frame_size in the first place - maybe if
>> the loop in account_used_vars_for_block would be
>> testing !TREE_USED ..., but then only using the
>> var-anns used flag in the for-each-local-decl loop will
>> have the same problem with debug-compare?
>>
>> Thus, I can't see how the duplicate-accounting avoiding
>> works (even before your patch), and I smell an inconsistency
>> when only partially using var-anns used flags.
>>
>> Maybe Honza can explain how the code works ...?
> Nope, I think this is Steven's code, since the code originally just executed
> the Rth's stack packing path and got its result.

Hm.

Richard.
Jan Hubicka - Jan. 10, 2011, 4:52 p.m.
> 2011/1/10 Jan Hubicka <hubicka@ucw.cz>:
> >> It looks ok when just looking at the documentation of the var_ann
> >> used filed, but I cannot find the code that computes it during
> >> out-of-SSA, instead do we rely on remove-unused-locals computing it?
> >
> > AFAIK remove-unused-locals was supposed to clear it and it was supposed
> > to be run after passes removing references to locals, so it should be run
> > before we get into inlinine parameters.
> 
> inline-parameters is run in lowering passes, so for cycles we'll end
> up estimating stack size to zero?  Or can we drop the early
> inline-parameters pass run and rely on that after early opts?

well, the first inline-parameters is run for early inlining to work.
Early inlining needs both estimates of functon it is inlining into and functions
it is inlininig.  First is computed by the first invocation of inline-parameters,
the second is computed by the inline-parameters at the end of early optimizations.


I think we can simply move the first inline_parameters just befre early inlining.

Honza
> 
> >> Are we sure that we at least
> >> run local vars removal once before we estimated stack frame
> >> size the first time - I see pass_inline_parameters before the
> >> first run?  All variables are initially created with used
> >> set to false (which is a non-conservative default, likely false
> >> even).  I guess I'm more happy with the patch if you switch the
> >> initialization in create_var_ann to mark the variable used.
> >>
> >> I can't make sense of the TREE_USED setting in
> >> estimate_stack_frame_size in the first place - maybe if
> >> the loop in account_used_vars_for_block would be
> >> testing !TREE_USED ..., but then only using the
> >> var-anns used flag in the for-each-local-decl loop will
> >> have the same problem with debug-compare?
> >>
> >> Thus, I can't see how the duplicate-accounting avoiding
> >> works (even before your patch), and I smell an inconsistency
> >> when only partially using var-anns used flags.
> >>
> >> Maybe Honza can explain how the code works ...?
> > Nope, I think this is Steven's code, since the code originally just executed
> > the Rth's stack packing path and got its result.
> 
> Hm.
> 
> Richard.

Patch

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	* cfgexpand.c (account_used_vars_for_block): Only account vars
	that are annotated as used.
	(estimated_stack_frame_size): Don't set TREE_USED.

Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2010-12-30 18:05:43.930553884 -0200
+++ gcc/cfgexpand.c	2011-01-04 02:39:50.745382341 -0200
@@ -1325,7 +1325,7 @@  account_used_vars_for_block (tree block,
 
   /* Expand all variables at this level.  */
   for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-    if (TREE_USED (t))
+    if (var_ann (t) && var_ann (t)->used)
       size += expand_one_var (t, toplevel, false);
 
   /* Expand all variables at containing levels.  */
@@ -1389,9 +1389,10 @@  estimated_stack_frame_size (tree decl)
 
   FOR_EACH_LOCAL_DECL (cfun, ix, var)
     {
+      /* TREE_USED marks local variables that do not appear in lexical
+	 blocks.  We don't want to expand those that do twice.  */
       if (TREE_USED (var))
         size += expand_one_var (var, true, false);
-      TREE_USED (var) = 1;
     }
   size += account_used_vars_for_block (outer_block, true);