diff mbox

Fix late dwarf generated early from optimized out globals

Message ID 5d717e7f-9585-555a-3a33-664492018c8e@fgznet.ch
State New
Headers show

Commit Message

Andreas Tobler Jan. 5, 2017, 9:35 p.m. UTC
On 05.01.17 13:05, Richard Biener wrote:
> On Wed, 4 Jan 2017, Andreas Tobler wrote:
>
>> On 04.01.17 10:21, Richard Biener wrote:
>>> On Wed, 28 Dec 2016, Andreas Tobler wrote:
>>>
>>>> On 28.12.16 19:24, Richard Biener wrote:
>>>>> On December 27, 2016 11:17:00 PM GMT+01:00, Andreas Tobler
>>>>> <andreast-list@fgznet.ch> wrote:
>>>>>> On 16.09.16 13:30, Richard Biener wrote:
>>>>>>> On Thu, 15 Sep 2016, Richard Biener wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> This addresses sth I needed to address with the early LTO debug
>>>>>> patches
>>>>>>>> (you might now figure I'm piecemail merging stuff from that
>>>>>>>> patch).
>>>>>>>>
>>>>>>>> When the cgraph code optimizes out a global we call the
>>>>>> late_global_decl
>>>>>>>> debug hook to eventually add a DW_AT_const_value to its DIE (we
>>>>>> don't
>>>>>>>> really expect a location as that will be invalid after optimizing
>>>>>> out
>>>>>>>> and will be pruned).
>>>>>>>>
>>>>>>>> With the early LTO debug patches I have introduced a
>>>>>> early_dwarf_finished
>>>>>>>> flag (mainly for consistency checking) and I figured I can use
>>>>>>>> that
>>>>>> to
>>>>>>>> detect the call to the late hook during the early phase and
>>>>>>>> provide
>>>>>>>> the following cleaned up variant of avoiding to create locations
>>>>>> that
>>>>>>>> require later pruning (which doesn't work with emitting the early
>>>>>> DIEs).
>>>>>>>>
>>>>>>>> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>>>>>>>>
>>>>>>>> I verified it does the correct thing for a unit like
>>>>>>>>
>>>>>>>> static const int i = 2;
>>>>>>>>
>>>>>>>> (but ISTR we do have at least one testcase in the testsuite as
>>>>>> well).
>>>>>>>>
>>>>>>>> Will commit if testing finishes successfully.
>>>>>>>
>>>>>>> Ok, so it showed issues when merging that back to early-LTO-debug.
>>>>>>> Turns out in LTO we never call early_finish and thus
>>>>>> early_dwarf_finished
>>>>>>> was never set.  Also dwarf2out_late_global_decl itself is a better
>>>>>>> place to constrain generating locations.
>>>>>>>
>>>>>>> The following variant is in very late stage of testing.
>>>>>>>
>>>>>>> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>>>>>>> LTO bootstrap on x86_64-unknown-linux-gnu in stage3.  LTO bootstrap
>>>>>>> with early-LTO-debug in stage3, bootstraped with early-LTO-debug,
>>>>>>> testing in progress.
>>>>>>
>>>>>> Any chance to backport this commit (r240228) to 6.x?
>>>>>> It fixes a bootstrap comparison issue on aarch64-*-freebsd*.
>>>>>> The aarch64-*-freebsd* port is not yet merged to 6.x and 5.4.x due to
>>>>>> the bootstrap comparison failure I faced.
>>>>>
>>>>> Did you analyze the bootstrap miscompare?  I suspect the patch merely
>>>>> papers
>>>>> over the problem.
>>>>
>>>> gcc/contrib/compare-debug -p prev-gcc/ipa-icf.o gcc/ipa-icf.o
>>>> prev-gcc/ipa-icf.o.stripped. gcc/ipa-icf.o.stripped. differ: char 52841,
>>>> line
>>>> 253
>>>>
>>>>
>>>> The objdump -dSx diff on the non stripped object looked always more or
>>>> less
>>>> the same, a rodata offset which was different.
>>>>
>>>> -                       1448: R_AARCH64_ADD_ABS_LO12_NC .rodata+0x1d8
>>>> +                       1448: R_AARCH64_ADD_ABS_LO12_NC .rodata+0x410
>>>
>>> Hmm, sounds like a constant pool entry was created by -g at a different
>>> time (and thus offset) from regular compilation.  So yes, the patch
>>> in question should have the affect to "fix" this.
>>>
>>> Note that I later changed the fix with
>>>
>>> 2016-10-20  Richard Biener  <rguenther@suse.de>
>>>
>>>         * cgraphunit.c (analyze_functions): Set node->definition to
>>>         false to signal symbol removal to debug_hooks->late_global_decl.
>>>         * ipa.c (symbol_table::remove_unreachable_nodes): When not in
>>>         WPA signal symbol removal to the debuginfo machinery.
>>>         * dwarf2out.c (dwarf2out_late_global_decl): Instead of
>>>         using early_finised to guard the we're called for symbol
>>>         removal case look at the symtabs definition flag.
>>>         (gen_variable_die): Remove redundant check.
>>>
>>> (the dwarf2out_late_global_decl and analyze_functions part are
>>> relevant).
>>>
>>> That should be the fix to backport (avoiding the early_dwarf_finished
>>> part).
>>
>> Ok, I bootstrapped with the attached snippet (had to tweak a bit to apply
>> cleanly). w/o the previous mentioned fix (r240228). Is this what you had in
>> mind or do you think some parts of the other fix (r240228) is also needed?
>
> Not sure if you need the dwarf2out.c part you included but you clearly
> missed the dwarf2out_late_global_decl part doing
>
>           /* We get called via the symtab code invoking late_global_decl
>              for symbols that are optimized out.  Do not add locations
>              for those.  */
>           varpool_node *node = varpool_node::get (decl);
>           if (! node || ! node->definition)
>             tree_add_const_value_attribute_for_decl (die, decl);
>           else
>             add_location_or_const_value_attribute (die, decl, false);
>
> I wouldn't include the ipa.c part unless required (it adds additional
> debug for optimized out decls).

Ok, attached the part I bootstrapped successfully on amd64-*-freebsd12 
and aarch64-*-freebsd12. From the amd64 run you'll find some test 
results at the usual place. The aarch64 run takes some more time.

I hope I got it right this time :)
What do you think?

Thanks,
Andreas

Comments

Richard Biener Jan. 9, 2017, 10:53 a.m. UTC | #1
On Thu, 5 Jan 2017, Andreas Tobler wrote:

> On 05.01.17 13:05, Richard Biener wrote:
> > On Wed, 4 Jan 2017, Andreas Tobler wrote:
> > 
> > > On 04.01.17 10:21, Richard Biener wrote:
> > > > On Wed, 28 Dec 2016, Andreas Tobler wrote:
> > > > 
> > > > > On 28.12.16 19:24, Richard Biener wrote:
> > > > > > On December 27, 2016 11:17:00 PM GMT+01:00, Andreas Tobler
> > > > > > <andreast-list@fgznet.ch> wrote:
> > > > > > > On 16.09.16 13:30, Richard Biener wrote:
> > > > > > > > On Thu, 15 Sep 2016, Richard Biener wrote:
> > > > > > > > 
> > > > > > > > > 
> > > > > > > > > This addresses sth I needed to address with the early LTO
> > > > > > > > > debug
> > > > > > > patches
> > > > > > > > > (you might now figure I'm piecemail merging stuff from that
> > > > > > > > > patch).
> > > > > > > > > 
> > > > > > > > > When the cgraph code optimizes out a global we call the
> > > > > > > late_global_decl
> > > > > > > > > debug hook to eventually add a DW_AT_const_value to its DIE
> > > > > > > > > (we
> > > > > > > don't
> > > > > > > > > really expect a location as that will be invalid after
> > > > > > > > > optimizing
> > > > > > > out
> > > > > > > > > and will be pruned).
> > > > > > > > > 
> > > > > > > > > With the early LTO debug patches I have introduced a
> > > > > > > early_dwarf_finished
> > > > > > > > > flag (mainly for consistency checking) and I figured I can use
> > > > > > > > > that
> > > > > > > to
> > > > > > > > > detect the call to the late hook during the early phase and
> > > > > > > > > provide
> > > > > > > > > the following cleaned up variant of avoiding to create
> > > > > > > > > locations
> > > > > > > that
> > > > > > > > > require later pruning (which doesn't work with emitting the
> > > > > > > > > early
> > > > > > > DIEs).
> > > > > > > > > 
> > > > > > > > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> > > > > > > > > 
> > > > > > > > > I verified it does the correct thing for a unit like
> > > > > > > > > 
> > > > > > > > > static const int i = 2;
> > > > > > > > > 
> > > > > > > > > (but ISTR we do have at least one testcase in the testsuite as
> > > > > > > well).
> > > > > > > > > 
> > > > > > > > > Will commit if testing finishes successfully.
> > > > > > > > 
> > > > > > > > Ok, so it showed issues when merging that back to
> > > > > > > > early-LTO-debug.
> > > > > > > > Turns out in LTO we never call early_finish and thus
> > > > > > > early_dwarf_finished
> > > > > > > > was never set.  Also dwarf2out_late_global_decl itself is a
> > > > > > > > better
> > > > > > > > place to constrain generating locations.
> > > > > > > > 
> > > > > > > > The following variant is in very late stage of testing.
> > > > > > > > 
> > > > > > > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> > > > > > > > LTO bootstrap on x86_64-unknown-linux-gnu in stage3.  LTO
> > > > > > > > bootstrap
> > > > > > > > with early-LTO-debug in stage3, bootstraped with
> > > > > > > > early-LTO-debug,
> > > > > > > > testing in progress.
> > > > > > > 
> > > > > > > Any chance to backport this commit (r240228) to 6.x?
> > > > > > > It fixes a bootstrap comparison issue on aarch64-*-freebsd*.
> > > > > > > The aarch64-*-freebsd* port is not yet merged to 6.x and 5.4.x due
> > > > > > > to
> > > > > > > the bootstrap comparison failure I faced.
> > > > > > 
> > > > > > Did you analyze the bootstrap miscompare?  I suspect the patch
> > > > > > merely
> > > > > > papers
> > > > > > over the problem.
> > > > > 
> > > > > gcc/contrib/compare-debug -p prev-gcc/ipa-icf.o gcc/ipa-icf.o
> > > > > prev-gcc/ipa-icf.o.stripped. gcc/ipa-icf.o.stripped. differ: char
> > > > > 52841,
> > > > > line
> > > > > 253
> > > > > 
> > > > > 
> > > > > The objdump -dSx diff on the non stripped object looked always more or
> > > > > less
> > > > > the same, a rodata offset which was different.
> > > > > 
> > > > > -                       1448: R_AARCH64_ADD_ABS_LO12_NC .rodata+0x1d8
> > > > > +                       1448: R_AARCH64_ADD_ABS_LO12_NC .rodata+0x410
> > > > 
> > > > Hmm, sounds like a constant pool entry was created by -g at a different
> > > > time (and thus offset) from regular compilation.  So yes, the patch
> > > > in question should have the affect to "fix" this.
> > > > 
> > > > Note that I later changed the fix with
> > > > 
> > > > 2016-10-20  Richard Biener  <rguenther@suse.de>
> > > > 
> > > >         * cgraphunit.c (analyze_functions): Set node->definition to
> > > >         false to signal symbol removal to debug_hooks->late_global_decl.
> > > >         * ipa.c (symbol_table::remove_unreachable_nodes): When not in
> > > >         WPA signal symbol removal to the debuginfo machinery.
> > > >         * dwarf2out.c (dwarf2out_late_global_decl): Instead of
> > > >         using early_finised to guard the we're called for symbol
> > > >         removal case look at the symtabs definition flag.
> > > >         (gen_variable_die): Remove redundant check.
> > > > 
> > > > (the dwarf2out_late_global_decl and analyze_functions part are
> > > > relevant).
> > > > 
> > > > That should be the fix to backport (avoiding the early_dwarf_finished
> > > > part).
> > > 
> > > Ok, I bootstrapped with the attached snippet (had to tweak a bit to apply
> > > cleanly). w/o the previous mentioned fix (r240228). Is this what you had
> > > in
> > > mind or do you think some parts of the other fix (r240228) is also needed?
> > 
> > Not sure if you need the dwarf2out.c part you included but you clearly
> > missed the dwarf2out_late_global_decl part doing
> > 
> >           /* We get called via the symtab code invoking late_global_decl
> >              for symbols that are optimized out.  Do not add locations
> >              for those.  */
> >           varpool_node *node = varpool_node::get (decl);
> >           if (! node || ! node->definition)
> >             tree_add_const_value_attribute_for_decl (die, decl);
> >           else
> >             add_location_or_const_value_attribute (die, decl, false);
> > 
> > I wouldn't include the ipa.c part unless required (it adds additional
> > debug for optimized out decls).
> 
> Ok, attached the part I bootstrapped successfully on amd64-*-freebsd12 and
> aarch64-*-freebsd12. From the amd64 run you'll find some test results at the
> usual place. The aarch64 run takes some more time.
> 
> I hope I got it right this time :)
> What do you think?

Looks good to me with the added comment to dwarf2out_late_global_decl
exchanged to the one on trunk.

Thanks,
Richard.
Jakub Jelinek Jan. 9, 2017, 11:25 a.m. UTC | #2
On Mon, Jan 09, 2017 at 11:53:38AM +0100, Richard Biener wrote:
> > Ok, attached the part I bootstrapped successfully on amd64-*-freebsd12 and
> > aarch64-*-freebsd12. From the amd64 run you'll find some test results at the
> > usual place. The aarch64 run takes some more time.
> > 
> > I hope I got it right this time :)
> > What do you think?
> 
> Looks good to me with the added comment to dwarf2out_late_global_decl
> exchanged to the one on trunk.

The formatting is completely wrong.
Lines indented e.g. by 7 spaces (or tab + 1/3 space(s)),
/* comment inside of { block starting in the same column as {
(should be 2 columns to the right), && ! not aligned below VAR_P,
or indenting by 3 columns instead of 2.

	Jakub
diff mbox

Patch

Index: dwarf2out.c
===================================================================
--- dwarf2out.c	(revision 244100)
+++ dwarf2out.c	(working copy)
@@ -23752,7 +23752,17 @@ 
     {
       dw_die_ref die = lookup_decl_die (decl);
       if (die)
-	add_location_or_const_value_attribute (die, decl, false);
+       {
+       /* We get called during the early debug phase via the symtab
+	  code invoking late_global_decl for symbols that are optimized
+	  out.  When the early phase is not finished, do not add
+	  locations.  */
+	 varpool_node *node = varpool_node::get (decl);
+	 if (! node || ! node->definition)
+	   tree_add_const_value_attribute_for_decl (die, decl);
+	 else
+	   add_location_or_const_value_attribute (die, decl, false);
+       }
     }
 }
 
Index: cgraphunit.c
===================================================================
--- cgraphunit.c	(revision 244100)
+++ cgraphunit.c	(working copy)
@@ -1193,8 +1193,16 @@ 
 	     at looking at optimized away DECLs, since
 	     late_global_decl will subsequently be called from the
 	     contents of the now pruned symbol table.  */
-	  if (!decl_function_context (node->decl))
-	    (*debug_hooks->late_global_decl) (node->decl);
+	  if (VAR_P (node->decl)
+		&& !decl_function_context (node->decl))
+	     {
+		/* We are reclaiming totally unreachable code and variables
+		so they effectively appear as readonly.  Show that to
+		the debug machinery.  */
+		TREE_READONLY (node->decl) = 1;
+		node->definition = false;
+		(*debug_hooks->late_global_decl) (node->decl);
+	     }
 
 	  node->remove ();
 	  continue;