[debug-early] C++ clones and limbo DIEs
diff mbox

Message ID 54C296B5.4050506@redhat.com
State New
Headers show

Commit Message

Aldy Hernandez Jan. 23, 2015, 6:45 p.m. UTC
Phew... ok, I'm a little stuck here with the interaction between 
dwarf2out and LTO, and I'm hoping y'all can shed some light.  Please 
bear with me through the verbosity, it gets clearer (I hope) near the end.

On 01/16/2015 12:45 PM, Jason Merrill wrote:
> On 01/16/2015 12:50 PM, Aldy Hernandez wrote:
>>> Can you remove the first flush and just do it in the second place?
>>
>> If I only flush the limbo list in the second place, that's basically
>> what mainline does, albeit abstracted into a function.  I thought the
>> whole point was to get rid of the limbo list, or at least keep it from
>> being a structure that has to go through LTO streaming.
>
> It would expect it to be before free_lang_data and LTO streaming.

The reason this wouldn't make a difference is because, as it stands, 
dwarf for the clones are not generated until final.c:

   if (!DECL_IGNORED_P (current_function_decl))
     debug_hooks->function_decl (current_function_decl);

which happens after free_lang_data.

However, we can generate early dwarf for the clones right from the 
parser, and things _mostly_ work:

And I say _mostly_ work, because now local statics will have the 
DECL_ABSTRACT_P bit set, making LTO think that things live in another 
partition, and are no longer trivially needed.  For instance:

struct TYPE { TYPE (int);  } ;

TYPE::TYPE (int some_argument)
{
   static int static_p = 5;
}

With the above patch, early dwarf generation will call gen_decl_die, 
which eventually does:

       /* If we're emitting a clone, emit info for the abstract 
instance.  */
       if (origin || DECL_ORIGIN (decl) != decl)
	dwarf2out_abstract_function (origin
				     ? DECL_ORIGIN (origin)
				     : DECL_ABSTRACT_ORIGIN (decl));

Where decl and DECL_ABSTRACT_ORIGIN(decl) are:
(gdb) p decl
$87 = <function_decl 0x7ffff030f0d8 __base_ctor >
(gdb) p decl.decl_common.abstract_origin
$88 = <function_decl 0x7ffff02fbe58 TYPE>

Now dwarf2out_abstract_function() will set DECL_ABSTRACT_P for all the 
children of the abstract origin:

   was_abstract = DECL_ABSTRACT_P (decl);
   set_decl_abstract_flags (decl, 1);
   dwarf2out_decl (decl);
   if (! was_abstract)
     set_decl_abstract_flags (decl, 0);

Unfortunately, this sets DECL_ABSTRACT_P for the "static_p" above, and 
refuses to unset it after the call to dwarf2out_decl.

Through some miraculous gymnastics, LTO streams out symbols without the 
"analyzed" bit set if things are not "in partition", which happens 
because symtab_node::get_partitioning_class() returns SYMBOL_EXTERNAL 
(ahem, not in partition) when the DECL_ABSTRACT_P bit is set.

So... dwarf2out_abstract_function() sets the DECL_ABSTRACT_P bit, and 
LTO thinks that local statics in a clone are in another partition, which 
causes the analyzed bit to be dropped on the floor, which causes 
symbol_table::remove_unreferenced_decls() to no longer think that local 
statics are trivially needed.  And if the local static is no longer 
needed, varpool will not output its definition in the assembly file.

I am very uncomfortable with a call to dwarf2out_abstract_function() 
changing the behavior of the LTO streamer (and the optimizers), but it 
looks like twiddling the DECL_ABSTRACT_P flag (and refusing to reset it) 
is by design.

I am all ears on solutions on either the LTO or the dwarf side.

Thanks.
Aldy

Comments

Richard Biener Jan. 26, 2015, 8:53 a.m. UTC | #1
On Fri, Jan 23, 2015 at 7:45 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
> Phew... ok, I'm a little stuck here with the interaction between dwarf2out
> and LTO, and I'm hoping y'all can shed some light.  Please bear with me
> through the verbosity, it gets clearer (I hope) near the end.
>
>
> On 01/16/2015 12:45 PM, Jason Merrill wrote:
>>
>> On 01/16/2015 12:50 PM, Aldy Hernandez wrote:
>>>>
>>>> Can you remove the first flush and just do it in the second place?
>>>
>>>
>>> If I only flush the limbo list in the second place, that's basically
>>> what mainline does, albeit abstracted into a function.  I thought the
>>> whole point was to get rid of the limbo list, or at least keep it from
>>> being a structure that has to go through LTO streaming.
>>
>>
>> It would expect it to be before free_lang_data and LTO streaming.
>
>
> The reason this wouldn't make a difference is because, as it stands, dwarf
> for the clones are not generated until final.c:
>
>   if (!DECL_IGNORED_P (current_function_decl))
>     debug_hooks->function_decl (current_function_decl);
>
> which happens after free_lang_data.
>
> However, we can generate early dwarf for the clones right from the parser,
> and things _mostly_ work:

I think that is the correct thing to do.

> diff --git a/gcc/cp/optimize.c b/gcc/cp/optimize.c
> index 62e32d2..5539244 100644
> --- a/gcc/cp/optimize.c
> +++ b/gcc/cp/optimize.c
> @@ -539,6 +539,11 @@ maybe_clone_body (tree fn)
>        /* Start processing the function.  */
>        start_preparsed_function (clone, NULL_TREE, SF_PRE_PARSED);
>
> +      /* Generate early dwarf for the clone now that we have a body
> +        for it.  */
> +       debug_hooks->early_global_decl (clone);
> +
>        /* Tell cgraph if both ctors or both dtors are known to have
>          the same body.  */
>        if (can_alias
>
> And I say _mostly_ work, because now local statics will have the
> DECL_ABSTRACT_P bit set, making LTO think that things live in another
> partition, and are no longer trivially needed.  For instance:
>
> struct TYPE { TYPE (int);  } ;
>
> TYPE::TYPE (int some_argument)
> {
>   static int static_p = 5;
> }
>
> With the above patch, early dwarf generation will call gen_decl_die, which
> eventually does:
>
>       /* If we're emitting a clone, emit info for the abstract instance.  */
>       if (origin || DECL_ORIGIN (decl) != decl)
>         dwarf2out_abstract_function (origin
>                                      ? DECL_ORIGIN (origin)
>                                      : DECL_ABSTRACT_ORIGIN (decl));
>
> Where decl and DECL_ABSTRACT_ORIGIN(decl) are:
> (gdb) p decl
> $87 = <function_decl 0x7ffff030f0d8 __base_ctor >
> (gdb) p decl.decl_common.abstract_origin
> $88 = <function_decl 0x7ffff02fbe58 TYPE>
>
> Now dwarf2out_abstract_function() will set DECL_ABSTRACT_P for all the
> children of the abstract origin:
>
>   was_abstract = DECL_ABSTRACT_P (decl);
>   set_decl_abstract_flags (decl, 1);
>   dwarf2out_decl (decl);
>   if (! was_abstract)
>     set_decl_abstract_flags (decl, 0);
>
> Unfortunately, this sets DECL_ABSTRACT_P for the "static_p" above, and
> refuses to unset it after the call to dwarf2out_decl.

I think the C++ FE does sth odd here by having an abstract origin that
is not abstract (If I understand you correctly here).  That is, the dwarf2out.c
code above should be able to do

  gcc_assert (was_abstract);

instead of setting abstract flags here.  Otherwise the abstract origin
wasn't and abstract origin.

Or I am completely missing something - which means LTO is wrong
to interpret DECL_ABSTRACT as it does:

/* Nonzero for a given ..._DECL node means that this node represents an
   "abstract instance" of the given declaration (e.g. in the original
   declaration of an inline function).  When generating symbolic debugging
   information, we mustn't try to generate any address information for nodes
   marked as "abstract instances" because we don't actually generate
   any code or allocate any data space for such instances.  */
#define DECL_ABSTRACT_P(NODE) \
  (DECL_COMMON_CHECK (NODE)->decl_common.abstract_flag)

The above suggests LTO is correct in not outputting the decl.

So it is either the C++ FE that is wrong in declaring this an abstract origin
or dwarf2out.c is wrong in simply making all abstract origins (and its chilren)
abstract because being an abstract origin doesn't require being abstract.

Jason?

> Through some miraculous gymnastics, LTO streams out symbols without the
> "analyzed" bit set if things are not "in partition", which happens because
> symtab_node::get_partitioning_class() returns SYMBOL_EXTERNAL (ahem, not in
> partition) when the DECL_ABSTRACT_P bit is set.
>
> So... dwarf2out_abstract_function() sets the DECL_ABSTRACT_P bit, and LTO
> thinks that local statics in a clone are in another partition, which causes
> the analyzed bit to be dropped on the floor, which causes
> symbol_table::remove_unreferenced_decls() to no longer think that local
> statics are trivially needed.  And if the local static is no longer needed,
> varpool will not output its definition in the assembly file.
>
> I am very uncomfortable with a call to dwarf2out_abstract_function()
> changing the behavior of the LTO streamer (and the optimizers), but it looks
> like twiddling the DECL_ABSTRACT_P flag (and refusing to reset it) is by
> design.
>
> I am all ears on solutions on either the LTO or the dwarf side.
>
> Thanks.
> Aldy
Jason Merrill Jan. 27, 2015, 8:51 p.m. UTC | #2
On 01/23/2015 01:45 PM, Aldy Hernandez wrote:
>> It would expect [the flush] to be before free_lang_data and LTO streaming.
>
> The reason this wouldn't make a difference is because, as it stands,
> dwarf for the clones are not generated until final.c:
>
>    if (!DECL_IGNORED_P (current_function_decl))
>      debug_hooks->function_decl (current_function_decl);
>
> which happens after free_lang_data.

I agree that the current code doesn't have this effect, but we're 
talking about changing things, right? :)

> Unfortunately, this sets DECL_ABSTRACT_P for the "static_p" above, and
> refuses to unset it after the call to dwarf2out_decl.

Well, that sounds like a bug.  Why isn't it being unset?  Is it because 
DECL_ABSTRACT_P was already set for the function, so we don't call 
set_decl_abstract_flags (decl, 0)?  Perhaps a solution to that would be 
to avoid calling set_decl_abstract_flags (decl, 1) if the function is 
already marked as abstract.  Or to teach set_decl_abstract_flags not to 
mess with static local variables.

Jason
Aldy Hernandez Jan. 28, 2015, 6:28 p.m. UTC | #3
On 01/27/2015 12:51 PM, Jason Merrill wrote:
> On 01/23/2015 01:45 PM, Aldy Hernandez wrote:
>>> It would expect [the flush] to be before free_lang_data and LTO
>>> streaming.
>>
>> The reason this wouldn't make a difference is because, as it stands,
>> dwarf for the clones are not generated until final.c:
>>
>>    if (!DECL_IGNORED_P (current_function_decl))
>>      debug_hooks->function_decl (current_function_decl);
>>
>> which happens after free_lang_data.
>
> I agree that the current code doesn't have this effect, but we're
> talking about changing things, right? :)

Oh, my bad.  I completely misread your original comment.  Yes, yes... we 
can move the flushing of the limbo list to free_lang_data.  See attached 
patch.

>
>> Unfortunately, this sets DECL_ABSTRACT_P for the "static_p" above, and
>> refuses to unset it after the call to dwarf2out_decl.
>
> Well, that sounds like a bug.  Why isn't it being unset?  Is it because
> DECL_ABSTRACT_P was already set for the function, so we don't call
> set_decl_abstract_flags (decl, 0)?  Perhaps a solution to that would be
> to avoid calling set_decl_abstract_flags (decl, 1) if the function is
> already marked as abstract.  Or to teach set_decl_abstract_flags not to
> mess with static local variables.

Its' being unset because DECL_ABSTRACT_P was already set.  I propose we 
avoid calling set_decl_abstract_flags() for this scenario, as you 
suggest, and if this causes any problems, look into your second approach.

The attached patch gets rid of our limbo dependency past LTO streaming 
by flushing it out early, and adding some sanity checks throughout.

Cilk required some special handholding.  It seems to be the only culprit 
of late FUNCTION_DECL creation.

Now that we call dwarf2out_decl() for clones early on, I had to adjust 
gen_subprogram_die() to make sure we're not duplicating work and 
creating a new DIE if we already have on with DW_AT_abstract_origin.

I also kept lookup_filename() uses from ICEing now that we hit this code 
path.  It shouldn't affect other code paths (since it was ICEing before 
;-)).

With the attached patch we take care of C++ clones early, and get rid of 
our dependence on the limbo DIE list into LTO land.  No guality regressions.

Are we all in agreement with this approach?

Thanks for everyone's help BTW.
Aldy

Patch
diff mbox

diff --git a/gcc/cp/optimize.c b/gcc/cp/optimize.c
index 62e32d2..5539244 100644
--- a/gcc/cp/optimize.c
+++ b/gcc/cp/optimize.c
@@ -539,6 +539,11 @@  maybe_clone_body (tree fn)
        /* Start processing the function.  */
        start_preparsed_function (clone, NULL_TREE, SF_PRE_PARSED);

+      /* Generate early dwarf for the clone now that we have a body
+	 for it.  */
+	debug_hooks->early_global_decl (clone);
+
        /* Tell cgraph if both ctors or both dtors are known to have
  	 the same body.  */
        if (can_alias