[10/10] debug-early merge: compiler proper
diff mbox

Message ID CAFiYyc16u2na9VeyMb5cVf4cH=caiBy6PRTJkX9DndzHcFxkBQ@mail.gmail.com
State New
Headers show

Commit Message

Richard Biener May 18, 2015, 10:56 a.m. UTC
On Fri, May 8, 2015 at 2:40 AM, Aldy Hernandez <aldyh@redhat.com> wrote:
> As seen on TV.

+/* FIRST_TIME is set to TRUE for the first time we are called for a
+   translation unit from finalize_compilation_unit() or false
+   otherwise.  */
+
 static void
-analyze_functions (void)
+analyze_functions (bool first_time)
 {
...
+  if (first_time)
+    {
+      symtab_node *snode;
+      FOR_EACH_SYMBOL (snode)
+       check_global_declaration (snode->decl);
+    }
+

I think it is better to split analyze_functions (why does it have it's own copy
of unreachable node removal?) into analysis and unused-symbol removal
and have the
check_global_declaration call be in finalize_compilation_unit directly.  Honza?

@@ -1113,6 +1131,19 @@ analyze_functions (void)
        {
          if (symtab->dump_file)
            fprintf (symtab->dump_file, " %s", node->name ());
+
+         /* See if the debugger can use anything before the DECL
+            passes away.  Perhaps it can notice a DECL that is now a
+            constant and can tag the early DIE with an appropriate
+            attribute.
+
+            Otherwise, this is the last chance the debug_hooks have
+            at looking at optimized away DECLs, since
+            late_global_decl will subsequently be called from the
+            contents of the now pruned symbol table.  */
+         if (!decl_function_context (node->decl))
+           (*debug_hooks->late_global_decl) (node->decl);
+
          node->remove ();

so this applies to VAR_DECLs only - shouldn't this be in the
varpool_node::remove function then?  You can even register/deregister
a hook for this in finalize_compilation_unit.  That would IMHO be better.

All debug stuff apart from dwarf2out.c changes (I assume Jason reviews
them) are ok.


Ick - do we need this?  dwarf2out.c has a hashtable to map blocks to
DIEs (which you don't remove in turn).

Jason - did you intentionally not yet "approve" the dwarf2out.c
changes (like, are you expecting somebody else
to do a review)?

Thanks,
Richard.

Comments

Jan Hubicka May 18, 2015, 2:46 p.m. UTC | #1
> On Fri, May 8, 2015 at 2:40 AM, Aldy Hernandez <aldyh@redhat.com> wrote:
> > As seen on TV.
> 
> +/* FIRST_TIME is set to TRUE for the first time we are called for a
> +   translation unit from finalize_compilation_unit() or false
> +   otherwise.  */
> +
>  static void
> -analyze_functions (void)
> +analyze_functions (bool first_time)
>  {
> ...
> +  if (first_time)
> +    {
> +      symtab_node *snode;
> +      FOR_EACH_SYMBOL (snode)
> +       check_global_declaration (snode->decl);
> +    }
> +
> 
> I think it is better to split analyze_functions (why does it have it's own copy
> of unreachable node removal?) into analysis and unused-symbol removal
> and have the
> check_global_declaration call be in finalize_compilation_unit directly.  Honza?

It is trying to avoid analyzing functions that are never used. Analyzing
include gimplification and other stuff that is not for free.
remove_unreachable_nodes works only on fully built cgraph.

Main reason why analyze_functions is intended to be called multiple times was
--combine support.  After each end of C source file you called
symbol_table::finalize_compilation_unit which did unreachable code removal and
saved some memory.

I am not sure symbol table is useful for debug info this way: we insert only
function definitions into symbol table and external declarations are inserted
lazilly.  That is why we still have the (convoluted) check_global_declarations
that walks the vectors provided by frontends.
> 
> @@ -1113,6 +1131,19 @@ analyze_functions (void)
>         {
>           if (symtab->dump_file)
>             fprintf (symtab->dump_file, " %s", node->name ());
> +
> +         /* See if the debugger can use anything before the DECL
> +            passes away.  Perhaps it can notice a DECL that is now a
> +            constant and can tag the early DIE with an appropriate
> +            attribute.
> +
> +            Otherwise, this is the last chance the debug_hooks have
> +            at looking at optimized away DECLs, since
> +            late_global_decl will subsequently be called from the
> +            contents of the now pruned symbol table.  */
> +         if (!decl_function_context (node->decl))
> +           (*debug_hooks->late_global_decl) (node->decl);
> +
>           node->remove ();
> 
> so this applies to VAR_DECLs only - shouldn't this be in the
> varpool_node::remove function then?  You can even register/deregister
> a hook for this in finalize_compilation_unit.  That would IMHO be better.

There is varpool_node::remove_initializer that is intended to throw away the
constructor.  I suppose late_global_decl can be called from there?
> @@ -1334,6 +1334,9 @@ struct GTY(()) tree_block {
>    tree abstract_origin;
>    tree fragment_origin;
>    tree fragment_chain;
> +
> +  /* Pointer to the DWARF lexical block.  */
> +  struct die_struct *die;
>  };
> 
>  struct GTY(()) tree_type_common {
> 
> Ick - do we need this?  dwarf2out.c has a hashtable to map blocks to
> DIEs (which you don't remove in turn).

Note that types also have tree_type_symtab that in turn contains die pointer.

Honza
> 
> Jason - did you intentionally not yet "approve" the dwarf2out.c
> changes (like, are you expecting somebody else
> to do a review)?
> 
> Thanks,
> Richard.
Richard Biener May 22, 2015, 11:23 a.m. UTC | #2
On Wed, May 20, 2015 at 5:50 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
> On 05/18/2015 06:56 AM, Richard Biener wrote:
>
> BTW, thanks for the review.
>
>> On Fri, May 8, 2015 at 2:40 AM, Aldy Hernandez <aldyh@redhat.com> wrote:
>>>
>>> As seen on TV.
>>
>>
>> +/* FIRST_TIME is set to TRUE for the first time we are called for a
>> +   translation unit from finalize_compilation_unit() or false
>> +   otherwise.  */
>> +
>>   static void
>> -analyze_functions (void)
>> +analyze_functions (bool first_time)
>>   {
>> ...
>> +  if (first_time)
>> +    {
>> +      symtab_node *snode;
>> +      FOR_EACH_SYMBOL (snode)
>> +       check_global_declaration (snode->decl);
>> +    }
>> +
>>
>> I think it is better to split analyze_functions (why does it have it's own
>> copy
>> of unreachable node removal?) into analysis and unused-symbol removal
>> and have the
>> check_global_declaration call be in finalize_compilation_unit directly.
>> Honza?
>
>
> Leaving things as is, as per Honza's comment ??
>
>>
>> @@ -1113,6 +1131,19 @@ analyze_functions (void)
>>          {
>>            if (symtab->dump_file)
>>              fprintf (symtab->dump_file, " %s", node->name ());
>> +
>> +         /* See if the debugger can use anything before the DECL
>> +            passes away.  Perhaps it can notice a DECL that is now a
>> +            constant and can tag the early DIE with an appropriate
>> +            attribute.
>> +
>> +            Otherwise, this is the last chance the debug_hooks have
>> +            at looking at optimized away DECLs, since
>> +            late_global_decl will subsequently be called from the
>> +            contents of the now pruned symbol table.  */
>> +         if (!decl_function_context (node->decl))
>> +           (*debug_hooks->late_global_decl) (node->decl);
>> +
>>            node->remove ();
>>
>> so this applies to VAR_DECLs only - shouldn't this be in the
>> varpool_node::remove function then?  You can even register/deregister
>> a hook for this in finalize_compilation_unit.  That would IMHO be better.
>
>
> The problem is that varpool_node::remove() may be called before we
> have finished parsing the DECL, thus before we call
> early_global_decl() on it.  So we would essentially be calling
> late_global_decl() on a DECL for which we haven't called
> early_global_decl().
>
> To complicate matters, we may call ::remove() before we finish parsing
> a decl.  In the C front-end, for instance, we call ::remove() from
> duplicate_decls(), before we even call rest_of_decl_compilation (where
> we call early_global_decl).
>
> Calling late_global_decl so early, before we have even finished
> parsing, seems wrong and obviously causes problems.  One example is
> dwarf2out can put the DECL into the deferred_asm_names list, only to
> have duplicate_decls() gcc_free it from under us.
>
>>
>> All debug stuff apart from dwarf2out.c changes (I assume Jason reviews
>> them) are ok.
>>
>> diff --git a/gcc/gengtype.c b/gcc/gengtype.c
>> index 02012d5..15b6dd2 100644
>> --- a/gcc/gengtype.c
>> +++ b/gcc/gengtype.c
>> @@ -4718,7 +4718,8 @@ write_roots (pair_p variables, bool emit_pch)
>>      this funcion will have to be adjusted to be more like
>>      output_mangled_typename.  */
>>
>> -static void
>> +/* ?? Why are we keeping this?  Is this actually used anywhere?  */
>> +static void ATTRIBUTE_UNUSED
>>   output_typename (outf_p of, const_type_p t)
>>   {
>>     switch (t->kind)
>>
>> Just remove the function.
>
>
> Done.
>
>>
>> The langhooks changes are ok.
>>
>> diff --git a/gcc/passes.c b/gcc/passes.c
>> index 04ff042..4dee8ad 100644
>> --- a/gcc/passes.c
>> +++ b/gcc/passes.c
>> @@ -293,6 +293,28 @@ rest_of_decl_compilation (tree decl,
>>     else if (TREE_CODE (decl) == VAR_DECL && !DECL_EXTERNAL (decl)
>>             && TREE_STATIC (decl))
>>       varpool_node::get_create (decl);
>> +
>> +  /* Generate early debug for global variables.  Any local variables will
>> +     be handled by either handling reachable functions from
>> +     finalize_compilation_unit (and by consequence, locally scoped
>> +     symbols), or by rest_of_type_compilation below.
>> +
>> +     Also, pick up function prototypes, which will be mostly ignored
>> +     by the different early_global_decl() hooks, but will at least be
>> +     used by Go's hijack of the debug_hooks to implement
>> +     -fdump-go-spec.  */
>> +  if (!flag_wpa
>> +      && !in_lto_p
>>
>> Just check !in_lto_p, !flag_wpa is redundant.
>
>
> Done.
>
>>
>> +      && !decl_function_context (decl)
>> +      && !current_function_decl
>>
>> Why that?  !decl_function_context should catch relevant cases?
>
>
> You'd think, huh?  The issue here are extern declarations appearing
> inside of a function.  For this case, decl_function_context is NULL,
> because the actual context is toplevel, but current_function_decl is
> set to the function where the extern declaration appears.
>
> For example:
>
> namespace S
> {
>   int
>   f()
>   {
>     {
>       int i = 42;
>       {
>         extern int i; // Local extern declaration.
>         return i;
>       }
>     }
>   }
> }
>
> I have added a big fat comment in the code now, since it's clearly not
> obvious why we need current_function_decl.
>
>>
>> +      && !decl_type_context (decl))
>> +    (*debug_hooks->early_global_decl) (decl);
>>
>> I'll note that nested functions and class methods are not getting
>> early_global_decl()ed here.  I suppose their containing function/type
>> is supposed to generate early dwarf by means of dwarf2out walking
>> over children.
>
>
> Indeed, and I had properly commented this:
>
>   /* Emit early debug for reachable functions, and by consequence,
>      locally scoped symbols.  */
>   struct cgraph_node *cnode;
>   FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (cnode)
>     (*debug_hooks->early_global_decl) (cnode->decl);
>
> I did, however, remove the !decl_function_context() restriction in this
> snippet from the previous iteration of this patch because of the nested
> function problem you mention.
>
> Doubly nested functions were fine because dwarf2out will walk the inner
> function, by virtue of it being a symbol local to the top level function,
> however it was missing triply nested functions because the 2nd nested
> functions was considered a declaration so its children (> 2 nested
> functions) were not walked.  I ran into this with Ada, and removed the
> !decl_function_context() while generating early dwarf. IIRC, the restriction
> was originally there to limit extra DIEs that may be generated, but since
> I'm already on the hook for cleaning up DIEs shortly after the merge, we're
> going to have deal with this anyhow.
>
> I also added a test for triply nested functions for C+debug-early, since Ada
> seemed to be the only front-end stressing >= 3 nested functions.  I will
> repost the test when I repost the testsuite changes.
>
>>
>> timevar changes are ok.
>>
>> the toplev changes are ok.
>>
>> diff --git a/gcc/tree-core.h b/gcc/tree-core.h
>> index ad1bb23..2a9f417 100644
>> --- a/gcc/tree-core.h
>> +++ b/gcc/tree-core.h
>> @@ -1334,6 +1334,9 @@ struct GTY(()) tree_block {
>>     tree abstract_origin;
>>     tree fragment_origin;
>>     tree fragment_chain;
>> +
>> +  /* Pointer to the DWARF lexical block.  */
>> +  struct die_struct *die;
>>   };
>>
>>   struct GTY(()) tree_type_common {
>>
>> Ick - do we need this?  dwarf2out.c has a hashtable to map blocks to
>> DIEs (which you don't remove in turn).
>
>
> We need a way to reference the early created DIE from late debugging, and we
> can't use block_map because it gets cloberred across functions. It's
> currently being released in late debug (dwarf2out_function_decl),
> that's why you see it not set to NULL in dwarf2out_c_finalize.
>
> Also, it uses BLOCK_NUMBERs, which according to the documentation in
> tree.h, are not guaranteed to be unique across functions.
>
> As Honza mentioned, we're already using a DIE map in types through
> TYPE_SYMTAB_DIE.  See lookup_type_die() in dwarf2out.c.
>
> Could we leave this as is?

But why then not eliminate block_map in favor of using the new ->die member?
Having both looks very odd to me.

Can you cook up a patch for trunk adding that field to tree_block and removing
the block_map map in favor of sth like what we do for
lookup_type_die/equate_type_number_to_die
and TYPE_SYMTAB_DIE?

> How does this version, which has been committed to the debug-early branch,
> look?

No further look right now, but I assume it's fine apart from the
block_map issue (and dwarf2out.c of course).

Richard.

> Aldy
Jason Merrill May 27, 2015, 12:39 p.m. UTC | #3
On 05/20/2015 11:50 AM, Aldy Hernandez wrote:
> +     determine anscestry later.  */

ancestry

> +static bool early_dwarf_dumping;

Sorry for the late bikeshedding, but "dumping" suddently strikes me as 
odd, since there is no output as with other dumping in the compiler. 
Can we change that to "generation" or "building"?

> +	      /* Reuse DIE even with a differing context.
> +
> +		 This happens when called through
> +		 dwarf2out_abstract_function for formal parameter
> +		 packs.  */
> +	      gcc_assert (parm_die->die_parent->die_tag
> +			  == DW_TAG_GNU_formal_parameter_pack);

Does this mean we're generating a new DW_TAG_GNU_formal_parameter_pack 
in late debug even though we already generated one in early debug?  If 
so, why?

> -  /* It is possible to have both DECL_ABSTRACT_P and DECLARATION be true if we
> -     started to generate the abstract instance of an inline, decided to output
> -     its containing class, and proceeded to emit the declaration of the inline
> -     from the member list for the class.  If so, DECLARATION takes priority;
> -     we'll get back to the abstract instance when done with the class.  */
> -
> -  /* The class-scope declaration DIE must be the primary DIE.  */
> -  if (origin && declaration && class_or_namespace_scope_p (context_die))
> -    {
> -      origin = NULL;
> -      gcc_assert (!old_die);
> -    }

Can't this happen anymore?

> +      if ((is_cu_die (old_die->die_parent)
> +	   /* FIXME: Jason doesn't like this condition, but it fixes
> +	      the inconsistency/ICE with the following Fortran test:
> +
> +		 module some_m
> +		 contains
> +		    logical function funky (FLAG)
> +		      funky = .true.
> +		   end function
> +		 end module
> +
> +	      Another alternative is !is_cu_die (context_die).
> +	   */
> +	   || old_die->die_parent->die_tag == DW_TAG_module

I like it now.  :)
You can leave the rest of the comment.

> +  /* For non DECL_EXTERNALs, if range information is available, fill
> +     the DIE with it.  */
>    else if (!DECL_EXTERNAL (decl))
>      {
>        HOST_WIDE_INT cfa_fb_offset;
> +
>        struct function *fun = DECL_STRUCT_FUNCTION (decl);
>
> -      if (!old_die || !get_AT (old_die, DW_AT_inline))
> -	equate_decl_number_to_die (decl, subr_die);
> +      /* If we have no fun->fde, we have no range information.
> +	 Skip over and fill in range information in the second
> +	 dwarf pass.  */
> +      if (!fun->fde)
> +	goto no_fde_continue;

How about controlling this block with !early_dwarf so you don't need to 
deal with missing FDE?

>  	  if (generic_decl_parm
>  	      && lang_hooks.function_parameter_pack_p (generic_decl_parm))
> -	    gen_formal_parameter_pack_die (generic_decl_parm,
> -					   parm, subr_die,
> -					   &parm);
> +	    {
> +	      if (early_dwarf_dumping)
> +		gen_formal_parameter_pack_die (generic_decl_parm,
> +					       parm, subr_die,
> +					       &parm);
> +	      else if (parm)
> +		parm = DECL_CHAIN (parm);
> +	    }

Let's try only setting generic_decl when early_dwarf.

> +  /* Unless we have an existing non-declaration DIE, equate the new
> +     DIE.  */
> +  if (!old_die || is_declaration_die (old_die))
> +    equate_decl_number_to_die (decl, subr_die);
...
> +  if (decl && (DECL_ABSTRACT_P (decl) || declaration || old_die == NULL
> +	       /* If we make it to a specialization, we have already
> +		  handled the declaration by virtue of early dwarf.
> +		  If so, make a new assocation if available, so late
> +		  dwarf can find it.  */
> +	       || (specialization_p && early_dwarf_dumping)))
>      equate_decl_number_to_die (decl, var_die);

Why are the conditions so different?  Can we use the function condition 
for variables, too?

> +	  /* Do nothing.  This must have been early dumped and it
> +	     won't even need location information since it's a
> +	     DW_AT_inline function.  */
> +	  for (dw_die_ref c = context_die; c; c = c->die_parent)
> +	    if (c->die_tag == DW_TAG_inlined_subroutine
> +		|| c->die_tag == DW_TAG_subprogram)
> +	      {
> +		gcc_assert (get_AT (c, DW_AT_inline));
> +		break;
> +	      }

Maybe wrap this in #ifdef ENABLE_CHECKING.

> +	  /* Do the new DIE dance.  */
> +	  stmt_die = new_die (DW_TAG_lexical_block, context_die, stmt);
> +	  BLOCK_DIE (stmt) = stmt_die;
> +	}
> +    }
> +  else if (BLOCK_ABSTRACT_ORIGIN (stmt))
> +    {
> +      /* If this is an inlined instance, create a new lexical die for
> +	 anything below to attach DW_AT_abstract_origin to.  */
> +      stmt_die = new_die (DW_TAG_lexical_block, context_die, stmt);
> +    }
> +  else
> +    {
> +      if (!stmt_die)
> +	{
> +	  /* This is the first time we are creating something for this
> +	     block.  */
> +	  stmt_die = new_die (DW_TAG_lexical_block, context_die, stmt);
> +	  BLOCK_DIE (stmt) = stmt_die;
> +	}

Surely we don't need to repeat the new_die call three times; the first 
and last are both controlled by !stmt_die.  And don't we want to set 
BLOCK_DIE for the inlined case as well, so that we can find the DIE 
again in late debug?

> +  /* Fill in the size of variable-length fields in late dwarf.  */
> +  if (TREE_ASM_WRITTEN (type)
> +      && !early_dwarf_dumping)
> +    {
> +      tree member;
> +      for (member = TYPE_FIELDS (type); member; member = DECL_CHAIN (member))
> +	fill_variable_array_bounds (TREE_TYPE (member));
> +      return;
> +    }

Why is this happening in late dwarf?  I'm concerned that front-end 
information that is necessary to do this might be lost by that point.

> +      /* Variable-length types may be incomplete even if
> +	 TREE_ASM_WRITTEN.  For such types, fall through to
> +	 gen_array_type_die() and possibly fill in
> +	 DW_AT_{upper,lower}_bound attributes.  */
> +      if ((TREE_CODE (type) != ARRAY_TYPE
> +	   && TREE_CODE (type) != RECORD_TYPE
> +	   && TREE_CODE (type) != UNION_TYPE
> +	   && TREE_CODE (type) != QUAL_UNION_TYPE)
> +	  || (TYPE_SIZE (type)
> +	      && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST))

Similarly, why check for INTEGER_CST here?

> +      bool t = early_dwarf_dumping;
> +      early_dwarf_dumping = true;
> +      dwarf2out_decl (decl);
> +      early_dwarf_dumping = t;

Let's use a RAII (resource acquisition is initialization) pattern for 
this and dwarf2out_imported_module_or_decl and dwarf2out_early_global_decl:

struct set_early_dwarf {
  bool saved;
  set_early_dwarf(): saved(early_dwarf) { early_dwarf = true; }
  ~set_early_dwarf() { early_dwarf = saved; }
};

set_early_dwarf s;
dwarf2out_decl (decl);

> +      /* When generating LTO bytecode we can not generate new assembler
> +         names at this point and all important decls got theirs via
> +	 free-lang-data.  */
> +      if (((!flag_generate_lto && !flag_generate_offload)
> +	   || DECL_ASSEMBLER_NAME_SET_P (decl))
> +	  && DECL_ASSEMBLER_NAME (decl) != DECL_NAME (decl)

Doesn't early_finish happen before free_lang_data, so we should be fine?

Jason
Jason Merrill May 29, 2015, 3:51 a.m. UTC | #4
Looks good.

Jason
Richard Biener May 29, 2015, 11:04 a.m. UTC | #5
On Thu, May 28, 2015 at 10:35 PM, Jason Merrill <jason@redhat.com> wrote:
> On 05/28/2015 02:53 PM, Aldy Hernandez wrote:
>>
>> On 05/27/2015 08:39 AM, Jason Merrill wrote:
>>>
>>> On 05/20/2015 11:50 AM, Aldy Hernandez wrote:
>>
>>
>>>> +  /* Fill in the size of variable-length fields in late dwarf.  */
>>>> +  if (TREE_ASM_WRITTEN (type)
>>>> +      && !early_dwarf_dumping)
>>>> +    {
>>>> +      tree member;
>>>> +      for (member = TYPE_FIELDS (type); member; member = DECL_CHAIN
>>>> (member))
>>>> +    fill_variable_array_bounds (TREE_TYPE (member));
>>>> +      return;
>>>> +    }
>>>
>>>
>>> Why is this happening in late dwarf?  I'm concerned that front-end
>>> information that is necessary to do this might be lost by that point.
>>
>>
>> I thought only after the optimizations had run their course would we be
>> guaranteed to have accurate bound information.  At least, that's what my
>> experience showed.
>
>
> Hmm, I'm don't know why optimizations would change the representation of the
> array type.

Correct.  It works fine with the LTO prototype I did apart from gdb not really
getting the association between the early DIEs

 <0><166>: Abbrev Number: 1 (DW_TAG_compile_unit)
...
 <1><1e5>: Abbrev Number: 6 (DW_TAG_subprogram)
...
 <2><1fc>: Abbrev Number: 8 (DW_TAG_variable)
    <1fd>   DW_AT_type        : <0x1d7>
    <201>   DW_AT_artificial  : 1
    <201>   DW_AT_declaration : 1
 <2><201>: Abbrev Number: 3 (DW_TAG_variable)
    <202>   DW_AT_name        : a
    <204>   DW_AT_decl_file   : 1
    <205>   DW_AT_decl_line   : 15
    <206>   DW_AT_type        : <0x20b>
 <2><20a>: Abbrev Number: 0
 <1><20b>: Abbrev Number: 9 (DW_TAG_array_type)
    <20c>   DW_AT_type        : <0x21e>
    <210>   DW_AT_sibling     : <0x21e>
 <2><214>: Abbrev Number: 10 (DW_TAG_subrange_type)
    <215>   DW_AT_type        : <0x1d7>
    <219>   DW_AT_upper_bound : <0x1fc>

and the late DIEs annotating the above

 <0><24c>: Abbrev Number: 1 (DW_TAG_compile_unit)
...
 <1><30c>: Abbrev Number: 3 (DW_TAG_subprogram)
    <30d>   DW_AT_abstract_origin: <0x1e5>
...
 <2><32e>: Abbrev Number: 4 (DW_TAG_variable)
    <32f>   DW_AT_abstract_origin: <0x1fc>
    <333>   DW_AT_location    : 11 byte block: 75 1 8 20 24 8 20 26 31
1c 9f    (DW_OP_breg5 (rdi): 1; DW_OP_const1u: 32; DW_OP_shl;
DW_OP_const1u: 32; DW_OP_shra; DW_OP_lit1; DW_OP_minus;
DW_OP_stack_value)
 <2><33f>: Abbrev Number: 11 (DW_TAG_variable)
    <340>   DW_AT_abstract_origin: <0x201>
<2><344>: Abbrev Number: 0
 <1><345>: Abbrev Number: 12 (DW_TAG_subprogram)

so it fails to jump through hoops when looking up the type of 'a'
seeing the annotation
to the variable used in the array types upper bound... (probably
expected, I'd probably
need to pull down the type, and the subrange-type as well, even though I do not
annotate them).

Or we can teach gdb to jump through the hoops (basically collect all
additional info
that a subprogram DIE children provide to the subprogram abstract
origin children DIEs
somewhen in advance...)

It would be a shame if I really had to create "empty" references to
all the dependence
chain formed via types from the array variable to the array subrange
type upper bound.

Richard.

>>>> +      /* Variable-length types may be incomplete even if
>>>> +     TREE_ASM_WRITTEN.  For such types, fall through to
>>>> +     gen_array_type_die() and possibly fill in
>>>> +     DW_AT_{upper,lower}_bound attributes.  */
>>>> +      if ((TREE_CODE (type) != ARRAY_TYPE
>>>> +       && TREE_CODE (type) != RECORD_TYPE
>>>> +       && TREE_CODE (type) != UNION_TYPE
>>>> +       && TREE_CODE (type) != QUAL_UNION_TYPE)
>>>> +      || (TYPE_SIZE (type)
>>>> +          && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST))
>>>
>>>
>>> Similarly, why check for INTEGER_CST here?
>>
>>
>> The INTEGER_CST check was supposed to mean "we have bound information
>> already, no need to look further".
>>
>> I guess we could have a variable length bound that does not decay to a
>> constant.
>
>
> Right.  I would expect that to usually be the case with VLAs.
>
>> Perhaps I could check the presence of a cached DIE with a
>> type DIE containing a DW_TAG_subrange_type *and*
>> DW_AT_{lower,upper}_bound ??.  Basically I just want to add bound
>> information, if available and not already present.
>>
>> Suggestions?
>
>
> I'm still not sure why we can't just emit bound info in early dwarf. Can you
> be more specific about the optimization thing?
>
> Jason
>
Aldy Hernandez May 29, 2015, 6:42 p.m. UTC | #6
On 05/28/2015 04:35 PM, Jason Merrill wrote:
> On 05/28/2015 02:53 PM, Aldy Hernandez wrote:
>> On 05/27/2015 08:39 AM, Jason Merrill wrote:
>>> On 05/20/2015 11:50 AM, Aldy Hernandez wrote:
>>
>>>> +  /* Fill in the size of variable-length fields in late dwarf.  */
>>>> +  if (TREE_ASM_WRITTEN (type)
>>>> +      && !early_dwarf_dumping)
>>>> +    {
>>>> +      tree member;
>>>> +      for (member = TYPE_FIELDS (type); member; member = DECL_CHAIN
>>>> (member))
>>>> +    fill_variable_array_bounds (TREE_TYPE (member));
>>>> +      return;
>>>> +    }
>>>
>>> Why is this happening in late dwarf?  I'm concerned that front-end
>>> information that is necessary to do this might be lost by that point.
>>
>> I thought only after the optimizations had run their course would we be
>> guaranteed to have accurate bound information.  At least, that's what my
>> experience showed.
>
> Hmm, I'm don't know why optimizations would change the representation of
> the array type.
>
>>>> +      /* Variable-length types may be incomplete even if
>>>> +     TREE_ASM_WRITTEN.  For such types, fall through to
>>>> +     gen_array_type_die() and possibly fill in
>>>> +     DW_AT_{upper,lower}_bound attributes.  */
>>>> +      if ((TREE_CODE (type) != ARRAY_TYPE
>>>> +       && TREE_CODE (type) != RECORD_TYPE
>>>> +       && TREE_CODE (type) != UNION_TYPE
>>>> +       && TREE_CODE (type) != QUAL_UNION_TYPE)
>>>> +      || (TYPE_SIZE (type)
>>>> +          && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST))
>>>
>>> Similarly, why check for INTEGER_CST here?
>>
>> The INTEGER_CST check was supposed to mean "we have bound information
>> already, no need to look further".
>>
>> I guess we could have a variable length bound that does not decay to a
>> constant.
>
> Right.  I would expect that to usually be the case with VLAs.
>
>> Perhaps I could check the presence of a cached DIE with a
>> type DIE containing a DW_TAG_subrange_type *and*
>> DW_AT_{lower,upper}_bound ??.  Basically I just want to add bound
>> information, if available and not already present.
>>
>> Suggestions?
>
> I'm still not sure why we can't just emit bound info in early dwarf. Can
> you be more specific about the optimization thing?

Ok, I see what I was trying to do, albeit incorrectly.  Imagine this:

unsigned int i=555;

int main()
{
   unsigned int array[i];
   ...
}

For the VLA, I'd like to check if we have an array type with a missing 
DW_AT_{upper,lower}_bound late in the game, and fill it in.

During early dwarf we only have an uninitialized gimple register 
representing the bound, and loc_list_from_tree() cannot find the RTL 
with the final bound location.  Thus, we end up with a missing bound, 
which I propose to fill in late dwarf.

Obviously I was doing some nonsense with TYPE_SIZE != INTEGER_CST, when 
in reality I should probably check that TREE_CODE (type) == ARRAY_TYPE 
and that we are missing the bound late (by looking for DW_AT_*_bound in 
the cached DIE).

Is this acceptable, or where you thinking of some other scheme?

Thanks.
Aldy
Richard Biener May 29, 2015, 7:26 p.m. UTC | #7
On May 29, 2015 8:42:50 PM GMT+02:00, Aldy Hernandez <aldyh@redhat.com> wrote:
>On 05/28/2015 04:35 PM, Jason Merrill wrote:
>> On 05/28/2015 02:53 PM, Aldy Hernandez wrote:
>>> On 05/27/2015 08:39 AM, Jason Merrill wrote:
>>>> On 05/20/2015 11:50 AM, Aldy Hernandez wrote:
>>>
>>>>> +  /* Fill in the size of variable-length fields in late dwarf. 
>*/
>>>>> +  if (TREE_ASM_WRITTEN (type)
>>>>> +      && !early_dwarf_dumping)
>>>>> +    {
>>>>> +      tree member;
>>>>> +      for (member = TYPE_FIELDS (type); member; member =
>DECL_CHAIN
>>>>> (member))
>>>>> +    fill_variable_array_bounds (TREE_TYPE (member));
>>>>> +      return;
>>>>> +    }
>>>>
>>>> Why is this happening in late dwarf?  I'm concerned that front-end
>>>> information that is necessary to do this might be lost by that
>point.
>>>
>>> I thought only after the optimizations had run their course would we
>be
>>> guaranteed to have accurate bound information.  At least, that's
>what my
>>> experience showed.
>>
>> Hmm, I'm don't know why optimizations would change the representation
>of
>> the array type.
>>
>>>>> +      /* Variable-length types may be incomplete even if
>>>>> +     TREE_ASM_WRITTEN.  For such types, fall through to
>>>>> +     gen_array_type_die() and possibly fill in
>>>>> +     DW_AT_{upper,lower}_bound attributes.  */
>>>>> +      if ((TREE_CODE (type) != ARRAY_TYPE
>>>>> +       && TREE_CODE (type) != RECORD_TYPE
>>>>> +       && TREE_CODE (type) != UNION_TYPE
>>>>> +       && TREE_CODE (type) != QUAL_UNION_TYPE)
>>>>> +      || (TYPE_SIZE (type)
>>>>> +          && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST))
>>>>
>>>> Similarly, why check for INTEGER_CST here?
>>>
>>> The INTEGER_CST check was supposed to mean "we have bound
>information
>>> already, no need to look further".
>>>
>>> I guess we could have a variable length bound that does not decay to
>a
>>> constant.
>>
>> Right.  I would expect that to usually be the case with VLAs.
>>
>>> Perhaps I could check the presence of a cached DIE with a
>>> type DIE containing a DW_TAG_subrange_type *and*
>>> DW_AT_{lower,upper}_bound ??.  Basically I just want to add bound
>>> information, if available and not already present.
>>>
>>> Suggestions?
>>
>> I'm still not sure why we can't just emit bound info in early dwarf.
>Can
>> you be more specific about the optimization thing?
>
>Ok, I see what I was trying to do, albeit incorrectly.  Imagine this:
>
>unsigned int i=555;
>
>int main()
>{
>   unsigned int array[i];
>   ...
>}
>
>For the VLA, I'd like to check if we have an array type with a missing 
>DW_AT_{upper,lower}_bound late in the game, and fill it in.
>
>During early dwarf we only have an uninitialized gimple register 
>representing the bound, and loc_list_from_tree() cannot find the RTL 
>with the final bound location.  Thus, we end up with a missing bound, 
>which I propose to fill in late dwarf.
>
>Obviously I was doing some nonsense with TYPE_SIZE != INTEGER_CST, when
>
>in reality I should probably check that TREE_CODE (type) == ARRAY_TYPE 
>and that we are missing the bound late (by looking for DW_AT_*_bound in
>
>the cached DIE).
>
>Is this acceptable, or where you thinking of some other scheme?

ISTR I had to mark the gimple reg used for the bound as non-DECL_IGNORED for the LTO stuff.

Richard.

>Thanks.
>Aldy
Jason Merrill May 29, 2015, 7:32 p.m. UTC | #8
On 05/29/2015 02:42 PM, Aldy Hernandez wrote:
> unsigned int i=555;
>
> int main()
> {
>    unsigned int array[i];
>    ...
> }
>
> For the VLA, I'd like to check if we have an array type with a missing
> DW_AT_{upper,lower}_bound late in the game, and fill it in.
>
> During early dwarf we only have an uninitialized gimple register
> representing the bound

Ah, I see, from gimplify_type_sizes.

> and loc_list_from_tree() cannot find the RTL
> with the final bound location.  Thus, we end up with a missing bound,
> which I propose to fill in late dwarf.

OK, that makes sense.  I wonder if we should emit debug info for the 
gimple register; that would allow us to fill in the bound during early 
dwarf.

> Obviously I was doing some nonsense with TYPE_SIZE != INTEGER_CST, when
> in reality I should probably check that TREE_CODE (type) == ARRAY_TYPE
> and that we are missing the bound late (by looking for DW_AT_*_bound in
> the cached DIE).

But this sounds reasonable.

Jason
Jason Merrill May 29, 2015, 7:33 p.m. UTC | #9
On 05/29/2015 03:26 PM, Richard Biener wrote:
> ISTR I had to mark the gimple reg used for the bound as non-DECL_IGNORED for the LTO stuff.

Let's go with that, then.

Jason
Aldy Hernandez May 31, 2015, 12:52 a.m. UTC | #10
On 05/29/2015 03:33 PM, Jason Merrill wrote:
> On 05/29/2015 03:26 PM, Richard Biener wrote:
>> ISTR I had to mark the gimple reg used for the bound as
>> non-DECL_IGNORED for the LTO stuff.
>
> Let's go with that, then.

Well, I did play around with that option originally, but temporaries do 
not end up in the symbol table, so we won't see them to feed them to 
late_global_decl.

We'd have to save them on the side to make them survive until late and 
then feed them to late_global_decl separately (which I'm sure Richi will 
hate), or we could drill down through the array type/domain to find the 
gimple register (which now has an early DIE) and call late_global_decl 
on it.  However, this last option sounds like a variant of my original 
idea-- fill the bound location later, with the unfortunate side-effect 
of having an additional DIE (the gimple register DIE).

I guess we could iterate through all the gimple registers late that have 
DECL_IGNORED_P == NULL and call late_global_decl on them, but I dislike 
this as well.  Actually, all ideas involving generating DIEs for 
temporaries involve an additional DIE we wouldn't otherwise get.

Can I clean up my original idea instead?

Aldy
Jason Merrill May 31, 2015, 8:38 p.m. UTC | #11
On 05/30/2015 08:52 PM, Aldy Hernandez wrote:
> On 05/29/2015 03:33 PM, Jason Merrill wrote:
>> On 05/29/2015 03:26 PM, Richard Biener wrote:
>>> ISTR I had to mark the gimple reg used for the bound as
>>> non-DECL_IGNORED for the LTO stuff.
>>
>> Let's go with that, then.
>
> Well, I did play around with that option originally, but temporaries do
> not end up in the symbol table, so we won't see them to feed them to
> late_global_decl.

The temporary has function scope, so I don't see why that would be an issue.

Jason
Richard Biener June 1, 2015, 8:03 a.m. UTC | #12
On Sun, May 31, 2015 at 10:38 PM, Jason Merrill <jason@redhat.com> wrote:
> On 05/30/2015 08:52 PM, Aldy Hernandez wrote:
>>
>> On 05/29/2015 03:33 PM, Jason Merrill wrote:
>>>
>>> On 05/29/2015 03:26 PM, Richard Biener wrote:
>>>>
>>>> ISTR I had to mark the gimple reg used for the bound as
>>>> non-DECL_IGNORED for the LTO stuff.
>>>
>>>
>>> Let's go with that, then.
>>
>>
>> Well, I did play around with that option originally, but temporaries do
>> not end up in the symbol table, so we won't see them to feed them to
>> late_global_decl.
>
>
> The temporary has function scope, so I don't see why that would be an issue.

Yeah, we should walk it via the function_decl hook on the containing function.

Richard.

> Jason
>

Patch
diff mbox

diff --git a/gcc/gengtype.c b/gcc/gengtype.c
index 02012d5..15b6dd2 100644
--- a/gcc/gengtype.c
+++ b/gcc/gengtype.c
@@ -4718,7 +4718,8 @@  write_roots (pair_p variables, bool emit_pch)
    this funcion will have to be adjusted to be more like
    output_mangled_typename.  */

-static void
+/* ?? Why are we keeping this?  Is this actually used anywhere?  */
+static void ATTRIBUTE_UNUSED
 output_typename (outf_p of, const_type_p t)
 {
   switch (t->kind)

Just remove the function.

The langhooks changes are ok.

diff --git a/gcc/passes.c b/gcc/passes.c
index 04ff042..4dee8ad 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -293,6 +293,28 @@  rest_of_decl_compilation (tree decl,
   else if (TREE_CODE (decl) == VAR_DECL && !DECL_EXTERNAL (decl)
           && TREE_STATIC (decl))
     varpool_node::get_create (decl);
+
+  /* Generate early debug for global variables.  Any local variables will
+     be handled by either handling reachable functions from
+     finalize_compilation_unit (and by consequence, locally scoped
+     symbols), or by rest_of_type_compilation below.
+
+     Also, pick up function prototypes, which will be mostly ignored
+     by the different early_global_decl() hooks, but will at least be
+     used by Go's hijack of the debug_hooks to implement
+     -fdump-go-spec.  */
+  if (!flag_wpa
+      && !in_lto_p

Just check !in_lto_p, !flag_wpa is redundant.

+      && !decl_function_context (decl)
+      && !current_function_decl

Why that?  !decl_function_context should catch relevant cases?

+      && !decl_type_context (decl))
+    (*debug_hooks->early_global_decl) (decl);

I'll note that nested functions and class methods are not getting
early_global_decl()ed here.  I suppose their containing function/type
is supposed to generate early dwarf by means of dwarf2out walking
over children.

timevar changes are ok.

the toplev changes are ok.

diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index ad1bb23..2a9f417 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1334,6 +1334,9 @@  struct GTY(()) tree_block {
   tree abstract_origin;
   tree fragment_origin;
   tree fragment_chain;
+
+  /* Pointer to the DWARF lexical block.  */
+  struct die_struct *die;
 };

 struct GTY(()) tree_type_common {