diff mbox

[debug] Do not emit debug sections until they are required.

Message ID A3086084-F186-42E0-92EB-1359151FF2D7@sandoe-acoustics.co.uk
State New
Headers show

Commit Message

Iain Sandoe Nov. 2, 2010, 12:23 p.m. UTC
This is part 2 of a campaign to get rid of differences in section  
ordering between debug & non-debug cases.

This moves the three remaining switch_to_section()  from  
dwarf2out_init () and places them directly before the code that emits  
the respective sections.

In order to do this, macinfo data are saved in a vec, and emitted from  
dwarf2out_finish () rather than 'on the fly'.

Whilst the benefit is mainly for targets that emit sections in the  
order they are encountered...
...  I also expect it is of help to anyone who has occasion to compare  
debug/non-debug asm files.

bootstrapped on i686-darwin{8,9,10} and x86_64-unk-linux.
OK for trunk?
Iain

gcc:

	* dwarf2out.c (macinfo_entry): New struct.
	(output_comp_unit): Note that we will emit an info section.
	(output_pubnames): Only try to refer to the info section if it is  
emitted.
	(dwarf2out_start_source_file): Save data in a macinfo entry rather than
	emitting directly.
	(dwarf2out_end_source_file): Likewise.
	(dwarf2out_define): Likewise.
	(dwarf2out_undef): Likewise.
	(output_macinfo): New.
	(dwarf2out_init): Do not emit debug section switches here.
	(output_indirect_string): If any strings are emitted, then emit the  
section
	switch first, and once only.
	(dwarf2out_finish): First switch to debug_abbrev_section here.
	debug_line_section,  debug_macinfo_section, Likewise.

===

After this, there are tthree remaining issues (AFAICT);
1. frame_debug emitted before eh_frames (from toplev.c rather than  
dwarf2out - is there some special reason for this?)
2. darwin's indirection stubs are emitted from end_file () and they  
really need to be between the _eh frames and the debug.
3. the lto marker is emitted after everything else.
I had patches for these .. but they've got some bit-rot ..
.. this is not a mega-high-priority - but it would be good to shake  
down some of the darwin debug hassles for 4.6.

Comments

Richard Henderson Nov. 2, 2010, 8:58 p.m. UTC | #1
On 11/02/2010 05:23 AM, IainS wrote:
> -  dw2_asm_output_offset (DWARF_OFFSET_SIZE, debug_info_section_label,
> +  if (info_section_emitted)
> +    dw2_asm_output_offset (DWARF_OFFSET_SIZE, debug_info_section_label,
>  			 debug_info_section,
>  			 "Offset of Compilation Unit Info");
> +  else
> +    dw2_asm_output_data (4, 0, "No Debug Info Section");

This is the only hunk that concerns me.  I don't believe that this
is valid dwarf.  In particular, a non-empty pubnames implies that
we must have had an info section from which to pull the names.

Did you have a test case that caused you to make this change?
If not, ok with gcc_assert (info_section_emitted).


r~
Tom Tromey Nov. 2, 2010, 9:10 p.m. UTC | #2
>>>>> "IainS" == IainS  <developer@sandoe-acoustics.co.uk> writes:

IainS> In order to do this, macinfo data are saved in a vec, and emitted from
IainS> dwarf2out_finish () rather than 'on the fly'.

Does this impact gcc's memory use?
(And does it matter to anybody?)

Tom
Iain Sandoe Nov. 3, 2010, 1:33 p.m. UTC | #3
Hi Richard,

On 2 Nov 2010, at 20:58, Richard Henderson wrote:

> On 11/02/2010 05:23 AM, IainS wrote:
>> -  dw2_asm_output_offset (DWARF_OFFSET_SIZE,  
>> debug_info_section_label,
>> +  if (info_section_emitted)
>> +    dw2_asm_output_offset (DWARF_OFFSET_SIZE,  
>> debug_info_section_label,
>> 			 debug_info_section,
>> 			 "Offset of Compilation Unit Info");
>> +  else
>> +    dw2_asm_output_data (4, 0, "No Debug Info Section");
>
> This is the only hunk that concerns me.  I don't believe that this
> is valid dwarf.  In particular, a non-empty pubnames implies that
> we must have had an info section from which to pull the names.
>
> Did you have a test case that caused you to make this change?
> If not, ok with gcc_assert (info_section_emitted).

not a test-case -- but Java achieves it ...
... and with the gcc_assert (), fails to bootstrap on i686-darwin.

.. because I've removed the unconditional emitting of the section  
switch in dwarf2_init()
now the section is only emitted _if_ there's actually a switch to it....
(so without the assert we would simply fail later with a not-found- 
label in the assembler).

-

If it is unconditionally necessary to emit a (potentially empty)  
section - that can still easily be handled from dwarf2_finish() ...
..  is an empty info section valid?
(I noted, also, the comment about some targets not liking empty debug  
sections).

Iain

also, ISTR that maybe only darwin is using pubnames (so maybe no other  
platform would notice).
(I seem to recall some traffic about this on the list, but not exactly  
the when, why and outcome).
Iain Sandoe Nov. 3, 2010, 1:55 p.m. UTC | #4
On 2 Nov 2010, at 21:10, Tom Tromey wrote:

>>>>>> "IainS" == IainS  <developer@sandoe-acoustics.co.uk> writes:
>
> IainS> In order to do this, macinfo data are saved in a vec, and  
> emitted from
> IainS> dwarf2out_finish () rather than 'on the fly'.
>
> Does this impact gcc's memory use?

the size of the macro strings + a small overhead per macro.
(i.e. it should be dominated by the strings in most cases).

one would generally expect this to number in kb rather than Mb -  
although I'm sure we could come up with a scenario which would exceed  
this.

> (And does it matter to anybody?)

... given the discussion on irc last night about >4Gb objects, it  
seems unlikely.

It's judgement on whether the improvement in determinism for the  
section order and readability of the asm are worth the increase in  
footprint.

cheers
Iain
Richard Henderson Nov. 5, 2010, 11:41 p.m. UTC | #5
On 11/03/2010 06:33 AM, IainS wrote:
> not a test-case -- but Java achieves it ...
> ... and with the gcc_assert (), fails to bootstrap on i686-darwin.

Ok, I've forced pubnames on for linux and I've reproduced this.
I'll have a look at the real causes over the weekend.


r~
Richard Henderson Nov. 6, 2010, 12:08 a.m. UTC | #6
On 11/03/2010 06:33 AM, IainS wrote:
>> This is the only hunk that concerns me.  I don't believe that this
>> is valid dwarf.  In particular, a non-empty pubnames implies that
>> we must have had an info section from which to pull the names.
>>
>> Did you have a test case that caused you to make this change?
>> If not, ok with gcc_assert (info_section_emitted).
> 
> not a test-case -- but Java achieves it ...
> ... and with the gcc_assert (), fails to bootstrap on i686-darwin.

Ok, got it.

We generate pubtypes before -feliminate-unused-debug-types does its
thing.  So we're putting stuff in pubtypes that are no longer in the
compilation unit.  Which is a Bad Thing.  I was right about us 
emitting invalid dwarf2.

We should generate pubnames/pubtypes from the final debug_info tree,
rather than trying to emit them earlier.


r~
Iain Sandoe Nov. 6, 2010, 9:25 p.m. UTC | #7
Hi Richard,
On 6 Nov 2010, at 00:08, Richard Henderson wrote:

> On 11/03/2010 06:33 AM, IainS wrote:
>>> This is the only hunk that concerns me.  I don't believe that this
>>> is valid dwarf.  In particular, a non-empty pubnames implies that
>>> we must have had an info section from which to pull the names.
>>>
>>> Did you have a test case that caused you to make this change?
>>> If not, ok with gcc_assert (info_section_emitted).
>>
>> not a test-case -- but Java achieves it ...
>> ... and with the gcc_assert (), fails to bootstrap on i686-darwin.
>
> Ok, got it.

:-)

> We generate pubtypes before -feliminate-unused-debug-types does its
> thing.  So we're putting stuff in pubtypes that are no longer in the
> compilation unit.  Which is a Bad Thing.  I was right about us
> emitting invalid dwarf2.

it looks like we have possibly only been doing it when the table was  
empty...

> We should generate pubnames/pubtypes from the final debug_info tree,
> rather than trying to emit them earlier.

That looks quite involved (but maybe I miss something).

==

It seems that checking the the Vec has size > 0  is not enough for  
pubtypes (because of the possible elimination).

Would it be sufficient (although another pass through the table :( )  
to check that at least one type will be emitted before trying?

==

Also I note that the

   size_of_pubnames()
   output_pubnames ()

routines check p->die->die_offset != 0 as the condition.

I wonder if that is not supposed to be p->die->die_mark?

thanks
Iain
Richard Henderson Nov. 8, 2010, 4:58 p.m. UTC | #8
On 11/06/2010 02:25 PM, IainS wrote:
> That looks quite involved (but maybe I miss something).

It probably is a bit.  I havn't actually looked.

> Would it be sufficient (although another pass through the table :( )
> to check that at least one type will be emitted before trying?

No, pubtypes is supposed to be a summary of what's in the comp unit.

If we say that type X is in there and it isn't, then the consumer
of the pubtypes info is going to get confused.  To the point where
it might have been better not to emit the data at all, which ought
to force the consumer to scan all the comp units, which is the only
way it can currently get a correct answer.


r~
Iain Sandoe Nov. 8, 2010, 5:19 p.m. UTC | #9
On 8 Nov 2010, at 16:58, Richard Henderson wrote:

> On 11/06/2010 02:25 PM, IainS wrote:
>> Would it be sufficient (although another pass through the table :( )
>> to check that at least one type will be emitted before trying?
>
> No, pubtypes is supposed to be a summary of what's in the comp unit.
>
> If we say that type X is in there and it isn't, then the consumer
> of the pubtypes info is going to get confused.  To the point where
> it might have been better not to emit the data at all, which ought
> to force the consumer to scan all the comp units, which is the only
> way it can currently get a correct answer.


If the current code is working correctly (question about checking p- 
 >die->die_offset != 0 as the condition ***)
... then we should only emit un-pruned types.

Thus, we emit an empty pubtypes table at present if all the types have  
been pruned - I think that case is producing the invalid debug.

We could detect an 'empty' table (no unpruned types left) and elect  
not to emit it at all.

but maybe I'm still missing sth?

Iain

====

  ***

Also I note that the

  size_of_pubnames()
  output_pubnames ()

routines check p->die->die_offset != 0 as the condition.

I wonder if that is not supposed to be p->die->die_mark?
Richard Henderson Nov. 8, 2010, 5:49 p.m. UTC | #10
On 11/08/2010 09:19 AM, IainS wrote:
> If the current code is working correctly (question about checking
> p->die->die_offset != 0 as the condition ***) ... then we should only
> emit un-pruned types.

Hmm.  It would seem that I didn't follow the path far enough to see
us skipping eliminated pubtypes during output.  I was still focused
on the existence of the pubtypes section.

> Thus, we emit an empty pubtypes table at present if all the types
> have been pruned - I think that case is producing the invalid debug.

Yes.  In particular your line about using 0 instead of a real pointer
to the debug_info section.  There is no "null pointer".  If the debug_info
section doesn't exist, then the pub* sections can't exist either.

> We could detect an 'empty' table (no unpruned types left) and elect not to emit it at all.

That sounds plausible.

> Also I note that the
> 
>  size_of_pubnames()
>  output_pubnames ()
> 
> routines check p->die->die_offset != 0 as the condition.
> 
> I wonder if that is not supposed to be p->die->die_mark?

No, the marks are unset after processing.


r~
diff mbox

Patch

Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c	(revision 166173)
+++ gcc/dwarf2out.c	(working copy)
@@ -5760,6 +5760,18 @@  struct GTY(()) dw_ranges_struct {
   int num;
 };
 
+/* A structure to hold a macinfo entry.  */
+
+typedef struct GTY(()) macinfo_struct {
+  unsigned HOST_WIDE_INT code;
+  unsigned HOST_WIDE_INT lineno;
+  const char *info;
+}
+macinfo_entry;
+
+DEF_VEC_O(macinfo_entry);
+DEF_VEC_ALLOC_O(macinfo_entry, gc);
+
 struct GTY(()) dw_ranges_by_label_struct {
   const char *begin;
   const char *end;
@@ -5989,6 +6001,10 @@  static GTY(()) unsigned separate_line_info_table_i
    line_info_table.  */
 #define LINE_INFO_TABLE_INCREMENT 1024
 
+/* A flag to tell pubnames/types export if there is an info section to
+   refer to.  */
+static bool info_section_emitted;
+
 /* A pointer to the base of a table that contains a list of publicly
    accessible names.  */
 static GTY (()) VEC (pubname_entry, gc) *  pubname_table;
@@ -5997,6 +6013,10 @@  static GTY (()) VEC (pubname_entry, gc) *  pubname
    accessible types.  */
 static GTY (()) VEC (pubname_entry, gc) * pubtype_table;
 
+/* A pointer to the base of a table that contains a list of macro
+   defines/undefines (and file start/end markers).  */
+static GTY (()) VEC (macinfo_entry, gc) * macinfo_table;
+
 /* Array of dies for which we should generate .debug_arange info.  */
 static GTY((length ("arange_table_allocated"))) dw_die_ref *arange_table;
 
@@ -11263,7 +11283,11 @@  output_comp_unit (dw_die_ref die, int output_if_em
       switch_to_section (get_section (secname, SECTION_DEBUG, NULL));
     }
   else
-    switch_to_section (debug_info_section);
+    {
+      switch_to_section (debug_info_section);
+      ASM_OUTPUT_LABEL (asm_out_file, debug_info_section_label);
+      info_section_emitted = true;
+    }
 
   /* Output debugging information.  */
   output_compilation_unit_header ();
@@ -11428,9 +11452,13 @@  output_pubnames (VEC (pubname_entry, gc) * names)
 			 "Length of Public Type Names Info");
   /* Version number for pubnames/pubtypes is still 2, even in DWARF3.  */
   dw2_asm_output_data (2, 2, "DWARF Version");
-  dw2_asm_output_offset (DWARF_OFFSET_SIZE, debug_info_section_label,
+  if (info_section_emitted)
+    dw2_asm_output_offset (DWARF_OFFSET_SIZE, debug_info_section_label,
 			 debug_info_section,
 			 "Offset of Compilation Unit Info");
+  else
+    dw2_asm_output_data (4, 0, "No Debug Info Section");
+    
   dw2_asm_output_data (DWARF_OFFSET_SIZE, next_die_offset,
 		       "Compilation Unit Length");
 
@@ -21666,14 +21694,11 @@  dwarf2out_start_source_file (unsigned int lineno,
 
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     {
-      int file_num = maybe_emit_file (lookup_filename (filename));
-
-      switch_to_section (debug_macinfo_section);
-      dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
-      dw2_asm_output_data_uleb128 (lineno, "Included from line number %d",
-				   lineno);
-
-      dw2_asm_output_data_uleb128 (file_num, "file %s", filename);
+      macinfo_entry e;
+      e.code = DW_MACINFO_start_file;
+      e.lineno = lineno;
+      e.info = xstrdup (filename);
+      VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
     }
 }
 
@@ -21688,8 +21713,11 @@  dwarf2out_end_source_file (unsigned int lineno ATT
 
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     {
-      switch_to_section (debug_macinfo_section);
-      dw2_asm_output_data (1, DW_MACINFO_end_file, "End file");
+      macinfo_entry e;
+      e.code = DW_MACINFO_end_file;
+      e.lineno = lineno;
+      e.info = NULL;
+      VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
     }
 }
 
@@ -21703,10 +21731,11 @@  dwarf2out_define (unsigned int lineno ATTRIBUTE_UN
 {
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     {
-      switch_to_section (debug_macinfo_section);
-      dw2_asm_output_data (1, DW_MACINFO_define, "Define macro");
-      dw2_asm_output_data_uleb128 (lineno, "At line number %d", lineno);
-      dw2_asm_output_nstring (buffer, -1, "The macro");
+      macinfo_entry e;
+      e.code = DW_MACINFO_define;
+      e.lineno = lineno;
+      e.info = xstrdup (buffer);;
+      VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
     }
 }
 
@@ -21720,13 +21749,61 @@  dwarf2out_undef (unsigned int lineno ATTRIBUTE_UNU
 {
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     {
-      switch_to_section (debug_macinfo_section);
-      dw2_asm_output_data (1, DW_MACINFO_undef, "Undefine macro");
-      dw2_asm_output_data_uleb128 (lineno, "At line number %d", lineno);
-      dw2_asm_output_nstring (buffer, -1, "The macro");
+      macinfo_entry e;
+      e.code = DW_MACINFO_undef;
+      e.lineno = lineno;
+      e.info = xstrdup (buffer);;
+      VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
     }
 }
 
+static void
+output_macinfo (void)
+{
+  unsigned i;
+  unsigned long length = VEC_length (macinfo_entry, macinfo_table);
+  macinfo_entry *ref;
+
+  if (! length)
+    return;
+
+  for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++)
+    {
+      switch (ref->code)
+	{
+	  case DW_MACINFO_start_file:
+	    {
+	      int file_num = maybe_emit_file (lookup_filename (ref->info));
+	      dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
+	      dw2_asm_output_data_uleb128 
+			(ref->lineno, "Included from line number %lu", 
+			 			(unsigned long)ref->lineno);
+	      dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info);
+	    }
+	    break;
+	  case DW_MACINFO_end_file:
+	    dw2_asm_output_data (1, DW_MACINFO_end_file, "End file");
+	    break;
+	  case DW_MACINFO_define:
+	    dw2_asm_output_data (1, DW_MACINFO_define, "Define macro");
+	    dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", 
+			 			(unsigned long)ref->lineno);
+	    dw2_asm_output_nstring (ref->info, -1, "The macro");
+	    break;
+	  case DW_MACINFO_undef:
+	    dw2_asm_output_data (1, DW_MACINFO_undef, "Undefine macro");
+	    dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
+			 			(unsigned long)ref->lineno);
+	    dw2_asm_output_nstring (ref->info, -1, "The macro");
+	    break;
+	  default:
+	   fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n",
+	     ASM_COMMENT_START, (unsigned long)ref->code);
+	  break;
+	}
+    }
+}
+
 /* Set up for Dwarf output at the start of compilation.  */
 
 static void
@@ -21815,20 +21892,11 @@  dwarf2out_init (const char *filename ATTRIBUTE_UNU
 			       DEBUG_LINE_SECTION_LABEL, 0);
   ASM_GENERATE_INTERNAL_LABEL (ranges_section_label,
 			       DEBUG_RANGES_SECTION_LABEL, 0);
-  switch_to_section (debug_abbrev_section);
-  ASM_OUTPUT_LABEL (asm_out_file, abbrev_section_label);
-  switch_to_section (debug_info_section);
-  ASM_OUTPUT_LABEL (asm_out_file, debug_info_section_label);
-  switch_to_section (debug_line_section);
-  ASM_OUTPUT_LABEL (asm_out_file, debug_line_section_label);
+  ASM_GENERATE_INTERNAL_LABEL (macinfo_section_label,
+			       DEBUG_MACINFO_SECTION_LABEL, 0);
 
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
-    {
-      switch_to_section (debug_macinfo_section);
-      ASM_GENERATE_INTERNAL_LABEL (macinfo_section_label,
-				   DEBUG_MACINFO_SECTION_LABEL, 0);
-      ASM_OUTPUT_LABEL (asm_out_file, macinfo_section_label);
-    }
+    macinfo_table = VEC_alloc (macinfo_entry, gc, 64);
 
   switch_to_section (text_section);
   ASM_OUTPUT_LABEL (asm_out_file, text_section_label);
@@ -21855,7 +21923,10 @@  dwarf2out_assembly_start (void)
 }
 
 /* A helper function for dwarf2out_finish called through
-   htab_traverse.  Emit one queued .debug_str string.  */
+   htab_traverse.  Emit one queued .debug_str string.  
+   We also emit the section start from here, on the basis that 
+   it need not be emitted unless there is at least one entry to
+   output.  */
 
 static int
 output_indirect_string (void **h, void *v ATTRIBUTE_UNUSED)
@@ -21864,7 +21935,12 @@  output_indirect_string (void **h, void *v ATTRIBUT
 
   if (node->label && node->refcount)
     {
-      switch_to_section (debug_str_section);
+      static bool done;
+      if (!done)
+	{
+	  switch_to_section (debug_str_section);
+	  done = true;
+	}
       ASM_OUTPUT_LABEL (asm_out_file, node->label);
       assemble_string (node->str, strlen (node->str) + 1);
     }
@@ -23042,11 +23118,12 @@  dwarf2out_finish (const char *filename)
   htab_delete (comdat_type_table);
 
   /* Output the main compilation unit if non-empty or if .debug_macinfo
-     has been emitted.  */
+     will be emitted.  */
   output_comp_unit (comp_unit_die (), debug_info_level >= DINFO_LEVEL_VERBOSE);
 
   /* Output the abbreviation table.  */
   switch_to_section (debug_abbrev_section);
+  ASM_OUTPUT_LABEL (asm_out_file, abbrev_section_label);
   output_abbrev_section ();
 
   /* Output location list section if necessary.  */
@@ -23111,16 +23188,18 @@  dwarf2out_finish (const char *filename)
      .debug_info section.  IRIX 6.5 `nm' will then complain when
      examining the file.  This is done late so that any filenames
      used by the debug_info section are marked as 'used'.  */
+  switch_to_section (debug_line_section);
+  ASM_OUTPUT_LABEL (asm_out_file, debug_line_section_label);
   if (! DWARF2_ASM_LINE_DEBUG_INFO)
-    {
-      switch_to_section (debug_line_section);
-      output_line_info ();
-    }
+    output_line_info ();
 
   /* Have to end the macro section.  */
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     {
       switch_to_section (debug_macinfo_section);
+      ASM_OUTPUT_LABEL (asm_out_file, macinfo_section_label);
+      if (!VEC_empty (macinfo_entry, macinfo_table))
+        output_macinfo ();
       dw2_asm_output_data (1, 0, "End compilation unit");
     }