Patchwork dwarf2out: Use DW_FORM_ref_udata (.debug -= 7.53%)

login
register
mail settings
Submitter Jan Kratochvil
Date Oct. 21, 2011, 6:01 p.m.
Message ID <20111021180129.GA24486@host1.jankratochvil.net>
Download mbox | patch
Permalink /patch/121047/
State New
Headers show

Comments

Jan Kratochvil - Oct. 21, 2011, 6:01 p.m.
Hi,

this is a standalone patch to replace DW_FORM_ref4 by DW_FORM_ref_udata
(uleb128).  It saves for libstdc++.so.debug: 5254792 -> 4859136 = 7.53%
Tested with: -O2 -gdwarf-4 -fdebug-types-section

The references are already intra-CU ones, they are never relocated, they are
pre-computed by dwarf2out.c as plain numbers already.  IIRC I got this idea
after reading Google "Fission" but it does not deal with intra-CU refs.

The change does not apply to CFI references, those are unchanged, generated by
Jan Kratochvil - Oct. 21, 2011, 6:08 p.m.
On Fri, 21 Oct 2011 20:01:29 +0200, Jan Kratochvil wrote:
> No regressions on {x86_64,x86_64-m32,i686}-fedora16pre-linux-gnu
> (4.7.0 20111002).

A typo, only tested for x86_64-fedora16pre-linux-gnu, sorry.


Jan
Jakub Jelinek - Oct. 21, 2011, 7:11 p.m.
On Fri, Oct 21, 2011 at 08:01:29PM +0200, Jan Kratochvil wrote:
> I had a draft patch to use DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref_udata and
> DW_FORM_ref4 which should be slightly smaller than just DW_FORM_ref_udata.
> But it produced larger files than just DW_FORM_ref_udata.  Assuming it was due
> to multiplied abbrev definitions.  One would need to decide when if it is
> worth to create a new smaller-sized abbrev but that seems too complicated for
> such kind of optimization.  There exist other project(s) in development for
> DWARF optimizations as a post-processing tool, this patch is meant just as
> a very simple way to reduce the (possibly intermediate) DWARF files.

Well, you calculate the sizes multiple times anyway, so I don't see why you
during the size calculations you couldn't start with DW_FORM_ref_udata
as first guess and compute on the side also total sizes of those
DW_FORM_ref_udata bytes and use that number plus the guessed length
of the whole CU to decide if replacing all DW_FORM_ref_udata with
DW_FORM_ref{1,2,4} wouldn't be beneficial.  Small complication for that is
that when there are multiple .debug_info/.debug_types sections that share
the same .debug_abbrev, the decision needs to be done for all of them
together.

BTW, is your 

> +  /* The DIE sizes can increase, due to DW_FORM_ref_udata size increase
> +     dependent on increases of other DIE_OFFSETs.  */
> +  do
> +    {
> +      /* Initialize the beginning DIE offset - and calculate sizes/offsets.  */
> +      next_die_offset = init_die_offset;
> +      calc_die_sizes_change = false;
> +      calc_die_sizes (die);
> +    }
> +  while (calc_die_sizes_change);

loop guaranteed to terminate?  If the CU is either only growing or only
shrinking then it hopefully should, but it would be nice to assert that.
For references to DIEs with lower offsets you start with a roughly correct
guess, for references to DIEs with higher offsets you start with 0 and then
just keep growing?

	Jakub
Jan Kratochvil - Oct. 21, 2011, 7:49 p.m.
On Fri, 21 Oct 2011 21:11:16 +0200, Jakub Jelinek wrote:
> Well, you calculate the sizes multiple times anyway, so I don't see why you
> during the size calculations you couldn't start with DW_FORM_ref_udata
> as first guess and compute on the side also total sizes of those
> DW_FORM_ref_udata bytes and use that number plus the guessed length
> of the whole CU to decide if replacing all DW_FORM_ref_udata with
> DW_FORM_ref{1,2,4} wouldn't be beneficial.

The optimal sizes are:
	value less than 1 <<  8: DW_FORM_ref1
	value less than 1 << 16: DW_FORM_ref2
	value less than 1 << 21: DW_FORM_ref_udata
	value less than 1 << 32: DW_FORM_ref4

One would have to decide each size specifically if it isn't worth to use the
larger size for the few instances.  It could be done, just the gain is not
expected (not measured) big against DW_FORM_ref_udata.  The currently
suboptimal ranges are:
	(1 <<  7) <= value < (1 <<  8)
	(1 << 14) <= value < (1 << 16)
	(1 << 28) <= value

This suboptimal loss is at most 9316 bytes = 0.177% of .debug size but it will
be less as I do not calculate here with the abbrevs multiplication at all:
	readelf -wi libstdc++.so|perl -nle 'next if !/^.*<(0x[0-9a-f]+)>.*$/;$x=eval $1;$b++ if (1<<7)<=$x&&$x<(1<<8)||(1<<14)<=$x&&$x<(1<<16)||(1<<28)<=$x;END{print $b;}'
	9316

IMHO it is not worth it.

Also this problem is already present there.  The following code may enlarge the
output in specific cases due to duplication of the abbrev definitions:
  /* If the string is shorter or equal to the size of the reference, it is
     always better to put it inline.  */
     ^^^^^^ = not always
  if (len <= DWARF_OFFSET_SIZE || node->refcount == 0)
    return node->form = DW_FORM_string;

This patch has a size regression for DIE_OFFSETs larger than 1 << 28 = 256MB
when DW_FORM_ref_udata becomes 5 bytes while DW_FORM_ref4 was 4 bytes.
Still I do not think it needs to care about optimal output for a single CU of
size larger than 256MB.


> > +  /* The DIE sizes can increase, due to DW_FORM_ref_udata size increase
> > +     dependent on increases of other DIE_OFFSETs.  */
> > +  do
> > +    {
> > +      /* Initialize the beginning DIE offset - and calculate sizes/offsets.  */
> > +      next_die_offset = init_die_offset;
> > +      calc_die_sizes_change = false;
> > +      calc_die_sizes (die);
> > +    }
> > +  while (calc_die_sizes_change);
> 
> loop guaranteed to terminate?  If the CU is either only growing or only
> shrinking then it hopefully should, but it would be nice to assert that.
> For references to DIEs with lower offsets you start with a roughly correct
> guess, for references to DIEs with higher offsets you start with 0 and then
> just keep growing?

Yes.  The initial DIE_OFFSETs are 0, therefore their sizeof(uleb128) is 1.
As I assume
	VAL1 <= VAL2 implicates sizeof (uleb128 (VAL1)) <= sizeof (uleb128 (VAL2))

there must be valid on each calc_die_sizes call the updated assertion:
	gcc_assert ((unsigned long int) die->die_offset <= next_die_offset);

I do not see anything could shrink anywhere.


Thanks,
Jan

Patch

different code of dwarf2out.c.

I had a draft patch to use DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref_udata and
DW_FORM_ref4 which should be slightly smaller than just DW_FORM_ref_udata.
But it produced larger files than just DW_FORM_ref_udata.  Assuming it was due
to multiplied abbrev definitions.  One would need to decide when if it is
worth to create a new smaller-sized abbrev but that seems too complicated for
such kind of optimization.  There exist other project(s) in development for
DWARF optimizations as a post-processing tool, this patch is meant just as
a very simple way to reduce the (possibly intermediate) DWARF files.

It is incompatible with GNU binutils readelf, a pending simple fix is:
	[patch] Fix readelf for DW_FORM_ref_udata
	http://sourceware.org/ml/binutils/2011-10/msg00201.html

No regressions on {x86_64,x86_64-m32,i686}-fedora16pre-linux-gnu
(4.7.0 20111002).  No changes in readelf -w output on libstdc++.so.

My former DW_AT_sibling patch will be updated on top of this one:
	Re: [patch#2] dwarf2out: Drop the size + performance overhead of DW_AT_sibling
	http://gcc.gnu.org/ml/gcc-patches/2011-10/msg01869.html
These two optimizations are not incremental to each other as both benefit
primarily from common DW_AT_sibling.  That size decrease 3.49% will be less
after this patch; moreover when not all DW_AT_sibling attrs will be dropped.


Thanks,
Jan


gcc/
2011-10-21  Jan Kratochvil  <jan.kratochvil@redhat.com>

	* dwarf2out.c (DW_FORM_ref): Remove.
	(size_of_die) <dw_val_class_die_ref>: Use size_of_uleb128 for
	!AT_ref_external references.
	(calc_die_sizes_change): New variable.
	(calc_die_sizes): Permit die_offset incrementals.  Set
	CALC_DIE_SIZES_CHANGE accordingly.
	(value_format) <dw_val_class_die_ref>: Return DW_FORM_ref_udata for
	!AT_ref_external references.
	(output_die) <dw_val_class_die_ref>: Use dw2_asm_output_data_uleb128
	for !AT_ref_external references.
	(output_unit_prep): New function, call calc_die_sizes repeatedly based
	on CALC_DIE_SIZES_CHANGE.
	(output_comp_unit, output_comdat_type_unit): Move some code to
	output_unit_prep.

--- gcc/dwarf2out.c
+++ gcc/dwarf2out.c
@@ -226,7 +226,6 @@  static GTY(()) rtx current_unit_personality;
 
 /* Data and reference forms for relocatable data.  */
 #define DW_FORM_data (DWARF_OFFSET_SIZE == 8 ? DW_FORM_data8 : DW_FORM_data4)
-#define DW_FORM_ref (DWARF_OFFSET_SIZE == 8 ? DW_FORM_ref8 : DW_FORM_ref4)
 
 #ifndef DEBUG_FRAME_SECTION
 #define DEBUG_FRAME_SECTION	".debug_frame"
@@ -7700,7 +7699,7 @@  size_of_die (dw_die_ref die)
 		size += DWARF_OFFSET_SIZE;
 	    }
 	  else
-	    size += DWARF_OFFSET_SIZE;
+	    size += size_of_uleb128 (AT_ref (a)->die_offset);
 	  break;
 	case dw_val_class_fde_ref:
 	  size += DWARF_OFFSET_SIZE;
@@ -7735,6 +7734,10 @@  size_of_die (dw_die_ref die)
   return size;
 }
 
+/* Has calc_die_sizes changed any DIE_OFFSET?  */
+
+static bool calc_die_sizes_change;
+
 /* Size the debugging information associated with a given DIE.  Visits the
    DIE's children recursively.  Updates the global variable next_die_offset, on
    each time through.  Uses the current value of next_die_offset to update the
@@ -7745,9 +7748,12 @@  calc_die_sizes (dw_die_ref die)
 {
   dw_die_ref c;
 
-  gcc_assert (die->die_offset == 0
-	      || (unsigned long int) die->die_offset == next_die_offset);
-  die->die_offset = next_die_offset;
+  gcc_assert ((unsigned long int) die->die_offset <= next_die_offset);
+  if ((unsigned long int) die->die_offset != next_die_offset)
+    {
+      die->die_offset = next_die_offset;
+      calc_die_sizes_change = true;
+    }
   next_die_offset += size_of_die (die);
 
   FOR_EACH_CHILD (die, c, calc_die_sizes (c));
@@ -8018,7 +8024,7 @@  value_format (dw_attr_ref a)
       if (AT_ref_external (a))
 	return use_debug_types ? DW_FORM_ref_sig8 : DW_FORM_ref_addr;
       else
-	return DW_FORM_ref;
+	return DW_FORM_ref_udata;
     case dw_val_class_fde_ref:
       return DW_FORM_data;
     case dw_val_class_lbl_id:
@@ -8397,8 +8403,7 @@  output_die (dw_die_ref die)
 	  else
 	    {
 	      gcc_assert (AT_ref (a)->die_offset);
-	      dw2_asm_output_data (DWARF_OFFSET_SIZE, AT_ref (a)->die_offset,
-				   "%s", name);
+	      dw2_asm_output_data_uleb128 (AT_ref (a)->die_offset, "%s", name);
 	    }
 	  break;
 
@@ -8496,6 +8501,28 @@  output_compilation_unit_header (void)
   dw2_asm_output_data (1, DWARF2_ADDR_SIZE, "Pointer Size (in bytes)");
 }
 
+/* Perform initializations common for .debug_info and .debug_types sections.  */
+
+static void
+output_unit_prep (dw_die_ref die, unsigned long init_die_offset)
+{
+  /* First mark all the DIEs in this CU so we know which get local refs.  */
+  mark_dies (die);
+
+  build_abbrev_table (die);
+
+  /* The DIE sizes can increase, due to DW_FORM_ref_udata size increase
+     dependent on increases of other DIE_OFFSETs.  */
+  do
+    {
+      /* Initialize the beginning DIE offset - and calculate sizes/offsets.  */
+      next_die_offset = init_die_offset;
+      calc_die_sizes_change = false;
+      calc_die_sizes (die);
+    }
+  while (calc_die_sizes_change);
+}
+
 /* Output the compilation unit DIE and its children.  */
 
 static void
@@ -8511,15 +8538,8 @@  output_comp_unit (dw_die_ref die, int output_if_empty)
   /* Even if there are no children of this DIE, we must output the information
      about the compilation unit.  Otherwise, on an empty translation unit, we
      will generate a present, but empty, .debug_info section.  IRIX 6.5 `nm'
-     will then complain when examining the file.  First mark all the DIEs in
-     this CU so we know which get local refs.  */
-  mark_dies (die);
-
-  build_abbrev_table (die);
-
-  /* Initialize the beginning DIE offset - and calculate sizes/offsets.  */
-  next_die_offset = DWARF_COMPILE_UNIT_HEADER_SIZE;
-  calc_die_sizes (die);
+     will then complain when examining the file.  */
+  output_unit_prep (die, DWARF_COMPILE_UNIT_HEADER_SIZE);
 
   oldsym = die->die_id.die_symbol;
   if (oldsym)
@@ -8563,14 +8583,7 @@  output_comdat_type_unit (comdat_type_node *node)
   tree comdat_key;
 #endif
 
-  /* First mark all the DIEs in this CU so we know which get local refs.  */
-  mark_dies (node->root_die);
-
-  build_abbrev_table (node->root_die);
-
-  /* Initialize the beginning DIE offset - and calculate sizes/offsets.  */
-  next_die_offset = DWARF_COMDAT_TYPE_UNIT_HEADER_SIZE;
-  calc_die_sizes (node->root_die);
+  output_unit_prep (node->root_die, DWARF_COMDAT_TYPE_UNIT_HEADER_SIZE);
 
 #if defined (OBJECT_FORMAT_ELF)
   secname = ".debug_types";