diff mbox series

dwarf2out: Fix -gsplit-dwarf on riscv or other non-.uleb128 targets [PR99090]

Message ID 20210227102142.GD4020736@tucnak
State New
Headers show
Series dwarf2out: Fix -gsplit-dwarf on riscv or other non-.uleb128 targets [PR99090] | expand

Commit Message

Jakub Jelinek Feb. 27, 2021, 10:21 a.m. UTC
Hi!

As mentioned in the PR, riscv* only supports .uleb128 with constant
arguments, doesn't support difference of two labels because of aggressive
linker relaxations.  But I bet various other targets, especially those not
using GNU assembler, might suffer from the same problem.
As the FIXME comment in output_loc_list indicates, we ICE on
-gsplit-dwarf on those targets whenever we need .debug_loclists, because
we only emit DW_LLE_startx_length which requires working .uleb128 delta
of 2 code section labels.  We can't use DW_LLE_base_addressx
once followed by DW_LLE_offset_pair either because the latter suffers
from the same issue - need .uleb128 difference of code section labels
(and in that case not just for the second operand but also for the first).

So, this patch implements what the comment said and emits DW_LLE_startx_endx
instead, which wastes more space in .debug_addr, but will work.

Bootstrapped/regtested on x86_64-linux and i686-linux and as written in the
PR, Jim has tested it on riscv*linux.  Ok for trunk?

BTW, for HAVE_AS_LEB128 -gdwarf-5 -gsplit-dwarf, maybe we should consider
instead of always emitting DW_LLE_startx_length do all the optimizations
that we do for HAVE_AS_LEB128 -gdwarf-5, or at least a subset of them.
For !have_multiple_function_sections, we in that case emit just
DW_LLE_offset_pair (that can certainly be a win for small TUs, we wouldn't
need any .debug_addr entry in that case; on the other side, just using
DW_LLE_offset_pair can be harmful for very large TUs especially if the
loclist has many entries, emitting in that case a single DW_LLE_base_address
or for -gsplit-dwarf DW_LLE_base_addressx followed by DW_LLE_offset_pair
might be much smaller), and for have_multiple_function_sections figuring
out if DW_LLE_base_address followed by DW_LLE_offset_pair entries
or DW_LLE_start_length is bettter.  So perhaps a middle-ground for
-gsplit-dwarf would be to always do the have_multiple_function_sections
behavior, i.e. DW_LLE_base_addressx followed by DW_LLE_offset_pair vs.
DW_LLE_startx_length decisions based on the ranges and their counts.
And perhaps dwz could optimize afterwards, on linked binaries or shared
libraries it knows all the offsets and could figure out optimal DW_LLE_*.

2021-02-26  Jakub Jelinek  <jakub@redhat.com>

	PR debug/99090
	* dwarf2out.c (dw_loc_list_struct): Add end_entry member.
	(new_loc_list): Clear end_entry.
	(output_loc_list): Only use DW_LLE_startx_length for -gsplit-dwarf
	if HAVE_AS_LEB128, otherwise use DW_LLE_startx_endx.  Fix comment
	typo.
	(index_location_lists): For dwarf_version >= 5 without HAVE_AS_LEB128,
	initialize also end_entry.


	Jakub

Comments

Jeff Law March 2, 2021, 11:01 p.m. UTC | #1
On 2/27/21 3:21 AM, Jakub Jelinek via Gcc-patches wrote:
> Hi!
>
> As mentioned in the PR, riscv* only supports .uleb128 with constant
> arguments, doesn't support difference of two labels because of aggressive
> linker relaxations.  But I bet various other targets, especially those not
> using GNU assembler, might suffer from the same problem.
Yea.  I'm going to be working on a target with similar constraints
soon.  The 32bit PA hpux port has similar issues, but can at least emit
suitable relocations to make this work.
> As the FIXME comment in output_loc_list indicates, we ICE on
> -gsplit-dwarf on those targets whenever we need .debug_loclists, because
> we only emit DW_LLE_startx_length which requires working .uleb128 delta
> of 2 code section labels.  We can't use DW_LLE_base_addressx
> once followed by DW_LLE_offset_pair either because the latter suffers
> from the same issue - need .uleb128 difference of code section labels
> (and in that case not just for the second operand but also for the first).
>
> So, this patch implements what the comment said and emits DW_LLE_startx_endx
> instead, which wastes more space in .debug_addr, but will work.


>
> Bootstrapped/regtested on x86_64-linux and i686-linux and as written in the
> PR, Jim has tested it on riscv*linux.  Ok for trunk?
>
> BTW, for HAVE_AS_LEB128 -gdwarf-5 -gsplit-dwarf, maybe we should consider
> instead of always emitting DW_LLE_startx_length do all the optimizations
> that we do for HAVE_AS_LEB128 -gdwarf-5, or at least a subset of them.
> For !have_multiple_function_sections, we in that case emit just
> DW_LLE_offset_pair (that can certainly be a win for small TUs, we wouldn't
> need any .debug_addr entry in that case; on the other side, just using
> DW_LLE_offset_pair can be harmful for very large TUs especially if the
> loclist has many entries, emitting in that case a single DW_LLE_base_address
> or for -gsplit-dwarf DW_LLE_base_addressx followed by DW_LLE_offset_pair
> might be much smaller), and for have_multiple_function_sections figuring
> out if DW_LLE_base_address followed by DW_LLE_offset_pair entries
> or DW_LLE_start_length is bettter.  So perhaps a middle-ground for
> -gsplit-dwarf would be to always do the have_multiple_function_sections
> behavior, i.e. DW_LLE_base_addressx followed by DW_LLE_offset_pair vs.
> DW_LLE_startx_length decisions based on the ranges and their counts.
> And perhaps dwz could optimize afterwards, on linked binaries or shared
> libraries it knows all the offsets and could figure out optimal DW_LLE_*.
>
> 2021-02-26  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR debug/99090
> 	* dwarf2out.c (dw_loc_list_struct): Add end_entry member.
> 	(new_loc_list): Clear end_entry.
> 	(output_loc_list): Only use DW_LLE_startx_length for -gsplit-dwarf
> 	if HAVE_AS_LEB128, otherwise use DW_LLE_startx_endx.  Fix comment
> 	typo.
> 	(index_location_lists): For dwarf_version >= 5 without HAVE_AS_LEB128,
> 	initialize also end_entry.
OK
jeff
diff mbox series

Patch

--- gcc/dwarf2out.c.jj	2021-02-10 07:54:25.210622383 +0100
+++ gcc/dwarf2out.c	2021-02-26 19:11:19.555409473 +0100
@@ -1317,6 +1317,7 @@  typedef struct GTY(()) dw_loc_list_struc
   const char *begin; /* Label and addr_entry for start of range */
   addr_table_entry *begin_entry;
   const char *end;  /* Label for end of range */
+  addr_table_entry *end_entry;
   char *ll_symbol; /* Label for beginning of location list.
 		      Only on head of list.  */
   char *vl_symbol; /* Label for beginning of view list.  Ditto.  */
@@ -10101,6 +10102,7 @@  new_loc_list (dw_loc_descr_ref expr, con
   retlist->begin = begin;
   retlist->begin_entry = NULL;
   retlist->end = end;
+  retlist->end_entry = NULL;
   retlist->expr = expr;
   retlist->section = section;
   retlist->vbegin = vbegin;
@@ -10327,10 +10329,10 @@  output_loc_list (dw_loc_list_ref list_he
 
       if (dwarf_version >= 5)
 	{
-	  if (dwarf_split_debug_info)
+	  if (dwarf_split_debug_info && HAVE_AS_LEB128)
 	    {
 	      dwarf2out_maybe_output_loclist_view_pair (curr);
-	      /* For -gsplit-dwarf, emit DW_LLE_starx_length, which has
+	      /* For -gsplit-dwarf, emit DW_LLE_startx_length, which has
 		 uleb128 index into .debug_addr and uleb128 length.  */
 	      dw2_asm_output_data (1, DW_LLE_startx_length,
 				   "DW_LLE_startx_length (%s)",
@@ -10338,13 +10340,26 @@  output_loc_list (dw_loc_list_ref list_he
 	      dw2_asm_output_data_uleb128 (curr->begin_entry->index,
 					   "Location list range start index "
 					   "(%s)", curr->begin);
-	      /* FIXME: This will ICE ifndef HAVE_AS_LEB128.
-		 For that case we probably need to emit DW_LLE_startx_endx,
-		 but we'd need 2 .debug_addr entries rather than just one.  */
 	      dw2_asm_output_delta_uleb128 (curr->end, curr->begin,
 					    "Location list length (%s)",
 					    list_head->ll_symbol);
 	    }
+	  else if (dwarf_split_debug_info)
+	    {
+	      dwarf2out_maybe_output_loclist_view_pair (curr);
+	      /* For -gsplit-dwarf without usable .uleb128 support, emit
+		 DW_LLE_startx_endx, which has two uleb128 indexes into
+		 .debug_addr.  */
+	      dw2_asm_output_data (1, DW_LLE_startx_endx,
+				   "DW_LLE_startx_endx (%s)",
+				   list_head->ll_symbol);
+	      dw2_asm_output_data_uleb128 (curr->begin_entry->index,
+					   "Location list range start index "
+					   "(%s)", curr->begin);
+	      dw2_asm_output_data_uleb128 (curr->end_entry->index,
+					   "Location list range end index "
+					   "(%s)", curr->end);
+	    }
 	  else if (!have_multiple_function_sections && HAVE_AS_LEB128)
 	    {
 	      dwarf2out_maybe_output_loclist_view_pair (curr);
@@ -31288,12 +31303,14 @@  index_location_lists (dw_die_ref die)
 	       to the hash table.  In the rare case of DWARF[234] >= 64KB
 	       location expression, we'll just waste unused address table entry
 	       for it.  */
-            if (curr->begin_entry != NULL
-                || skip_loc_list_entry (curr))
+	    if (curr->begin_entry != NULL || skip_loc_list_entry (curr))
               continue;
 
             curr->begin_entry
 	      = add_addr_table_entry (xstrdup (curr->begin), ate_kind_label);
+	    if (dwarf_version >= 5 && !HAVE_AS_LEB128)
+	      curr->end_entry
+		= add_addr_table_entry (xstrdup (curr->end), ate_kind_label);
           }
       }