diff mbox

Avoiding some garbage rtl from instantiate_virtual_regs

Message ID 874n0q155r.fsf@talisman.default
State New
Headers show

Commit Message

Richard Sandiford May 16, 2014, 8:47 a.m. UTC
It seems that in a typical -O0 compile the amount of rtl that
starts out as needed but becomes garbage is only slightly less than
half of the total amount created.  One of the big offenders is the
vregs pass, which creates new PLUSes when instanstiating a virtual
register + a constant and which creates new MEMs when instantiating
an address involving a virtual register.  This happens a lot in -O0
code because all variables live on the stack.

The instantiation walk is fundamentally in-place: every other part
of the pattern is modified without copying.  And rtl sharing rules
guarantee that we can do the same for PLUSes of registers and MEMs.

The patch does this by adding "inplace" arguments to plus_constant and
replace_equiv_address.  In a -O0 compile of an oldish fold-const.ii
(where no GC takes place) it reduces the amount of used GC memory
from 169M to 166M.  The average max RSS goes down by just over 1%.
Compile time seems to decrease slightly, but probably in the noise range.

There might be other callers than can use the new interfaces too.

Tested on x86_64-linux-gnu.  Also tested by comparing the asm output
for various parts of the testsuite before and after the patch.
The only changes were that some "sym+0"s becamse plain "syms"
(i.e. (plus X (const_int 0)) became X) because of the plus_constant
change.

OK to install?

Thanks,
Richard


gcc/
	* emit-rtl.h (replace_equiv_address, replace_equiv_address_nv): Add an
	inplace argument.  Store the new address in the original MEM when true.
	* emit-rtl.c (change_address_1): Likewise.
	(adjust_address_1, adjust_automodify_address_1, offset_address):
	Update accordingly.
	* rtl.h (plus_constant): Add an inplace argument.
	* explow.c (plus_constant): Likewise.  Try to reuse the original PLUS
	when true.  Avoid generating (plus X (const_int 0)).
	* function.c (instantiate_virtual_regs_in_rtx): Adjust the PLUS
	in-place.  Pass true to plus_constant.
	(instantiate_virtual_regs_in_insn): Pass true to replace_equiv_address.

Comments

Jeff Law May 16, 2014, 6:09 p.m. UTC | #1
On 05/16/14 02:47, Richard Sandiford wrote:
> It seems that in a typical -O0 compile the amount of rtl that
> starts out as needed but becomes garbage is only slightly less than
> half of the total amount created.  One of the big offenders is the
> vregs pass, which creates new PLUSes when instanstiating a virtual
> register + a constant and which creates new MEMs when instantiating
> an address involving a virtual register.  This happens a lot in -O0
> code because all variables live on the stack.
>
> The instantiation walk is fundamentally in-place: every other part
> of the pattern is modified without copying.  And rtl sharing rules
> guarantee that we can do the same for PLUSes of registers and MEMs.
>
> The patch does this by adding "inplace" arguments to plus_constant and
> replace_equiv_address.  In a -O0 compile of an oldish fold-const.ii
> (where no GC takes place) it reduces the amount of used GC memory
> from 169M to 166M.  The average max RSS goes down by just over 1%.
> Compile time seems to decrease slightly, but probably in the noise range.
>
> There might be other callers than can use the new interfaces too.
>
> Tested on x86_64-linux-gnu.  Also tested by comparing the asm output
> for various parts of the testsuite before and after the patch.
> The only changes were that some "sym+0"s becamse plain "syms"
> (i.e. (plus X (const_int 0)) became X) because of the plus_constant
> change.
>
> OK to install?
>
> Thanks,
> Richard
>
>
> gcc/
> 	* emit-rtl.h (replace_equiv_address, replace_equiv_address_nv): Add an
> 	inplace argument.  Store the new address in the original MEM when true.
> 	* emit-rtl.c (change_address_1): Likewise.
> 	(adjust_address_1, adjust_automodify_address_1, offset_address):
> 	Update accordingly.
> 	* rtl.h (plus_constant): Add an inplace argument.
> 	* explow.c (plus_constant): Likewise.  Try to reuse the original PLUS
> 	when true.  Avoid generating (plus X (const_int 0)).
> 	* function.c (instantiate_virtual_regs_in_rtx): Adjust the PLUS
> 	in-place.  Pass true to plus_constant.
> 	(instantiate_virtual_regs_in_insn): Pass true to replace_equiv_address.
>
> Index: gcc/emit-rtl.h
> ===================================================================
> --- gcc/emit-rtl.h	2014-05-15 11:27:06.000259353 +0100
> +++ gcc/emit-rtl.h	2014-05-16 09:11:42.479556294 +0100
> @@ -52,10 +52,10 @@ extern tree get_spill_slot_decl (bool);
>      ADDR.  The caller is asserting that the actual piece of memory pointed
>      to is the same, just the form of the address is being changed, such as
>      by putting something into a register.  */
> -extern rtx replace_equiv_address (rtx, rtx);
> +extern rtx replace_equiv_address (rtx, rtx, bool = false);
>
>   /* Likewise, but the reference is not required to be valid.  */
> -extern rtx replace_equiv_address_nv (rtx, rtx);
> +extern rtx replace_equiv_address_nv (rtx, rtx, bool = false);
Presumably the default value for the inplace argument is to avoid having 
to fixup all the call sites.

I guess that's OK.  Clearly it's a safe default and avoids a fair amount 
of unnecessary churn.


OK for the trunk.

THanks,
Jeff
Richard Sandiford May 17, 2014, 7:04 a.m. UTC | #2
Jeff Law <law@redhat.com> writes:
>> Index: gcc/emit-rtl.h
>> ===================================================================
>> --- gcc/emit-rtl.h	2014-05-15 11:27:06.000259353 +0100
>> +++ gcc/emit-rtl.h	2014-05-16 09:11:42.479556294 +0100
>> @@ -52,10 +52,10 @@ extern tree get_spill_slot_decl (bool);
>>      ADDR.  The caller is asserting that the actual piece of memory pointed
>>      to is the same, just the form of the address is being changed, such as
>>      by putting something into a register.  */
>> -extern rtx replace_equiv_address (rtx, rtx);
>> +extern rtx replace_equiv_address (rtx, rtx, bool = false);
>>
>>   /* Likewise, but the reference is not required to be valid.  */
>> -extern rtx replace_equiv_address_nv (rtx, rtx);
>> +extern rtx replace_equiv_address_nv (rtx, rtx, bool = false);
> Presumably the default value for the inplace argument is to avoid having 
> to fixup all the call sites.

Yeah, partly.

> I guess that's OK.  Clearly it's a safe default and avoids a fair amount 
> of unnecessary churn.

I suppose the argument against default arguments is that having mandatory
arguments forces every caller to think about whether to allow in-place
modification.  But like you say, in practice there are so many callers
to things like plus_constant that I'd have to conservative and add "false"
to calls where "true" would have been OK.  And then people might think
there's a specific reason why it has to be false.

> OK for the trunk.

Applied, thanks.

Richard
diff mbox

Patch

Index: gcc/emit-rtl.h
===================================================================
--- gcc/emit-rtl.h	2014-05-15 11:27:06.000259353 +0100
+++ gcc/emit-rtl.h	2014-05-16 09:11:42.479556294 +0100
@@ -52,10 +52,10 @@  extern tree get_spill_slot_decl (bool);
    ADDR.  The caller is asserting that the actual piece of memory pointed
    to is the same, just the form of the address is being changed, such as
    by putting something into a register.  */
-extern rtx replace_equiv_address (rtx, rtx);
+extern rtx replace_equiv_address (rtx, rtx, bool = false);
 
 /* Likewise, but the reference is not required to be valid.  */
-extern rtx replace_equiv_address_nv (rtx, rtx);
+extern rtx replace_equiv_address_nv (rtx, rtx, bool = false);
 
 extern rtx gen_blockage (void);
 extern rtvec gen_rtvec (int, ...);
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c	2014-05-16 09:09:18.446271662 +0100
+++ gcc/emit-rtl.c	2014-05-16 09:40:25.285714457 +0100
@@ -145,7 +145,6 @@  #define cur_insn_uid (crtl->emit.x_cur_i
 #define cur_debug_insn_uid (crtl->emit.x_cur_debug_insn_uid)
 #define first_label_num (crtl->emit.x_first_label_num)
 
-static rtx change_address_1 (rtx, enum machine_mode, rtx, int);
 static void set_used_decls (tree);
 static void mark_label_nuses (rtx);
 static hashval_t const_int_htab_hash (const void *);
@@ -2010,11 +2009,15 @@  clear_mem_size (rtx mem)
 /* Return a memory reference like MEMREF, but with its mode changed to MODE
    and its address changed to ADDR.  (VOIDmode means don't change the mode.
    NULL for ADDR means don't change the address.)  VALIDATE is nonzero if the
-   returned memory location is required to be valid.  The memory
-   attributes are not changed.  */
+   returned memory location is required to be valid.  INPLACE is true if any
+   changes can be made directly to MEMREF or false if MEMREF must be treated
+   as immutable.
+
+   The memory attributes are not changed.  */
 
 static rtx
-change_address_1 (rtx memref, enum machine_mode mode, rtx addr, int validate)
+change_address_1 (rtx memref, enum machine_mode mode, rtx addr, int validate,
+		  bool inplace)
 {
   addr_space_t as;
   rtx new_rtx;
@@ -2042,6 +2045,12 @@  change_address_1 (rtx memref, enum machi
   if (rtx_equal_p (addr, XEXP (memref, 0)) && mode == GET_MODE (memref))
     return memref;
 
+  if (inplace)
+    {
+      XEXP (memref, 0) = addr;
+      return memref;
+    }
+
   new_rtx = gen_rtx_MEM (mode, addr);
   MEM_COPY_ATTRIBUTES (new_rtx, memref);
   return new_rtx;
@@ -2053,7 +2062,7 @@  change_address_1 (rtx memref, enum machi
 rtx
 change_address (rtx memref, enum machine_mode mode, rtx addr)
 {
-  rtx new_rtx = change_address_1 (memref, mode, addr, 1);
+  rtx new_rtx = change_address_1 (memref, mode, addr, 1, false);
   enum machine_mode mmode = GET_MODE (new_rtx);
   struct mem_attrs attrs, *defattrs;
 
@@ -2166,7 +2175,7 @@  adjust_address_1 (rtx memref, enum machi
 	addr = plus_constant (address_mode, addr, offset);
     }
 
-  new_rtx = change_address_1 (memref, mode, addr, validate);
+  new_rtx = change_address_1 (memref, mode, addr, validate, false);
 
   /* If the address is a REG, change_address_1 rightfully returns memref,
      but this would destroy memref's MEM_ATTRS.  */
@@ -2236,7 +2245,7 @@  adjust_address_1 (rtx memref, enum machi
 adjust_automodify_address_1 (rtx memref, enum machine_mode mode, rtx addr,
 			     HOST_WIDE_INT offset, int validate)
 {
-  memref = change_address_1 (memref, VOIDmode, addr, validate);
+  memref = change_address_1 (memref, VOIDmode, addr, validate, false);
   return adjust_address_1 (memref, mode, offset, validate, 0, 0, 0);
 }
 
@@ -2272,7 +2281,7 @@  offset_address (rtx memref, rtx offset,
     }
 
   update_temp_slot_address (XEXP (memref, 0), new_rtx);
-  new_rtx = change_address_1 (memref, VOIDmode, new_rtx, 1);
+  new_rtx = change_address_1 (memref, VOIDmode, new_rtx, 1, false);
 
   /* If there are no changes, just return the original memory reference.  */
   if (new_rtx == memref)
@@ -2292,23 +2301,25 @@  offset_address (rtx memref, rtx offset,
 /* Return a memory reference like MEMREF, but with its address changed to
    ADDR.  The caller is asserting that the actual piece of memory pointed
    to is the same, just the form of the address is being changed, such as
-   by putting something into a register.  */
+   by putting something into a register.  INPLACE is true if any changes
+   can be made directly to MEMREF or false if MEMREF must be treated as
+   immutable.  */
 
 rtx
-replace_equiv_address (rtx memref, rtx addr)
+replace_equiv_address (rtx memref, rtx addr, bool inplace)
 {
   /* change_address_1 copies the memory attribute structure without change
      and that's exactly what we want here.  */
   update_temp_slot_address (XEXP (memref, 0), addr);
-  return change_address_1 (memref, VOIDmode, addr, 1);
+  return change_address_1 (memref, VOIDmode, addr, 1, inplace);
 }
 
 /* Likewise, but the reference is not required to be valid.  */
 
 rtx
-replace_equiv_address_nv (rtx memref, rtx addr)
+replace_equiv_address_nv (rtx memref, rtx addr, bool inplace)
 {
-  return change_address_1 (memref, VOIDmode, addr, 0);
+  return change_address_1 (memref, VOIDmode, addr, 0, inplace);
 }
 
 /* Return a memory reference like MEMREF, but with its mode widened to
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h	2014-05-16 09:09:18.446271662 +0100
+++ gcc/rtl.h	2014-05-16 09:36:52.309554940 +0100
@@ -1952,7 +1952,7 @@  #define USE_STORE_PRE_DECREMENT(MODE)
 
 /* In explow.c */
 extern HOST_WIDE_INT trunc_int_for_mode	(HOST_WIDE_INT, enum machine_mode);
-extern rtx plus_constant (enum machine_mode, rtx, HOST_WIDE_INT);
+extern rtx plus_constant (enum machine_mode, rtx, HOST_WIDE_INT, bool = false);
 
 /* In rtl.c */
 extern rtx rtx_alloc_stat (RTX_CODE MEM_STAT_DECL);
Index: gcc/explow.c
===================================================================
--- gcc/explow.c	2014-05-15 11:27:06.000259353 +0100
+++ gcc/explow.c	2014-05-16 09:42:08.459715080 +0100
@@ -74,10 +74,12 @@  trunc_int_for_mode (HOST_WIDE_INT c, enu
 }
 
 /* Return an rtx for the sum of X and the integer C, given that X has
-   mode MODE.  */
+   mode MODE.  INPLACE is true if X can be modified inplace or false
+   if it must be treated as immutable.  */
 
 rtx
-plus_constant (enum machine_mode mode, rtx x, HOST_WIDE_INT c)
+plus_constant (enum machine_mode mode, rtx x, HOST_WIDE_INT c,
+	       bool inplace)
 {
   RTX_CODE code;
   rtx y;
@@ -116,6 +118,8 @@  plus_constant (enum machine_mode mode, r
     case CONST:
       /* If adding to something entirely constant, set a flag
 	 so that we can add a CONST around the result.  */
+      if (inplace && shared_const_p (x))
+	inplace = false;
       x = XEXP (x, 0);
       all_constant = 1;
       goto restart;
@@ -136,19 +140,25 @@  plus_constant (enum machine_mode mode, r
 
       if (CONSTANT_P (XEXP (x, 1)))
 	{
-	  x = gen_rtx_PLUS (mode, XEXP (x, 0),
-			    plus_constant (mode, XEXP (x, 1), c));
+	  rtx term = plus_constant (mode, XEXP (x, 1), c, inplace);
+	  if (term == const0_rtx)
+	    x = XEXP (x, 0);
+	  else if (inplace)
+	    XEXP (x, 1) = term;
+	  else
+	    x = gen_rtx_PLUS (mode, XEXP (x, 0), term);
 	  c = 0;
 	}
-      else if (find_constant_term_loc (&y))
+      else if (rtx *const_loc = find_constant_term_loc (&y))
 	{
-	  /* We need to be careful since X may be shared and we can't
-	     modify it in place.  */
-	  rtx copy = copy_rtx (x);
-	  rtx *const_loc = find_constant_term_loc (&copy);
-
+	  if (!inplace)
+	    {
+	      /* We need to be careful since X may be shared and we can't
+		 modify it in place.  */
+	      x = copy_rtx (x);
+	      const_loc = find_constant_term_loc (&x);
+	    }
 	  *const_loc = plus_constant (mode, *const_loc, c);
-	  x = copy;
 	  c = 0;
 	}
       break;
Index: gcc/function.c
===================================================================
--- gcc/function.c	2014-05-15 11:27:06.000259353 +0100
+++ gcc/function.c	2014-05-16 09:11:42.480556303 +0100
@@ -1459,8 +1459,8 @@  instantiate_virtual_regs_in_rtx (rtx *lo
       new_rtx = instantiate_new_reg (XEXP (x, 0), &offset);
       if (new_rtx)
 	{
-	  new_rtx = plus_constant (GET_MODE (x), new_rtx, offset);
-	  *loc = simplify_gen_binary (PLUS, GET_MODE (x), new_rtx, XEXP (x, 1));
+	  XEXP (x, 0) = new_rtx;
+	  *loc = plus_constant (GET_MODE (x), x, offset, true);
 	  if (changed)
 	    *changed = true;
 	  return -1;
@@ -1622,7 +1622,7 @@  instantiate_virtual_regs_in_insn (rtx in
 	      continue;
 
 	    start_sequence ();
-	    x = replace_equiv_address (x, addr);
+	    x = replace_equiv_address (x, addr, true);
 	    /* It may happen that the address with the virtual reg
 	       was valid (e.g. based on the virtual stack reg, which might
 	       be acceptable to the predicates with all offsets), whereas
@@ -1635,7 +1635,7 @@  instantiate_virtual_regs_in_insn (rtx in
 	    if (!safe_insn_predicate (insn_code, i, x))
 	      {
 		addr = force_reg (GET_MODE (addr), addr);
-		x = replace_equiv_address (x, addr);
+		x = replace_equiv_address (x, addr, true);
 	      }
 	    seq = get_insns ();
 	    end_sequence ();