diff mbox

Avoid creating unbounded complexity debug insns in valtrack (PR debug/77844)

Message ID 20161212192839.GG3541@tucnak.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek Dec. 12, 2016, 7:28 p.m. UTC
Hi!

On the following testcase (where I've failed to reduce it without the
header, got to over 70KB with both delta and creduce) we end up with huge
RTL expressions in debug insns that just slow down combiner as well as
var-tracking etc. too much.
Generally, at GIMPLE level we try to keep the debug stmts using debug
temporaries very simple, then during expansion we have a function that uses
debug temporaries in case TER creates larger expressions, and finally during
var-tracking we also punt on too large expressions.  But valtrack APIs,
used e.g. by the combiner, when substituting lots of pseudos for expressions
that e.g. contain each two other pseudos, is able to create many KB sized
expressions.

The following patch just throws away expressions that have 32 register
references in there, I think that is so huge that it will be really very
unlikely to be beneficial in the debug info and var-tracking would likely
throw it away anyway.  Or shall I use some --param instead of the constant
(any suggestions on how to call it)?  Another option (with or without param)
is to create debug temporaries in those cases and split the expression into
smaller expressions.  Not sure if it is worth it though.

Bootstrapped/regtested on x86_64-linux and i686-linux.

2016-12-12  Jakub Jelinek  <jakub@redhat.com>

	PR debug/77844
	* valtrack.c: Include rtl-iter.h.
	(propagate_for_debug): Reset debug insns which after simplification
	contain more than 32 regs.

	* g++.dg/opt/pr77844.C: New test.


	Jakub

Comments

Richard Biener Dec. 13, 2016, 9:05 a.m. UTC | #1
On Mon, 12 Dec 2016, Jakub Jelinek wrote:

> Hi!
> 
> On the following testcase (where I've failed to reduce it without the
> header, got to over 70KB with both delta and creduce) we end up with huge
> RTL expressions in debug insns that just slow down combiner as well as
> var-tracking etc. too much.
> Generally, at GIMPLE level we try to keep the debug stmts using debug
> temporaries very simple, then during expansion we have a function that uses
> debug temporaries in case TER creates larger expressions, and finally during
> var-tracking we also punt on too large expressions.  But valtrack APIs,
> used e.g. by the combiner, when substituting lots of pseudos for expressions
> that e.g. contain each two other pseudos, is able to create many KB sized
> expressions.
> 
> The following patch just throws away expressions that have 32 register
> references in there, I think that is so huge that it will be really very
> unlikely to be beneficial in the debug info and var-tracking would likely
> throw it away anyway.  Or shall I use some --param instead of the constant
> (any suggestions on how to call it)?  Another option (with or without param)
> is to create debug temporaries in those cases and split the expression into
> smaller expressions.  Not sure if it is worth it though.

Hmm, so the "easier" fix would be to always create a debug insn
right after INSN setting a new debug reg to SRC and then doing the
replacement with the new debug reg?  With the "optimization"
to not do that for some single use and/or "simple" SRC cases
(I think "simple" SRC would be sth with a single/zero register reference).

You are fixing it at the wrong end IMHO, by throwing away the
thing after propagating (and doing all the work and doing the
check on each individual propagation result).

Richard.

> Bootstrapped/regtested on x86_64-linux and i686-linux.
> 
> 2016-12-12  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR debug/77844
> 	* valtrack.c: Include rtl-iter.h.
> 	(propagate_for_debug): Reset debug insns which after simplification
> 	contain more than 32 regs.
> 
> 	* g++.dg/opt/pr77844.C: New test.
> 
> --- gcc/valtrack.c.jj	2016-12-02 16:41:05.000000000 +0100
> +++ gcc/valtrack.c	2016-12-12 15:12:39.637451127 +0100
> @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.
>  #include "regs.h"
>  #include "memmodel.h"
>  #include "emit-rtl.h"
> +#include "rtl-iter.h"
>  
>  /* gen_lowpart_no_emit hook implementation for DEBUG_INSNs.  In DEBUG_INSNs,
>     all lowpart SUBREGs are valid, despite what the machine requires for
> @@ -197,6 +198,16 @@ propagate_for_debug (rtx_insn *insn, rtx
>  					 dest, propagate_for_debug_subst, &p);
>  	  if (loc == INSN_VAR_LOCATION_LOC (insn))
>  	    continue;
> +	  /* Avoid propagation from growing DEBUG_INSN expressions
> +	     too much.  */
> +	  int cnt = 0;
> +	  subrtx_iterator::array_type array;
> +	  FOR_EACH_SUBRTX (iter, array, loc, ALL)
> +	    if (REG_P (*iter) && ++cnt > 32)
> +	      {
> +		loc = gen_rtx_UNKNOWN_VAR_LOC ();
> +		break;
> +	      }
>  	  INSN_VAR_LOCATION_LOC (insn) = loc;
>  	  df_insn_rescan (insn);
>  	}
> --- gcc/testsuite/g++.dg/opt/pr77844.C.jj	2016-12-12 17:28:25.146711803 +0100
> +++ gcc/testsuite/g++.dg/opt/pr77844.C	2016-12-12 17:27:43.000000000 +0100
> @@ -0,0 +1,32 @@
> +// PR debug/77844
> +// { dg-do compile }
> +// { dg-options "-O3 -g" }
> +
> +#include <vector>
> +
> +void
> +foo (std::vector<bool>& v, int i, int j)
> +{
> +  for (int k = 0; k < 5; ++k)
> +    v[5 * i + k] = v[5 * i + k] | v[5 * j + k];
> +}
> +
> +void
> +bar (std::vector<bool>& v, int i, int j)
> +{
> +  for (int k = 0; k < 5; ++k)
> +    v[5 * i + k] = v[5 * i + k] ^ v[5 * j + k];
> +}
> +
> +void
> +baz (std::vector<bool>& v)
> +{
> +  foo (v, 4, 1);
> +  foo (v, 4, 2);
> +  foo (v, 4, 3);
> +  foo (v, 4, 4);
> +  bar (v, 4, 1);
> +  bar (v, 4, 2);
> +  bar (v, 4, 3);
> +  bar (v, 4, 4);
> +}
> 
> 	Jakub
> 
>
Jakub Jelinek Dec. 13, 2016, 9:34 a.m. UTC | #2
On Tue, Dec 13, 2016 at 10:05:58AM +0100, Richard Biener wrote:
> > The following patch just throws away expressions that have 32 register
> > references in there, I think that is so huge that it will be really very
> > unlikely to be beneficial in the debug info and var-tracking would likely
> > throw it away anyway.  Or shall I use some --param instead of the constant
> > (any suggestions on how to call it)?  Another option (with or without param)
> > is to create debug temporaries in those cases and split the expression into
> > smaller expressions.  Not sure if it is worth it though.
> 
> Hmm, so the "easier" fix would be to always create a debug insn
> right after INSN setting a new debug reg to SRC and then doing the
> replacement with the new debug reg?  With the "optimization"

Well, that would IMHO kill the simplification effect of the combiner
for debug insn, nothing would be simplified in debug insns anymore.
But perhaps we can try to simplify and after simplification using
some metrics compare if the new expression is more complex than before or
not.  If it is more complex, then (if not emitted already) emit the extra
debug insn with a debug temp (next to the insn being removed, aka insn - 
first argument to propage_for_debug) and replace again from DEST to the debug
temp.

so it means we will often create uselessly large debug info.
> The  to not do that for some single use and/or "simple" SRC cases
> (I think "simple" SRC would be sth with a single/zero register reference).

But simplify-rtx will often do something even for the more complex SRC
cases.

> You are fixing it at the wrong end IMHO, by throwing away the
> thing after propagating (and doing all the work and doing the
> check on each individual propagation result).

	Jakub
diff mbox

Patch

--- gcc/valtrack.c.jj	2016-12-02 16:41:05.000000000 +0100
+++ gcc/valtrack.c	2016-12-12 15:12:39.637451127 +0100
@@ -29,6 +29,7 @@  along with GCC; see the file COPYING3.
 #include "regs.h"
 #include "memmodel.h"
 #include "emit-rtl.h"
+#include "rtl-iter.h"
 
 /* gen_lowpart_no_emit hook implementation for DEBUG_INSNs.  In DEBUG_INSNs,
    all lowpart SUBREGs are valid, despite what the machine requires for
@@ -197,6 +198,16 @@  propagate_for_debug (rtx_insn *insn, rtx
 					 dest, propagate_for_debug_subst, &p);
 	  if (loc == INSN_VAR_LOCATION_LOC (insn))
 	    continue;
+	  /* Avoid propagation from growing DEBUG_INSN expressions
+	     too much.  */
+	  int cnt = 0;
+	  subrtx_iterator::array_type array;
+	  FOR_EACH_SUBRTX (iter, array, loc, ALL)
+	    if (REG_P (*iter) && ++cnt > 32)
+	      {
+		loc = gen_rtx_UNKNOWN_VAR_LOC ();
+		break;
+	      }
 	  INSN_VAR_LOCATION_LOC (insn) = loc;
 	  df_insn_rescan (insn);
 	}
--- gcc/testsuite/g++.dg/opt/pr77844.C.jj	2016-12-12 17:28:25.146711803 +0100
+++ gcc/testsuite/g++.dg/opt/pr77844.C	2016-12-12 17:27:43.000000000 +0100
@@ -0,0 +1,32 @@ 
+// PR debug/77844
+// { dg-do compile }
+// { dg-options "-O3 -g" }
+
+#include <vector>
+
+void
+foo (std::vector<bool>& v, int i, int j)
+{
+  for (int k = 0; k < 5; ++k)
+    v[5 * i + k] = v[5 * i + k] | v[5 * j + k];
+}
+
+void
+bar (std::vector<bool>& v, int i, int j)
+{
+  for (int k = 0; k < 5; ++k)
+    v[5 * i + k] = v[5 * i + k] ^ v[5 * j + k];
+}
+
+void
+baz (std::vector<bool>& v)
+{
+  foo (v, 4, 1);
+  foo (v, 4, 2);
+  foo (v, 4, 3);
+  foo (v, 4, 4);
+  bar (v, 4, 1);
+  bar (v, 4, 2);
+  bar (v, 4, 3);
+  bar (v, 4, 4);
+}