diff mbox

[PR58479] introduce a param to limit debug stmts count

Message ID or7g7w18o3.fsf@livre.home
State New
Headers show

Commit Message

Alexandre Oliva March 15, 2014, 2:45 a.m. UTC
This bug report had various testcases that had to do with full loop
unrolling with non-automatic iterators and fixed boundaries, which
resulted in duplicating debug stmts in the loop for each iteration.  In
some cases, the resulting executable code is none, but the debug stmts
add up to millions.  Just dropping them on the floor is somewhat
undesirable, even though they're not usable for much with today's
compiler and debugger infrastructure.  I decided to introduce a param to
limit debug stmts, so that this sort of testcase doesn't run nearly
forever, eating up all memory while at that.  This is what this patchset
does.

The first patch introduces the parameter (defaulting to no limit) and
the infrastructure around it.  I decided to only detect that the limit
was exceeded and clean up existing debug stmts at pass boundaries, so
that no pass would be surprised by a failure to create a debug stmt
where it used to be possible to do so.

The second patch addresses the actual problem in some passes,
particularly in the code that copies basic blocks, used by the loop
unroller, by testing regularly whether building more debug stmts is
desirable.  They will still collect debug stmts only at the end of the
pass, since there's not much reason to rush to do that.

The third patch enables the limit by default.  I played a bit with lower
values, and on i686 and x86_64, I could build all of stage3, with all
languages enabled, without any debug info degradation, with the limit
set as low as 25000, so 1 million is a very conservative limit.
fold-const, the largest debug stmt builder, builds a total of 24382
debug stmts, and with the limit at 10000, fold_binary_loc is still the
only function that VTA disabled because the limit is reached.  At 4000,
there are only some 20 functions in the entire stage3 (host and target
libs) that get degradation.  fold-const.c:fold_comparison and
tree-browser.c:browse_tree got my attention because they built a few
hundred debug stmts more than the limit set, indicating there are other
passes that may still benefit from BUILD_DEBUG_STMTS_P testing, but I
haven't investigated any further.

Regstrapped incrementally (first patch alone, first and second, and all
3) on i686- and x86_64-linux-gnu.  Ok to install?

Comments

Mike Stump March 15, 2014, 4:09 a.m. UTC | #1
On Mar 14, 2014, at 7:45 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> In some cases, the resulting executable code is none, but the debug stmts
> add up to millions.

I’d like to think there is a better theoretic answer to the specific problem…  trimming random debug info I think just invites a bad experience where people want to know what is going on and to them it just feels like a bad compiler that just randomly messed up debug info.  A user that wants faster compilation can refrain from using -g, or use -g1?

For example, if there truly is no code, removing all scopes that have no instruction between the start and the end along with all the debug info that goes with those scopes.  If there is one instruction, seems tome that it should be hard to have more than a few debug statements per instruction.  If there are more than 5, it would be curious to review each one and ask the question, is this useful and interesting?  I’d like to think there are entire classes of useless things that can be removed with no loss to the debug experience.
Richard Biener March 15, 2014, 9:57 a.m. UTC | #2
On Sat, Mar 15, 2014 at 5:09 AM, Mike Stump <mikestump@comcast.net> wrote:
> On Mar 14, 2014, at 7:45 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> In some cases, the resulting executable code is none, but the debug stmts
>> add up to millions.
>
> I'd like to think there is a better theoretic answer to the specific problem...  trimming random debug info I think just invites a bad experience where people want to know what is going on and to them it just feels like a bad compiler that just randomly messed up debug info.  A user that wants faster compilation can refrain from using -g, or use -g1?
>
> For example, if there truly is no code, removing all scopes that have no instruction between the start and the end along with all the debug info that goes with those scopes.  If there is one instruction, seems tome that it should be hard to have more than a few debug statements per instruction.  If there are more than 5, it would be curious to review each one and ask the question, is this useful and interesting?  I'd like to think there are entire classes of useless things that can be removed with no loss to the debug experience.

I agree, this doesn't seem to be a good solution (though the ability
to disable VTA per function looks good to me).  If we want to limit
sth then we should
limit the number of debug stmts inbetween two real stmts (ok, more
like the ratio of debug vs. real stmts).  But then the question is
which debug stmts do
we retain?  IMHO generating debug stmts in the first place for each
initializer in an unrolled

int a[10000];
for (;;)
  a[i] = 0;

is bad.  That is, I question the usefulness of the fancy debug stmts
we create from a dead

 a[12345] = 0;

stmt.  Can we add a -fextra-verbose-var-tracking-assignments for
those?  Or disable it for arrays?

Richard.
Jakub Jelinek March 18, 2014, 4:30 p.m. UTC | #3
On Fri, Mar 14, 2014 at 11:45:48PM -0300, Alexandre Oliva wrote:
> This bug report had various testcases that had to do with full loop
> unrolling with non-automatic iterators and fixed boundaries, which
> resulted in duplicating debug stmts in the loop for each iteration.  In
> some cases, the resulting executable code is none, but the debug stmts
> add up to millions.  Just dropping them on the floor is somewhat
> undesirable, even though they're not usable for much with today's
> compiler and debugger infrastructure.  I decided to introduce a param to
> limit debug stmts, so that this sort of testcase doesn't run nearly
> forever, eating up all memory while at that.  This is what this patchset
> does.

To some extent this is really a big hammer approach, on the other side we
already have a precedent here, the max-vartrack-size limit, and if we have
too many debug stmts in a single function, we also most likely hit the
max-vartrack-size limit anyway.

It would be nice if for the loop unrolling we could try to do something
smarter (as I wrote in the PR, I think it would be e.g. nice to preserve
debug stmts on the first few unrolled iterations and the last one and
just say the vars are all unavailable for the middle iterations or
something, instead of dropping everything).

> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -4498,6 +4498,8 @@ allocate_struct_function (tree fndecl, bool abstract_p)
>  
>    cfun = ggc_alloc_cleared_function ();
>  
> +  SET_BUILD_DEBUG_STMTS (cfun, flag_var_tracking_assignments);

Dunno how this plays together with __attribute__((optimize(...))),
I'm afraid not very well.  E.g. if in -O0 -g compilation some function is
__attribute__((optimize(2))) then we want to have debug stmts in there,
but the above would preclude it, on the other wise in -O2 -g compilation
with __attribute__((optimize(0))) function in it, we don't want debug stmts
in there.  So perhaps it needs to be updated when handling the optimize
attribute if the function doesn't have a body yet or something similar?

> @@ -121,6 +121,12 @@ gimple_alloc_stat (enum gimple_code code, unsigned num_ops MEM_STAT_DECL)
>    size_t size;
>    gimple stmt;
>  
> +  if (code == GIMPLE_DEBUG)
> +    {  
> +      gcc_checking_assert (MAY_HAVE_DEBUG_STMTS);
> +      cfun->debug_stmts++;
> +    }
> +
>    size = gimple_size (code);
>    if (num_ops > 0)
>      size += sizeof (tree) * (num_ops - 1);

I'd strongly prefer it you could move this hunk to
gimple_build_debug_bind_stat and gimple_build_debug_source_bind_stat,
yeah, it is duplication of it, but adding a branch for all GIMPLE allocation
doesn't look like a good idea to me.

	Jakub
diff mbox

Patch

Limit max debug stmts by default

From: Alexandre Oliva <aoliva@redhat.com>

for  gcc/ChangeLog

	PR debug/58479
	* params.def (PARAM_MAX_DEBUG_STMTS): Set default to 1,000,000.
	* doc/invoke.texi (max-vartrack-debug-stmts): Document it.
---
 gcc/doc/invoke.texi |    2 +-
 gcc/params.def      |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 3e26fc3..0e2d714 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9956,7 +9956,7 @@  Sets a maximum number of debug stmts to be built in a function before
 reached, additional debug stmts may be built, until a collection point
 is reached, where all debug stmts in the function are garbage collected
 and no further debug stmts can be built for the function.  The default
-is 0, that stands for no limit.
+is 1,000,000.  Setting it to zero makes it unlimited.
 
 @item min-nondebug-insn-uid
 Use uids starting at this parameter for nondebug insns.  The range below
diff --git a/gcc/params.def b/gcc/params.def
index 47d3900..398448c 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -915,7 +915,7 @@  DEFPARAM (PARAM_MAX_VARTRACK_REVERSE_OP_SIZE,
 DEFPARAM (PARAM_MAX_DEBUG_STMTS,
 	  "max-vartrack-debug-stmts",
 	  "Max. per-function number of debug stmts before VTA is disabled",
-	  0, 0, 0)
+	  1000000, 0, 0)
 
 /* Set minimum insn uid for non-debug insns.  */