Message ID | 20161212192839.GG3541@tucnak.redhat.com |
---|---|
State | New |
Headers | show |
On Mon, 12 Dec 2016, Jakub Jelinek wrote: > Hi! > > On the following testcase (where I've failed to reduce it without the > header, got to over 70KB with both delta and creduce) we end up with huge > RTL expressions in debug insns that just slow down combiner as well as > var-tracking etc. too much. > Generally, at GIMPLE level we try to keep the debug stmts using debug > temporaries very simple, then during expansion we have a function that uses > debug temporaries in case TER creates larger expressions, and finally during > var-tracking we also punt on too large expressions. But valtrack APIs, > used e.g. by the combiner, when substituting lots of pseudos for expressions > that e.g. contain each two other pseudos, is able to create many KB sized > expressions. > > The following patch just throws away expressions that have 32 register > references in there, I think that is so huge that it will be really very > unlikely to be beneficial in the debug info and var-tracking would likely > throw it away anyway. Or shall I use some --param instead of the constant > (any suggestions on how to call it)? Another option (with or without param) > is to create debug temporaries in those cases and split the expression into > smaller expressions. Not sure if it is worth it though. Hmm, so the "easier" fix would be to always create a debug insn right after INSN setting a new debug reg to SRC and then doing the replacement with the new debug reg? With the "optimization" to not do that for some single use and/or "simple" SRC cases (I think "simple" SRC would be sth with a single/zero register reference). You are fixing it at the wrong end IMHO, by throwing away the thing after propagating (and doing all the work and doing the check on each individual propagation result). Richard. > Bootstrapped/regtested on x86_64-linux and i686-linux. > > 2016-12-12 Jakub Jelinek <jakub@redhat.com> > > PR debug/77844 > * valtrack.c: Include rtl-iter.h. > (propagate_for_debug): Reset debug insns which after simplification > contain more than 32 regs. > > * g++.dg/opt/pr77844.C: New test. > > --- gcc/valtrack.c.jj 2016-12-02 16:41:05.000000000 +0100 > +++ gcc/valtrack.c 2016-12-12 15:12:39.637451127 +0100 > @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3. > #include "regs.h" > #include "memmodel.h" > #include "emit-rtl.h" > +#include "rtl-iter.h" > > /* gen_lowpart_no_emit hook implementation for DEBUG_INSNs. In DEBUG_INSNs, > all lowpart SUBREGs are valid, despite what the machine requires for > @@ -197,6 +198,16 @@ propagate_for_debug (rtx_insn *insn, rtx > dest, propagate_for_debug_subst, &p); > if (loc == INSN_VAR_LOCATION_LOC (insn)) > continue; > + /* Avoid propagation from growing DEBUG_INSN expressions > + too much. */ > + int cnt = 0; > + subrtx_iterator::array_type array; > + FOR_EACH_SUBRTX (iter, array, loc, ALL) > + if (REG_P (*iter) && ++cnt > 32) > + { > + loc = gen_rtx_UNKNOWN_VAR_LOC (); > + break; > + } > INSN_VAR_LOCATION_LOC (insn) = loc; > df_insn_rescan (insn); > } > --- gcc/testsuite/g++.dg/opt/pr77844.C.jj 2016-12-12 17:28:25.146711803 +0100 > +++ gcc/testsuite/g++.dg/opt/pr77844.C 2016-12-12 17:27:43.000000000 +0100 > @@ -0,0 +1,32 @@ > +// PR debug/77844 > +// { dg-do compile } > +// { dg-options "-O3 -g" } > + > +#include <vector> > + > +void > +foo (std::vector<bool>& v, int i, int j) > +{ > + for (int k = 0; k < 5; ++k) > + v[5 * i + k] = v[5 * i + k] | v[5 * j + k]; > +} > + > +void > +bar (std::vector<bool>& v, int i, int j) > +{ > + for (int k = 0; k < 5; ++k) > + v[5 * i + k] = v[5 * i + k] ^ v[5 * j + k]; > +} > + > +void > +baz (std::vector<bool>& v) > +{ > + foo (v, 4, 1); > + foo (v, 4, 2); > + foo (v, 4, 3); > + foo (v, 4, 4); > + bar (v, 4, 1); > + bar (v, 4, 2); > + bar (v, 4, 3); > + bar (v, 4, 4); > +} > > Jakub > >
On Tue, Dec 13, 2016 at 10:05:58AM +0100, Richard Biener wrote: > > The following patch just throws away expressions that have 32 register > > references in there, I think that is so huge that it will be really very > > unlikely to be beneficial in the debug info and var-tracking would likely > > throw it away anyway. Or shall I use some --param instead of the constant > > (any suggestions on how to call it)? Another option (with or without param) > > is to create debug temporaries in those cases and split the expression into > > smaller expressions. Not sure if it is worth it though. > > Hmm, so the "easier" fix would be to always create a debug insn > right after INSN setting a new debug reg to SRC and then doing the > replacement with the new debug reg? With the "optimization" Well, that would IMHO kill the simplification effect of the combiner for debug insn, nothing would be simplified in debug insns anymore. But perhaps we can try to simplify and after simplification using some metrics compare if the new expression is more complex than before or not. If it is more complex, then (if not emitted already) emit the extra debug insn with a debug temp (next to the insn being removed, aka insn - first argument to propage_for_debug) and replace again from DEST to the debug temp. so it means we will often create uselessly large debug info. > The to not do that for some single use and/or "simple" SRC cases > (I think "simple" SRC would be sth with a single/zero register reference). But simplify-rtx will often do something even for the more complex SRC cases. > You are fixing it at the wrong end IMHO, by throwing away the > thing after propagating (and doing all the work and doing the > check on each individual propagation result). Jakub
--- gcc/valtrack.c.jj 2016-12-02 16:41:05.000000000 +0100 +++ gcc/valtrack.c 2016-12-12 15:12:39.637451127 +0100 @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3. #include "regs.h" #include "memmodel.h" #include "emit-rtl.h" +#include "rtl-iter.h" /* gen_lowpart_no_emit hook implementation for DEBUG_INSNs. In DEBUG_INSNs, all lowpart SUBREGs are valid, despite what the machine requires for @@ -197,6 +198,16 @@ propagate_for_debug (rtx_insn *insn, rtx dest, propagate_for_debug_subst, &p); if (loc == INSN_VAR_LOCATION_LOC (insn)) continue; + /* Avoid propagation from growing DEBUG_INSN expressions + too much. */ + int cnt = 0; + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, loc, ALL) + if (REG_P (*iter) && ++cnt > 32) + { + loc = gen_rtx_UNKNOWN_VAR_LOC (); + break; + } INSN_VAR_LOCATION_LOC (insn) = loc; df_insn_rescan (insn); } --- gcc/testsuite/g++.dg/opt/pr77844.C.jj 2016-12-12 17:28:25.146711803 +0100 +++ gcc/testsuite/g++.dg/opt/pr77844.C 2016-12-12 17:27:43.000000000 +0100 @@ -0,0 +1,32 @@ +// PR debug/77844 +// { dg-do compile } +// { dg-options "-O3 -g" } + +#include <vector> + +void +foo (std::vector<bool>& v, int i, int j) +{ + for (int k = 0; k < 5; ++k) + v[5 * i + k] = v[5 * i + k] | v[5 * j + k]; +} + +void +bar (std::vector<bool>& v, int i, int j) +{ + for (int k = 0; k < 5; ++k) + v[5 * i + k] = v[5 * i + k] ^ v[5 * j + k]; +} + +void +baz (std::vector<bool>& v) +{ + foo (v, 4, 1); + foo (v, 4, 2); + foo (v, 4, 3); + foo (v, 4, 4); + bar (v, 4, 1); + bar (v, 4, 2); + bar (v, 4, 3); + bar (v, 4, 4); +}