Message ID | 20121114233407.GC12910@kam.mff.cuni.cz |
---|---|
State | New |
Headers | show |
On Thu, Nov 15, 2012 at 12:34:07AM +0100, Jan Hubicka wrote: > * params.def (max-peeled-insns, max-completely-peeled-insns): Reduce to 100. Ok, thanks. > --- params.def (revision 193505) > +++ params.def (working copy) > @@ -290,7 +290,7 @@ DEFPARAM(PARAM_MAX_UNROLL_TIMES, > DEFPARAM(PARAM_MAX_PEELED_INSNS, > "max-peeled-insns", > "The maximum number of insns of a peeled loop", > - 400, 0, 0) > + 100, 0, 0) > /* The maximum number of peelings of a single loop. */ > DEFPARAM(PARAM_MAX_PEEL_TIMES, > "max-peel-times", > @@ -305,7 +305,7 @@ DEFPARAM(PARAM_MAX_PEEL_BRANCHES, > DEFPARAM(PARAM_MAX_COMPLETELY_PEELED_INSNS, > "max-completely-peeled-insns", > "The maximum number of insns of a completely peeled loop", > - 400, 0, 0) > + 100, 0, 0) > /* The maximum number of peelings of a single loop that is peeled completely. */ > DEFPARAM(PARAM_MAX_COMPLETELY_PEEL_TIMES, > "max-completely-peel-times", Jakub
> this patch reduces max-peeled-insns and max-completely-peeled-insns from 400 > to 100. The reason why I am doing this is that I want to reduce code bloat > caused by my cunroll work that enabled a lot more unrolling then previously > causing considerable code size regression at -O3. Did you notice that gcc.c-torture/compile/pr43186.c regressed? It now again takes a while to compile, so times out on slow machines: FAIL: gcc.c-torture/compile/pr43186.c -O3 -fomit-frame-pointer (test for excess errors) FAIL: gcc.c-torture/compile/pr43186.c -O3 -fomit-frame-pointer -funroll-loops (test for excess errors) FAIL: gcc.c-torture/compile/pr43186.c -O3 -fomit-frame-pointer -funroll-all- loops -finline-functions (test for excess errors) FAIL: gcc.c-torture/compile/pr43186.c -O3 -g (test for excess errors) WARNING: program timed out. WARNING: program timed out. WARNING: program timed out. WARNING: program timed out.
> > this patch reduces max-peeled-insns and max-completely-peeled-insns from 400 > > to 100. The reason why I am doing this is that I want to reduce code bloat > > caused by my cunroll work that enabled a lot more unrolling then previously > > causing considerable code size regression at -O3. > > Did you notice that gcc.c-torture/compile/pr43186.c regressed? It now again > takes a while to compile, so times out on slow machines: I did not :(. I am currently on a trip, but will take a look on tuesday. If it seems to disturb testing, please just revert the patch for time being. Honza
> > > this patch reduces max-peeled-insns and max-completely-peeled-insns from 400 > > > to 100. The reason why I am doing this is that I want to reduce code bloat > > > caused by my cunroll work that enabled a lot more unrolling then previously > > > causing considerable code size regression at -O3. > > > > Did you notice that gcc.c-torture/compile/pr43186.c regressed? It now again > > takes a while to compile, so times out on slow machines: > > I did not :(. I am currently on a trip, but will take a look on tuesday. > If it seems to disturb testing, please just revert the patch for time being. OK, here are multiple issues. 1) recursive inlining makes huge loop nest (of 18 loops) 2) SCEV is very slow on answering simple_iv tests in this case becuase it walks the nest 3) unroller is computing loop body size even when it is clear the body is much larger than the limit (the outer loop has 78000 instructions) I will prepare patches to fix those issues. Honza > > Honza
> OK, here are multiple issues. > 1) recursive inlining makes huge loop nest (of 18 loops) > 2) SCEV is very slow on answering simple_iv tests in this case becuase it > walks the nest > 3) unroller is computing loop body size even when it is clear the body is > much larger than the limit (the outer loop has 78000 instructions) > > I will prepare patches to fix those issues. Thanks for the analysis (and don't worry, I won't revert anything :-).
On Sun, 18 Nov 2012, Jan Hubicka wrote: > > > > this patch reduces max-peeled-insns and max-completely-peeled-insns from 400 > > > > to 100. The reason why I am doing this is that I want to reduce code bloat > > > > caused by my cunroll work that enabled a lot more unrolling then previously > > > > causing considerable code size regression at -O3. > > > > > > Did you notice that gcc.c-torture/compile/pr43186.c regressed? It now again > > > takes a while to compile, so times out on slow machines: > > > > I did not :(. I am currently on a trip, but will take a look on tuesday. > > If it seems to disturb testing, please just revert the patch for time being. > > OK, here are multiple issues. > 1) recursive inlining makes huge loop nest (of 18 loops) > 2) SCEV is very slow on answering simple_iv tests in this case becuase it walks > the nest > 3) unroller is computing loop body size even when it is clear the body is much larger > than the limit (the outer loop has 78000 instructions) > > I will prepare patches to fix those issues. The recent (well, a week ago) params.def change also regressed gfortran.dg/reassoc_4.f almost everywhere; see PR55452. I guess a fix is fairly trivial, I just don't know to what. brgds, H-P
> On Sun, 18 Nov 2012, Jan Hubicka wrote: > > > > > this patch reduces max-peeled-insns and max-completely-peeled-insns from 400 > > > > > to 100. The reason why I am doing this is that I want to reduce code bloat > > > > > caused by my cunroll work that enabled a lot more unrolling then previously > > > > > causing considerable code size regression at -O3. > > > > > > > > Did you notice that gcc.c-torture/compile/pr43186.c regressed? It now again > > > > takes a while to compile, so times out on slow machines: > > > > > > I did not :(. I am currently on a trip, but will take a look on tuesday. > > > If it seems to disturb testing, please just revert the patch for time being. > > > > OK, here are multiple issues. > > 1) recursive inlining makes huge loop nest (of 18 loops) > > 2) SCEV is very slow on answering simple_iv tests in this case becuase it walks > > the nest > > 3) unroller is computing loop body size even when it is clear the body is much larger > > than the limit (the outer loop has 78000 instructions) > > > > I will prepare patches to fix those issues. > > The recent (well, a week ago) params.def change also regressed > gfortran.dg/reassoc_4.f almost everywhere; see PR55452. > > I guess a fix is fairly trivial, I just don't know to what. Yes, we siply should add explicit unrolling limits there, I believe I posted a patch? I am currently on a way, I will look up the message and/or post it. Honza > > brgds, H-P
Index: params.def =================================================================== --- params.def (revision 193505) +++ params.def (working copy) @@ -290,7 +290,7 @@ DEFPARAM(PARAM_MAX_UNROLL_TIMES, DEFPARAM(PARAM_MAX_PEELED_INSNS, "max-peeled-insns", "The maximum number of insns of a peeled loop", - 400, 0, 0) + 100, 0, 0) /* The maximum number of peelings of a single loop. */ DEFPARAM(PARAM_MAX_PEEL_TIMES, "max-peel-times", @@ -305,7 +305,7 @@ DEFPARAM(PARAM_MAX_PEEL_BRANCHES, DEFPARAM(PARAM_MAX_COMPLETELY_PEELED_INSNS, "max-completely-peeled-insns", "The maximum number of insns of a completely peeled loop", - 400, 0, 0) + 100, 0, 0) /* The maximum number of peelings of a single loop that is peeled completely. */ DEFPARAM(PARAM_MAX_COMPLETELY_PEEL_TIMES, "max-completely-peel-times",