Message ID | yddr5j6ysjb.fsf@manam.CeBiTec.Uni-Bielefeld.DE |
---|---|
State | New |
Headers | show |
On Jul 14, 2010, at 8:47 AM, Rainer Orth wrote: > The gcc.dg/pr43058.c test times out on most of my systems: > > WARNING: program timed out. > FAIL: gcc.dg/pr43058.c (test for excess errors) > > Even on an idle Sun Fire T5220 (1.2 GHz UltraSPARC-T2), it takes > > real 4:56.38 > user 4:54.71 > sys 0.35 > > or on an Sun Fire X4450 (2.93 GHz Xeon X7350) > > real 1:18.01 > user 1:17.20 > sys 0.26 > > As soon as the machine is loaded (e.g. make -j<2 * ncpu> check), the > test is practically guaranteed not to complete within the regular 5 > minute (300 s) timeout. I'd therefore like to increase the timeout by a > factor of 4. > > Ok for mainline and the 4.5 branch? No. I think the patch is wrong. I think the patch is wrong because the testcase is wrong. The testcase is wrong, as it intentionally consumes tons of resource to try and blow the machine out of the water. I think 43058 should be re-opened. Philosophy, any testcase that takes more than 30 seconds on a slow machine, is in danger of being a bad testcase. Really, they should be trimmed, reduced or split up, if possible. The style of this testcase sucks. At least the limit should be lowered to 10 seconds and the testcase reduced to take around 5 seconds on a slow machine (1 GHz say), if people were happy with how slow the compiler is. A better fix would be to design an extension to the testing infrastructure, say: PERF: INT test_case_name and then we could put things like memory usage of the compiler as: PERF: 121883 RAM gcc.dg/pr43058.c and compile time performance as: PERF: 312334 comptime gcc.dg/pre43058.c where the number is, say, the number of cycles the compiler spent compiling the testcase. We can then add a tool to compare two runs (a la contrib/compare_tests) and report regressions in any performance numbers. Works for RAM usage, paging, compile time, test case run time, number of cache misses, number of spills... If people like the slow compiler, then the size of the testcase should just be reduced (and the limit dropped), if they don't want to add the infrastructure to do better.
Mike Stump wrote: > Philosophy, any testcase that takes more than 30 seconds on a slow > machine, is in danger of being a bad testcase. I tend to agree. I'm not willing to make it a hard limit, because there really are bugs that take a lot of processing to reproduce, and they may be appropriate. But, it's bothered me for about a decade that we spend X% of our test cycles running Y% of our tests, where X >> Y. The marginal costs of slow tests is usually not commensurate with their marginal benefits.
Mark Mitchell <mark@codesourcery.com> writes: > Mike Stump wrote: > >> Philosophy, any testcase that takes more than 30 seconds on a slow >> machine, is in danger of being a bad testcase. > > I tend to agree. I'm not willing to make it a hard limit, because there > really are bugs that take a lot of processing to reproduce, and they may > be appropriate. > > But, it's bothered me for about a decade that we spend X% of our test > cycles running Y% of our tests, where X >> Y. The marginal costs of > slow tests is usually not commensurate with their marginal benefits. Fully agreed. The worst offender so far is a single gcc.c-torture testcase that takes 4+ hours (2 multilibs, 8 different optimization options) on an UltraSPARC-T2: http://gcc.gnu.org/ml/gcc-patches/2010-07/msg01633.html If this isn't changed, this single testcase runs longer than the rest of the testsuite combined ;-( Rainer
On Fri, Jul 23, 2010 at 7:30 PM, Mike Stump <mikestump@comcast.net> wrote: > On Jul 14, 2010, at 8:47 AM, Rainer Orth wrote: >> The gcc.dg/pr43058.c test times out on most of my systems: >> >> WARNING: program timed out. >> FAIL: gcc.dg/pr43058.c (test for excess errors) >> >> Even on an idle Sun Fire T5220 (1.2 GHz UltraSPARC-T2), it takes >> >> real 4:56.38 >> user 4:54.71 >> sys 0.35 >> >> or on an Sun Fire X4450 (2.93 GHz Xeon X7350) >> >> real 1:18.01 >> user 1:17.20 >> sys 0.26 >> >> As soon as the machine is loaded (e.g. make -j<2 * ncpu> check), the >> test is practically guaranteed not to complete within the regular 5 >> minute (300 s) timeout. I'd therefore like to increase the timeout by a >> factor of 4. >> >> Ok for mainline and the 4.5 branch? > > No. I think the patch is wrong. I think the patch is wrong because the testcase is wrong. The testcase is wrong, as it intentionally consumes tons of resource to try and blow the machine out of the water. > > I think 43058 should be re-opened. Well - it was fixed (it was about memory consumption). And to not regress we need to simulate the original failure which the testcase does. Now, as for slow machines I'd rather have a { dg-effective-target fast-and-big } that will just skip these kind of tests on small/slow machines. Richard.
Rainer Orth wrote: > Fully agreed. The worst offender so far is a single gcc.c-torture > testcase that takes 4+ hours (2 multilibs, 8 different optimization > options) on an UltraSPARC-T2: > > http://gcc.gnu.org/ml/gcc-patches/2010-07/msg01633.html > > If this isn't changed, this single testcase runs longer than the rest of > the testsuite combined ;-( That's just silly. I will approve a patch which moves this into a separate testsuite, which is run only upon explicit request, and we will remove this from the set of required tests to run before checking things in. Thanks,
Jakub Jelinek wrote: > That can't be true, because pr43058.c isn't run for all optimization levels. > It is only run at -O2 -g. So it definitely can't run for 4+ hours. > With 2 multilibs, if you run with RUNTESTFLAGS='--target_board=unix\{-m32,-m64\}' > (not the default), it can take at most 2 timeouts if it times out. I thought Rainer was referring to limits-fnargs.c.
On Fri, Jul 23, 2010 at 12:06:02PM -0700, Mark Mitchell wrote: > Rainer Orth wrote: > > > Fully agreed. The worst offender so far is a single gcc.c-torture > > testcase that takes 4+ hours (2 multilibs, 8 different optimization > > options) on an UltraSPARC-T2: > > > > http://gcc.gnu.org/ml/gcc-patches/2010-07/msg01633.html > > > > If this isn't changed, this single testcase runs longer than the rest of > > the testsuite combined ;-( That can't be true, because pr43058.c isn't run for all optimization levels. It is only run at -O2 -g. So it definitely can't run for 4+ hours. With 2 multilibs, if you run with RUNTESTFLAGS='--target_board=unix\{-m32,-m64\}' (not the default), it can take at most 2 timeouts if it times out. > That's just silly. > > I will approve a patch which moves this into a separate testsuite, which > is run only upon explicit request, and we will remove this from the set > of required tests to run before checking things in. Jakub
Mark Mitchell <mark@codesourcery.com> writes: > Jakub Jelinek wrote: > >> That can't be true, because pr43058.c isn't run for all optimization levels. >> It is only run at -O2 -g. So it definitely can't run for 4+ hours. >> With 2 multilibs, if you run with RUNTESTFLAGS='--target_board=unix\{-m32,-m64\}' >> (not the default), it can take at most 2 timeouts if it times out. > > I thought Rainer was referring to limits-fnargs.c. Indeed: gcc.dg/pr43058.c is merely an annoyance and the timeout warning is now avoided by increasing the timeout factor, but limits-fnargs.c is completely out of control. limits-fndefn.c is in a similar category: on an otherwise idle T5220, a single compilation takes almost exactly 5 minutes, compared to the 16:40 minutes of limits-fnargs.c Rainer
Jakub Jelinek <jakub@redhat.com> writes: >> I thought Rainer was referring to limits-fnargs.c. > > Oops, sorry. For limits-fnargs.c, perhaps copying it over into gcc.dg and > run only at one opt level (say -O2) in this size and shrink the test > considerably for what stays in gcc.c-torture/compile? The shrinking is part of my proposal in http://gcc.gnu.org/ml/gcc-patches/2010-07/msg01633.html The exact amount remains to be determined. Rainer
On Fri, Jul 23, 2010 at 12:14:25PM -0700, Mark Mitchell wrote: > Jakub Jelinek wrote: > > > That can't be true, because pr43058.c isn't run for all optimization levels. > > It is only run at -O2 -g. So it definitely can't run for 4+ hours. > > With 2 multilibs, if you run with RUNTESTFLAGS='--target_board=unix\{-m32,-m64\}' > > (not the default), it can take at most 2 timeouts if it times out. > > I thought Rainer was referring to limits-fnargs.c. Oops, sorry. For limits-fnargs.c, perhaps copying it over into gcc.dg and run only at one opt level (say -O2) in this size and shrink the test considerably for what stays in gcc.c-torture/compile? Jakub
Jakub Jelinek wrote: > Oops, sorry. For limits-fnargs.c, perhaps copying it over into gcc.dg and > run only at one opt level (say -O2) in this size and shrink the test > considerably for what stays in gcc.c-torture/compile? If we want to have a "mini" version of it that stays in gcc.c-torture/compile, that's fine, but it shouldn't be more expensive that most tests in the testsuite. A few seconds at most. I don't think we should run the full version by default, even once. It's just not doing anything very useful, relative to the time it takes. How often has it broken since it was added? If the answer is "never", then it's consumed thousands of hours, but provided almost no benefit. Let's make it optional, and if someone (including automated testers) wants to run it, so be it -- but let's not slow down every GCC developer with this.
diff -r 53c3be5f051b gcc/testsuite/gcc.dg/pr43058.c --- a/gcc/testsuite/gcc.dg/pr43058.c Fri Jul 09 13:39:23 2010 +0200 +++ b/gcc/testsuite/gcc.dg/pr43058.c Fri Jul 09 13:46:33 2010 +0200 @@ -1,6 +1,7 @@ /* PR debug/43058 */ /* { dg-do compile } */ /* { dg-options "-g -O2" } */ +/* { dg-timeout-factor 4 } */ extern void *f1 (void *, void *, void *); extern void *f2 (const char *, int, int, int, void *(*) ());