Message ID | 20221110135442.14501-1-mdoucha@suse.cz |
---|---|
State | Accepted |
Headers | show |
Series | Allow graceful subtest cleanup in shell tests | expand |
Hi Martin, Reviewed-by: Petr Vorel <pvorel@suse.cz> Thanks! > The new shell test timeout code sends SIGTERM to any subprocesses when > the main script hits timeout. SIGTERM isn't handled by the LTP library > which means that tools like netstress will be instantly killed without > performing any cleanup. Handle SIGTERM like SIGINT in LTP library > to allow graceful cleanup. Besides this, Cyril some time ago suggested to define TST_NO_DEFAULT_MAIN in nfs05_make_tree.c [1], which is also helper like netstress.c. Looking what this would be required for netstress.c: implement function it's own tst_brk(), which would call cleanup() function before calling library's tst_brk(), parsing getopt parameters, calling setup() in main() etc. The only thing which works is tst_res() and tst_brk() printing. I'm not sure if this is worth just to avoid problematic timeout. Kind regards, Petr [1] https://lore.kernel.org/ltp/YqxFo1iFzHatNRIl@yuki/
Hello, Martin Doucha <mdoucha@suse.cz> writes: > The new shell test timeout code sends SIGTERM to any subprocesses when > the main script hits timeout. SIGTERM isn't handled by the LTP library > which means that tools like netstress will be instantly killed without > performing any cleanup. Handle SIGTERM like SIGINT in LTP library > to allow graceful cleanup. > > Signed-off-by: Martin Doucha <mdoucha@suse.cz> Merged with Petr's tag, thanks! Possibly we should also print the signal that we received somehow. > --- > > Note: The current lack of graceful cleanup causes random failures in shell > tests which run the same tool many times (e.g. netstress). When the PID > counter wraps around and the tool accidentally gets the same PID as another > process that got killed by SIGTERM, the new test process will fail during IPC > setup. > > lib/tst_test.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/lib/tst_test.c b/lib/tst_test.c > index b225ba082..1732fd058 100644 > --- a/lib/tst_test.c > +++ b/lib/tst_test.c > @@ -1568,6 +1568,7 @@ static int fork_testrun(void) > int status; > > SAFE_SIGNAL(SIGINT, sigint_handler); > + SAFE_SIGNAL(SIGTERM, sigint_handler); > > alarm(results->timeout); > > @@ -1579,6 +1580,7 @@ static int fork_testrun(void) > tst_disable_oom_protection(0); > SAFE_SIGNAL(SIGALRM, SIG_DFL); > SAFE_SIGNAL(SIGUSR1, SIG_DFL); > + SAFE_SIGNAL(SIGTERM, SIG_DFL); > SAFE_SIGNAL(SIGINT, SIG_DFL); > SAFE_SETPGID(0, 0); > testrun(); > @@ -1586,6 +1588,7 @@ static int fork_testrun(void) > > SAFE_WAITPID(test_pid, &status, 0); > alarm(0); > + SAFE_SIGNAL(SIGTERM, SIG_DFL); > SAFE_SIGNAL(SIGINT, SIG_DFL); > > if (tst_test->taint_check && tst_taint_check()) { > -- > 2.37.3
diff --git a/lib/tst_test.c b/lib/tst_test.c index b225ba082..1732fd058 100644 --- a/lib/tst_test.c +++ b/lib/tst_test.c @@ -1568,6 +1568,7 @@ static int fork_testrun(void) int status; SAFE_SIGNAL(SIGINT, sigint_handler); + SAFE_SIGNAL(SIGTERM, sigint_handler); alarm(results->timeout); @@ -1579,6 +1580,7 @@ static int fork_testrun(void) tst_disable_oom_protection(0); SAFE_SIGNAL(SIGALRM, SIG_DFL); SAFE_SIGNAL(SIGUSR1, SIG_DFL); + SAFE_SIGNAL(SIGTERM, SIG_DFL); SAFE_SIGNAL(SIGINT, SIG_DFL); SAFE_SETPGID(0, 0); testrun(); @@ -1586,6 +1588,7 @@ static int fork_testrun(void) SAFE_WAITPID(test_pid, &status, 0); alarm(0); + SAFE_SIGNAL(SIGTERM, SIG_DFL); SAFE_SIGNAL(SIGINT, SIG_DFL); if (tst_test->taint_check && tst_taint_check()) {
The new shell test timeout code sends SIGTERM to any subprocesses when the main script hits timeout. SIGTERM isn't handled by the LTP library which means that tools like netstress will be instantly killed without performing any cleanup. Handle SIGTERM like SIGINT in LTP library to allow graceful cleanup. Signed-off-by: Martin Doucha <mdoucha@suse.cz> --- Note: The current lack of graceful cleanup causes random failures in shell tests which run the same tool many times (e.g. netstress). When the PID counter wraps around and the tool accidentally gets the same PID as another process that got killed by SIGTERM, the new test process will fail during IPC setup. lib/tst_test.c | 3 +++ 1 file changed, 3 insertions(+)