Message ID | 20210720101249.10118-1-aleksei.kodanev@bell-sw.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [v2] crypto/af_alg02: thread termination fixes | expand |
Hi! > On musl, pthread_kill() doesn't return ESRCH if thread id is not found > (POSIX only recommends to return ESRCH). Use tst_atomic_store/load() > instead, when waiting for the thread. > > Also, the thread's resources wasn't properly freed after the run(), > so adding pthread_join() should fix that. I do not think that we even need atomic operations here as we do not have competing threads setting the value, it should work fine with regular assignments as long as the completed variable is marked as volatile (which will prevent compiler mis-optimizations).
Hi On 7/30/2021 11:12 AM, Cyril Hrubis wrote: > Hi! >> On musl, pthread_kill() doesn't return ESRCH if thread id is not found >> (POSIX only recommends to return ESRCH). Use tst_atomic_store/load() >> instead, when waiting for the thread. >> >> Also, the thread's resources wasn't properly freed after the run(), >> so adding pthread_join() should fix that. > I do not think that we even need atomic operations here as we do not > have competing threads setting the value, it should work fine with > regular assignments as long as the completed variable is marked as > volatile (which will prevent compiler mis-optimizations). +1 Using a volatile variable should be enough here. If only pthread_timedjoin_np was not _np... This is exactly the function that could be used here without any boilerplate. But is the custom timeout handling in this test even required? Does the default timeout using SIGALRM not work? The thread was introduced, because SIG_ALRM was apparently not able to interrupt the read call on linux < 4.14. But even there it should be possible to interrupt pthread_join. So just replacing the whole loop with pthread_join should be enough. Joerg
On Fri, Jul 30, 2021 at 5:32 PM Joerg Vehlow <lkml@jv-coder.de> wrote: > Hi > > On 7/30/2021 11:12 AM, Cyril Hrubis wrote: > > Hi! > >> On musl, pthread_kill() doesn't return ESRCH if thread id is not found > >> (POSIX only recommends to return ESRCH). Use tst_atomic_store/load() > >> instead, when waiting for the thread. > >> > >> Also, the thread's resources wasn't properly freed after the run(), > >> so adding pthread_join() should fix that. > > I do not think that we even need atomic operations here as we do not > > have competing threads setting the value, it should work fine with > > regular assignments as long as the completed variable is marked as > > volatile (which will prevent compiler mis-optimizations). > > +1 Using a volatile variable should be enough here. > I have no preference for atomic or volatile methods. Both should be fine. > If only pthread_timedjoin_np was not _np... This is exactly the function > that could be used here without any boilerplate. > But is the custom timeout handling in this test even required? Does the > default timeout using SIGALRM not work? > The default timeout obviously works. But with introducing the thread_B (custom timeout) can get a fine-grained report when the read() was stuck. That's the advantage I can think of. > The thread was introduced, because SIG_ALRM was apparently not able to > interrupt the read call on linux < 4.14. > But even there it should be possible to interrupt pthread_join. So just > replacing the whole loop with pthread_join should be enough. > That should work but no precise log to indicate where goes wrong, so I vote to go the loop way:).
Hi Li, On 7/30/2021 12:38 PM, Li Wang wrote: > > > On Fri, Jul 30, 2021 at 5:32 PM Joerg Vehlow <lkml@jv-coder.de > <mailto:lkml@jv-coder.de>> wrote: > > Hi > > On 7/30/2021 11:12 AM, Cyril Hrubis wrote: > > Hi! > >> On musl, pthread_kill() doesn't return ESRCH if thread id is > not found > >> (POSIX only recommends to return ESRCH). Use > tst_atomic_store/load() > >> instead, when waiting for the thread. > >> > >> Also, the thread's resources wasn't properly freed after the run(), > >> so adding pthread_join() should fix that. > > I do not think that we even need atomic operations here as we do not > > have competing threads setting the value, it should work fine with > > regular assignments as long as the completed variable is marked as > > volatile (which will prevent compiler mis-optimizations). > > +1 Using a volatile variable should be enough here. > > > I have no preference for atomic or volatile methods. > Both should be fine. > > If only pthread_timedjoin_np was not _np... This is exactly the > function > that could be used here without any boilerplate. > But is the custom timeout handling in this test even required? > Does the > default timeout using SIGALRM not work? > > > The default timeout obviously works. > > But with introducing the thread_B (custom timeout) can get a fine-grained > report when the read() was stuck. That's the advantage I can think of. > > The thread was introduced, because SIG_ALRM was apparently not > able to > interrupt the read call on linux < 4.14. > But even there it should be possible to interrupt pthread_join. So > just > replacing the whole loop with pthread_join should be enough. > > > That should work but no precise log to indicate where goes wrong, > so I vote to go the loop way:). Does it really matter? The being stuck in the read does not seem to be the failure case tested with this test. Otherwise it would TFAIL. Additionally the loop and "custom" timeout was only introduced at a later stage of the test. Initially it relied on the default timeout handling. If my assumption, that being stuck in the read is not the failure case of this test, then your argument is invalid. Your argument would work for each and every line of code, that might be executed, when the timeout hits. Joerg
Hi Joerg, > > That should work but no precise log to indicate where goes wrong, > > so I vote to go the loop way:). > Does it really matter? The being stuck in the read does not seem to be > the failure case tested with this test. Otherwise it would TFAIL. > Additionally the loop and "custom" timeout was only introduced at a > later stage of the test. Initially it relied on the default timeout > handling. > It's doesn't matter actually. > If my assumption, that being stuck in the read is not the failure case > of this test, then your argument is invalid. Your argument would work for > each and every line of code, that might be executed, when the timeout hits. > Yes right, but I was not arguing for that. I just provide more opinions for Alexey to make a final decision:).
Hi! > > That should work but no precise log to indicate where goes wrong, > > so I vote to go the loop way:). > Does it really matter? The being stuck in the read does not seem to be > the failure case tested with this test. Otherwise it would TFAIL. > Additionally the loop and "custom" timeout was only introduced at a > later stage of the test. Initially it relied on the default timeout > handling. > If my assumption, that being stuck in the read is not the failure case > of this test, then your argument is invalid. Your argument would work for > each and every line of code, that might be executed, when the timeout hits. The top level in the test actually says that for some kernels the read() does not return, which is a kernel bug so the test should report this case as a failure as well and also the commit that fixes this should be added to the tags structure.
Hi, On 7/30/2021 1:43 PM, Cyril Hrubis wrote: > Hi! >>> That should work but no precise log to indicate where goes wrong, >>> so I vote to go the loop way:). >> Does it really matter? The being stuck in the read does not seem to be >> the failure case tested with this test. Otherwise it would TFAIL. >> Additionally the loop and "custom" timeout was only introduced at a >> later stage of the test. Initially it relied on the default timeout >> handling. >> If my assumption, that being stuck in the read is not the failure case >> of this test, then your argument is invalid. Your argument would work for >> each and every line of code, that might be executed, when the timeout hits. > The top level in the test actually says that for some kernels the read() > does not return, which is a kernel bug so the test should report this > case as a failure as well and also the commit that fixes this should be > added to the tags structure. Ups, yes sorry, should have read the whole test and not just assumed, that the not returning read was something else. In that case +1 on keeping the custom handling, but maybe use FAIL and not BROK in that case? I guess there is no other good reason, why the test can trigger the timeout Joerg
diff --git a/testcases/kernel/crypto/af_alg02.c b/testcases/kernel/crypto/af_alg02.c index 31d30777c..5863fd26b 100644 --- a/testcases/kernel/crypto/af_alg02.c +++ b/testcases/kernel/crypto/af_alg02.c @@ -18,11 +18,13 @@ #include "tst_test.h" #include "tst_af_alg.h" #include "tst_safe_pthread.h" +#include "tst_atomic.h" #include <pthread.h> #include <errno.h> #define SALSA20_IV_SIZE 8 #define SALSA20_MIN_KEY_SIZE 16 +static int completed; static void *verify_encrypt(void *arg) { @@ -48,19 +50,21 @@ static void *verify_encrypt(void *arg) tst_res(TPASS, "Successfully \"encrypted\" an empty message"); else tst_res(TFAIL, "read() didn't return 0"); + + tst_atomic_store(1, &completed); return arg; } static void run(void) { pthread_t thr; - + tst_atomic_store(0, &completed); pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL); SAFE_PTHREAD_CREATE(&thr, NULL, verify_encrypt, NULL); TST_CHECKPOINT_WAIT(0); - while (pthread_kill(thr, 0) != ESRCH) { + while (!tst_atomic_load(&completed)) { if (tst_timeout_remaining() <= 10) { pthread_cancel(thr); tst_brk(TBROK, @@ -68,6 +72,7 @@ static void run(void) } usleep(1000); } + pthread_join(thr, NULL); } static struct tst_test test = {
On musl, pthread_kill() doesn't return ESRCH if thread id is not found (POSIX only recommends to return ESRCH). Use tst_atomic_store/load() instead, when waiting for the thread. Also, the thread's resources wasn't properly freed after the run(), so adding pthread_join() should fix that. Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com> --- v2: convert pthread_tryjoin_np() into atomic_load/store() + pthread_join() testcases/kernel/crypto/af_alg02.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)