diff mbox series

tst-epoll: increase waiting time before sending signal to the child

Message ID 20240123211130.190553-1-aurelien@aurel32.net
State New
Headers show
Series tst-epoll: increase waiting time before sending signal to the child | expand

Commit Message

Aurelien Jarno Jan. 23, 2024, 9:11 p.m. UTC
When running the testsuite in parallel, for instance running make -j
$(nproc) check, from time to time tst-epoll fails with a timeout. It
happens because it sometimes takes a bit more than 10ms for the process
to get cloned and blocked by the syscall. In that case the signal is
sent to early, and the test fails with a timeout. This happens even on
fast hosts.

This patch increases the waiting time to 100ms to make it more reliable.
It corresponds to 20% of the epoll wait time, so there is still some
margin on that side.
---
 sysdeps/unix/sysv/linux/tst-epoll.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Andreas Schwab Jan. 23, 2024, 10:23 p.m. UTC | #1
On Jan 23 2024, Aurelien Jarno wrote:

> When running the testsuite in parallel, for instance running make -j
> $(nproc) check, from time to time tst-epoll fails with a timeout. It
> happens because it sometimes takes a bit more than 10ms for the process
> to get cloned and blocked by the syscall. In that case the signal is
> sent to early, and the test fails with a timeout. This happens even on
> fast hosts.
>
> This patch increases the waiting time to 100ms to make it more reliable.
> It corresponds to 20% of the epoll wait time, so there is still some
> margin on that side.

Can this be synchronized properly?  A racy test is worthless.
Adhemerval Zanella Netto Jan. 23, 2024, 11:13 p.m. UTC | #2
On 23/01/24 19:23, Andreas Schwab wrote:
> On Jan 23 2024, Aurelien Jarno wrote:
> 
>> When running the testsuite in parallel, for instance running make -j
>> $(nproc) check, from time to time tst-epoll fails with a timeout. It
>> happens because it sometimes takes a bit more than 10ms for the process
>> to get cloned and blocked by the syscall. In that case the signal is
>> sent to early, and the test fails with a timeout. This happens even on
>> fast hosts.
>>
>> This patch increases the waiting time to 100ms to make it more reliable.
>> It corresponds to 20% of the epoll wait time, so there is still some
>> margin on that side.
> 
> Can this be synchronized properly?  A racy test is worthless.
> 


Maybe either:

  static pthread_barrier_t *barrier;
  shared_data = support_shared_allocate (sizeof (*shared_data));
  {
    pthread_barrierattr_t attr;
    xpthread_barrierattr_init (&attr);
    xpthread_barrierattr_setpshared (&attr, PTHREAD_PROCESS_SHARED);
    xpthread_barrier_init (barrier, &attr, 2);
    xpthread_barrierattr_destroy (&attr);
  }

  /* Child.  */
  xpthread_barrier_wait (&barrier);
  epoll_ctl (...)

  /* Parent.  */
  xpthread_barrier_wait (&barrier);
  kill (...)

Or:

  /* Parent.  */
  support_process_state_wait (pid, support_process_state_sleeping);
  kill (...)
diff mbox series

Patch

diff --git a/sysdeps/unix/sysv/linux/tst-epoll.c b/sysdeps/unix/sysv/linux/tst-epoll.c
index 3b38beae6e..39953c0a08 100644
--- a/sysdeps/unix/sysv/linux/tst-epoll.c
+++ b/sysdeps/unix/sysv/linux/tst-epoll.c
@@ -98,7 +98,7 @@  test_epoll_basic (epoll_wait_check_t epoll_wait_check)
   xclose (fds[1][1]);
 
   /* Wait some time so child is blocked on the syscall.  */
-  nanosleep (&(struct timespec) {0, 10000000}, NULL);
+  nanosleep (&(struct timespec) {0, 100000000}, NULL);
   TEST_COMPARE (kill (p, SIGUSR1), 0);
 
   int e = epoll_wait_check (efd, &event, 1, 500000000, &ss);