diff mbox series

syscalls/futex_waitv03: replace TST_THREAD_STATE_WAIT with repeated wake

Message ID d9cddda87dd121a4c57baa7f2d0e221b7a99bc1c.1657212941.git.jstancek@redhat.com
State Accepted
Headers show
Series syscalls/futex_waitv03: replace TST_THREAD_STATE_WAIT with repeated wake | expand

Commit Message

Jan Stancek July 7, 2022, 4:56 p.m. UTC
TST_THREAD_STATE_WAIT isn't reliable to tell that it's safe to call futex_wake().
futex_wake() can be called prematurely and return 0, which leaves other thread
timing out on futex call:
  tst_test.c:1459: TINFO: Timeout per run is 0h 10m 00s
  futex_waitv03.c:37: TINFO: Testing variant: syscall with old kernel spec
  tst_buffers.c:55: TINFO: Test is using guarded buffers
  futex_waitv03.c:106: TBROK: futex_waitv returned: -1: ETIMEDOUT (110)

Replace it with repeated futex_wake() until it fails or wakes at least 1 waiter.
Also extend timeout to 5 seconds to avoid false positives from systems with
high steal time (e.g. overloaded s390x host).

Signed-off-by: Jan Stancek <jstancek@redhat.com>
---
 .../kernel/syscalls/futex/futex_waitv03.c     | 25 ++++++++-----------
 1 file changed, 11 insertions(+), 14 deletions(-)

Comments

Li Wang July 11, 2022, 7:17 a.m. UTC | #1
On Fri, Jul 8, 2022 at 12:56 AM Jan Stancek <jstancek@redhat.com> wrote:

> TST_THREAD_STATE_WAIT isn't reliable to tell that it's safe to call
> futex_wake().
> futex_wake() can be called prematurely and return 0, which leaves other
> thread
> timing out on futex call:
>   tst_test.c:1459: TINFO: Timeout per run is 0h 10m 00s
>   futex_waitv03.c:37: TINFO: Testing variant: syscall with old kernel spec
>   tst_buffers.c:55: TINFO: Test is using guarded buffers
>   futex_waitv03.c:106: TBROK: futex_waitv returned: -1: ETIMEDOUT (110)
>
> Replace it with repeated futex_wake() until it fails or wakes at least 1
> waiter.
> Also extend timeout to 5 seconds to avoid false positives from systems with
> high steal time (e.g. overloaded s390x host).
>

Though TST_THREAD_STATE_WAIT is unreliable, I guess that would
still add more chances if we keep it?
(I mean go with repeat futex_wake() after checking 'S' state)
Jan Stancek July 12, 2022, 7:59 a.m. UTC | #2
On Mon, Jul 11, 2022 at 9:17 AM Li Wang <liwang@redhat.com> wrote:
>
>
>
> On Fri, Jul 8, 2022 at 12:56 AM Jan Stancek <jstancek@redhat.com> wrote:
>>
>> TST_THREAD_STATE_WAIT isn't reliable to tell that it's safe to call futex_wake().
>> futex_wake() can be called prematurely and return 0, which leaves other thread
>> timing out on futex call:
>>   tst_test.c:1459: TINFO: Timeout per run is 0h 10m 00s
>>   futex_waitv03.c:37: TINFO: Testing variant: syscall with old kernel spec
>>   tst_buffers.c:55: TINFO: Test is using guarded buffers
>>   futex_waitv03.c:106: TBROK: futex_waitv returned: -1: ETIMEDOUT (110)
>>
>> Replace it with repeated futex_wake() until it fails or wakes at least 1 waiter.
>> Also extend timeout to 5 seconds to avoid false positives from systems with
>> high steal time (e.g. overloaded s390x host).
>
>
> Though TST_THREAD_STATE_WAIT is unreliable, I guess that would
> still add more chances if we keep it?
> (I mean go with repeat futex_wake() after checking 'S' state)

We could keep it, though I'm not sure what benefit that has. You
would probably make fewer calls to futex_wake(). Without it,
the window where wake and wait calls are made in parallel
is larger, and it also makes test simpler.
Li Wang July 12, 2022, 8:33 a.m. UTC | #3
On Tue, Jul 12, 2022 at 4:00 PM Jan Stancek <jstancek@redhat.com> wrote:

> On Mon, Jul 11, 2022 at 9:17 AM Li Wang <liwang@redhat.com> wrote:
> >
> >
> >
> > On Fri, Jul 8, 2022 at 12:56 AM Jan Stancek <jstancek@redhat.com> wrote:
> >>
> >> TST_THREAD_STATE_WAIT isn't reliable to tell that it's safe to call
> futex_wake().
> >> futex_wake() can be called prematurely and return 0, which leaves other
> thread
> >> timing out on futex call:
> >>   tst_test.c:1459: TINFO: Timeout per run is 0h 10m 00s
> >>   futex_waitv03.c:37: TINFO: Testing variant: syscall with old kernel
> spec
> >>   tst_buffers.c:55: TINFO: Test is using guarded buffers
> >>   futex_waitv03.c:106: TBROK: futex_waitv returned: -1: ETIMEDOUT (110)
> >>
> >> Replace it with repeated futex_wake() until it fails or wakes at least
> 1 waiter.
> >> Also extend timeout to 5 seconds to avoid false positives from systems
> with
> >> high steal time (e.g. overloaded s390x host).
> >
> >
> > Though TST_THREAD_STATE_WAIT is unreliable, I guess that would
> > still add more chances if we keep it?
> > (I mean go with repeat futex_wake() after checking 'S' state)
>
> We could keep it, though I'm not sure what benefit that has. You
> would probably make fewer calls to futex_wake(). Without it,
> the window where wake and wait calls are made in parallel
> is larger, and it also makes test simpler.
>

Yes, that helps reduce the tries of futex_wake() to make the test faster.
But not benefit a lot compared with removing.

So feel free to merge it as you wish, I think it's OK with both.

Reviewed-by: Li Wang <liwang@redhat.com>
Jan Stancek July 19, 2022, 9:09 a.m. UTC | #4
On Tue, Jul 12, 2022 at 10:33 AM Li Wang <liwang@redhat.com> wrote:
>
>
>
> On Tue, Jul 12, 2022 at 4:00 PM Jan Stancek <jstancek@redhat.com> wrote:
>>
>> On Mon, Jul 11, 2022 at 9:17 AM Li Wang <liwang@redhat.com> wrote:
>> >
>> >
>> >
>> > On Fri, Jul 8, 2022 at 12:56 AM Jan Stancek <jstancek@redhat.com> wrote:
>> >>
>> >> TST_THREAD_STATE_WAIT isn't reliable to tell that it's safe to call futex_wake().
>> >> futex_wake() can be called prematurely and return 0, which leaves other thread
>> >> timing out on futex call:
>> >>   tst_test.c:1459: TINFO: Timeout per run is 0h 10m 00s
>> >>   futex_waitv03.c:37: TINFO: Testing variant: syscall with old kernel spec
>> >>   tst_buffers.c:55: TINFO: Test is using guarded buffers
>> >>   futex_waitv03.c:106: TBROK: futex_waitv returned: -1: ETIMEDOUT (110)
>> >>
>> >> Replace it with repeated futex_wake() until it fails or wakes at least 1 waiter.
>> >> Also extend timeout to 5 seconds to avoid false positives from systems with
>> >> high steal time (e.g. overloaded s390x host).
>> >
>> >
>> > Though TST_THREAD_STATE_WAIT is unreliable, I guess that would
>> > still add more chances if we keep it?
>> > (I mean go with repeat futex_wake() after checking 'S' state)
>>
>> We could keep it, though I'm not sure what benefit that has. You
>> would probably make fewer calls to futex_wake(). Without it,
>> the window where wake and wait calls are made in parallel
>> is larger, and it also makes test simpler.
>
>
> Yes, that helps reduce the tries of futex_wake() to make the test faster.
> But not benefit a lot compared with removing.
>
> So feel free to merge it as you wish, I think it's OK with both.
>
> Reviewed-by: Li Wang <liwang@redhat.com>

Pushed.

>
>
> --
> Regards,
> Li Wang
diff mbox series

Patch

diff --git a/testcases/kernel/syscalls/futex/futex_waitv03.c b/testcases/kernel/syscalls/futex/futex_waitv03.c
index ffe5c66cde77..ee79728474ee 100644
--- a/testcases/kernel/syscalls/futex/futex_waitv03.c
+++ b/testcases/kernel/syscalls/futex/futex_waitv03.c
@@ -70,19 +70,19 @@  static void cleanup(void)
 	}
 }
 
-static void *threaded(void *arg)
+static void *threaded(LTP_ATTRIBUTE_UNUSED void *arg)
 {
 	struct futex_test_variants tv = futex_variant();
-	int tid = *(int *)arg;
 
-	TST_THREAD_STATE_WAIT(tid, 'S', 0);
-
-	TEST(futex_wake(tv.fntype, (void *)(uintptr_t)waitv[numfutex - 1].uaddr,
+	do {
+		TEST(futex_wake(tv.fntype, (void *)(uintptr_t)waitv[numfutex - 1].uaddr,
 			1, 0));
-	if (TST_RET < 0) {
-		tst_brk(TBROK | TTERRNO,
-			"futex_wake private returned: %ld", TST_RET);
-	}
+		if (TST_RET < 0) {
+			tst_brk(TBROK | TTERRNO,
+				"futex_wake private returned: %ld", TST_RET);
+		}
+		usleep(1000);
+	} while (TST_RET < 1);
 
 	return NULL;
 }
@@ -90,16 +90,13 @@  static void *threaded(void *arg)
 static void run(void)
 {
 	struct timespec to;
-	int tid;
 	pthread_t t;
 
-	tid = tst_syscall(__NR_gettid);
-
-	SAFE_PTHREAD_CREATE(&t, NULL, threaded, (void *)&tid);
+	SAFE_PTHREAD_CREATE(&t, NULL, threaded, NULL);
 
 	/* setting absolute timeout for futex2 */
 	SAFE_CLOCK_GETTIME(CLOCK_MONOTONIC, &to);
-	to.tv_sec++;
+	to.tv_sec += 5;
 
 	TEST(futex_waitv(waitv, numfutex, 0, &to, CLOCK_MONOTONIC));
 	if (TST_RET < 0) {