[v2] futex_cmp_requeue01: fix test expectations
diff mbox series

Message ID ec1792be62432bb64e4d5c5085f6ebfa7e5d4b53.1573562645.git.jstancek@redhat.com
State New
Headers show
Series
  • [v2] futex_cmp_requeue01: fix test expectations
Related show

Commit Message

Jan Stancek Nov. 12, 2019, 2:08 p.m. UTC
There are couple problems:

1. Keeping same uaddr1 value across requeue leads to a side-effect.
If there is a signal or spurious wakeup, futex_wait() operation can
be restarted (by kernel) with same input parameters. Which means that
process requeued for uaddr2, goes back to queueing for uaddr1. Such
use case is currently not valid as it is expected that uaddr1 value
changes, so that any late waiter can notice it and goes back with
-EWOULDBLOCK (-EAGAIN).

2. Test doesn't expect possibility of spurious wakeups.

3. Test is expecting to get exact values of woken/requeued processes.
Man page wording like "at most" or "maximum of" does not guarantee
exact numbers.

Change futex value before requeue, so child processes can observe
spurious wakeups.

Change child process, such that it always sleeps, to guarantee that
TST_PROCESS_STATE_WAIT() will always succeed finding sleeping process.

Change default timeout to 5 seconds. Spawning 1000 process on slower
systems was getting close to 1 second. This doesn't affect runtime in
PASS-ing case.

Change checks such that they reflect wording in man page, and don't
test for absolute values (see comments in code). One exception is
that we assume futex_wake(tc->num_waiters) will always wake up all
remaining processes.

Signed-off-by: Jan Stancek <jstancek@redhat.com>
---
 .../kernel/syscalls/futex/futex_cmp_requeue01.c    | 105 +++++++++++++++------
 1 file changed, 78 insertions(+), 27 deletions(-)

v2 changes:
- use tst_multiply_timeout()
- remove tst_atomic_store from loop
- shared variables moved to single struct

Comments

Cyril Hrubis Nov. 20, 2019, 4:16 p.m. UTC | #1
Hi!
> 1. Keeping same uaddr1 value across requeue leads to a side-effect.
> If there is a signal or spurious wakeup, futex_wait() operation can
> be restarted (by kernel) with same input parameters. Which means that
> process requeued for uaddr2, goes back to queueing for uaddr1. Such
> use case is currently not valid as it is expected that uaddr1 value
> changes, so that any late waiter can notice it and goes back with
> -EWOULDBLOCK (-EAGAIN).
> 
> 2. Test doesn't expect possibility of spurious wakeups.
> 
> 3. Test is expecting to get exact values of woken/requeued processes.
> Man page wording like "at most" or "maximum of" does not guarantee
> exact numbers.
> 
> Change futex value before requeue, so child processes can observe
> spurious wakeups.
> 
> Change child process, such that it always sleeps, to guarantee that
> TST_PROCESS_STATE_WAIT() will always succeed finding sleeping process.
> 
> Change default timeout to 5 seconds. Spawning 1000 process on slower
> systems was getting close to 1 second. This doesn't affect runtime in
> PASS-ing case.
> 
> Change checks such that they reflect wording in man page, and don't
> test for absolute values (see comments in code). One exception is
> that we assume futex_wake(tc->num_waiters) will always wake up all
> remaining processes.

I was thinking about this and the only unpreciseness we can get here is
the number of spuriously woken up processes at the end of the test and
that is because we cannot tell where exactly the spurious wakeup
happened, right?

That means that all the assertion we could have made without the
spurious wakeups will still hold, but we will have to take the number of
spurious wakeups as our measurement error, just like in physics.

Also the futex_cmp_requeue() should prefer waking processes up against
requeue operation so basically:

TST_RET - num_requeues = set_wakes

Unless spurious wakeup has happened between the requeue and wake
operation which means that the num_requeues can be smaller because we
will wake up less than requeued processes. So if we sampled spurious
wakeups before the requeue operation and after the futex_wake() for
requeued processes and call it delta_spurious we would get a range:

TST_RET - num_requeues >= set_wakes

&&

TST_RET - num_requeues - delta_spurious <= set_wakes

Similarily the number of processes left waiting on the futex should be
in a range of expected and MIN(0, expected - spurious) where
expected = num_waiter - set_wakes - set_requeues.

And lastly the num_requeues should be between set_requeues and MIN(0,
set_requeues - spurious).

Or did is miss something that invalidates my line of thought?


Also btw, we are missing a tcase where we attempt to wake more processes
that are sleeping on the futex and check that we haven't requeued any
because all were woken up.
Jan Stancek Nov. 21, 2019, 8:04 a.m. UTC | #2
----- Original Message -----
> I was thinking about this and the only unpreciseness we can get here is
> the number of spuriously woken up processes at the end of the test and
> that is because we cannot tell where exactly the spurious wakeup
> happened, right?
> 
> That means that all the assertion we could have made without the
> spurious wakeups will still hold
>, but we will have to take the number of
> spurious wakeups as our measurement error, just like in physics.
> 
> Also the futex_cmp_requeue() should prefer waking processes up against
> requeue operation so basically:
> 
> TST_RET - num_requeues = set_wakes

It comes down to how we interpret man page (more below).

> 
> Unless spurious wakeup has happened between the requeue and wake
> operation which means that the num_requeues can be smaller because we
> will wake up less than requeued processes. So if we sampled spurious
> wakeups before the requeue operation and after the futex_wake() for
> requeued processes and call it delta_spurious we would get a range:
> 
> TST_RET - num_requeues >= set_wakes

This doesn't look correct if we consider spurious wakeups:

5 processes, set_wakes = 5, set_requeue = 0, 1 spuriously wakes up,
remaining 4 are woken up by futex(), 0 are requeued:

4 - 0 >= 5

> 
> &&
> 
> TST_RET - num_requeues - delta_spurious <= set_wakes

This seems ok - number of processes woken up by futex_cmp_requeue
must be less than set_wakes. 

If number of processes we find on uaddr1/uaddr2 have expected
values and nothing timed out, that should imply above as well.

> 
> Similarily the number of processes left waiting on the futex should be
> in a range of expected and MIN(0, expected - spurious) where

I don't get the "MIN()". It's 0 or less than zero?

> expected = num_waiter - set_wakes - set_requeues.

This might be where I took man page too pessimistically. Specifically
this part: "wakes up a maximum of val waiters". I took that as "can 
wake up fewer waiters at any time". While your formulas seem to imply
that "if there are _enough_ waiters, it will _always_ wake up val
waiters".

Looking at futex_requeue():
                if (++task_count <= nr_wake && !requeue_pi) {
                        mark_wake_futex(&wake_q, this);
                        continue;
                }
the latter looks plausible. We don't use FUTEX_CMP_REQUEUE_PI,
which appears to be only way to avoid wakeup (when there are enough
waiters).

If we go with latter case, then I agree v2 is unnecessarily cautious
in its assertions. 

> 
> And lastly the num_requeues should be between set_requeues and MIN(0,
> set_requeues - spurious).

Was MIN supposed to be MAX?

> 
> Or did is miss something that invalidates my line of thought?
> 
> 
> Also btw, we are missing a tcase where we attempt to wake more processes
> that are sleeping on the futex and check that we haven't requeued any
> because all were woken up.

That looks like it would complicate things  because we no longer assume
there are "enough waiters".

  expected = num_waiter - set_wakes - set_requeues

could go negative. It might be enough to have tcase where num_waiter == set_wakes
and set_requeues = 0.
Cyril Hrubis Nov. 21, 2019, 11:02 a.m. UTC | #3
Hi!
> > Unless spurious wakeup has happened between the requeue and wake
> > operation which means that the num_requeues can be smaller because we
> > will wake up less than requeued processes. So if we sampled spurious
> > wakeups before the requeue operation and after the futex_wake() for
> > requeued processes and call it delta_spurious we would get a range:
> > 
> > TST_RET - num_requeues >= set_wakes
> 
> This doesn't look correct if we consider spurious wakeups:
> 
> 5 processes, set_wakes = 5, set_requeue = 0, 1 spuriously wakes up,
> remaining 4 are woken up by futex(), 0 are requeued:
> 
> 4 - 0 >= 5

Well I was betting on the fact that we wake up much less processes than
we attempt to lock on the futex and that waking up processes takes
precedence. I we can add delta_spurious to the right side that wouldn't
change much and we end up being correct all the time, i.e.

TST_RET + delta_spurious - num_requeues >= set_wakes

That together with the next one gives us range:

TST_RET - num_requeues = set_wakes +-delta_spurious

> > 
> > &&
> > 
> > TST_RET - num_requeues - delta_spurious <= set_wakes
> 
> This seems ok - number of processes woken up by futex_cmp_requeue
> must be less than set_wakes. 
> 
> If number of processes we find on uaddr1/uaddr2 have expected
> values and nothing timed out, that should imply above as well.
> 
> > 
> > Similarily the number of processes left waiting on the futex should be
> > in a range of expected and MIN(0, expected - spurious) where
> 
> I don't get the "MIN()". It's 0 or less than zero?

Sigh, should have been MAX() to avoid negative numbers in case that
spurious > expected.

> > expected = num_waiter - set_wakes - set_requeues.
> 
> This might be where I took man page too pessimistically. Specifically
> this part: "wakes up a maximum of val waiters". I took that as "can 
> wake up fewer waiters at any time". While your formulas seem to imply
> that "if there are _enough_ waiters, it will _always_ wake up val
> waiters".
> 
> Looking at futex_requeue():
>                 if (++task_count <= nr_wake && !requeue_pi) {
>                         mark_wake_futex(&wake_q, this);
>                         continue;
>                 }
> the latter looks plausible. We don't use FUTEX_CMP_REQUEUE_PI,
> which appears to be only way to avoid wakeup (when there are enough
> waiters).
> 
> If we go with latter case, then I agree v2 is unnecessarily cautious
> in its assertions. 

As far as I can tell the wording in man page is there to suggest that it
cannot wake more processes than it's sleeping on the futex. I do not
think that it's explicitelly written anywhere that it will never fail to
wake up waiters it was requested to but all of the code from the
original futex tests was depending on that and the rewritten tests does
as well, have a look at futex_wake02.c for instance. Also there is no
point in waking less than available, it would only complicate the API
since we would have to loop over each futex_wake() in userspace. Having
a look at "Futexes are tricky" by Drepper none of the example code does
that either.

> > And lastly the num_requeues should be between set_requeues and MIN(0,
> > set_requeues - spurious).
> 
> Was MIN supposed to be MAX?

Yes, indeed, sorry.

> > Or did is miss something that invalidates my line of thought?
> > 
> > 
> > Also btw, we are missing a tcase where we attempt to wake more processes
> > that are sleeping on the futex and check that we haven't requeued any
> > because all were woken up.
> 
> That looks like it would complicate things  because we no longer assume
> there are "enough waiters".
> 
>   expected = num_waiter - set_wakes - set_requeues
> 
> could go negative. It might be enough to have tcase where num_waiter == set_wakes
> and set_requeues = 0.

I was thinking of putting that case into a separate testcase, all that
we would have to check there as a PASS would be that there are none
sleepers requeued if number of processes to wake was >= than the number
of sleeping processes.
Jan Stancek Nov. 21, 2019, 4:07 p.m. UTC | #4
----- Original Message -----
> Hi!
> > > Unless spurious wakeup has happened between the requeue and wake
> > > operation which means that the num_requeues can be smaller because we
> > > will wake up less than requeued processes. So if we sampled spurious
> > > wakeups before the requeue operation and after the futex_wake() for
> > > requeued processes and call it delta_spurious we would get a range:
> > > 
> > > TST_RET - num_requeues >= set_wakes
> > 
> > This doesn't look correct if we consider spurious wakeups:
> > 
> > 5 processes, set_wakes = 5, set_requeue = 0, 1 spuriously wakes up,
> > remaining 4 are woken up by futex(), 0 are requeued:
> > 
> > 4 - 0 >= 5
> 
> Well I was betting on the fact that we wake up much less processes than
> we attempt to lock on the futex and that waking up processes takes
> precedence. I we can add delta_spurious to the right side that wouldn't
> change much and we end up being correct all the time, i.e.
> 
> TST_RET + delta_spurious - num_requeues >= set_wakes

I'd go with just spurious instead of delta_spurious. If there is spurious
wake up before requeue (and first sample), wouldn't that fail in same way
as example above?

TST_RET + delta_spurious - num_requeues >= set_wakes
4 + 0 - 0 >= 5

Also delta_spurious looks racy, because it's based on counter
that is increased only after user-space gets chance to run. But process
may have been already removed from futex queue on kernel side.
So 'first sample before requeue' can see inaccurate state.

So I'd tweak your check to:
  set_wakes-spurious <= TST_RET-num_requeues <= set_wakes+spurious

Patch
diff mbox series

diff --git a/testcases/kernel/syscalls/futex/futex_cmp_requeue01.c b/testcases/kernel/syscalls/futex/futex_cmp_requeue01.c
index f5264c2ba26f..1711861ef298 100644
--- a/testcases/kernel/syscalls/futex/futex_cmp_requeue01.c
+++ b/testcases/kernel/syscalls/futex/futex_cmp_requeue01.c
@@ -19,7 +19,14 @@ 
 #include "tst_test.h"
 #include "futextest.h"
 
-static futex_t *futexes;
+struct shared_data {
+	futex_t futexes[2];
+	int spurious;
+	int test_done;
+};
+
+static struct shared_data *sd;
+static int max_sleep_ms;
 
 static struct tcase {
 	int num_waiters;
@@ -37,14 +44,30 @@  static struct tcase {
 
 static void do_child(void)
 {
-	struct timespec usec = tst_ms_to_timespec(2000);
+	int slept_for_ms = 0;
+	struct timespec usec = tst_ms_to_timespec(max_sleep_ms);
 	int pid = getpid();
+	int ret = 0;
+
+	if (futex_wait(&sd->futexes[0], sd->futexes[0], &usec, 0) == -1) {
+		if (errno == EAGAIN) {
+			/* spurious wakeup or signal */
+			tst_atomic_inc(&sd->spurious);
+		} else {
+			tst_res(TFAIL | TERRNO, "process %d wasn't woken up",
+				pid);
+			ret = 1;
+		}
+	}
 
-	if (!futex_wait(&futexes[0], futexes[0], &usec, 0))
-		exit(0);
+	/* make sure TST_PROCESS_STATE_WAIT() can always succeed */
+	while (!tst_atomic_load(&sd->test_done)
+		&& (slept_for_ms < max_sleep_ms)) {
+		usleep(50000);
+		slept_for_ms += 50;
+	}
 
-	tst_res(TINFO | TERRNO, "process %d wasn't woken up", pid);
-	exit(1);
+	exit(ret);
 }
 
 static void verify_futex_cmp_requeue(unsigned int n)
@@ -55,6 +78,8 @@  static void verify_futex_cmp_requeue(unsigned int n)
 	int pid[tc->num_waiters];
 	int exp_ret = tc->set_wakes + tc->set_requeues;
 
+	tst_atomic_store(0, &sd->spurious);
+	tst_atomic_store(0, &sd->test_done);
 	for (i = 0; i < tc->num_waiters; i++) {
 		pid[i] = SAFE_FORK();
 		if (!pid[i])
@@ -64,61 +89,87 @@  static void verify_futex_cmp_requeue(unsigned int n)
 	for (i = 0; i < tc->num_waiters; i++)
 		TST_PROCESS_STATE_WAIT(pid[i], 'S');
 
-	TEST(futex_cmp_requeue(&futexes[0], futexes[0], &futexes[1],
-	     tc->set_wakes, tc->set_requeues, 0));
-	if (TST_RET != exp_ret) {
-		tst_res(TFAIL, "futex_cmp_requeue() returned %ld, expected %d",
+	tst_res(TINFO, "Test %d: waiters: %d, wakes: %d, requeues: %d",
+		n, tc->num_waiters, tc->set_wakes, tc->set_requeues);
+
+	/*
+	 * change futex value, so any spurious wakeups or signals after
+	 * this point get bounced back to userspace.
+	 */
+	sd->futexes[0]++;
+	sd->futexes[1]++;
+
+	/*
+	 * Wakes up a maximum of tc->set_wakes waiters. tc->set_requeues
+	 * specifies an upper limit on the number of waiters that are requeued.
+	 * Returns the total number of waiters that were woken up or requeued.
+	 */
+	TEST(futex_cmp_requeue(&sd->futexes[0], sd->futexes[0], &sd->futexes[1],
+		tc->set_wakes, tc->set_requeues, 0));
+
+	/* Fail if more than requested wakes + requeues were returned */
+	if (TST_RET > exp_ret) {
+		tst_res(TFAIL, "futex_cmp_requeue() returned %ld, expected <= %d",
 			TST_RET, exp_ret);
+	} else {
+		tst_res(TINFO, "futex_cmp_requeue() returned %ld", TST_RET);
 	}
 
-	num_requeues = futex_wake(&futexes[1], tc->num_waiters, 0);
-	num_waits = futex_wake(&futexes[0], tc->num_waiters, 0);
+	num_requeues = futex_wake(&sd->futexes[1], tc->num_waiters, 0);
+	num_waits = futex_wake(&sd->futexes[0], tc->num_waiters, 0);
 
+	tst_atomic_store(1, &sd->test_done);
 	for (i = 0; i < tc->num_waiters; i++) {
 		SAFE_WAITPID(pid[i], &status, 0);
 		if (WIFEXITED(status) && !WEXITSTATUS(status))
 			num_total++;
 	}
 
+	tst_res(TINFO, "children woken, futex0: %d, futex1: %d, "
+		"spurious wakeups: %d",
+		num_waits, num_requeues, tst_atomic_load(&sd->spurious));
+
+	/* Fail if any waiter timed out */
 	if (num_total != tc->num_waiters) {
 		tst_res(TFAIL, "%d waiters were not woken up normally",
 			tc->num_waiters - num_total);
 		return;
 	}
 
-	if (num_requeues != tc->set_requeues) {
+	/* Fail if more than upper limit of tc->set_requeues is at futex1 */
+	if (num_requeues > tc->set_requeues) {
 		tst_res(TFAIL,
-			"futex_cmp_requeue() requeues %d waiters, expected %d",
+			"requeued %d waiters, expected <= %d",
 			num_requeues, tc->set_requeues);
 		return;
 	}
 
-	if (tc->num_waiters - num_requeues - num_waits != tc->set_wakes) {
-		tst_res(TFAIL,
-			"futex_cmp_requeue() woke up %d waiters, expected %d",
-			tc->num_waiters - num_requeues - num_waits,
-			tc->set_wakes);
+	/* Fail if more than tc->set_wakes were woken up by requeue */
+	exp_ret = tc->num_waiters - TST_RET - tst_atomic_load(&sd->spurious);
+	if (num_waits < exp_ret) {
+		tst_res(TFAIL, "woken up %d on futex0, expected >= %d",
+			num_waits, exp_ret);
 		return;
 	}
 
-	tst_res(TPASS,
-		"futex_cmp_requeue() woke up %d waiters and requeued %d waiters",
-		tc->set_wakes, tc->set_requeues);
+	tst_res(TPASS, "futex_cmp_requeue()");
 }
 
 static void setup(void)
 {
-	futexes = SAFE_MMAP(NULL, sizeof(futex_t) * 2, PROT_READ | PROT_WRITE,
+	max_sleep_ms = tst_multiply_timeout(5000);
+
+	sd = SAFE_MMAP(NULL, sizeof(*sd), PROT_READ | PROT_WRITE,
 			    MAP_ANONYMOUS | MAP_SHARED, -1, 0);
 
-	futexes[0] = FUTEX_INITIALIZER;
-	futexes[1] = FUTEX_INITIALIZER + 1;
+	sd->futexes[0] = FUTEX_INITIALIZER;
+	sd->futexes[1] = FUTEX_INITIALIZER + 1000;
 }
 
 static void cleanup(void)
 {
-	if (futexes)
-		SAFE_MUNMAP((void *)futexes, sizeof(futex_t) * 2);
+	if (sd)
+		SAFE_MUNMAP((void *)sd, sizeof(*sd));
 }
 
 static struct tst_test test = {