diff mbox series

[1/2] ioctl09: wait for udevd complete the uevent handling

Message ID 20220420114720.1463473-1-liwang@redhat.com
State Superseded
Headers show
Series [1/2] ioctl09: wait for udevd complete the uevent handling | expand

Commit Message

Li Wang April 20, 2022, 11:47 a.m. UTC
This is quite necessary to wait for uevent handing settled before
checking the partition is taken effect.

  `udevadm monitor` listening from another terminal:

  KERNEL[447.077683] change   /devices/virtual/block/loop0 (block)
  KERNEL[447.077939] add      /devices/virtual/block/loop0/loop0p1 (block)
  KERNEL[447.077999] add      /devices/virtual/block/loop0/loop0p2 (block)
  UDEV  [447.081138] change   /devices/virtual/block/loop0 (block)
  UDEV  [447.083490] add      /devices/virtual/block/loop0/loop0p1 (block)
  UDEV  [447.086898] add      /devices/virtual/block/loop0/loop0p2 (block)

Which helps to get rid of such boring failures:

  ioctl09.c:44: TPASS: access /sys/block/loop0/loop0p1 succeeds
  ioctl09.c:52: TPASS: access /dev/loop0p1 succeeds
  ioctl09.c:47: TFAIL: access /sys/block/loop0/loop0p2 fails
  ioctl09.c:55: TFAIL: access /dev/loop0p2 fails

I manually confirmed those failures disappeared from a reproducible system
even with running 10000 times.

Signed-off-by: Li Wang <liwang@redhat.com>
---
 testcases/kernel/syscalls/ioctl/ioctl09.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Cyril Hrubis April 20, 2022, 12:05 p.m. UTC | #1
Hi!
> This is quite necessary to wait for uevent handing settled before
> checking the partition is taken effect.
> 
>   `udevadm monitor` listening from another terminal:
> 
>   KERNEL[447.077683] change   /devices/virtual/block/loop0 (block)
>   KERNEL[447.077939] add      /devices/virtual/block/loop0/loop0p1 (block)
>   KERNEL[447.077999] add      /devices/virtual/block/loop0/loop0p2 (block)
>   UDEV  [447.081138] change   /devices/virtual/block/loop0 (block)
>   UDEV  [447.083490] add      /devices/virtual/block/loop0/loop0p1 (block)
>   UDEV  [447.086898] add      /devices/virtual/block/loop0/loop0p2 (block)
> 
> Which helps to get rid of such boring failures:
> 
>   ioctl09.c:44: TPASS: access /sys/block/loop0/loop0p1 succeeds
>   ioctl09.c:52: TPASS: access /dev/loop0p1 succeeds
>   ioctl09.c:47: TFAIL: access /sys/block/loop0/loop0p2 fails
>   ioctl09.c:55: TFAIL: access /dev/loop0p2 fails
> 
> I manually confirmed those failures disappeared from a reproducible system
> even with running 10000 times.
> 
> Signed-off-by: Li Wang <liwang@redhat.com>
> ---
>  testcases/kernel/syscalls/ioctl/ioctl09.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/testcases/kernel/syscalls/ioctl/ioctl09.c b/testcases/kernel/syscalls/ioctl/ioctl09.c
> index 9728ecb9c..533a4a5d0 100644
> --- a/testcases/kernel/syscalls/ioctl/ioctl09.c
> +++ b/testcases/kernel/syscalls/ioctl/ioctl09.c
> @@ -58,6 +58,10 @@ static void check_partition(int part_num, bool value)
>  
>  static void verify_ioctl(void)
>  {
> +	char cmd[BUFSIZ];
> +
> +	sprintf(cmd, "udevadm settle --exit-if-exists=%s", dev_path);
> +
>  	const char *const cmd_parted_old[] = {"parted", "-s", "test.img",
>  					      "mklabel", "msdos", "mkpart",
>  					      "primary", "ext4", "1M", "10M",
> @@ -75,11 +79,13 @@ static void verify_ioctl(void)
>  
>  	loopinfo.lo_flags =  LO_FLAGS_PARTSCAN;
>  	SAFE_IOCTL(dev_fd, LOOP_SET_STATUS, &loopinfo);
> +	tst_system(cmd);
>  	check_partition(1, true);
>  	check_partition(2, false);
>  
>  	change_partition(cmd_parted_new);
>  	TST_RETRY_FUNC(ioctl(dev_fd, BLKRRPART, 0), TST_RETVAL_EQ0);
> +	tst_system(cmd);
>  	check_partition(1, true);
>  	check_partition(2, true);

We allready use TST_RETRY_FN_EXP_BACKOFF() in there with 30 second
timeout isn't that enough? And if it isn't wouldn't simply increasing
the timeout to a minute or two fix the issue?
Li Wang April 20, 2022, 12:31 p.m. UTC | #2
Cyril Hrubis <chrubis@suse.cz> wrote:



>
> >  static void verify_ioctl(void)
> >  {
> > +     char cmd[BUFSIZ];
> > +
> > +     sprintf(cmd, "udevadm settle --exit-if-exists=%s", dev_path);
> > +
> >       const char *const cmd_parted_old[] = {"parted", "-s", "test.img",
> >                                             "mklabel", "msdos", "mkpart",
> >                                             "primary", "ext4", "1M",
> "10M",
> > @@ -75,11 +79,13 @@ static void verify_ioctl(void)
> >
> >       loopinfo.lo_flags =  LO_FLAGS_PARTSCAN;
> >       SAFE_IOCTL(dev_fd, LOOP_SET_STATUS, &loopinfo);
> > +     tst_system(cmd);
> >       check_partition(1, true);
> >       check_partition(2, false);
> >
> >       change_partition(cmd_parted_new);
> >       TST_RETRY_FUNC(ioctl(dev_fd, BLKRRPART, 0), TST_RETVAL_EQ0);
> > +     tst_system(cmd);
> >       check_partition(1, true);
> >       check_partition(2, true);
>
> We allready use TST_RETRY_FN_EXP_BACKOFF() in there with 30 second
>

I have to say, for most systems 30 sec is long enough, but we definitely got
sporadically fails in larger-scale automation testing.


timeout isn't that enough? And if it isn't wouldn't simply increasing
> the timeout to a minute or two fix the issue?
>


That should be better, I just have a try on my reproducible system,
but it does not work with enlarged to 180 seconds.

ioctl09.c:52: TPASS: access /dev/loop0p1 succeeds
octl09.c:47: TFAIL: access /sys/block/loop0/loop0p2 fails
Test timeouted, sending SIGKILL!
tst_test.c:1509: TINFO: If you are running on slow machine, try exporting
LTP_TIMEOUT_MUL > 1
tst_test.c:1511: TBROK: Test killed! (timeout?)


Note, the `udevadm settle` uses 180s as default timeout as well,
but it can work, I will look into udevadm.c to see if that does
something additional besides the waiting.

If so, we might need to remove the TST_RETRY_FN_EXP_BACKOFF
from this test.
Cyril Hrubis April 20, 2022, 12:52 p.m. UTC | #3
Hi!
> timeout isn't that enough? And if it isn't wouldn't simply increasing
> > the timeout to a minute or two fix the issue?
> >
> 
> 
> That should be better, I just have a try on my reproducible system,
> but it does not work with enlarged to 180 seconds.

You have to set the test timeout to be four times of this timeout
otherwise it will report that the test had timeouted sporadically.

> ioctl09.c:52: TPASS: access /dev/loop0p1 succeeds
> octl09.c:47: TFAIL: access /sys/block/loop0/loop0p2 fails
> Test timeouted, sending SIGKILL!
> tst_test.c:1509: TINFO: If you are running on slow machine, try exporting
> LTP_TIMEOUT_MUL > 1
> tst_test.c:1511: TBROK: Test killed! (timeout?)
> 
> 
> Note, the `udevadm settle` uses 180s as default timeout as well,
> but it can work, I will look into udevadm.c to see if that does
> something additional besides the waiting.

I guess that the difference is that when we wait on a socket actively we
return sooner than with the exp backoff. By definition the exp backoff
may wait for twice as long as the udevadm and because of that the test
may actually timeout because we waited just a little bit longer.

> If so, we might need to remove the TST_RETRY_FN_EXP_BACKOFF
> from this test.

That would be valid option, remove the exp backoff and wait activelly.
Btw we already have an infrastructure for matching kernel event uevents
in the kernel/uevent/ directory. If we split the uevent.h there into a
header and libs/ltpuevent/ we could simply wait for the event by filling
in the uevent_desc structure and calling wait_for_uevents().
Li Wang April 21, 2022, 6:36 a.m. UTC | #4
On Wed, Apr 20, 2022 at 8:50 PM Cyril Hrubis <chrubis@suse.cz> wrote:


>
> > Note, the `udevadm settle` uses 180s as default timeout as well,
> > but it can work, I will look into udevadm.c to see if that does
> > something additional besides the waiting.
>
> I guess that the difference is that when we wait on a socket actively we
> return sooner than with the exp backoff. By definition the exp backoff
> may wait for twice as long as the udevadm and because of that the test
> may actually timeout because we waited just a little bit longer.
>

That's right, I overlooked that exp backoff time is longer than the set
value.



>
> > If so, we might need to remove the TST_RETRY_FN_EXP_BACKOFF
> > from this test.
>
> That would be valid option, remove the exp backoff and wait activelly.
>

Yes, wait actively and confirm event handling is completed.
(`udevadm settle` is the direct way I can think of now)



> Btw we already have an infrastructure for matching kernel event uevents
> in the kernel/uevent/ directory. If we split the uevent.h there into a
> header and libs/ltpuevent/ we could simply wait for the event by filling
> in the uevent_desc structure and calling wait_for_uevents().
>

Good to know this, I have looked through that but am afraid it can't
fully satisfy our requirements.

Because the wait_for_uevents is only falling into an infinite loop
and returns if detecting the expected UEVENT (which was sent
from the kernel function kobject_uevent_env).

The key point here is, we have to confirm `udevd` completed the UEVENT
handling, otherwise only detecting event exists is not enough, there is
still
potentially get the race condition again if the event queue there.
diff mbox series

Patch

diff --git a/testcases/kernel/syscalls/ioctl/ioctl09.c b/testcases/kernel/syscalls/ioctl/ioctl09.c
index 9728ecb9c..533a4a5d0 100644
--- a/testcases/kernel/syscalls/ioctl/ioctl09.c
+++ b/testcases/kernel/syscalls/ioctl/ioctl09.c
@@ -58,6 +58,10 @@  static void check_partition(int part_num, bool value)
 
 static void verify_ioctl(void)
 {
+	char cmd[BUFSIZ];
+
+	sprintf(cmd, "udevadm settle --exit-if-exists=%s", dev_path);
+
 	const char *const cmd_parted_old[] = {"parted", "-s", "test.img",
 					      "mklabel", "msdos", "mkpart",
 					      "primary", "ext4", "1M", "10M",
@@ -75,11 +79,13 @@  static void verify_ioctl(void)
 
 	loopinfo.lo_flags =  LO_FLAGS_PARTSCAN;
 	SAFE_IOCTL(dev_fd, LOOP_SET_STATUS, &loopinfo);
+	tst_system(cmd);
 	check_partition(1, true);
 	check_partition(2, false);
 
 	change_partition(cmd_parted_new);
 	TST_RETRY_FUNC(ioctl(dev_fd, BLKRRPART, 0), TST_RETVAL_EQ0);
+	tst_system(cmd);
 	check_partition(1, true);
 	check_partition(2, true);