diff mbox series

syscalls/preadv203: Add basic RWF_NOWAIT test

Message ID 20190128134656.27979-1-metan@ucw.cz
State Superseded
Headers show
Series syscalls/preadv203: Add basic RWF_NOWAIT test | expand

Commit Message

Cyril Hrubis Jan. 28, 2019, 1:46 p.m. UTC
From: Cyril Hrubis <chrubis@suse.cz>

We are attempting to trigger the EAGAIN path for the RWF_NOWAIT flag.

In order to do so the test runs three threads:

* nowait_reader: reads from a random offset from a random file with
                 RWF_NOWAIT flag and expects to get EAGAIN and short
                 read sooner or later

* writer_thread: rewrites random file in order to keep the underlying device
                 bussy so that pages evicted from cache cannot be faulted
                 immediatelly

* cache_dropper: attempts to evict pages from a cache in order for reader to
                 hit evicted page sooner or later

Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
CC: Jiri Kosina <jikos@kernel.org>
CC: Linux-MM <linux-mm@kvack.org>
CC: kernel list <linux-kernel@vger.kernel.org>
CC: Linux API <linux-api@vger.kernel.org>

---

I was wondering if we can do a better job at flushing the caches. Is
there an interface for flusing caches just for the device we are using
for the test?

Also the RWF_NOWAIT should probably be benchmarked as well but that is
completely out of scope for LTP.

 runtest/syscalls                              |   2 +
 testcases/kernel/syscalls/preadv2/.gitignore  |   2 +
 testcases/kernel/syscalls/preadv2/Makefile    |   4 +
 testcases/kernel/syscalls/preadv2/preadv203.c | 266 ++++++++++++++++++
 4 files changed, 274 insertions(+)
 create mode 100644 testcases/kernel/syscalls/preadv2/preadv203.c

Comments

Petr Vorel Feb. 6, 2019, 7:52 a.m. UTC | #1
Hi Cyril,

> From: Cyril Hrubis <chrubis@suse.cz>
Reviewed-by: Petr Vorel <pvorel@suse.cz>

> We are attempting to trigger the EAGAIN path for the RWF_NOWAIT flag.

> In order to do so the test runs three threads:

> * nowait_reader: reads from a random offset from a random file with
>                  RWF_NOWAIT flag and expects to get EAGAIN and short
>                  read sooner or later

> * writer_thread: rewrites random file in order to keep the underlying device
>                  bussy so that pages evicted from cache cannot be faulted
typo => busy
>                  immediatelly
typo => immediately

...
> +++ b/testcases/kernel/syscalls/preadv2/preadv203.c
> @@ -0,0 +1,266 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Copyright (C) 2019 Cyril Hrubis <chrubis@suse.cz>
> + */
> +
> +/*
> + * This is a basic functional test for RWF_NOWAIT flag, we are attempting to
> + * force preadv2() either to return a short read or EAGAIN with three
> + * concurelntly running threads:
typo => concurrently

> + *
> + *  nowait_reader: reads from a random offset from a random file with
> + *                 RWF_NOWAIT flag and expects to get EAGAIN and short
> + *                 read sooner or later
> + *
> + *  writer_thread: rewrites random file in order to keep the underlying device
> + *                 bussy so that pages evicted from cache cannot be faulted
typo => busy

> + *                 immediatelly
typo => immediately

> + *
> + *  cache_dropper: attempts to evict pages from a cache in order for reader to
> + *                 hit evicted page sooner or later
> + */
> +
> +/*
> + * If test fails with EOPNOTSUPP you have likely hit a glibc bug:
> + *
> + * https://sourceware.org/bugzilla/show_bug.cgi?id=23579
> + *
> + * Which can be worked around by calling preadv2() directly by syscall() such as:
> + *
> + * static ssize_t sys_preadv2(int fd, const struct iovec *iov, int iovcnt,
> + *                            off_t offset, int flags)
> + * {
> + *	return syscall(SYS_preadv2, fd, iov, iovcnt, offset, offset>>32, flags);
> + * }
I wonder if we want either warn user or run it both via (g)libc wrapper and
directly the syscall.

BTW testing on kernel 4.20.0, glibc 2.28 with sys_preadv2() and still get TBROK
EOPNOTSUPP. I tried to test on ext[2-4], btrfs. I wonder what I'm missing.

Kind regards,
Petr
Petr Vorel Feb. 6, 2019, 8:04 a.m. UTC | #2
Hi Cyril,

> +static void *nowait_reader(void *unused LTP_ATTRIBUTE_UNUSED)
...
> +		TEST(preadv2(fd, rd_iovec, 2, off, RWF_NOWAIT));
RWF_NOWAIT needs to be declared in lapi for old distros.

Kind regards,
Petr
Vijay Kumar Feb. 6, 2019, 8:36 a.m. UTC | #3
On Monday 28 January 2019 07:16 PM, Cyril Hrubis wrote:

> From: Cyril Hrubis <chrubis@suse.cz>
>
> We are attempting to trigger the EAGAIN path for the RWF_NOWAIT flag.
>
> In order to do so the test runs three threads:
>
> * nowait_reader: reads from a random offset from a random file with
>                   RWF_NOWAIT flag and expects to get EAGAIN and short
>                   read sooner or later
>
> * writer_thread: rewrites random file in order to keep the underlying device
>                   bussy so that pages evicted from cache cannot be faulted
>                   immediatelly
>
> * cache_dropper: attempts to evict pages from a cache in order for reader to
>                   hit evicted page sooner or later
>
> Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
> CC: Jiri Kosina <jikos@kernel.org>
> CC: Linux-MM <linux-mm@kvack.org>
> CC: kernel list <linux-kernel@vger.kernel.org>
> CC: Linux API <linux-api@vger.kernel.org>
>
> ---
>
> I was wondering if we can do a better job at flushing the caches. Is
> there an interface for flusing caches just for the device we are using
> for the test?
>

Will the BLKFLSBUF ioctl be suitable for this purpose?

Regards,
Vijay
Cyril Hrubis Feb. 6, 2019, 12:07 p.m. UTC | #4
Hi!
> I wonder if we want either warn user or run it both via (g)libc wrapper and
> directly the syscall.

I guess that we may drop a hint in the error message.

> BTW testing on kernel 4.20.0, glibc 2.28 with sys_preadv2() and still get TBROK
> EOPNOTSUPP. I tried to test on ext[2-4], btrfs. I wonder what I'm missing.

That's strange, are you sure that the test was calling the wrapper
instead of the glibc call?

This should really work unless the support in kernel has been disabled
in distribution kernel for some reason. I was testing it on vanilla
kernel where it worked as expected.
Cyril Hrubis Feb. 6, 2019, 12:10 p.m. UTC | #5
Hi!
> > +static void *nowait_reader(void *unused LTP_ATTRIBUTE_UNUSED)
> ...
> > +		TEST(preadv2(fd, rd_iovec, 2, off, RWF_NOWAIT));
> RWF_NOWAIT needs to be declared in lapi for old distros.

Right and we should as well return TCONF on older kernels, I haven't
tested that properly yet but I guess that we should check for EOPNOTSUPP
and EINVAL. I will do that later on, I just didn't want to delay the
initial test patch because it seem that kernel developers wanted to
have it ASAP.
Petr Vorel Feb. 7, 2019, 11:47 a.m. UTC | #6
Hi Cyril,

> > I wonder if we want either warn user or run it both via (g)libc wrapper and
> > directly the syscall.

> I guess that we may drop a hint in the error message.

> > BTW testing on kernel 4.20.0, glibc 2.28 with sys_preadv2() and still get TBROK
> > EOPNOTSUPP. I tried to test on ext[2-4], btrfs. I wonder what I'm missing.

> That's strange, are you sure that the test was calling the wrapper
> instead of the glibc call?
See diff below, but I guess it's what you meant :).

> This should really work unless the support in kernel has been disabled
> in distribution kernel for some reason. I was testing it on vanilla
> kernel where it worked as expected.
What config option is needed?

It does not work on both my machines (glibc 2.27 and 2.28) and it's working on
various VMs with these versions.


Kind regards,
Petr


+++ testcases/kernel/syscalls/preadv2/preadv203.c
@@ -95,6 +95,12 @@ static int verify_short_read(struct iovec *iov, size_t iov_cnt,
 	return 0;
 }
 
+ static ssize_t sys_preadv2(int fd, const struct iovec *iov, int iovcnt,
+                            off_t offset, int flags)
+ {
+ 	return syscall(SYS_preadv2, fd, iov, iovcnt, offset, offset>>32, flags);
+ }
+
 static void *nowait_reader(void *unused LTP_ATTRIBUTE_UNUSED)
 {
 	char buf1[CHUNK_SZ/2];
@@ -115,7 +121,7 @@ static void *nowait_reader(void *unused LTP_ATTRIBUTE_UNUSED)
 		off_t off = random() % ((CHUNKS - 2) * CHUNK_SZ);
 		int fd = fds[random() % FILES];
 
-		TEST(preadv2(fd, rd_iovec, 2, off, RWF_NOWAIT));
+		TEST(sys_preadv2(fd, rd_iovec, 2, off, RWF_NOWAIT));
 
 		if (TST_RET < 0) {
 			if (TST_ERR != EAGAIN)
Naresh Kamboju Feb. 21, 2019, 5:57 p.m. UTC | #7
On Wed, 6 Feb 2019 at 17:40, Cyril Hrubis <chrubis@suse.cz> wrote:
>
> Hi!
> > > +static void *nowait_reader(void *unused LTP_ATTRIBUTE_UNUSED)
> > ...
> > > +           TEST(preadv2(fd, rd_iovec, 2, off, RWF_NOWAIT));
> > RWF_NOWAIT needs to be declared in lapi for old distros.
>
> Right and we should as well return TCONF on older kernels, I haven't
> tested that properly yet but I guess that we should check for EOPNOTSUPP
> and EINVAL. I will do that later on, I just didn't want to delay the
> initial test patch because it seem that kernel developers wanted to
> have it ASAP.


LTP syscalls test preadv201 and preadv201_64 failed on 4.4 branch
kernel on all devices.
PASS 4.9, 4.14, 4.20 kernel versions.

OTOH,
preadv202 and preadv202_64 PASS on 4.4, 4.9, 4.14 and 4.20 kernel versions.

Test output log,
tst_test.c:1085: INFO: Timeout per run is 0h 15m 00s
preadv201.c:91: PASS: preadv2() read 64 bytes with content 'a' expectedly
preadv201.c:91: PASS: preadv2() read 64 bytes with content 'a' expectedly
preadv201.c:91: PASS: preadv2() read 32 bytes with content 'b' expectedly
preadv201.c:64: FAIL: preadv2() failed: EINVAL
preadv201.c:64: FAIL: preadv2() failed: EINVAL
preadv201.c:64: FAIL: preadv2() failed: EINVAL
Summary:
passed   3
failed   3
skipped  0
warnings 0
tst_test.c:1085: INFO: Timeout per run is 0h 15m 00s
preadv201.c:91: PASS: preadv2() read 64 bytes with content 'a' expectedly
preadv201.c:91: PASS: preadv2() read 64 bytes with content 'a' expectedly
preadv201.c:91: PASS: preadv2() read 32 bytes with content 'b' expectedly
preadv201.c:64: FAIL: preadv2() failed: EINVAL
preadv201.c:64: FAIL: preadv2() failed: EINVAL
preadv201.c:64: FAIL: preadv2() failed: EINVAL
Summary:
passed   3
failed   3
skipped  0
warnings 0

Full test log from 4.4  branch kernel x86_64,
https://lkft.validation.linaro.org/scheduler/job/614534#L7922

https://qa-reports.linaro.org/lkft/linux-stable-rc-4.4-oe/tests/ltp-syscalls-tests/preadv201

- Naresh
diff mbox series

Patch

diff --git a/runtest/syscalls b/runtest/syscalls
index 34b47f36b..a69c431f1 100644
--- a/runtest/syscalls
+++ b/runtest/syscalls
@@ -853,6 +853,8 @@  preadv201 preadv201
 preadv201_64 preadv201_64
 preadv202 preadv202
 preadv202_64 preadv202_64
+preadv203 preadv203
+preadv203_64 preadv203_64
 
 profil01 profil01
 
diff --git a/testcases/kernel/syscalls/preadv2/.gitignore b/testcases/kernel/syscalls/preadv2/.gitignore
index 759d9ef5b..98b81abea 100644
--- a/testcases/kernel/syscalls/preadv2/.gitignore
+++ b/testcases/kernel/syscalls/preadv2/.gitignore
@@ -2,3 +2,5 @@ 
 /preadv201_64
 /preadv202
 /preadv202_64
+/preadv203
+/preadv203_64
diff --git a/testcases/kernel/syscalls/preadv2/Makefile b/testcases/kernel/syscalls/preadv2/Makefile
index fc1fbf3c7..fbedd0287 100644
--- a/testcases/kernel/syscalls/preadv2/Makefile
+++ b/testcases/kernel/syscalls/preadv2/Makefile
@@ -11,4 +11,8 @@  include $(abs_srcdir)/../utils/newer_64.mk
 
 %_64: CPPFLAGS += -D_FILE_OFFSET_BITS=64
 
+preadv203: CFLAGS += -pthread
+preadv203_64: CFLAGS += -pthread
+preadv203_64: LDFLAGS += -pthread
+
 include $(top_srcdir)/include/mk/generic_leaf_target.mk
diff --git a/testcases/kernel/syscalls/preadv2/preadv203.c b/testcases/kernel/syscalls/preadv2/preadv203.c
new file mode 100644
index 000000000..a6d5300f9
--- /dev/null
+++ b/testcases/kernel/syscalls/preadv2/preadv203.c
@@ -0,0 +1,266 @@ 
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2019 Cyril Hrubis <chrubis@suse.cz>
+ */
+
+/*
+ * This is a basic functional test for RWF_NOWAIT flag, we are attempting to
+ * force preadv2() either to return a short read or EAGAIN with three
+ * concurelntly running threads:
+ *
+ *  nowait_reader: reads from a random offset from a random file with
+ *                 RWF_NOWAIT flag and expects to get EAGAIN and short
+ *                 read sooner or later
+ *
+ *  writer_thread: rewrites random file in order to keep the underlying device
+ *                 bussy so that pages evicted from cache cannot be faulted
+ *                 immediatelly
+ *
+ *  cache_dropper: attempts to evict pages from a cache in order for reader to
+ *                 hit evicted page sooner or later
+ */
+
+/*
+ * If test fails with EOPNOTSUPP you have likely hit a glibc bug:
+ *
+ * https://sourceware.org/bugzilla/show_bug.cgi?id=23579
+ *
+ * Which can be worked around by calling preadv2() directly by syscall() such as:
+ *
+ * static ssize_t sys_preadv2(int fd, const struct iovec *iov, int iovcnt,
+ *                            off_t offset, int flags)
+ * {
+ *	return syscall(SYS_preadv2, fd, iov, iovcnt, offset, offset>>32, flags);
+ * }
+ *
+ */
+
+#define _GNU_SOURCE
+#include <string.h>
+#include <sys/uio.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <ctype.h>
+#include <pthread.h>
+
+#include "tst_test.h"
+#include "tst_safe_pthread.h"
+#include "lapi/preadv2.h"
+
+#define CHUNK_SZ 4123
+#define CHUNKS 60
+#define MNTPOINT "mntpoint"
+#define FILES 1000
+
+static int fds[FILES];
+
+static volatile int stop;
+
+static void drop_caches(void)
+{
+	SAFE_FILE_PRINTF("/proc/sys/vm/drop_caches", "3");
+}
+
+/*
+ * All files are divided in chunks each filled with the same bytes starting with
+ * '0' at offset 0 and with increasing value on each next chunk.
+ *
+ * 000....000111....111.......AAA......AAA...
+ * | chunk0 || chunk1 |  ...  |  chunk17 |
+ */
+static int verify_short_read(struct iovec *iov, size_t iov_cnt,
+		             off_t off, size_t size)
+{
+	unsigned int i;
+	size_t j, checked = 0;
+
+	for (i = 0; i < iov_cnt; i++) {
+		char *buf = iov[i].iov_base;
+		for (j = 0; j < iov[i].iov_len; j++) {
+			char exp_val = '0' + (off + checked)/CHUNK_SZ;
+
+			if (exp_val != buf[j]) {
+				tst_res(TFAIL,
+				        "Wrong value read pos %zu size %zu %c (%i) %c (%i)!",
+				        checked, size, exp_val, exp_val,
+					isprint(buf[j]) ? buf[j] : ' ', buf[j]);
+				return 1;
+			}
+
+			if (++checked >= size)
+				return 0;
+		}
+	}
+
+	return 0;
+}
+
+static void *nowait_reader(void *unused LTP_ATTRIBUTE_UNUSED)
+{
+	char buf1[CHUNK_SZ/2];
+	char buf2[CHUNK_SZ];
+	unsigned int full_read_cnt = 0, eagain_cnt = 0;
+	unsigned int short_read_cnt = 0, zero_read_cnt = 0;
+
+	struct iovec rd_iovec[] = {
+		{buf1, sizeof(buf1)},
+		{buf2, sizeof(buf2)},
+	};
+
+	while (!stop) {
+		if (eagain_cnt >= 100 && short_read_cnt >= 10)
+			stop = 1;
+
+		/* Ensure short reads doesn't happen because of tripping on EOF */
+		off_t off = random() % ((CHUNKS - 2) * CHUNK_SZ);
+		int fd = fds[random() % FILES];
+
+		TEST(preadv2(fd, rd_iovec, 2, off, RWF_NOWAIT));
+
+		if (TST_RET < 0) {
+			if (TST_ERR != EAGAIN)
+				tst_brk(TBROK | TTERRNO, "preadv2() failed");
+
+			eagain_cnt++;
+			continue;
+		}
+
+
+		if (TST_RET == 0) {
+			zero_read_cnt++;
+			continue;
+		}
+
+		if (TST_RET != CHUNK_SZ + CHUNK_SZ/2) {
+			verify_short_read(rd_iovec, 2, off, TST_RET);
+			short_read_cnt++;
+			continue;
+		}
+
+		full_read_cnt++;
+	}
+
+	tst_res(TINFO,
+	        "Number of full_reads %u, short reads %u, zero len reads %u, EAGAIN(s) %u",
+		full_read_cnt, short_read_cnt, zero_read_cnt, eagain_cnt);
+
+	return (void*)(long)eagain_cnt;
+}
+
+static void *writer_thread(void *unused)
+{
+	char buf[CHUNK_SZ];
+	unsigned int j, write_cnt = 0;
+
+	struct iovec wr_iovec[] = {
+		{buf, sizeof(buf)},
+	};
+
+	while (!stop) {
+		int fd = fds[random() % FILES];
+
+		for (j = 0; j < CHUNKS; j++) {
+			memset(buf, '0' + j, sizeof(buf));
+
+			off_t off = CHUNK_SZ * j;
+
+			if (pwritev(fd, wr_iovec, 1, off) < 0) {
+				if (errno == EBADF) {
+					tst_res(TBROK | TERRNO, "FDs closed?");
+					return unused;
+				}
+
+				tst_brk(TBROK | TERRNO, "pwritev()");
+			}
+
+			write_cnt++;
+		}
+	}
+
+	tst_res(TINFO, "Number of writes %u", write_cnt);
+
+	return unused;
+}
+
+static void *cache_dropper(void *unused)
+{
+	unsigned int drop_cnt = 0;
+
+	while (!stop) {
+		drop_caches();
+		drop_cnt++;
+	}
+
+	tst_res(TINFO, "Cache dropped %u times", drop_cnt);
+
+	return unused;
+}
+
+static void verify_preadv2(void)
+{
+	pthread_t reader, dropper, writer;
+	unsigned int max_runtime = 600;
+	void *eagains;
+
+	stop = 0;
+
+	drop_caches();
+
+	SAFE_PTHREAD_CREATE(&dropper, NULL, cache_dropper, NULL);
+	SAFE_PTHREAD_CREATE(&reader, NULL, nowait_reader, NULL);
+	SAFE_PTHREAD_CREATE(&writer, NULL, writer_thread, NULL);
+
+	while (!stop && max_runtime-- > 0)
+		usleep(100000);
+
+	stop = 1;
+
+	SAFE_PTHREAD_JOIN(reader, &eagains);
+	SAFE_PTHREAD_JOIN(dropper, NULL);
+	SAFE_PTHREAD_JOIN(writer, NULL);
+
+	if (eagains)
+		tst_res(TPASS, "Got some EAGAIN");
+	else
+		tst_res(TFAIL, "Haven't got EAGAIN");
+}
+
+static void setup(void)
+{
+	char path[1024];
+	char buf[CHUNK_SZ];
+	unsigned int i;
+	char j;
+
+	for (i = 0; i < FILES; i++) {
+		snprintf(path, sizeof(path), MNTPOINT"/file_%i", i);
+
+		fds[i] = SAFE_OPEN(path, O_RDWR | O_CREAT, 0644);
+
+		for (j = 0; j < CHUNKS; j++) {
+			memset(buf, '0' + j, sizeof(buf));
+			SAFE_WRITE(1, fds[i], buf, sizeof(buf));
+		}
+	}
+}
+
+static void do_cleanup(void)
+{
+	unsigned int i;
+
+	for (i = 0; i < FILES; i++) {
+		if (fds[i] > 0)
+			SAFE_CLOSE(fds[i]);
+	}
+}
+
+TST_DECLARE_ONCE_FN(cleanup, do_cleanup);
+
+static struct tst_test test = {
+	.setup = setup,
+	.cleanup = cleanup,
+	.test_all = verify_preadv2,
+	.mntpoint = MNTPOINT,
+	.all_filesystems = 1,
+	.needs_tmpdir = 1,
+};