diff mbox series

[PATCH/RFC] syscalls/readahead02: don't use cache size

Message ID 69e171efb14ec9078a4b70838f45ff5a550388d8.1551789098.git.jstancek@redhat.com
State Superseded, archived
Headers show
Series [PATCH/RFC] syscalls/readahead02: don't use cache size | expand

Commit Message

Jan Stancek March 5, 2019, 12:34 p.m. UTC
Using system-wide "Cached" size is not accurate. The test is sporadically
failing with warning on ppc64le 4.18 and 5.0 kernels.

Problem is that test over-estimates max readahead size, which then
leads to fewer readhead calls and kernel can silently trims length
in each of them:
  ...
  readahead02.c:244: INFO: Test #2: POSIX_FADV_WILLNEED on file
  readahead02.c:134: INFO: creating test file of size: 67108864
  readahead02.c:263: INFO: read_testfile(0)
  readahead02.c:274: INFO: read_testfile(1)
  readahead02.c:189: INFO: max ra estimate: 12320768
  readahead02.c:198: INFO: readahead calls made: 6
  readahead02.c:204: PASS: offset is still at 0 as expected
  readahead02.c:308: INFO: read_testfile(0) took: 492486 usec
  readahead02.c:309: INFO: read_testfile(1) took: 430627 usec
  readahead02.c:311: INFO: read_testfile(0) read: 67108864 bytes
  readahead02.c:313: INFO: read_testfile(1) read: 59244544 bytes
  readahead02.c:316: PASS: readahead saved some I/O
  readahead02.c:324: INFO: cache can hold at least: 264192 kB
  readahead02.c:325: INFO: read_testfile(0) used cache: 124992 kB
  readahead02.c:326: INFO: read_testfile(1) used cache: 12032 kB
  readahead02.c:338: WARN: using less cache than expected

Stop relying on used cache size, and use minimal sane readahead length,
that should work across all systems.

Signed-off-by: Jan Stancek <jstancek@redhat.com>
---
 testcases/kernel/syscalls/readahead/readahead02.c | 19 +------------------
 1 file changed, 1 insertion(+), 18 deletions(-)

Comments

Amir Goldstein March 5, 2019, 1:53 p.m. UTC | #1
On Tue, Mar 5, 2019 at 2:34 PM Jan Stancek <jstancek@redhat.com> wrote:
>
> Using system-wide "Cached" size is not accurate. The test is sporadically
> failing with warning on ppc64le 4.18 and 5.0 kernels.
>
> Problem is that test over-estimates max readahead size, which then
> leads to fewer readhead calls and kernel can silently trims length
> in each of them:
>   ...
>   readahead02.c:244: INFO: Test #2: POSIX_FADV_WILLNEED on file
>   readahead02.c:134: INFO: creating test file of size: 67108864
>   readahead02.c:263: INFO: read_testfile(0)
>   readahead02.c:274: INFO: read_testfile(1)
>   readahead02.c:189: INFO: max ra estimate: 12320768
>   readahead02.c:198: INFO: readahead calls made: 6
>   readahead02.c:204: PASS: offset is still at 0 as expected
>   readahead02.c:308: INFO: read_testfile(0) took: 492486 usec
>   readahead02.c:309: INFO: read_testfile(1) took: 430627 usec
>   readahead02.c:311: INFO: read_testfile(0) read: 67108864 bytes
>   readahead02.c:313: INFO: read_testfile(1) read: 59244544 bytes
>   readahead02.c:316: PASS: readahead saved some I/O
>   readahead02.c:324: INFO: cache can hold at least: 264192 kB
>   readahead02.c:325: INFO: read_testfile(0) used cache: 124992 kB
>   readahead02.c:326: INFO: read_testfile(1) used cache: 12032 kB
>   readahead02.c:338: WARN: using less cache than expected
>
> Stop relying on used cache size, and use minimal sane readahead length,
> that should work across all systems.

But now instead of over estimating readahead length you definetly
underestimate it resulting in way too many readahead calls
(reahead every 4K block), which does not validate long readahead
code is working at all.

How about this patch instead.
Can you say with sufficient confidence if it solves the sporadic errors?
Also, IMO if you solve the estimation problem, there is no need to
remove the WARNING, it may still be informative.

Thanks,
Amir.

--- a/testcases/kernel/syscalls/readahead/readahead02.c
+++ b/testcases/kernel/syscalls/readahead/readahead02.c
@@ -50,6 +50,8 @@ static int ovl_mounted;
 #define OVL_WORK       MNTPOINT"/work"
 #define OVL_MNT                MNTPOINT"/ovl"
 #define MIN_SANE_READAHEAD (4u * 1024u)
+/* To avoid over estimation of readahead length */
+#define MAX_SANE_READAHEAD (1024u * 1024u)

 static const char mntpoint[] = MNTPOINT;

@@ -193,7 +195,7 @@ static int read_testfile(struct tcase *tc, int do_readahead,
                        }

                        i++;
-                       offset += max_ra_estimate;
+                       offset += MIN(max_ra_estimate, MAX_SANE_READAHEAD);
                } while ((size_t)offset < fsize);
                tst_res(TINFO, "readahead calls made: %zu", i);
                *cached = get_cached_size();
Jan Stancek March 5, 2019, 3:17 p.m. UTC | #2
----- Original Message -----
> On Tue, Mar 5, 2019 at 2:34 PM Jan Stancek <jstancek@redhat.com> wrote:
> >
> > Using system-wide "Cached" size is not accurate. The test is sporadically
> > failing with warning on ppc64le 4.18 and 5.0 kernels.
> >
> > Problem is that test over-estimates max readahead size, which then
> > leads to fewer readhead calls and kernel can silently trims length
> > in each of them:
> >   ...
> >   readahead02.c:244: INFO: Test #2: POSIX_FADV_WILLNEED on file
> >   readahead02.c:134: INFO: creating test file of size: 67108864
> >   readahead02.c:263: INFO: read_testfile(0)
> >   readahead02.c:274: INFO: read_testfile(1)
> >   readahead02.c:189: INFO: max ra estimate: 12320768
> >   readahead02.c:198: INFO: readahead calls made: 6
> >   readahead02.c:204: PASS: offset is still at 0 as expected
> >   readahead02.c:308: INFO: read_testfile(0) took: 492486 usec
> >   readahead02.c:309: INFO: read_testfile(1) took: 430627 usec
> >   readahead02.c:311: INFO: read_testfile(0) read: 67108864 bytes
> >   readahead02.c:313: INFO: read_testfile(1) read: 59244544 bytes
> >   readahead02.c:316: PASS: readahead saved some I/O
> >   readahead02.c:324: INFO: cache can hold at least: 264192 kB
> >   readahead02.c:325: INFO: read_testfile(0) used cache: 124992 kB
> >   readahead02.c:326: INFO: read_testfile(1) used cache: 12032 kB
> >   readahead02.c:338: WARN: using less cache than expected
> >
> > Stop relying on used cache size, and use minimal sane readahead length,
> > that should work across all systems.
> 
> But now instead of over estimating readahead length you definetly
> underestimate it resulting in way too many readahead calls
> (reahead every 4K block), which does not validate long readahead
> code is working at all.

We could try to cap on backing device read_ahead_kb.

> 
> How about this patch instead.
> Can you say with sufficient confidence if it solves the sporadic errors?

It likely won't. We started at 2M couple years back, but there
were several commit that keep changing the limit: [1][2]

[1] 600e19afc5f8a6c18ea49cee9511c5797db02391
[2] https://lkml.org/lkml/2016/7/25/308

> Also, IMO if you solve the estimation problem, there is no need to
> remove the WARNING, it may still be informative.
> 
> Thanks,
> Amir.
> 
> --- a/testcases/kernel/syscalls/readahead/readahead02.c
> +++ b/testcases/kernel/syscalls/readahead/readahead02.c
> @@ -50,6 +50,8 @@ static int ovl_mounted;
>  #define OVL_WORK       MNTPOINT"/work"
>  #define OVL_MNT                MNTPOINT"/ovl"
>  #define MIN_SANE_READAHEAD (4u * 1024u)
> +/* To avoid over estimation of readahead length */
> +#define MAX_SANE_READAHEAD (1024u * 1024u)
> 
>  static const char mntpoint[] = MNTPOINT;
> 
> @@ -193,7 +195,7 @@ static int read_testfile(struct tcase *tc, int
> do_readahead,
>                         }
> 
>                         i++;
> -                       offset += max_ra_estimate;
> +                       offset += MIN(max_ra_estimate, MAX_SANE_READAHEAD);
>                 } while ((size_t)offset < fsize);
>                 tst_res(TINFO, "readahead calls made: %zu", i);
>                 *cached = get_cached_size();
>
Amir Goldstein March 5, 2019, 3:33 p.m. UTC | #3
On Tue, Mar 5, 2019 at 5:17 PM Jan Stancek <jstancek@redhat.com> wrote:
>
>
> ----- Original Message -----
> > On Tue, Mar 5, 2019 at 2:34 PM Jan Stancek <jstancek@redhat.com> wrote:
> > >
> > > Using system-wide "Cached" size is not accurate. The test is sporadically
> > > failing with warning on ppc64le 4.18 and 5.0 kernels.
> > >
> > > Problem is that test over-estimates max readahead size, which then
> > > leads to fewer readhead calls and kernel can silently trims length
> > > in each of them:
> > >   ...
> > >   readahead02.c:244: INFO: Test #2: POSIX_FADV_WILLNEED on file
> > >   readahead02.c:134: INFO: creating test file of size: 67108864
> > >   readahead02.c:263: INFO: read_testfile(0)
> > >   readahead02.c:274: INFO: read_testfile(1)
> > >   readahead02.c:189: INFO: max ra estimate: 12320768
> > >   readahead02.c:198: INFO: readahead calls made: 6
> > >   readahead02.c:204: PASS: offset is still at 0 as expected
> > >   readahead02.c:308: INFO: read_testfile(0) took: 492486 usec
> > >   readahead02.c:309: INFO: read_testfile(1) took: 430627 usec
> > >   readahead02.c:311: INFO: read_testfile(0) read: 67108864 bytes
> > >   readahead02.c:313: INFO: read_testfile(1) read: 59244544 bytes
> > >   readahead02.c:316: PASS: readahead saved some I/O
> > >   readahead02.c:324: INFO: cache can hold at least: 264192 kB
> > >   readahead02.c:325: INFO: read_testfile(0) used cache: 124992 kB
> > >   readahead02.c:326: INFO: read_testfile(1) used cache: 12032 kB
> > >   readahead02.c:338: WARN: using less cache than expected
> > >
> > > Stop relying on used cache size, and use minimal sane readahead length,
> > > that should work across all systems.
> >
> > But now instead of over estimating readahead length you definetly
> > underestimate it resulting in way too many readahead calls
> > (reahead every 4K block), which does not validate long readahead
> > code is working at all.
>
> We could try to cap on backing device read_ahead_kb.

Makes sense. It should would make the test more deterministic.

>
> >
> > How about this patch instead.
> > Can you say with sufficient confidence if it solves the sporadic errors?
>
> It likely won't. We started at 2M couple years back, but there
> were several commit that keep changing the limit: [1][2]
>
> [1] 600e19afc5f8a6c18ea49cee9511c5797db02391
> [2] https://lkml.org/lkml/2016/7/25/308
>

Well, perhaps the name I chose for the constant is wrong.
I did not mean that configuration > 1MB readahead is not sane.
I meant we could call readahead syscall for at most every 1MB,
so mitigate over estimation of the loop step.

So maybe setra to 1MB on test device and call readahead
in 1MB steps without estimation?

Thanks,
Amir.
diff mbox series

Patch

diff --git a/testcases/kernel/syscalls/readahead/readahead02.c b/testcases/kernel/syscalls/readahead/readahead02.c
index 293c839e169e..2f4d7f05a550 100644
--- a/testcases/kernel/syscalls/readahead/readahead02.c
+++ b/testcases/kernel/syscalls/readahead/readahead02.c
@@ -165,13 +165,11 @@  static int read_testfile(struct tcase *tc, int do_readahead,
 	size_t i = 0;
 	long read_bytes_start;
 	unsigned char *p, tmp;
-	unsigned long cached_start, max_ra_estimate = 0;
 	off_t offset = 0;
 
 	fd = SAFE_OPEN(fname, O_RDONLY);
 
 	if (do_readahead) {
-		cached_start = get_cached_size();
 		do {
 			TEST(tc->readahead(fd, offset, fsize - offset));
 			if (TST_RET != 0) {
@@ -179,21 +177,8 @@  static int read_testfile(struct tcase *tc, int do_readahead,
 				return TST_ERR;
 			}
 
-			/* estimate max readahead size based on first call */
-			if (!max_ra_estimate) {
-				*cached = get_cached_size();
-				if (*cached > cached_start) {
-					max_ra_estimate = (1024 *
-						(*cached - cached_start));
-					tst_res(TINFO, "max ra estimate: %lu",
-						max_ra_estimate);
-				}
-				max_ra_estimate = MAX(max_ra_estimate,
-					MIN_SANE_READAHEAD);
-			}
-
 			i++;
-			offset += max_ra_estimate;
+			offset += MIN_SANE_READAHEAD;
 		} while ((size_t)offset < fsize);
 		tst_res(TINFO, "readahead calls made: %zu", i);
 		*cached = get_cached_size();
@@ -334,8 +319,6 @@  static void test_readahead(unsigned int n)
 			tst_res(TPASS, "using cache as expected");
 		else if (!cached_ra)
 			tst_res(TFAIL, "readahead failed to use any cache");
-		else
-			tst_res(TWARN, "using less cache than expected");
 	} else {
 		tst_res(TCONF, "Page cache on your system is too small "
 			"to hold whole testfile.");