diff mbox series

madvise06: Raise the bar for judging failure

Message ID 20230218040919.3548296-1-liwang@redhat.com
State Superseded
Headers show
Series madvise06: Raise the bar for judging failure | expand

Commit Message

Li Wang Feb. 18, 2023, 4:09 a.m. UTC
There is an intermittent failure which we have observed many times whether
on rhel or mainline kernel. But we're unable to stable reproduce it:

    43	madvise06.c:201: TFAIL: less than 102400 Kb were moved to the swap cache
    ...

However it does not look like a kernel issue, because SwapCached change is
not strictly abiding by the principle of MADV_WILLNEED advice. That means it
all depends on the kernel's specific circumstances. The value of the threshold
is debatable at least from my point of view, its use 1/4 is not guaranteed
100% safe.

As MADV_WILLNEED is just advice to the kernel, not a guarantee. The kernel may
choose to ignore the advice, or may prioritize other memory management tasks
over pre-loading the advised pages.

So this patch is aimed at improving the accuracy and clarity of the test results.
Specifically, the use of two separate variables to track the results of different
comparisons will make it easier to understand what the test is doing.

Additionally, the change to report a test result of "TINFO" instead of "TFAIL"
when the swap cache size is less than expected would be intended to indicate
that this is an acceptable outcome.

Finally, the change to the second tst_res call is intended to make the test more
lenient, as it now passes if either no page faults occur or the swap cache size
is larger than expected.

Reported-by: Paul Bunyan <pbunyan@redhat.com>
Signed-off-by: Li Wang <liwang@redhat.com>
Cc: Richard Palethorpe <rpalethorpe@suse.de>
Cc: Yongqiang Liu <liuyongqiang13@huawei.com>
Cc: Eirik Fuller <efuller@redhat.com>
---
 testcases/kernel/syscalls/madvise/madvise06.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

Comments

Richard Palethorpe Feb. 27, 2023, 11:33 a.m. UTC | #1
Hell Li,

Li Wang <liwang@redhat.com> writes:

> There is an intermittent failure which we have observed many times whether
> on rhel or mainline kernel. But we're unable to stable reproduce it:
>
>     43	madvise06.c:201: TFAIL: less than 102400 Kb were moved to the swap cache
>     ...
>
> However it does not look like a kernel issue, because SwapCached change is
> not strictly abiding by the principle of MADV_WILLNEED advice. That means it
> all depends on the kernel's specific circumstances. The value of the threshold
> is debatable at least from my point of view, its use 1/4 is not guaranteed
> 100% safe.
>
> As MADV_WILLNEED is just advice to the kernel, not a guarantee. The kernel may
> choose to ignore the advice, or may prioritize other memory management tasks
> over pre-loading the advised pages.
>
> So this patch is aimed at improving the accuracy and clarity of the test results.
> Specifically, the use of two separate variables to track the results of different
> comparisons will make it easier to understand what the test is doing.
>
> Additionally, the change to report a test result of "TINFO" instead of "TFAIL"
> when the swap cache size is less than expected would be intended to indicate
> that this is an acceptable outcome.
>
> Finally, the change to the second tst_res call is intended to make the test more
> lenient, as it now passes if either no page faults occur or the swap cache size
> is larger than expected.

Why not skip to making them all TINFO?

It's undefined what action will result from MADV_WILLNEED. If it were
better for performance *not* to read in pages, then it would be valid
for the kernel to ignore it.

Yang Xu added a tag for a perf regression that it could
reproduce. However looking at the kernel commit this was first found by
stress-ng.

commit 66383800df9cbdbf3b0c34d5a51bf35bcdb72fd2
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Sat Nov 21 22:17:22 2020 -0800

    mm: fix madvise WILLNEED performance problem

    The calculation of the end page index was incorrect, leading to a
    regression of 70% when running stress-ng.

    With this fix, we instead see a performance improvement of 3%

I found a bug with this test, but it was causing an Oops. It wouldn't
matter if the test printed pass or fail.

So I think we are wasting our time by constantly tweaking this test.
Li Wang Feb. 28, 2023, 5:45 a.m. UTC | #2
Hi Richard,

On Mon, Feb 27, 2023 at 8:27 PM Richard Palethorpe <rpalethorpe@suse.de>
wrote:

> Hell Li,
>
> Li Wang <liwang@redhat.com> writes:
>
> > There is an intermittent failure which we have observed many times
> whether
> > on rhel or mainline kernel. But we're unable to stable reproduce it:
> >
> >     43        madvise06.c:201: TFAIL: less than 102400 Kb were moved to
> the swap cache
> >     ...
> >
> > However it does not look like a kernel issue, because SwapCached change
> is
> > not strictly abiding by the principle of MADV_WILLNEED advice. That
> means it
> > all depends on the kernel's specific circumstances. The value of the
> threshold
> > is debatable at least from my point of view, its use 1/4 is not
> guaranteed
> > 100% safe.
> >
> > As MADV_WILLNEED is just advice to the kernel, not a guarantee. The
> kernel may
> > choose to ignore the advice, or may prioritize other memory management
> tasks
> > over pre-loading the advised pages.
> >
> > So this patch is aimed at improving the accuracy and clarity of the test
> results.
> > Specifically, the use of two separate variables to track the results of
> different
> > comparisons will make it easier to understand what the test is doing.
> >
> > Additionally, the change to report a test result of "TINFO" instead of
> "TFAIL"
> > when the swap cache size is less than expected would be intended to
> indicate
> > that this is an acceptable outcome.
> >
> > Finally, the change to the second tst_res call is intended to make the
> test more
> > lenient, as it now passes if either no page faults occur or the swap
> cache size
> > is larger than expected.
>
> Why not skip to making them all TINFO?
>
> It's undefined what action will result from MADV_WILLNEED. If it were
> better for performance *not* to read in pages, then it would be valid
> for the kernel to ignore it.
>

Yes, but I didn't do that because madvise06 test checks free_mem/free_swap
size at the beginning, it garantee the system at least with 2 * CHUNK_SZ
(800MB + 800MB) memory for the test performing, unless there is something
happening parallel otherwise kernel will handle MADV_WILLNEED request
correctly for most scenarios.

And we indeed do not see page-faults failure out of expected
anymore since commit 00e769e63515e51, so I just combined the
two judgments together in this patch. I believe it's enough and also
give a leeway to the kernel.

I hope there could be a lenient test for MADV_WILLNEED.
I will decisively take your suggestion once the failure appears again next
time.



>
> Yang Xu added a tag for a perf regression that it could
> reproduce. However looking at the kernel commit this was first found by
> stress-ng.
>
> commit 66383800df9cbdbf3b0c34d5a51bf35bcdb72fd2
> Author: Matthew Wilcox (Oracle) <willy@infradead.org>
> Date:   Sat Nov 21 22:17:22 2020 -0800
>
>     mm: fix madvise WILLNEED performance problem
>
>     The calculation of the end page index was incorrect, leading to a
>     regression of 70% when running stress-ng.
>
>     With this fix, we instead see a performance improvement of 3%
>
> I found a bug with this test, but it was causing an Oops. It wouldn't
> matter if the test printed pass or fail.
>
> So I think we are wasting our time by constantly tweaking this test.
>
> --
> Thank you,
> Richard.
>
>
diff mbox series

Patch

diff --git a/testcases/kernel/syscalls/madvise/madvise06.c b/testcases/kernel/syscalls/madvise/madvise06.c
index c7967ae6f..5bd428bd9 100644
--- a/testcases/kernel/syscalls/madvise/madvise06.c
+++ b/testcases/kernel/syscalls/madvise/madvise06.c
@@ -164,7 +164,7 @@  static int get_page_fault_num(void)
 
 static void test_advice_willneed(void)
 {
-	int loops = 100, res;
+	int loops = 100, res1, res2;
 	char *target;
 	long swapcached_start, swapcached;
 	int page_fault_num_1, page_fault_num_2;
@@ -197,10 +197,10 @@  static void test_advice_willneed(void)
 	} while (swapcached < swapcached_start + PASS_THRESHOLD_KB && loops > 0);
 
 	meminfo_diag("After madvise");
-	res = swapcached > swapcached_start + PASS_THRESHOLD_KB;
-	tst_res(res ? TPASS : TFAIL,
+	res1 = swapcached > swapcached_start + PASS_THRESHOLD_KB;
+	tst_res(res1 ? TPASS : TINFO,
 		"%s than %ld Kb were moved to the swap cache",
-		res ? "more" : "less", PASS_THRESHOLD_KB);
+		res1 ? "more" : "less", PASS_THRESHOLD_KB);
 
 	loops = 100;
 	SAFE_FILE_LINES_SCANF("/proc/meminfo", "SwapCached: %ld", &swapcached_start);
@@ -225,9 +225,9 @@  static void test_advice_willneed(void)
 			page_fault_num_2);
 	meminfo_diag("After page access");
 
-	res = page_fault_num_2 - page_fault_num_1;
-	tst_res(res == 0 ? TPASS : TFAIL,
-		"%d pages were faulted out of 3 max", res);
+	res2 = page_fault_num_2 - page_fault_num_1;
+	tst_res(((res2 == 0) || res1) ? TPASS : TFAIL,
+		"%d pages were faulted out of 3 max", res2);
 
 	SAFE_MUNMAP(target, CHUNK_SZ);
 }