diff mbox series

[v1] memcontrol03: Using clean page cache to avoid dependency on IO rate

Message ID 20240308023312.25449-1-wegao@suse.com
State Changes Requested
Headers show
Series [v1] memcontrol03: Using clean page cache to avoid dependency on IO rate | expand

Commit Message

Wei Gao March 8, 2024, 2:33 a.m. UTC
Bad IO situation(storage bandwidth ~10MB/sec) will lead background
writeback has uncertain progress for dirty page. So system can not
reclaim enough memory for new process and finally lead a unexpected
OOM.

memcontrol03.c:218: TPASS: Expect: (A/B/E memory.current=0) ~= 0
memcontrol03.c:116: TPASS: Child 1918 killed by OOM
memcontrol03.c:224: TPASS: Expect: (A/B memory.current=52588544) ~= 52428800
memcontrol03.c:129: TFAIL: Expected child 1944 to exit(0), but instead killed by SIGKILL

Signed-off-by: Wei Gao <wegao@suse.com>
---
 testcases/kernel/controllers/memcg/memcontrol03.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Cyril Hrubis March 14, 2024, 11:24 a.m. UTC | #1
Hi!
> Bad IO situation(storage bandwidth ~10MB/sec) will lead background
> writeback has uncertain progress for dirty page. So system can not
> reclaim enough memory for new process and finally lead a unexpected
> OOM.
> 
> memcontrol03.c:218: TPASS: Expect: (A/B/E memory.current=0) ~= 0
> memcontrol03.c:116: TPASS: Child 1918 killed by OOM
> memcontrol03.c:224: TPASS: Expect: (A/B memory.current=52588544) ~= 52428800
> memcontrol03.c:129: TFAIL: Expected child 1944 to exit(0), but instead killed by SIGKILL

This is strange the order of messages here does not make any sense. the
last line in the log shouldn't be there at all, the test should have
ended with the "Expect: (A/B memory.current=52588544) ~= 52428800"

Can you please send the whole log?

> Signed-off-by: Wei Gao <wegao@suse.com>
> ---
>  testcases/kernel/controllers/memcg/memcontrol03.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/testcases/kernel/controllers/memcg/memcontrol03.c b/testcases/kernel/controllers/memcg/memcontrol03.c
> index 9c6c808e0..9903ba82b 100644
> --- a/testcases/kernel/controllers/memcg/memcontrol03.c
> +++ b/testcases/kernel/controllers/memcg/memcontrol03.c
> @@ -201,6 +201,7 @@ static void test_memcg_min(void)
>  		sleep(1);
>  	}
>  
> +	sync();
>  	alloc_anon_in_child(trunk_cg[G], MB(148), 0);
>  
>  	SAFE_CG_SCANF(trunk_cg[B], "memory.current", "%ld", c);
> @@ -217,6 +218,7 @@ static void test_memcg_min(void)
>  	TST_EXP_EXPR(values_close(c[2], 0, 1),
>  		     "(A/B/E memory.current=%ld) ~= 0", c[2]);
>  
> +	sync();
>  	alloc_anon_in_child(trunk_cg[G], MB(170), 1);
>  
>  	SAFE_CG_SCANF(trunk_cg[B], "memory.current", "%ld", c);
Cyril Hrubis March 14, 2024, 3:46 p.m. UTC | #2
Hi!
> Bad IO situation(storage bandwidth ~10MB/sec) will lead background
> writeback has uncertain progress for dirty page. So system can not
> reclaim enough memory for new process and finally lead a unexpected
> OOM.
> 
> memcontrol03.c:218: TPASS: Expect: (A/B/E memory.current=0) ~= 0
> memcontrol03.c:116: TPASS: Child 1918 killed by OOM
> memcontrol03.c:224: TPASS: Expect: (A/B memory.current=52588544) ~= 52428800
> memcontrol03.c:129: TFAIL: Expected child 1944 to exit(0), but instead killed by SIGKILL

So I've read the corresponding bug:

https://bugzilla.suse.com/show_bug.cgi?id=1218178

And now I understand what this is supposed to fix. Hover doing sync() is
a too big hammer, there is no need to sync the whole system, we just
need to make sure that the pages from the file we wrote into are written
to a pernament storage, so we likely just need:

diff --git a/testcases/kernel/controllers/memcg/memcontrol03.c b/testcases/kernel/controllers/memcg/memcontrol03.c
index e927dfd19..6d9c490b5 100644
--- a/testcases/kernel/controllers/memcg/memcontrol03.c
+++ b/testcases/kernel/controllers/memcg/memcontrol03.c
@@ -144,6 +144,7 @@ static void alloc_pagecache_in_child(const struct tst_cg_group *const cg,
        tst_res(TINFO, "Child %d in %s: Allocating pagecache: %"PRIdPTR,
                getpid(), tst_cg_group_name(cg), size);
        alloc_pagecache(fd, size);
+       SAFE_FSYNC(fd);

        TST_CHECKPOINT_WAKE(CHILD_IDLE);
        TST_CHECKPOINT_WAIT(TEST_DONE);
diff mbox series

Patch

diff --git a/testcases/kernel/controllers/memcg/memcontrol03.c b/testcases/kernel/controllers/memcg/memcontrol03.c
index 9c6c808e0..9903ba82b 100644
--- a/testcases/kernel/controllers/memcg/memcontrol03.c
+++ b/testcases/kernel/controllers/memcg/memcontrol03.c
@@ -201,6 +201,7 @@  static void test_memcg_min(void)
 		sleep(1);
 	}
 
+	sync();
 	alloc_anon_in_child(trunk_cg[G], MB(148), 0);
 
 	SAFE_CG_SCANF(trunk_cg[B], "memory.current", "%ld", c);
@@ -217,6 +218,7 @@  static void test_memcg_min(void)
 	TST_EXP_EXPR(values_close(c[2], 0, 1),
 		     "(A/B/E memory.current=%ld) ~= 0", c[2]);
 
+	sync();
 	alloc_anon_in_child(trunk_cg[G], MB(170), 1);
 
 	SAFE_CG_SCANF(trunk_cg[B], "memory.current", "%ld", c);