Message ID | 20200130161337.31614-1-mdoucha@suse.cz |
---|---|
State | Superseded |
Headers | show |
Series | Taunt OOM killer in fork12 setup() | expand |
On Fri, Jan 31, 2020 at 12:13 AM Martin Doucha <mdoucha@suse.cz> wrote: > On a system with low memory, fork12 can trigger OOM killer before it hits > any fork() limits. The OOM killer might accidentally kill e.g. the parent > shell and external testing tools will assume the test failed. > > Set high oom_score_adj on the fork12 process so that the OOM killer focuses > on it and its children. > It sounds more like the OOM-Killer defect but not fork12. What we do for that is to protect the parent shell and its harness to avoid oom_kill_process() acting on them. On the other side, if we do raise the oom score of fork12, that would not guarantee OOM-Killer do right evaluation but just makes fork12 easily to be killed in testing.
----- Original Message ----- > > > On Fri, Jan 31, 2020 at 12:13 AM Martin Doucha < mdoucha@suse.cz > wrote: > > > On a system with low memory, fork12 can trigger OOM killer before it hits > any fork() limits. The OOM killer might accidentally kill e.g. the parent > shell and external testing tools will assume the test failed. > > Set high oom_score_adj on the fork12 process so that the OOM killer focuses > on it and its children. > > It sounds more like the OOM-Killer defect but not fork12. Badness score is based on proportion of rss/swap. It doesn't seem like defect to me, we just quickly spawn many small tasks. > What we do for that > is to protect the parent shell and its harness to avoid oom_kill_process() > acting on them. > > On the other side, if we do raise the oom score of fork12, that would not > guarantee OOM-Killer do right evaluation but just makes fork12 easily to be > killed in testing. fork12 is not an OOM test, so I don't see problem with this. We only need OOM to kill something we don't care about, in case it triggers. I'd move oom_score_adj after fork, so only child processes are better target, not the parent.
On Fri, Jan 31, 2020 at 5:37 PM Jan Stancek <jstancek@redhat.com> wrote: > > ----- Original Message ----- > > > > > > On Fri, Jan 31, 2020 at 12:13 AM Martin Doucha < mdoucha@suse.cz > > wrote: > > > > > > On a system with low memory, fork12 can trigger OOM killer before it hits > > any fork() limits. The OOM killer might accidentally kill e.g. the parent > > shell and external testing tools will assume the test failed. > > > > Set high oom_score_adj on the fork12 process so that the OOM killer > focuses > > on it and its children. > > > > It sounds more like the OOM-Killer defect but not fork12. > > Badness score is based on proportion of rss/swap. It doesn't seem like > defect to me, we just quickly spawn many small tasks. > Ok, but killing the parent shell is not what we wanted anyway. Though oom killer does the right evaluation and gives a high score to the whole test framework, that will break ltp reporting. > > > What we do for that > > is to protect the parent shell and its harness to avoid > oom_kill_process() > > acting on them. > > > > On the other side, if we do raise the oom score of fork12, that would not > > guarantee OOM-Killer do right evaluation but just makes fork12 easily to > be > > killed in testing. > > fork12 is not an OOM test, so I don't see problem with this. We only need > OOM > to kill something we don't care about, in case it triggers. > > I'd move oom_score_adj after fork, so only child processes are better > target, > not the parent. > Exactly, I agree with this. At least only target for the child process.
On 1/31/20 10:37 AM, Jan Stancek wrote: > ----- Original Message ---- >> It sounds more like the OOM-Killer defect but not fork12. > > Badness score is based on proportion of rss/swap. It doesn't seem > like defect to me, we just quickly spawn many small tasks. Yes, OOM killer is working as intended here. fork12 is basically a fork bomb test so it spawns thousands of processes with almost no allocated memory. Since kernel 2.6.36, OOM killer uses only two criteria to decide which process to kill: - how much memory/swap it has allocated - whether the process is privileged Since fork12 children have low memory footprint, most system processes look like better targets for OOM killer right now. But we're not testing userspace resilience against fork bomb here. We're trying to crash the kernel itself. >> What we do for that is to protect the parent shell and its harness >> to avoid oom_kill_process() acting on them. >> >> On the other side, if we do raise the oom score of fork12, that >> would not guarantee OOM-Killer do right evaluation but just makes >> fork12 easily to be killed in testing. > > fork12 is not an OOM test, so I don't see problem with this. We only > need OOM to kill something we don't care about, in case it triggers. > > I'd move oom_score_adj after fork, so only child processes are better > target, not the parent. oom_score_adj is inherited by child processes and OOM killer tries to kill first-level children if it can. So setting oom_score_adj on the main fork12 process will work exactly the way we want - OOM killer will kill one of the child processes, fork12 will notice on line 80 and exit gracefully. There could be problems only on kernels older than 2.6.36 where the number of forked children was included in OOM score calculation and the main worker process might get targeted directly (not sure if the kill-children-first approach was used back then). Either way, trying to protect the parent shell is a bad idea. We'd have to set negative oom_score_adj on it and if fork12 crashes before it can reset it back to zero, all further test processes would inherit the OOM protection.
On Fri, Jan 31, 2020 at 8:40 PM Martin Doucha <mdoucha@suse.cz> wrote: > On 1/31/20 10:37 AM, Jan Stancek wrote: > > ----- Original Message ---- > >> It sounds more like the OOM-Killer defect but not fork12. > > > > Badness score is based on proportion of rss/swap. It doesn't seem > > like defect to me, we just quickly spawn many small tasks. > > Yes, OOM killer is working as intended here. fork12 is basically a fork > bomb test so it spawns thousands of processes with almost no allocated > memory. Since kernel 2.6.36, OOM killer uses only two criteria to decide > which process to kill: > - how much memory/swap it has allocated > - whether the process is privileged > > Since fork12 children have low memory footprint, most system processes > look like better targets for OOM killer right now. But we're not testing > userspace resilience against fork bomb here. We're trying to crash the > kernel itself. > Sounds reasonable to me, thanks! > > >> What we do for that is to protect the parent shell and its harness > >> to avoid oom_kill_process() acting on them. > >> > >> On the other side, if we do raise the oom score of fork12, that > >> would not guarantee OOM-Killer do right evaluation but just makes > >> fork12 easily to be killed in testing. > > > > fork12 is not an OOM test, so I don't see problem with this. We only > > need OOM to kill something we don't care about, in case it triggers. > > > > I'd move oom_score_adj after fork, so only child processes are better > > target, not the parent. > > oom_score_adj is inherited by child processes and OOM killer tries to > kill first-level children if it can. So setting oom_score_adj on the > main fork12 process will work exactly the way we want - OOM killer will > kill one of the child processes, fork12 will notice on line 80 and exit > gracefully. > Theoretically yes! Isn't it makes the main fork12 more robust if not set high score to it? > There could be problems only on kernels older than 2.6.36 where the > number of forked children was included in OOM score calculation and the > main worker process might get targeted directly (not sure if the > kill-children-first approach was used back then). > It sounds a little tricky. The method I can think of now is to reset the max process limitation for the lower memory system, to make the test includes fork bomb but NOT costs too much resource in fork12? > > Either way, trying to protect the parent shell is a bad idea. We'd have > to set negative oom_score_adj on it and if fork12 crashes before it can > reset it back to zero, all further test processes would inherit the OOM > protection. > My bad! The way we have tried is to set a negative score for the test harness and LTP related process, then reset parent shell score to 0, which avoids other tests inherit OOM protection from the shell.
diff --git a/testcases/kernel/syscalls/fork/fork12.c b/testcases/kernel/syscalls/fork/fork12.c index 75278b012..99b6900f4 100644 --- a/testcases/kernel/syscalls/fork/fork12.c +++ b/testcases/kernel/syscalls/fork/fork12.c @@ -108,6 +108,8 @@ int main(int ac, char **av) static void setup(void) { tst_sig(FORK, fork12_sigs, cleanup); + /* Taunt the OOM killer so that it doesn't kill system processes */ + SAFE_FILE_PRINTF(cleanup, "/proc/self/oom_score_adj", "500"); TEST_PAUSE; }
On a system with low memory, fork12 can trigger OOM killer before it hits any fork() limits. The OOM killer might accidentally kill e.g. the parent shell and external testing tools will assume the test failed. Set high oom_score_adj on the fork12 process so that the OOM killer focuses on it and its children. Signed-off-by: Martin Doucha <mdoucha@suse.cz> --- testcases/kernel/syscalls/fork/fork12.c | 2 ++ 1 file changed, 2 insertions(+)