Message ID | 1611570288-23040-1-git-send-email-liuxp11@chinatelecom.cn |
---|---|
State | Changes Requested |
Headers | show |
Series | [1/2] syscalls/ioctl: ioctl_sg01.c: ioctl_sg01 invoked oom-killer | expand |
Hi Xinpeng, On Wed, Jan 27, 2021 at 11:28 AM Xinpeng Liu <liuxp11@chinatelecom.cn> wrote: > Kernel version is 5.4.81+,the available RAM is less than free,as follow: > [root@liuxp mywork]# head /proc/meminfo > MemTotal: 198101744 kB > MemFree: 189303148 kB > MemAvailable: 188566732 kB > > So use available RAM to avoid OOM killer. > > Signed-off-by: Xinpeng Liu <liuxp11@chinatelecom.cn> > --- > lib/tst_memutils.c | 29 ++++++++++++++++++++++++++--- > 1 file changed, 26 insertions(+), 3 deletions(-) > > diff --git a/lib/tst_memutils.c b/lib/tst_memutils.c > index dd09db4..21df9a8 100644 > --- a/lib/tst_memutils.c > +++ b/lib/tst_memutils.c > @@ -10,14 +10,33 @@ > > #define TST_NO_DEFAULT_MAIN > #include "tst_test.h" > +#include "tst_safe_stdio.h" > > #define BLOCKSIZE (16 * 1024 * 1024) > > +static unsigned long get_available_ram(void) > +{ > + char buf[60]; /* actual lines we expect are ~30 chars or less */ > + unsigned long available_kb = 0; > + FILE *fp; > + > + fp = SAFE_FOPEN("/proc/meminfo","r"); > + while (fgets(buf, sizeof(buf), fp)) { > + if (sscanf(buf, "MemAvailable: %lu %*s\n", &available_kb) > == 1){ > + break; > + } > + } > + SAFE_FCLOSE(fp); > + > + return 1024 * available_kb; > +} > + > void tst_pollute_memory(size_t maxsize, int fillchar) > { > size_t i, map_count = 0, safety = 0, blocksize = BLOCKSIZE; > void **map_blocks; > struct sysinfo info; > + unsigned long available_ram = get_available_ram(); > LTP provides SAFE_READ_MEMINFO() macro to be used in /proc/meminfo reading. See: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/swapping/swapping01.c#L85 > > SAFE_SYSINFO(&info); > safety = MAX(4096 * SAFE_SYSCONF(_SC_PAGESIZE), 128 * 1024 * 1024); > @@ -26,15 +45,19 @@ void tst_pollute_memory(size_t maxsize, int fillchar) > if (info.freeswap > safety) > safety = 0; > > + /*"MemAvailable" field maybe not exist, or freeram less than > available_ram*/ > + if(available_ram == 0 || info.freeram < available_ram) > + available_ram = info.freeram; > + > /* Not enough free memory to avoid invoking OOM killer */ > - if (info.freeram <= safety) > + if (available_ram <= safety) > return; > > if (!maxsize) > maxsize = SIZE_MAX; > > - if (info.freeram - safety < maxsize / info.mem_unit) > - maxsize = (info.freeram - safety) * info.mem_unit; > + if (available_ram - safety < maxsize / info.mem_unit) > + maxsize = (available_ram - safety) * info.mem_unit; > > blocksize = MIN(maxsize, blocksize); > map_count = maxsize / blocksize; > -- > 1.8.3.1 > > > -- > Mailing list info: https://lists.linux.it/listinfo/ltp > >
ok,thanks for your direction! From: Li Wang Date: 2021-01-27 12:27 To: Xinpeng Liu CC: LTP List Subject: Re: [LTP] [PATCH 1/2] syscalls/ioctl: ioctl_sg01.c: ioctl_sg01 invoked oom-killer Hi Xinpeng, On Wed, Jan 27, 2021 at 11:28 AM Xinpeng Liu <liuxp11@chinatelecom.cn> wrote: Kernel version is 5.4.81+,the available RAM is less than free,as follow: [root@liuxp mywork]# head /proc/meminfo MemTotal: 198101744 kB MemFree: 189303148 kB MemAvailable: 188566732 kB So use available RAM to avoid OOM killer. Signed-off-by: Xinpeng Liu <liuxp11@chinatelecom.cn> --- lib/tst_memutils.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/lib/tst_memutils.c b/lib/tst_memutils.c index dd09db4..21df9a8 100644 --- a/lib/tst_memutils.c +++ b/lib/tst_memutils.c @@ -10,14 +10,33 @@ #define TST_NO_DEFAULT_MAIN #include "tst_test.h" +#include "tst_safe_stdio.h" #define BLOCKSIZE (16 * 1024 * 1024) +static unsigned long get_available_ram(void) +{ + char buf[60]; /* actual lines we expect are ~30 chars or less */ + unsigned long available_kb = 0; + FILE *fp; + + fp = SAFE_FOPEN("/proc/meminfo","r"); + while (fgets(buf, sizeof(buf), fp)) { + if (sscanf(buf, "MemAvailable: %lu %*s\n", &available_kb) == 1){ + break; + } + } + SAFE_FCLOSE(fp); + + return 1024 * available_kb; +} + void tst_pollute_memory(size_t maxsize, int fillchar) { size_t i, map_count = 0, safety = 0, blocksize = BLOCKSIZE; void **map_blocks; struct sysinfo info; + unsigned long available_ram = get_available_ram(); LTP provides SAFE_READ_MEMINFO() macro to be used in /proc/meminfo reading. See: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/swapping/swapping01.c#L85 SAFE_SYSINFO(&info); safety = MAX(4096 * SAFE_SYSCONF(_SC_PAGESIZE), 128 * 1024 * 1024); @@ -26,15 +45,19 @@ void tst_pollute_memory(size_t maxsize, int fillchar) if (info.freeswap > safety) safety = 0; + /*"MemAvailable" field maybe not exist, or freeram less than available_ram*/ + if(available_ram == 0 || info.freeram < available_ram) + available_ram = info.freeram; + /* Not enough free memory to avoid invoking OOM killer */ - if (info.freeram <= safety) + if (available_ram <= safety) return; if (!maxsize) maxsize = SIZE_MAX; - if (info.freeram - safety < maxsize / info.mem_unit) - maxsize = (info.freeram - safety) * info.mem_unit; + if (available_ram - safety < maxsize / info.mem_unit) + maxsize = (available_ram - safety) * info.mem_unit; blocksize = MIN(maxsize, blocksize); map_count = maxsize / blocksize;
Hi Li, Have a question about using macro SAFE_READ_MEMINFO get MemAvailable value, Some old kernels maybe not privode "MemAvailable" field, which will broken. From: Li Wang Date: 2021-01-27 12:27 To: Xinpeng Liu CC: LTP List Subject: Re: [LTP] [PATCH 1/2] syscalls/ioctl: ioctl_sg01.c: ioctl_sg01 invoked oom-killer Hi Xinpeng, On Wed, Jan 27, 2021 at 11:28 AM Xinpeng Liu <liuxp11@chinatelecom.cn> wrote: Kernel version is 5.4.81+,the available RAM is less than free,as follow: [root@liuxp mywork]# head /proc/meminfo MemTotal: 198101744 kB MemFree: 189303148 kB MemAvailable: 188566732 kB So use available RAM to avoid OOM killer. Signed-off-by: Xinpeng Liu <liuxp11@chinatelecom.cn> --- lib/tst_memutils.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/lib/tst_memutils.c b/lib/tst_memutils.c index dd09db4..21df9a8 100644 --- a/lib/tst_memutils.c +++ b/lib/tst_memutils.c @@ -10,14 +10,33 @@ #define TST_NO_DEFAULT_MAIN #include "tst_test.h" +#include "tst_safe_stdio.h" #define BLOCKSIZE (16 * 1024 * 1024) +static unsigned long get_available_ram(void) +{ + char buf[60]; /* actual lines we expect are ~30 chars or less */ + unsigned long available_kb = 0; + FILE *fp; + + fp = SAFE_FOPEN("/proc/meminfo","r"); + while (fgets(buf, sizeof(buf), fp)) { + if (sscanf(buf, "MemAvailable: %lu %*s\n", &available_kb) == 1){ + break; + } + } + SAFE_FCLOSE(fp); + + return 1024 * available_kb; +} + void tst_pollute_memory(size_t maxsize, int fillchar) { size_t i, map_count = 0, safety = 0, blocksize = BLOCKSIZE; void **map_blocks; struct sysinfo info; + unsigned long available_ram = get_available_ram(); LTP provides SAFE_READ_MEMINFO() macro to be used in /proc/meminfo reading. See: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/swapping/swapping01.c#L85 SAFE_SYSINFO(&info); safety = MAX(4096 * SAFE_SYSCONF(_SC_PAGESIZE), 128 * 1024 * 1024); @@ -26,15 +45,19 @@ void tst_pollute_memory(size_t maxsize, int fillchar) if (info.freeswap > safety) safety = 0; + /*"MemAvailable" field maybe not exist, or freeram less than available_ram*/ + if(available_ram == 0 || info.freeram < available_ram) + available_ram = info.freeram; + /* Not enough free memory to avoid invoking OOM killer */ - if (info.freeram <= safety) + if (available_ram <= safety) return; if (!maxsize) maxsize = SIZE_MAX; - if (info.freeram - safety < maxsize / info.mem_unit) - maxsize = (info.freeram - safety) * info.mem_unit; + if (available_ram - safety < maxsize / info.mem_unit) + maxsize = (available_ram - safety) * info.mem_unit; blocksize = MIN(maxsize, blocksize); map_count = maxsize / blocksize;
Hi Xinpeng, I sent to you the case swapping01 solving this(via FILE_LINES_SCANF) already, feel free to take an reference: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/swapping/swapping01.c#L85 On Wed, Jan 27, 2021 at 2:54 PM liuxp11@chinatelecom.cn < liuxp11@chinatelecom.cn> wrote: > Hi Li, > Have a question about using macro SAFE_READ_MEMINFO get MemAvailable > value, > Some old kernels maybe not privode "MemAvailable" field, which will > broken. > > > *From:* Li Wang <liwang@redhat.com> > *Date:* 2021-01-27 12:27 > *To:* Xinpeng Liu <liuxp11@chinatelecom.cn> > *CC:* LTP List <ltp@lists.linux.it> > *Subject:* Re: [LTP] [PATCH 1/2] syscalls/ioctl: ioctl_sg01.c: ioctl_sg01 > invoked oom-killer > Hi Xinpeng, > > On Wed, Jan 27, 2021 at 11:28 AM Xinpeng Liu <liuxp11@chinatelecom.cn> > wrote: > >> Kernel version is 5.4.81+,the available RAM is less than free,as follow: >> [root@liuxp mywork]# head /proc/meminfo >> MemTotal: 198101744 kB >> MemFree: 189303148 kB >> MemAvailable: 188566732 kB >> >> So use available RAM to avoid OOM killer. >> >> Signed-off-by: Xinpeng Liu <liuxp11@chinatelecom.cn> >> --- >> lib/tst_memutils.c | 29 ++++++++++++++++++++++++++--- >> 1 file changed, 26 insertions(+), 3 deletions(-) >> >> diff --git a/lib/tst_memutils.c b/lib/tst_memutils.c >> index dd09db4..21df9a8 100644 >> --- a/lib/tst_memutils.c >> +++ b/lib/tst_memutils.c >> @@ -10,14 +10,33 @@ >> >> #define TST_NO_DEFAULT_MAIN >> #include "tst_test.h" >> +#include "tst_safe_stdio.h" >> >> #define BLOCKSIZE (16 * 1024 * 1024) >> >> +static unsigned long get_available_ram(void) >> +{ >> + char buf[60]; /* actual lines we expect are ~30 chars or less */ >> + unsigned long available_kb = 0; >> + FILE *fp; >> + >> + fp = SAFE_FOPEN("/proc/meminfo","r"); >> + while (fgets(buf, sizeof(buf), fp)) { >> + if (sscanf(buf, "MemAvailable: %lu %*s\n", &available_kb) >> == 1){ >> + break; >> + } >> + } >> + SAFE_FCLOSE(fp); >> + >> + return 1024 * available_kb; >> +} >> + >> void tst_pollute_memory(size_t maxsize, int fillchar) >> { >> size_t i, map_count = 0, safety = 0, blocksize = BLOCKSIZE; >> void **map_blocks; >> struct sysinfo info; >> + unsigned long available_ram = get_available_ram(); >> > > LTP provides SAFE_READ_MEMINFO() macro to be used in /proc/meminfo reading. > See: > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/swapping/swapping01.c#L85 > > > >> >> SAFE_SYSINFO(&info); >> safety = MAX(4096 * SAFE_SYSCONF(_SC_PAGESIZE), 128 * 1024 * >> 1024); >> @@ -26,15 +45,19 @@ void tst_pollute_memory(size_t maxsize, int fillchar) >> if (info.freeswap > safety) >> safety = 0; >> >> + /*"MemAvailable" field maybe not exist, or freeram less than >> available_ram*/ >> + if(available_ram == 0 || info.freeram < available_ram) >> + available_ram = info.freeram; >> + >> /* Not enough free memory to avoid invoking OOM killer */ >> - if (info.freeram <= safety) >> + if (available_ram <= safety) >> return; >> >> if (!maxsize) >> maxsize = SIZE_MAX; >> >> - if (info.freeram - safety < maxsize / info.mem_unit) >> - maxsize = (info.freeram - safety) * info.mem_unit; >> + if (available_ram - safety < maxsize / info.mem_unit) >> + maxsize = (available_ram - safety) * info.mem_unit; >> >> blocksize = MIN(maxsize, blocksize); >> map_count = maxsize / blocksize; >> -- >> 1.8.3.1 >> >> >> -- >> Mailing list info: https://lists.linux.it/listinfo/ltp >> >> > > -- > Regards, > Li Wang > >
Li Wang <liwang@redhat.com> wrote: Hi Xinpeng, > > I sent to you the case swapping01 solving this(via FILE_LINES_SCANF) > already, feel free to take an reference: > > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/swapping/swapping01.c#L85 > Or, maybe we can extract this process into a function and put it in tst_memutils.h, to convinently reuse by other testcases too? void tst_get_MemAvailable(void);
In this testcase,we first check MemAvailable. If MemAvailable doesn't exist,then use info.freeram. Maybe not other cases need do these. From: Li Wang Date: 2021-01-27 15:58 To: liuxp11@chinatelecom.cn CC: ltp Subject: Re: Re: [LTP] [PATCH 1/2] syscalls/ioctl: ioctl_sg01.c: ioctl_sg01 invoked oom-killer Hi Xinpeng, I sent to you the case swapping01 solving this(via FILE_LINES_SCANF) already, feel free to take an reference: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/swapping/swapping01.c#L85 On Wed, Jan 27, 2021 at 2:54 PM liuxp11@chinatelecom.cn <liuxp11@chinatelecom.cn> wrote: Hi Li, Have a question about using macro SAFE_READ_MEMINFO get MemAvailable value, Some old kernels maybe not privode "MemAvailable" field, which will broken. From: Li Wang Date: 2021-01-27 12:27 To: Xinpeng Liu CC: LTP List Subject: Re: [LTP] [PATCH 1/2] syscalls/ioctl: ioctl_sg01.c: ioctl_sg01 invoked oom-killer Hi Xinpeng, On Wed, Jan 27, 2021 at 11:28 AM Xinpeng Liu <liuxp11@chinatelecom.cn> wrote: Kernel version is 5.4.81+,the available RAM is less than free,as follow: [root@liuxp mywork]# head /proc/meminfo MemTotal: 198101744 kB MemFree: 189303148 kB MemAvailable: 188566732 kB So use available RAM to avoid OOM killer. Signed-off-by: Xinpeng Liu <liuxp11@chinatelecom.cn> --- lib/tst_memutils.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/lib/tst_memutils.c b/lib/tst_memutils.c index dd09db4..21df9a8 100644 --- a/lib/tst_memutils.c +++ b/lib/tst_memutils.c @@ -10,14 +10,33 @@ #define TST_NO_DEFAULT_MAIN #include "tst_test.h" +#include "tst_safe_stdio.h" #define BLOCKSIZE (16 * 1024 * 1024) +static unsigned long get_available_ram(void) +{ + char buf[60]; /* actual lines we expect are ~30 chars or less */ + unsigned long available_kb = 0; + FILE *fp; + + fp = SAFE_FOPEN("/proc/meminfo","r"); + while (fgets(buf, sizeof(buf), fp)) { + if (sscanf(buf, "MemAvailable: %lu %*s\n", &available_kb) == 1){ + break; + } + } + SAFE_FCLOSE(fp); + + return 1024 * available_kb; +} + void tst_pollute_memory(size_t maxsize, int fillchar) { size_t i, map_count = 0, safety = 0, blocksize = BLOCKSIZE; void **map_blocks; struct sysinfo info; + unsigned long available_ram = get_available_ram(); LTP provides SAFE_READ_MEMINFO() macro to be used in /proc/meminfo reading. See: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/swapping/swapping01.c#L85 SAFE_SYSINFO(&info); safety = MAX(4096 * SAFE_SYSCONF(_SC_PAGESIZE), 128 * 1024 * 1024); @@ -26,15 +45,19 @@ void tst_pollute_memory(size_t maxsize, int fillchar) if (info.freeswap > safety) safety = 0; + /*"MemAvailable" field maybe not exist, or freeram less than available_ram*/ + if(available_ram == 0 || info.freeram < available_ram) + available_ram = info.freeram; + /* Not enough free memory to avoid invoking OOM killer */ - if (info.freeram <= safety) + if (available_ram <= safety) return; if (!maxsize) maxsize = SIZE_MAX; - if (info.freeram - safety < maxsize / info.mem_unit) - maxsize = (info.freeram - safety) * info.mem_unit; + if (available_ram - safety < maxsize / info.mem_unit) + maxsize = (available_ram - safety) * info.mem_unit; blocksize = MIN(maxsize, blocksize); map_count = maxsize / blocksize;
Hi! > > I sent to you the case swapping01 solving this(via FILE_LINES_SCANF) > > already, feel free to take an reference: > > > > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/swapping/swapping01.c#L85 > > > > Or, maybe we can extract this process into a function and put it in > tst_memutils.h, to convinently reuse by other testcases too? > > void tst_get_MemAvailable(void); Please do not use CamelCase. It should be tst_get_mem_available(void) and it should also return unsigned long.
Hi! > MemTotal: 198101744 kB > MemFree: 189303148 kB > MemAvailable: 188566732 kB This sounds really strange, usually MemFree is smaller than MemAvailable since MemFree does not include page cache. Is this freshly booted system? Can you send the whole meminfo here please?
> Signed-off-by: Xinpeng Liu <liuxp11@chinatelecom.cn> > --- > lib/tst_memutils.c | 29 ++++++++++++++++++++++++++--- > 1 file changed, 26 insertions(+), 3 deletions(-) > > diff --git a/lib/tst_memutils.c b/lib/tst_memutils.c > index dd09db4..21df9a8 100644 > --- a/lib/tst_memutils.c > +++ b/lib/tst_memutils.c > @@ -10,14 +10,33 @@ > > #define TST_NO_DEFAULT_MAIN > #include "tst_test.h" > +#include "tst_safe_stdio.h" > > #define BLOCKSIZE (16 * 1024 * 1024) > > +static unsigned long get_available_ram(void) > +{ Can we prefix this function with tst_ and make it non-static? I guess that there may be other tests that may use it later on. > + char buf[60]; /* actual lines we expect are ~30 chars or less */ > + unsigned long available_kb = 0; > + FILE *fp; > + > + fp = SAFE_FOPEN("/proc/meminfo","r"); > + while (fgets(buf, sizeof(buf), fp)) { > + if (sscanf(buf, "MemAvailable: %lu %*s\n", &available_kb) == 1){ > + break; > + } > + } > + SAFE_FCLOSE(fp); Just use FILE_LINES_SCANF() instead. Also we should fall back to something as 90% of (MemFree + Cached) here if MemAvailable is not present so that the function returns sensible number on older kernels as well. > + return 1024 * available_kb; Can we just return kilobytes instead? It will be less likely to overflow if we do all the calculations in kilobytes instead.
On 27. 01. 21 10:39, Cyril Hrubis wrote: > Hi! >> MemTotal: 198101744 kB >> MemFree: 189303148 kB >> MemAvailable: 188566732 kB > > This sounds really strange, usually MemFree is smaller than MemAvailable > since MemFree does not include page cache. > > Is this freshly booted system? > > Can you send the whole meminfo here please? Please also send the contents of /proc/sys/vm/min_free_kbytes. I suspect that that's where the OOM issues are actually coming from.
On Wed, Jan 27, 2021 at 5:23 PM Cyril Hrubis <chrubis@suse.cz> wrote: > Hi! > > > I sent to you the case swapping01 solving this(via FILE_LINES_SCANF) > > > already, feel free to take an reference: > > > > > > > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/swapping/swapping01.c#L85 > > > > > > > Or, maybe we can extract this process into a function and put it in > > tst_memutils.h, to convinently reuse by other testcases too? > > > > void tst_get_MemAvailable(void); > > Please do not use CamelCase. > +1 Sorry, I just pasted the name by the editor, and YES, we should avoid that. > It should be tst_get_mem_available(void) and it should also return > unsigned long. > Absolutely right. liuxp11@chinatelecom.cn <liuxp11@chinatelecom.cn> wrote: In this testcase,we first check MemAvailable. If MemAvailable doesn't > exist,then use info.freeram. > Maybe not other cases need do these. > Yes, but we also could make use of tst_get_mem_available() here because, in the patch, you're trying to avoid that (MemFree > MemAvailable) situation. If the fix is correct, we just need to get both and choose the smaller one to use, isn't it? > Have a question about using macro SAFE_READ_MEMINFO get MemAvailable >> value, >> Some old kernels maybe not privode "MemAvailable" field, which will >> broken. >> > The most different of SAFE_* macro is that will exit with TBROK if not get it expected. As Cyril proposes another one FILE_LINES_SCANF has no such concern.
[root@test-env-nm05-compute-14e5e72e38 ~]# cat /proc/meminfo MemTotal: 526997420 kB MemFree: 520224908 kB MemAvailable: 519936744 kB Buffers: 0 kB Cached: 2509036 kB SwapCached: 0 kB Active: 906868 kB Inactive: 2398084 kB Active(anon): 816396 kB Inactive(anon): 77236 kB Active(file): 90472 kB Inactive(file): 2320848 kB Unevictable: 610056 kB Mlocked: 610056 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 1406336 kB Mapped: 118628 kB Shmem: 84400 kB KReclaimable: 193752 kB Slab: 703668 kB SReclaimable: 193752 kB SUnreclaim: 509916 kB KernelStack: 12672 kB PageTables: 14056 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 263498708 kB Committed_AS: 10263760 kB VmallocTotal: 34359738367 kB VmallocUsed: 651756 kB VmallocChunk: 0 kB Percpu: 62464 kB HardwareCorrupted: 88 kB AnonHugePages: 0 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB FileHugePages: 0 kB FilePmdMapped: 0 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 1371808 kB DirectMap2M: 49661952 kB DirectMap1G: 486539264 kB [root@test-env-nm05-compute-14e5e72e38 ~]# cat /proc/sys/vm/min_free_kbytes 90112 From: Cyril Hrubis Date: 2021-01-27 17:39 To: Xinpeng Liu CC: ltp Subject: Re: [LTP] [PATCH 1/2] syscalls/ioctl: ioctl_sg01.c: ioctl_sg01 invoked oom-killer Hi! > MemTotal: 198101744 kB > MemFree: 189303148 kB > MemAvailable: 188566732 kB This sounds really strange, usually MemFree is smaller than MemAvailable since MemFree does not include page cache. Is this freshly booted system? Can you send the whole meminfo here please?
[root@test-env-nm05-compute-14e5e72e38 ~]# w 18:05:00 up 15 days, 1:06, 3 users, load average: 0.39, 0.42, 0.47 From: Martin Doucha Date: 2021-01-27 17:53 To: Cyril Hrubis; Xinpeng Liu CC: ltp Subject: Re: [LTP] [PATCH 1/2] syscalls/ioctl: ioctl_sg01.c: ioctl_sg01 invoked oom-killer On 27. 01. 21 10:39, Cyril Hrubis wrote: > Hi! >> MemTotal: 198101744 kB >> MemFree: 189303148 kB >> MemAvailable: 188566732 kB > > This sounds really strange, usually MemFree is smaller than MemAvailable > since MemFree does not include page cache. > > Is this freshly booted system? > > Can you send the whole meminfo here please? Please also send the contents of /proc/sys/vm/min_free_kbytes. I suspect that that's where the OOM issues are actually coming from.
On 27. 01. 21 11:04, liuxp11@chinatelecom.cn wrote: > [root@test-env-nm05-compute-14e5e72e38 ~]# cat /proc/sys/vm/min_free_kbytes > 90112 Yep, there it is. min_free_kbytes is 90MB but we only leave 64MB safety margin. I'll prepare a patch that'll increase safety margin to 2*min_free_kbytes if needed.
safety = MAX(4096 * SAFE_SYSCONF(_SC_PAGESIZE), 128 * 1024 * 1024); now safety margin is 128MB,not 64MB. Right? From: Martin Doucha Date: 2021-01-27 19:28 To: liuxp11@chinatelecom.cn; Cyril Hrubis CC: ltp Subject: Re: [LTP] [PATCH 1/2] syscalls/ioctl: ioctl_sg01.c: ioctl_sg01 invoked oom-killer On 27. 01. 21 11:04, liuxp11@chinatelecom.cn wrote: > [root@test-env-nm05-compute-14e5e72e38 ~]# cat /proc/sys/vm/min_free_kbytes > 90112 Yep, there it is. min_free_kbytes is 90MB but we only leave 64MB safety margin. I'll prepare a patch that'll increase safety margin to 2*min_free_kbytes if needed.
On 27. 01. 21 12:41, liuxp11@chinatelecom.cn wrote: > safety = MAX(4096 * SAFE_SYSCONF(_SC_PAGESIZE), 128 * 1024 * 1024); > now safety margin is 128MB,not 64MB. Right? Yes, right, sorry.
available memory can avoid to oom-killer. #man free available Estimation of how much memory is available for starting new applications, without swapping. Unlike the data provided by the cache or free fields, this field takes into account page cache and also that not all reclaimable memory slabs will be reclaimed due to items being in use (MemAvailable in /proc/meminfo, available on kernels 3.14, emulated on kernels 2.6.27+, otherwise the same as free) [root@bogon ltp]# cat my.diff commit eb28176a3351c6854620aaa8248bf17edea210ae Author: Xinpeng Liu <liuxp11@chinatelecom.cn> Date: Mon Jan 25 20:58:20 2021 +0800 syscalls/ioctl: ioctl_sg01.c: ioctl_sg01 invoked oom-killer Kernel version is 5.4.81+,the available RAM is less than free,as follow: [root@liuxp mywork]# head /proc/meminfo MemTotal: 198101744 kB MemFree: 189303148 kB MemAvailable: 188566732 kB So use available RAM to avoid OOM killer. diff --git a/include/tst_memutils.h b/include/tst_memutils.h index 91dad07..3fd70b2 100644 --- a/include/tst_memutils.h +++ b/include/tst_memutils.h @@ -6,6 +6,8 @@ #ifndef TST_MEMUTILS_H__ #define TST_MEMUTILS_H__ +unsigned long tst_get_mem_available(void); + /* * Fill up to maxsize physical memory with fillchar, then free it for reuse. * If maxsize is zero, fill as much memory as possible. This function is diff --git a/lib/tst_memutils.c b/lib/tst_memutils.c index dd09db4..9408b37 100644 --- a/lib/tst_memutils.c +++ b/lib/tst_memutils.c @@ -13,11 +13,21 @@ #define BLOCKSIZE (16 * 1024 * 1024) +unsigned long tst_get_mem_available(void) +{ + unsigned long available_kb = 0; + + FILE_LINES_SCANF("/proc/meminfo", "MemAvailable: %lu", &available_kb); + + return available_kb; +} + void tst_pollute_memory(size_t maxsize, int fillchar) { size_t i, map_count = 0, safety = 0, blocksize = BLOCKSIZE; void **map_blocks; struct sysinfo info; + unsigned long available_ram; SAFE_SYSINFO(&info); safety = MAX(4096 * SAFE_SYSCONF(_SC_PAGESIZE), 128 * 1024 * 1024); @@ -26,15 +36,22 @@ void tst_pollute_memory(size_t maxsize, int fillchar) if (info.freeswap > safety) safety = 0; + available_ram = 1024 * tst_get_mem_available(); + available_ram /= info.mem_unit; + + /*"MemAvailable" field maybe not exist, or freeram less than available_ram*/ + if(available_ram == 0 || info.freeram < available_ram) + available_ram = info.freeram; + /* Not enough free memory to avoid invoking OOM killer */ - if (info.freeram <= safety) + if (available_ram <= safety) return; if (!maxsize) maxsize = SIZE_MAX; - if (info.freeram - safety < maxsize / info.mem_unit) - maxsize = (info.freeram - safety) * info.mem_unit; + if (available_ram - safety < maxsize / info.mem_unit) + maxsize = (available_ram - safety) * info.mem_unit; blocksize = MIN(maxsize, blocksize); map_count = maxsize / blocksize; From: Martin Doucha Date: 2021-01-27 19:46 To: liuxp11@chinatelecom.cn; Cyril Hrubis CC: ltp Subject: Re: [LTP] [PATCH 1/2] syscalls/ioctl: ioctl_sg01.c: ioctl_sg01 invoked oom-killer On 27. 01. 21 12:41, liuxp11@chinatelecom.cn wrote: > safety = MAX(4096 * SAFE_SYSCONF(_SC_PAGESIZE), 128 * 1024 * 1024); > now safety margin is 128MB,not 64MB. Right? Yes, right, sorry.
Hi Xinpeng, [root@test-env-nm05-compute-14e5e72e38 ~]# cat /proc/meminfo > MemTotal: 526997420 kB > MemFree: 520224908 kB > MemAvailable: 519936744 kB > Buffers: 0 kB > Cached: 2509036 kB > SwapCached: 0 kB > ... > SwapTotal: 0 kB > SwapFree: 0 kB > ... > CommitLimit: 263498708 kB > Committed_AS: 10263760 kB > > [root@test-env-nm05-compute-14e5e72e38 ~]# cat > /proc/sys/vm/min_free_kbytes > 90112 > After looking back on this problem, I prefer to think the reasons were caused by lower CommitLimit. CommitLimit: 263498708 kB < MemAvailable: 519936744 kB If you try to enable all swap-disk or reset to a high ratio in overcommit_ratio to make it larger than MemAvailable, probably no OOM occurs anymore. Btw, I also observed that ioctl_sg01 almost being killed by OOM every time on an aarch64 with little swap space, but if I add more swap or set a high value of overcommit_ratio, the problem is gone. (I manually tried with another x86_64 to confirm this too) total used free shared buff/cache available Mem: 259828 5365 247383 68 7079 231296 Swap: 4095 55 4040 --- MemTotal: 266063872 kB MemFree: 253320768 kB MemAvailable: 236848064 kB Buffers: 1472 kB Cached: 6755456 kB SwapCached: 12160 kB ... CommitLimit: 137226176 kB Committed_AS: 1206912 kB --- The previous method in the patch[1] seems not good enough, but that can help to verify if OOM disappears when resetting the overcommit_ratio. [1] http://lists.linux.it/pipermail/ltp/2021-February/020907.html Hence, another improvement way based on the above is to allocate proper memory-size according to CommitLimit value when detecting the value of CommitLimit is less than MemAvailable. That will make the test happy with a little swap-space size system. Any thoughts, or comments? --- a/lib/tst_memutils.c +++ b/lib/tst_memutils.c @@ -36,6 +36,13 @@ void tst_pollute_memory(size_t maxsize, int fillchar) if (info.freeram - safety < maxsize / info.mem_unit) maxsize = (info.freeram - safety) * info.mem_unit; + /* + * To respect CommitLimit to prevent test invoking OOM killer, + * this may appear on system with a smaller swap-disk (or disabled). + */ + if (SAFE_READ_MEMINFO("CommitLimit:") < SAFE_READ_MEMINFO("MemAvailable:")) + maxsize = SAFE_READ_MEMINFO("CommitLimit:") * 1024 - (safety * info.mem_unit); + blocksize = MIN(maxsize, blocksize); map_count = maxsize / blocksize; map_blocks = SAFE_MALLOC(map_count * sizeof(void *)); ======================== About the MemAvailable < MemFree, I think that is correct behavior on your system and not the OOM root-cause. Generally, we assumed the MemAvailable higher than MemFree, but we sometimes also allow situations to break that. We'd better count all of the different free watermarks from /proc/zoneinfo, then add the sum of the low watermarks to MemAvailable, if get a value larger than MemFree, that should be OK from my perspective. ----- # echo 675840 > /proc/sys/vm/min_free_kbytes # cat /proc/meminfo |grep -i mem MemTotal: 5888584 kB MemFree: 4518064 kB MemAvailable: 3692008 kB Shmem: 21128 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB # cat /proc/zoneinfo |grep low -B 3 ... pages free 3840 min 440 low 550 -- Node 0, zone DMA32 pages free 355602 min 79706 low 99632 -- Node 0, zone Normal pages free 0 min 0 low 0 -- Node 0, zone Movable pages free 0 min 0 low 0 -- Node 0, zone Device pages free 0 min 0 low 0 -- Node 1, zone DMA pages free 0 min 0 low 0 -- Node 1, zone DMA32 pages free 0 min 0 low 0 -- nr_kernel_misc_reclaimable 0 pages free 769192 min 88812 low 111015 (111015+99632+550)*4 + 3692008(MemAvailable) > 5888584(MemFree) Btw the formula to count MemAvailable is: available = MemFree - totalreserve_pages + pages[LRU_ACTIVE_FILE] + pages[LRU_INACTIVE_FILE] - min(pagecache / 2, wmark_low)
--- a/lib/tst_memutils.c
+++ b/lib/tst_memutils.c
@@ -36,6 +36,13 @@ void tst_pollute_memory(size_t maxsize, int fillchar)
if (info.freeram - safety < maxsize / info.mem_unit)
maxsize = (info.freeram - safety) * info.mem_unit;
==>Thanks,but the maxsize original code need to be deleted,Right?
+ /*
+ * To respect CommitLimit to prevent test invoking OOM killer,
+ * this may appear on system with a smaller swap-disk (or disabled).
+ */
+ if (SAFE_READ_MEMINFO("CommitLimit:") < SAFE_READ_MEMINFO("MemAvailable:"))
+ maxsize = SAFE_READ_MEMINFO("CommitLimit:") * 1024 - (safety * info.mem_unit);
+
blocksize = MIN(maxsize, blocksize);
map_count = maxsize / blocksize;
map_blocks = SAFE_MALLOC(map_count * sizeof(void *));
Thanks!
From: Li Wang
Date: 2021-03-04 15:52
To: liuxp11@chinatelecom.cn
CC: Cyril Hrubis; ltp; Martin Doucha
Subject: Re: [LTP] [PATCH 1/2] syscalls/ioctl: ioctl_sg01.c: ioctl_sg01 invoked oom-killer
Hi Xinpeng,
[root@test-env-nm05-compute-14e5e72e38 ~]# cat /proc/meminfo
MemTotal: 526997420 kB
MemFree: 520224908 kB
MemAvailable: 519936744 kB
Buffers: 0 kB
Cached: 2509036 kB
SwapCached: 0 kB
...
SwapTotal: 0 kB
SwapFree: 0 kB
...
CommitLimit: 263498708 kB
Committed_AS: 10263760 kB
[root@test-env-nm05-compute-14e5e72e38 ~]# cat /proc/sys/vm/min_free_kbytes
90112
After looking back on this problem, I prefer to think the reasons were caused by lower CommitLimit.
CommitLimit: 263498708 kB < MemAvailable: 519936744 kB
If you try to enable all swap-disk or reset to a high ratio in overcommit_ratio
to make it larger than MemAvailable, probably no OOM occurs anymore.
Btw, I also observed that ioctl_sg01 almost being killed by OOM
every time on an aarch64 with little swap space, but if I add more
swap or set a high value of overcommit_ratio, the problem is gone.
(I manually tried with another x86_64 to confirm this too)
total used free shared buff/cache available
Mem: 259828 5365 247383 68 7079 231296
Swap: 4095 55 4040---
MemTotal: 266063872 kB
MemFree: 253320768 kB
MemAvailable: 236848064 kB
Buffers: 1472 kB
Cached: 6755456 kB
SwapCached: 12160 kB
...
CommitLimit: 137226176 kB
Committed_AS: 1206912 kB
---
The previous method in the patch[1] seems not good enough, but that can
help to verify if OOM disappears when resetting the overcommit_ratio.
[1] http://lists.linux.it/pipermail/ltp/2021-February/020907.html
Hence, another improvement way based on the above is to allocate proper
memory-size according to CommitLimit value when detecting the value of
CommitLimit is less than MemAvailable. That will make the test happy with
a little swap-space size system.
Any thoughts, or comments?
--- a/lib/tst_memutils.c
+++ b/lib/tst_memutils.c
@@ -36,6 +36,13 @@ void tst_pollute_memory(size_t maxsize, int fillchar)
if (info.freeram - safety < maxsize / info.mem_unit)
maxsize = (info.freeram - safety) * info.mem_unit;
+ /*
+ * To respect CommitLimit to prevent test invoking OOM killer,
+ * this may appear on system with a smaller swap-disk (or disabled).
+ */
+ if (SAFE_READ_MEMINFO("CommitLimit:") < SAFE_READ_MEMINFO("MemAvailable:"))
+ maxsize = SAFE_READ_MEMINFO("CommitLimit:") * 1024 - (safety * info.mem_unit);
+
blocksize = MIN(maxsize, blocksize);
map_count = maxsize / blocksize;
map_blocks = SAFE_MALLOC(map_count * sizeof(void *));
========================
About the MemAvailable < MemFree, I think that is correct behavior on
your system and not the OOM root-cause.
Generally, we assumed the MemAvailable higher than MemFree,
but we sometimes also allow situations to break that. We'd better
count all of the different free watermarks from /proc/zoneinfo, then
add the sum of the low watermarks to MemAvailable, if get a value
larger than MemFree, that should be OK from my perspective.
-----
# echo 675840 > /proc/sys/vm/min_free_kbytes
# cat /proc/meminfo |grep -i mem
MemTotal: 5888584 kB
MemFree: 4518064 kB
MemAvailable: 3692008 kB
Shmem: 21128 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
# cat /proc/zoneinfo |grep low -B 3
...
pages free 3840
min 440
low 550
--
Node 0, zone DMA32
pages free 355602
min 79706
low 99632
--
Node 0, zone Normal
pages free 0
min 0
low 0
--
Node 0, zone Movable
pages free 0
min 0
low 0
--
Node 0, zone Device
pages free 0
min 0
low 0
--
Node 1, zone DMA
pages free 0
min 0
low 0
--
Node 1, zone DMA32
pages free 0
min 0
low 0
--
nr_kernel_misc_reclaimable 0
pages free 769192
min 88812
low 111015
(111015+99632+550)*4 + 3692008(MemAvailable) > 5888584(MemFree)
Btw the formula to count MemAvailable is:
available = MemFree - totalreserve_pages + pages[LRU_ACTIVE_FILE] + pages[LRU_INACTIVE_FILE] - min(pagecache / 2, wmark_low)
Hi Xinpeng, On Fri, Mar 5, 2021 at 1:52 PM liuxp11@chinatelecom.cn < liuxp11@chinatelecom.cn> wrote: > --- a/lib/tst_memutils.c > +++ b/lib/tst_memutils.c > @@ -36,6 +36,13 @@ void tst_pollute_memory(size_t maxsize, int fillchar) > if (info.freeram - safety < maxsize / info.mem_unit) > maxsize = (info.freeram - safety) * info.mem_unit; > > ==>Thanks,but the maxsize original code need to be deleted,Right? > No, the maxsize code is also useful, it needs to define the value in most common situations(i.e. CommitLimit > MemAvailable). But I'm still hesitating to use ComitLimit as the threshold for 'maxsize'. Because according to the Linux document, it says that only take effort when overcommit_memory is setting to 2. But our test system all set 0 by default. "This limit is only adhered to if strict overcommit accounting is enabled (mode 2 in 'vm.overcommit_memory')." see: https://github.com/torvalds/linux/blob/master/Documentation/filesystems/proc.rst Seems to use CommitLimit looks a bit strict and strange to test. And I even think the way to use MemAvailable is acceptable if MemFree > MemAvailable, just like what you did in your last patch. I'm still not very sure so far~ (But one thing I can confirm that MemAvailable < MemFree is correct behavior sometimes) > > + /* > + * To respect CommitLimit to prevent test invoking OOM killer, > + * this may appear on system with a smaller swap-disk (or > disabled). > + */ > + if (SAFE_READ_MEMINFO("CommitLimit:") < > SAFE_READ_MEMINFO("MemAvailable:")) > + maxsize = SAFE_READ_MEMINFO("CommitLimit:") * 1024 - > (safety * info.mem_unit); > + > blocksize = MIN(maxsize, blocksize); > map_count = maxsize / blocksize; > map_blocks = SAFE_MALLOC(map_count * sizeof(void *)); >
Hi Li Wang,
I think your patch is good.
1.CommitLimit is the memory that can be allocated by application.
2.ioctl_sg01 in serveral machines with your patch,the result is passed.
Thanks!
From: Li Wang
Date: 2021-03-05 17:02
To: liuxp11@chinatelecom.cn
CC: Cyril Hrubis; ltp; mdoucha
Subject: Re: Re: [LTP] [PATCH 1/2] syscalls/ioctl: ioctl_sg01.c: ioctl_sg01 invoked oom-killer
Hi Xinpeng,
On Fri, Mar 5, 2021 at 1:52 PM liuxp11@chinatelecom.cn <liuxp11@chinatelecom.cn> wrote:
--- a/lib/tst_memutils.c
+++ b/lib/tst_memutils.c
@@ -36,6 +36,13 @@ void tst_pollute_memory(size_t maxsize, int fillchar)
if (info.freeram - safety < maxsize / info.mem_unit)
maxsize = (info.freeram - safety) * info.mem_unit;
==>Thanks,but the maxsize original code need to be deleted,Right?
No, the maxsize code is also useful, it needs to define the value in
most common situations(i.e. CommitLimit > MemAvailable).
But I'm still hesitating to use ComitLimit as the threshold for 'maxsize'.
Because according to the Linux document, it says that only take effort
when overcommit_memory is setting to 2. But our test system all set 0
by default.
"This limit is only adhered to if strict overcommit accounting is enabled
(mode 2 in 'vm.overcommit_memory')."
see: https://github.com/torvalds/linux/blob/master/Documentation/filesystems/proc.rst
Seems to use CommitLimit looks a bit strict and strange to test.
And I even think the way to use MemAvailable is acceptable if
MemFree > MemAvailable, just like what you did in your last patch.
I'm still not very sure so far~
(But one thing I can confirm that MemAvailable < MemFree is correct behavior sometimes)
+ /*
+ * To respect CommitLimit to prevent test invoking OOM killer,
+ * this may appear on system with a smaller swap-disk (or disabled).
+ */
+ if (SAFE_READ_MEMINFO("CommitLimit:") < SAFE_READ_MEMINFO("MemAvailable:"))
+ maxsize = SAFE_READ_MEMINFO("CommitLimit:") * 1024 - (safety * info.mem_unit);
+
blocksize = MIN(maxsize, blocksize);
map_count = maxsize / blocksize;
map_blocks = SAFE_MALLOC(map_count * sizeof(void *));
Hi all, Seems this ioctl_sg01 problem is caused by the kernel commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8c7829b04c523cdc732cb77f59f03320e09f3386 I start a new thread for tracking it in Linux-MM just now: https://lists.linux.it/pipermail/ltp/2021-April/021903.html liuxp11@chinatelecom.cn <liuxp11@chinatelecom.cn> wrote: Hi Li Wang, > I think your patch is good. > 1.CommitLimit is the memory that can be allocated by application. > 2.ioctl_sg01 in serveral machines with your patch,the result is passed. > > Thank you Xinpeng, That is just an eclectic workaround but not the key point of the OOM occurs issue. I'd not suggest merging my patch to LTP:).
diff --git a/lib/tst_memutils.c b/lib/tst_memutils.c index dd09db4..21df9a8 100644 --- a/lib/tst_memutils.c +++ b/lib/tst_memutils.c @@ -10,14 +10,33 @@ #define TST_NO_DEFAULT_MAIN #include "tst_test.h" +#include "tst_safe_stdio.h" #define BLOCKSIZE (16 * 1024 * 1024) +static unsigned long get_available_ram(void) +{ + char buf[60]; /* actual lines we expect are ~30 chars or less */ + unsigned long available_kb = 0; + FILE *fp; + + fp = SAFE_FOPEN("/proc/meminfo","r"); + while (fgets(buf, sizeof(buf), fp)) { + if (sscanf(buf, "MemAvailable: %lu %*s\n", &available_kb) == 1){ + break; + } + } + SAFE_FCLOSE(fp); + + return 1024 * available_kb; +} + void tst_pollute_memory(size_t maxsize, int fillchar) { size_t i, map_count = 0, safety = 0, blocksize = BLOCKSIZE; void **map_blocks; struct sysinfo info; + unsigned long available_ram = get_available_ram(); SAFE_SYSINFO(&info); safety = MAX(4096 * SAFE_SYSCONF(_SC_PAGESIZE), 128 * 1024 * 1024); @@ -26,15 +45,19 @@ void tst_pollute_memory(size_t maxsize, int fillchar) if (info.freeswap > safety) safety = 0; + /*"MemAvailable" field maybe not exist, or freeram less than available_ram*/ + if(available_ram == 0 || info.freeram < available_ram) + available_ram = info.freeram; + /* Not enough free memory to avoid invoking OOM killer */ - if (info.freeram <= safety) + if (available_ram <= safety) return; if (!maxsize) maxsize = SIZE_MAX; - if (info.freeram - safety < maxsize / info.mem_unit) - maxsize = (info.freeram - safety) * info.mem_unit; + if (available_ram - safety < maxsize / info.mem_unit) + maxsize = (available_ram - safety) * info.mem_unit; blocksize = MIN(maxsize, blocksize); map_count = maxsize / blocksize;
Kernel version is 5.4.81+,the available RAM is less than free,as follow: [root@liuxp mywork]# head /proc/meminfo MemTotal: 198101744 kB MemFree: 189303148 kB MemAvailable: 188566732 kB So use available RAM to avoid OOM killer. Signed-off-by: Xinpeng Liu <liuxp11@chinatelecom.cn> --- lib/tst_memutils.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-)