Message ID | 1328521034-17401-2-git-send-email-apw@canonical.com |
---|---|
State | New |
Headers | show |
On 06.02.2012 10:37, Andy Whitcroft wrote: > From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > > PTE pages eat up memory just like anything else, but we do not account for > them in any way in the OOM scores. They are also _guaranteed_ to get > freed up when a process is OOM killed, while RSS is not. > > Reported-by: Dave Hansen <dave@linux.vnet.ibm.com> > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > Cc: Hugh Dickins <hughd@google.com> > Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > Cc: Oleg Nesterov <oleg@redhat.com> > Acked-by: David Rientjes <rientjes@google.com> > Cc: <stable@kernel.org> [2.6.36+] > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > > (cherry picked from commit f755a042d82b51b54f3bdd0890e5ea56c0fb6807) > CVE-2011-2498 > BugLink: http://bugs.launchpad.net/bugs/922374 > Signed-off-by: Andy Whitcroft <apw@canonical.com> > --- > mm/oom_kill.c | 9 ++++++--- > 1 files changed, 6 insertions(+), 3 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index ea16f72..49ea0cc 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -172,10 +172,13 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem, > > /* > * The baseline for the badness score is the proportion of RAM that each > - * task's rss and swap space use. > + * task's rss, pagetable and swap space use. > */ > - points = (get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS)) * 1000 / > - totalpages; > + points = get_mm_rss(p->mm) + p->mm->nr_ptes; > + points += get_mm_counter(p->mm, MM_SWAPENTS); > + > + points *= 1000; > + points /= totalpages; > task_unlock(p); > > /* Essentially only adding the pte count and cherry pick anyway... Acked-by: Stefan Bader <smb@canonical.com>
On 06/02/12 09:37, Andy Whitcroft wrote: > From: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com> > > PTE pages eat up memory just like anything else, but we do not account for > them in any way in the OOM scores. They are also _guaranteed_ to get > freed up when a process is OOM killed, while RSS is not. > > Reported-by: Dave Hansen<dave@linux.vnet.ibm.com> > Signed-off-by: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com> > Cc: Hugh Dickins<hughd@google.com> > Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com> > Cc: Oleg Nesterov<oleg@redhat.com> > Acked-by: David Rientjes<rientjes@google.com> > Cc:<stable@kernel.org> [2.6.36+] > Signed-off-by: Andrew Morton<akpm@linux-foundation.org> > Signed-off-by: Linus Torvalds<torvalds@linux-foundation.org> > > (cherry picked from commit f755a042d82b51b54f3bdd0890e5ea56c0fb6807) > CVE-2011-2498 > BugLink: http://bugs.launchpad.net/bugs/922374 > Signed-off-by: Andy Whitcroft<apw@canonical.com> > --- > mm/oom_kill.c | 9 ++++++--- > 1 files changed, 6 insertions(+), 3 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index ea16f72..49ea0cc 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -172,10 +172,13 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem, > > /* > * The baseline for the badness score is the proportion of RAM that each > - * task's rss and swap space use. > + * task's rss, pagetable and swap space use. > */ > - points = (get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS)) * 1000 / > - totalpages; > + points = get_mm_rss(p->mm) + p->mm->nr_ptes; > + points += get_mm_counter(p->mm, MM_SWAPENTS); > + > + points *= 1000; > + points /= totalpages; > task_unlock(p); > > /* Makes sense to add in the pte count, and this is cherry pick, so.. Acked-by: Colin King <colin.king@canonical.com>
Applied. -apw
On Mon, Feb 06, 2012 at 09:37:14AM +0000, Andy Whitcroft wrote: > From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > > PTE pages eat up memory just like anything else, but we do not account for > them in any way in the OOM scores. They are also _guaranteed_ to get > freed up when a process is OOM killed, while RSS is not. > > Reported-by: Dave Hansen <dave@linux.vnet.ibm.com> > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > Cc: Hugh Dickins <hughd@google.com> > Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > Cc: Oleg Nesterov <oleg@redhat.com> > Acked-by: David Rientjes <rientjes@google.com> > Cc: <stable@kernel.org> [2.6.36+] > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > > (cherry picked from commit f755a042d82b51b54f3bdd0890e5ea56c0fb6807) > CVE-2011-2498 > BugLink: http://bugs.launchpad.net/bugs/922374 > Signed-off-by: Andy Whitcroft <apw@canonical.com> > --- > mm/oom_kill.c | 9 ++++++--- > 1 files changed, 6 insertions(+), 3 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index ea16f72..49ea0cc 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -172,10 +172,13 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem, > > /* > * The baseline for the badness score is the proportion of RAM that each > - * task's rss and swap space use. > + * task's rss, pagetable and swap space use. > */ > - points = (get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS)) * 1000 / > - totalpages; > + points = get_mm_rss(p->mm) + p->mm->nr_ptes; > + points += get_mm_counter(p->mm, MM_SWAPENTS); > + > + points *= 1000; > + points /= totalpages; This split up of the computation introduced a bug in 64 bit arches, which is fixed by commit ff05b6f. Arm should be unaffected, but natty have this broken at least with x86_64, oneiric already got the fix through stable. > task_unlock(p); > > /* > -- > 1.7.8.3 > > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team >
On 02/06/2012 05:43 AM, Herton Ronaldo Krzesinski wrote: > On Mon, Feb 06, 2012 at 09:37:14AM +0000, Andy Whitcroft wrote: >> From: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com> >> >> PTE pages eat up memory just like anything else, but we do not account for >> them in any way in the OOM scores. They are also _guaranteed_ to get >> freed up when a process is OOM killed, while RSS is not. >> >> Reported-by: Dave Hansen<dave@linux.vnet.ibm.com> >> Signed-off-by: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com> >> Cc: Hugh Dickins<hughd@google.com> >> Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com> >> Cc: Oleg Nesterov<oleg@redhat.com> >> Acked-by: David Rientjes<rientjes@google.com> >> Cc:<stable@kernel.org> [2.6.36+] >> Signed-off-by: Andrew Morton<akpm@linux-foundation.org> >> Signed-off-by: Linus Torvalds<torvalds@linux-foundation.org> >> >> (cherry picked from commit f755a042d82b51b54f3bdd0890e5ea56c0fb6807) >> CVE-2011-2498 >> BugLink: http://bugs.launchpad.net/bugs/922374 >> Signed-off-by: Andy Whitcroft<apw@canonical.com> >> --- >> mm/oom_kill.c | 9 ++++++--- >> 1 files changed, 6 insertions(+), 3 deletions(-) >> >> diff --git a/mm/oom_kill.c b/mm/oom_kill.c >> index ea16f72..49ea0cc 100644 >> --- a/mm/oom_kill.c >> +++ b/mm/oom_kill.c >> @@ -172,10 +172,13 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem, >> >> /* >> * The baseline for the badness score is the proportion of RAM that each >> - * task's rss and swap space use. >> + * task's rss, pagetable and swap space use. >> */ >> - points = (get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS)) * 1000 / >> - totalpages; >> + points = get_mm_rss(p->mm) + p->mm->nr_ptes; >> + points += get_mm_counter(p->mm, MM_SWAPENTS); >> + >> + points *= 1000; >> + points /= totalpages; > > This split up of the computation introduced a bug in 64 bit arches, which > is fixed by commit ff05b6f. Arm should be unaffected, but natty have this > broken at least with x86_64, oneiric already got the fix through stable. > Good catch. Applied ff05b6f to natty/master-next. rtg
On Mon, Feb 06, 2012 at 10:43:50AM -0200, Herton Ronaldo Krzesinski wrote: > On Mon, Feb 06, 2012 at 09:37:14AM +0000, Andy Whitcroft wrote: > > From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > > > > PTE pages eat up memory just like anything else, but we do not account for > > them in any way in the OOM scores. They are also _guaranteed_ to get > > freed up when a process is OOM killed, while RSS is not. > > > > Reported-by: Dave Hansen <dave@linux.vnet.ibm.com> > > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > > Cc: Hugh Dickins <hughd@google.com> > > Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > > Cc: Oleg Nesterov <oleg@redhat.com> > > Acked-by: David Rientjes <rientjes@google.com> > > Cc: <stable@kernel.org> [2.6.36+] > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > > > > (cherry picked from commit f755a042d82b51b54f3bdd0890e5ea56c0fb6807) > > CVE-2011-2498 > > BugLink: http://bugs.launchpad.net/bugs/922374 > > Signed-off-by: Andy Whitcroft <apw@canonical.com> > > --- > > mm/oom_kill.c | 9 ++++++--- > > 1 files changed, 6 insertions(+), 3 deletions(-) > > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > index ea16f72..49ea0cc 100644 > > --- a/mm/oom_kill.c > > +++ b/mm/oom_kill.c > > @@ -172,10 +172,13 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem, > > > > /* > > * The baseline for the badness score is the proportion of RAM that each > > - * task's rss and swap space use. > > + * task's rss, pagetable and swap space use. > > */ > > - points = (get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS)) * 1000 / > > - totalpages; > > + points = get_mm_rss(p->mm) + p->mm->nr_ptes; > > + points += get_mm_counter(p->mm, MM_SWAPENTS); > > + > > + points *= 1000; > > + points /= totalpages; > > This split up of the computation introduced a bug in 64 bit arches, which > is fixed by commit ff05b6f. Arm should be unaffected, but natty have this > broken at least with x86_64, oneiric already got the fix through stable. Well spotted. Thanks. -apw
diff --git a/mm/oom_kill.c b/mm/oom_kill.c index ea16f72..49ea0cc 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -172,10 +172,13 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem, /* * The baseline for the badness score is the proportion of RAM that each - * task's rss and swap space use. + * task's rss, pagetable and swap space use. */ - points = (get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS)) * 1000 / - totalpages; + points = get_mm_rss(p->mm) + p->mm->nr_ptes; + points += get_mm_counter(p->mm, MM_SWAPENTS); + + points *= 1000; + points /= totalpages; task_unlock(p); /*