Message ID | 20110720214714.CFFF8F8AFC@sepang.rtg.net |
---|---|
State | New |
Headers | show |
On Wed, Jul 20, 2011 at 03:47:14PM -0600, Tim Gardner wrote: > From 64ba82a4b63a45fd0c102dd97ea81c59fc522b76 Mon Sep 17 00:00:00 2001 > From: Shaohua Li <shaohua.li@intel.com> > Date: Tue, 19 Jul 2011 08:49:26 -0700 > Subject: [PATCH] vmscan: fix a livelock in kswapd > > BugLink: http://bugs.launchpad.net/bugs/813797 > > I'm running a workload which triggers a lot of swap in a machine with 4 > nodes. After I kill the workload, I found a kswapd livelock. Sometimes > kswapd3 or kswapd2 are keeping running and I can't access filesystem, > but most memory is free. > > This looks like a regression since commit 08951e545918c159 ("mm: vmscan: > correct check for kswapd sleeping in sleeping_prematurely"). > > Node 2 and 3 have only ZONE_NORMAL, but balance_pgdat() will return 0 > for classzone_idx. The reason is end_zone in balance_pgdat() is 0 by > default, if all zones have watermark ok, end_zone will keep 0. > > Later sleeping_prematurely() always returns true. Because this is an > order 3 wakeup, and if classzone_idx is 0, both balanced_pages and > present_pages in pgdat_balanced() are 0. We add a special case here. > If a zone has no page, we think it's balanced. This fixes the livelock. > > Signed-off-by: Shaohua Li <shaohua.li@intel.com> > Acked-by: Mel Gorman <mgorman@suse.de> > Cc: Minchan Kim <minchan.kim@gmail.com> > Cc: <stable@kernel.org> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > (cherry picked from commit 4746efded84d7c5a9c8d64d4c6e814ff0cf9fb42) > > Signed-off-by: Tim Gardner <tim.gardner@canonical.com> > --- > mm/vmscan.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 4b8b37c..1e0eefe 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2245,7 +2245,8 @@ static bool pgdat_balanced(pg_data_t *pgdat, unsigned long balanced_pages, > for (i = 0; i <= classzone_idx; i++) > present_pages += pgdat->node_zones[i].present_pages; > > - return balanced_pages > (present_pages >> 2); > + /* A special case here: if zone has no page, we think it's balanced */ > + return balanced_pages >= (present_pages >> 2); > } > > /* is kswapd sleeping prematurely? */ Yeeks that is subtle. Looks to do what it claims. Acked-by: Andy Whitcroft <apw@canonical.com> -apw
On 20.07.2011 23:47, Tim Gardner wrote: > From 64ba82a4b63a45fd0c102dd97ea81c59fc522b76 Mon Sep 17 00:00:00 2001 > From: Shaohua Li <shaohua.li@intel.com> > Date: Tue, 19 Jul 2011 08:49:26 -0700 > Subject: [PATCH] vmscan: fix a livelock in kswapd > > BugLink: http://bugs.launchpad.net/bugs/813797 > > I'm running a workload which triggers a lot of swap in a machine with 4 > nodes. After I kill the workload, I found a kswapd livelock. Sometimes > kswapd3 or kswapd2 are keeping running and I can't access filesystem, > but most memory is free. > > This looks like a regression since commit 08951e545918c159 ("mm: vmscan: > correct check for kswapd sleeping in sleeping_prematurely"). > > Node 2 and 3 have only ZONE_NORMAL, but balance_pgdat() will return 0 > for classzone_idx. The reason is end_zone in balance_pgdat() is 0 by > default, if all zones have watermark ok, end_zone will keep 0. > > Later sleeping_prematurely() always returns true. Because this is an > order 3 wakeup, and if classzone_idx is 0, both balanced_pages and > present_pages in pgdat_balanced() are 0. We add a special case here. > If a zone has no page, we think it's balanced. This fixes the livelock. > > Signed-off-by: Shaohua Li <shaohua.li@intel.com> > Acked-by: Mel Gorman <mgorman@suse.de> > Cc: Minchan Kim <minchan.kim@gmail.com> > Cc: <stable@kernel.org> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > (cherry picked from commit 4746efded84d7c5a9c8d64d4c6e814ff0cf9fb42) > > Signed-off-by: Tim Gardner <tim.gardner@canonical.com> > --- > mm/vmscan.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 4b8b37c..1e0eefe 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2245,7 +2245,8 @@ static bool pgdat_balanced(pg_data_t *pgdat, unsigned long balanced_pages, > for (i = 0; i <= classzone_idx; i++) > present_pages += pgdat->node_zones[i].present_pages; > > - return balanced_pages > (present_pages >> 2); > + /* A special case here: if zone has no page, we think it's balanced */ > + return balanced_pages >= (present_pages >> 2); > } > > /* is kswapd sleeping prematurely? */ Acked-by: Stefan Bader <stefan.bader@canonical.com>
On 07/20/2011 02:47 PM, Tim Gardner wrote: > From 64ba82a4b63a45fd0c102dd97ea81c59fc522b76 Mon Sep 17 00:00:00 2001 > From: Shaohua Li<shaohua.li@intel.com> > Date: Tue, 19 Jul 2011 08:49:26 -0700 > Subject: [PATCH] vmscan: fix a livelock in kswapd > > BugLink: http://bugs.launchpad.net/bugs/813797 > > I'm running a workload which triggers a lot of swap in a machine with 4 > nodes. After I kill the workload, I found a kswapd livelock. Sometimes > kswapd3 or kswapd2 are keeping running and I can't access filesystem, > but most memory is free. > > This looks like a regression since commit 08951e545918c159 ("mm: vmscan: > correct check for kswapd sleeping in sleeping_prematurely"). > > Node 2 and 3 have only ZONE_NORMAL, but balance_pgdat() will return 0 > for classzone_idx. The reason is end_zone in balance_pgdat() is 0 by > default, if all zones have watermark ok, end_zone will keep 0. > > Later sleeping_prematurely() always returns true. Because this is an > order 3 wakeup, and if classzone_idx is 0, both balanced_pages and > present_pages in pgdat_balanced() are 0. We add a special case here. > If a zone has no page, we think it's balanced. This fixes the livelock. > > Signed-off-by: Shaohua Li<shaohua.li@intel.com> > Acked-by: Mel Gorman<mgorman@suse.de> > Cc: Minchan Kim<minchan.kim@gmail.com> > Cc:<stable@kernel.org> > Signed-off-by: Andrew Morton<akpm@linux-foundation.org> > Signed-off-by: Linus Torvalds<torvalds@linux-foundation.org> > (cherry picked from commit 4746efded84d7c5a9c8d64d4c6e814ff0cf9fb42) > > Signed-off-by: Tim Gardner<tim.gardner@canonical.com> > --- > mm/vmscan.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 4b8b37c..1e0eefe 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2245,7 +2245,8 @@ static bool pgdat_balanced(pg_data_t *pgdat, unsigned long balanced_pages, > for (i = 0; i<= classzone_idx; i++) > present_pages += pgdat->node_zones[i].present_pages; > > - return balanced_pages> (present_pages>> 2); > + /* A special case here: if zone has no page, we think it's balanced */ > + return balanced_pages>= (present_pages>> 2); > } > > /* is kswapd sleeping prematurely? */
diff --git a/mm/vmscan.c b/mm/vmscan.c index 4b8b37c..1e0eefe 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2245,7 +2245,8 @@ static bool pgdat_balanced(pg_data_t *pgdat, unsigned long balanced_pages, for (i = 0; i <= classzone_idx; i++) present_pages += pgdat->node_zones[i].present_pages; - return balanced_pages > (present_pages >> 2); + /* A special case here: if zone has no page, we think it's balanced */ + return balanced_pages >= (present_pages >> 2); } /* is kswapd sleeping prematurely? */