Patchwork [1/2] mm: vmscan: correctly check if reclaimer should schedule during shrink_slab

login
register
mail settings
Submitter Colin King
Date June 17, 2011, 10:06 a.m.
Message ID <1308305213-4657-2-git-send-email-colin.king@canonical.com>
Download mbox | patch
Permalink /patch/100791/
State New
Headers show

Comments

Colin King - June 17, 2011, 10:06 a.m.
From: Minchan Kim <minchan.kim@gmail.com>

It has been reported on some laptops that kswapd is consuming large
amounts of CPU and not being scheduled when SLUB is enabled during large
amounts of file copying.  It is expected that this is due to kswapd
missing every cond_resched() point because;

shrink_page_list() calls cond_resched() if inactive pages were isolated
        which in turn may not happen if all_unreclaimable is set in
        shrink_zones(). If for whatver reason, all_unreclaimable is
        set on all zones, we can miss calling cond_resched().

balance_pgdat() only calls cond_resched if the zones are not
        balanced. For a high-order allocation that is balanced, it
        checks order-0 again. During that window, order-0 might have
        become unbalanced so it loops again for order-0 and returns
        that it was reclaiming for order-0 to kswapd(). It can then
        find that a caller has rewoken kswapd for a high-order and
        re-enters balance_pgdat() without ever calling cond_resched().

shrink_slab only calls cond_resched() if we are reclaiming slab
	pages. If there are a large number of direct reclaimers, the
	shrinker_rwsem can be contended and prevent kswapd calling
	cond_resched().

This patch modifies the shrink_slab() case.  If the semaphore is
contended, the caller will still check cond_resched().  After each
successful call into a shrinker, the check for cond_resched() remains in
case one shrinker is particularly slow.

[mgorman@suse.de: preserve call to cond_resched after each call into shrinker]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Colin King <colin.king@canonical.com>
Cc: Raghavendra D Prabhu <raghu.prabhu13@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: <stable@kernel.org>		[2.6.38+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 mm/vmscan.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)
Brad Figg - June 17, 2011, 10:15 a.m.
On 06/17/2011 11:06 AM, Colin King wrote:
> From: Minchan Kim<minchan.kim@gmail.com>
>
> It has been reported on some laptops that kswapd is consuming large
> amounts of CPU and not being scheduled when SLUB is enabled during large
> amounts of file copying.  It is expected that this is due to kswapd
> missing every cond_resched() point because;
>
> shrink_page_list() calls cond_resched() if inactive pages were isolated
>          which in turn may not happen if all_unreclaimable is set in
>          shrink_zones(). If for whatver reason, all_unreclaimable is
>          set on all zones, we can miss calling cond_resched().
>
> balance_pgdat() only calls cond_resched if the zones are not
>          balanced. For a high-order allocation that is balanced, it
>          checks order-0 again. During that window, order-0 might have
>          become unbalanced so it loops again for order-0 and returns
>          that it was reclaiming for order-0 to kswapd(). It can then
>          find that a caller has rewoken kswapd for a high-order and
>          re-enters balance_pgdat() without ever calling cond_resched().
>
> shrink_slab only calls cond_resched() if we are reclaiming slab
> 	pages. If there are a large number of direct reclaimers, the
> 	shrinker_rwsem can be contended and prevent kswapd calling
> 	cond_resched().
>
> This patch modifies the shrink_slab() case.  If the semaphore is
> contended, the caller will still check cond_resched().  After each
> successful call into a shrinker, the check for cond_resched() remains in
> case one shrinker is particularly slow.
>
> [mgorman@suse.de: preserve call to cond_resched after each call into shrinker]
> Signed-off-by: Mel Gorman<mgorman@suse.de>
> Signed-off-by: Minchan Kim<minchan.kim@gmail.com>
> Cc: Rik van Riel<riel@redhat.com>
> Cc: Johannes Weiner<hannes@cmpxchg.org>
> Cc: Wu Fengguang<fengguang.wu@intel.com>
> Cc: James Bottomley<James.Bottomley@HansenPartnership.com>
> Tested-by: Colin King<colin.king@canonical.com>
> Cc: Raghavendra D Prabhu<raghu.prabhu13@gmail.com>
> Cc: Jan Kara<jack@suse.cz>
> Cc: Chris Mason<chris.mason@oracle.com>
> Cc: Christoph Lameter<cl@linux.com>
> Cc: Pekka Enberg<penberg@kernel.org>
> Cc: Rik van Riel<riel@redhat.com>
> Cc:<stable@kernel.org>		[2.6.38+]
> Signed-off-by: Andrew Morton<akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds<torvalds@linux-foundation.org>
> ---
>   mm/vmscan.c |    9 +++++++--
>   1 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 0665520..648aab8 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -230,8 +230,11 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
>   	if (scanned == 0)
>   		scanned = SWAP_CLUSTER_MAX;
>
> -	if (!down_read_trylock(&shrinker_rwsem))
> -		return 1;	/* Assume we'll be able to shrink next time */
> +	if (!down_read_trylock(&shrinker_rwsem)) {
> +		/* Assume we'll be able to shrink next time */
> +		ret = 1;
> +		goto out;
> +	}
>
>   	list_for_each_entry(shrinker,&shrinker_list, list) {
>   		unsigned long long delta;
> @@ -282,6 +285,8 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
>   		shrinker->nr += total_scan;
>   	}
>   	up_read(&shrinker_rwsem);
> +out:
> +	cond_resched();
>   	return ret;
>   }
>

Acked-by: Brad Figg <brad.figg@canonical.com>
Herton Ronaldo Krzesinski - June 17, 2011, 12:17 p.m.
On Fri, Jun 17, 2011 at 11:06:52AM +0100, Colin King wrote:
> From: Minchan Kim <minchan.kim@gmail.com>
> 
> It has been reported on some laptops that kswapd is consuming large
> amounts of CPU and not being scheduled when SLUB is enabled during large
> amounts of file copying.  It is expected that this is due to kswapd
> missing every cond_resched() point because;
> 
> shrink_page_list() calls cond_resched() if inactive pages were isolated
>         which in turn may not happen if all_unreclaimable is set in
>         shrink_zones(). If for whatver reason, all_unreclaimable is
>         set on all zones, we can miss calling cond_resched().
> 
> balance_pgdat() only calls cond_resched if the zones are not
>         balanced. For a high-order allocation that is balanced, it
>         checks order-0 again. During that window, order-0 might have
>         become unbalanced so it loops again for order-0 and returns
>         that it was reclaiming for order-0 to kswapd(). It can then
>         find that a caller has rewoken kswapd for a high-order and
>         re-enters balance_pgdat() without ever calling cond_resched().
> 
> shrink_slab only calls cond_resched() if we are reclaiming slab
> 	pages. If there are a large number of direct reclaimers, the
> 	shrinker_rwsem can be contended and prevent kswapd calling
> 	cond_resched().
> 
> This patch modifies the shrink_slab() case.  If the semaphore is
> contended, the caller will still check cond_resched().  After each
> successful call into a shrinker, the check for cond_resched() remains in
> case one shrinker is particularly slow.
> 
> [mgorman@suse.de: preserve call to cond_resched after each call into shrinker]
> Signed-off-by: Mel Gorman <mgorman@suse.de>
> Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Wu Fengguang <fengguang.wu@intel.com>
> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
> Tested-by: Colin King <colin.king@canonical.com>
> Cc: Raghavendra D Prabhu <raghu.prabhu13@gmail.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Chris Mason <chris.mason@oracle.com>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: <stable@kernel.org>		[2.6.38+]
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> ---
>  mm/vmscan.c |    9 +++++++--
>  1 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 0665520..648aab8 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -230,8 +230,11 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
>  	if (scanned == 0)
>  		scanned = SWAP_CLUSTER_MAX;
>  
> -	if (!down_read_trylock(&shrinker_rwsem))
> -		return 1;	/* Assume we'll be able to shrink next time */
> +	if (!down_read_trylock(&shrinker_rwsem)) {
> +		/* Assume we'll be able to shrink next time */
> +		ret = 1;
> +		goto out;
> +	}
>  
>  	list_for_each_entry(shrinker, &shrinker_list, list) {
>  		unsigned long long delta;
> @@ -282,6 +285,8 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
>  		shrinker->nr += total_scan;
>  	}
>  	up_read(&shrinker_rwsem);
> +out:
> +	cond_resched();
>  	return ret;
>  }

Acked-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>

>  
> -- 
> 1.7.0.4
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
>

Patch

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0665520..648aab8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -230,8 +230,11 @@  unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
 	if (scanned == 0)
 		scanned = SWAP_CLUSTER_MAX;
 
-	if (!down_read_trylock(&shrinker_rwsem))
-		return 1;	/* Assume we'll be able to shrink next time */
+	if (!down_read_trylock(&shrinker_rwsem)) {
+		/* Assume we'll be able to shrink next time */
+		ret = 1;
+		goto out;
+	}
 
 	list_for_each_entry(shrinker, &shrinker_list, list) {
 		unsigned long long delta;
@@ -282,6 +285,8 @@  unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
 		shrinker->nr += total_scan;
 	}
 	up_read(&shrinker_rwsem);
+out:
+	cond_resched();
 	return ret;
 }