Message ID | 1305127773-10570-3-git-send-email-mgorman@suse.de |
---|---|
State | Not Applicable, archived |
Headers | show |
On Wed, 11 May 2011, Mel Gorman wrote: > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 9f8a97b..057f1e2 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1972,6 +1972,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask) > { > int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET; > const gfp_t wait = gfp_mask & __GFP_WAIT; > + const gfp_t can_wake_kswapd = !(gfp_mask & __GFP_NO_KSWAPD); > > /* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */ > BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH); > @@ -1984,7 +1985,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask) > */ > alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH); > > - if (!wait) { > + if (!wait && can_wake_kswapd) { > /* > * Not worth trying to allocate harder for > * __GFP_NOMEMALLOC even if it can't schedule. > diff --git a/mm/slub.c b/mm/slub.c > index 98c358d..1071723 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -1170,7 +1170,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) > * Let the initial higher-order allocation fail under memory pressure > * so we fall-back to the minimum order allocation. > */ > - alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & ~__GFP_NOFAIL; > + alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & > + ~(__GFP_NOFAIL | __GFP_WAIT); __GFP_NORETRY is a no-op without __GFP_WAIT. > > page = alloc_slab_page(alloc_gfp, node, oo); > if (unlikely(!page)) { -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 11, 2011 at 01:38:44PM -0700, David Rientjes wrote: > On Wed, 11 May 2011, Mel Gorman wrote: > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 9f8a97b..057f1e2 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -1972,6 +1972,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask) > > { > > int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET; > > const gfp_t wait = gfp_mask & __GFP_WAIT; > > + const gfp_t can_wake_kswapd = !(gfp_mask & __GFP_NO_KSWAPD); > > > > /* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */ > > BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH); > > @@ -1984,7 +1985,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask) > > */ > > alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH); > > > > - if (!wait) { > > + if (!wait && can_wake_kswapd) { > > /* > > * Not worth trying to allocate harder for > > * __GFP_NOMEMALLOC even if it can't schedule. > > diff --git a/mm/slub.c b/mm/slub.c > > index 98c358d..1071723 100644 > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -1170,7 +1170,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) > > * Let the initial higher-order allocation fail under memory pressure > > * so we fall-back to the minimum order allocation. > > */ > > - alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & ~__GFP_NOFAIL; > > + alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & > > + ~(__GFP_NOFAIL | __GFP_WAIT); > > __GFP_NORETRY is a no-op without __GFP_WAIT. > True. I'll remove it in a V2 but I won't respin just yet. > > > > page = alloc_slab_page(alloc_gfp, node, oo); > > if (unlikely(!page)) {
Hi, On Wed, May 11, 2011 at 10:10:43PM +0100, Mel Gorman wrote: > > > diff --git a/mm/slub.c b/mm/slub.c > > > index 98c358d..1071723 100644 > > > --- a/mm/slub.c > > > +++ b/mm/slub.c > > > @@ -1170,7 +1170,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) > > > * Let the initial higher-order allocation fail under memory pressure > > > * so we fall-back to the minimum order allocation. > > > */ > > > - alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & ~__GFP_NOFAIL; > > > + alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & > > > + ~(__GFP_NOFAIL | __GFP_WAIT); > > > > __GFP_NORETRY is a no-op without __GFP_WAIT. > > > > True. I'll remove it in a V2 but I won't respin just yet. Nothing wrong and no performance difference with clearing __GFP_NORETRY too, if something it doesn't make sense for a caller to use __GFP_NOFAIL without __GFP_WAIT so the original version above looks cleaner. I like this change overall to only poll the buddy allocator without spinning kswapd and without invoking lumpy reclaim. Like you noted in the first mail, compaction was disabled, and very bad behavior is expected without it unless GFP_ATOMIC|__GFP_NO_KSWAPD is set (that was the way I had to use before disabling lumpy compaction when first developing THP too for the same reasons). But when compaction enabled slub could try to only clear __GFP_NOFAIL and leave __GFP_WAIT and no bad behavior should happen... but it's probably slower so I prefer to clear __GFP_WAIT too (for THP compaction is worth it because the allocation is generally long lived, but for slub allocations like tiny skb the allocation can be extremely short lived so it's unlikely to be worth it). So this way compaction is then invoked only by the minimal order allocation later if needed. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9f8a97b..057f1e2 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1972,6 +1972,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask) { int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET; const gfp_t wait = gfp_mask & __GFP_WAIT; + const gfp_t can_wake_kswapd = !(gfp_mask & __GFP_NO_KSWAPD); /* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */ BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH); @@ -1984,7 +1985,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask) */ alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH); - if (!wait) { + if (!wait && can_wake_kswapd) { /* * Not worth trying to allocate harder for * __GFP_NOMEMALLOC even if it can't schedule. diff --git a/mm/slub.c b/mm/slub.c index 98c358d..1071723 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1170,7 +1170,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) * Let the initial higher-order allocation fail under memory pressure * so we fall-back to the minimum order allocation. */ - alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & ~__GFP_NOFAIL; + alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & + ~(__GFP_NOFAIL | __GFP_WAIT); page = alloc_slab_page(alloc_gfp, node, oo); if (unlikely(!page)) {