Message ID | 1254405917-15796-1-git-send-email-sjayaraman@suse.de |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
On Thu, 1 Oct 2009, Suresh Jayaraman wrote: > Index: mmotm/mm/page_alloc.c > =================================================================== > --- mmotm.orig/mm/page_alloc.c > +++ mmotm/mm/page_alloc.c > @@ -1501,8 +1501,10 @@ zonelist_scan: > try_this_zone: > page = buffered_rmqueue(preferred_zone, zone, order, > gfp_mask, migratetype); > - if (page) > + if (page) { > + page->reserve = !!(alloc_flags & ALLOC_NO_WATERMARKS); > break; > + } > this_zone_full: > if (NUMA_BUILD) > zlc_mark_zone_full(zonelist, z); page->reserve won't necessary indicate that access to reserves was _necessary_ for the allocation to succeed, though. This will mark any page being allocated under PF_MEMALLOC as reserve when all zones may be well above their min watermarks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday October 1, rientjes@google.com wrote: > On Thu, 1 Oct 2009, Suresh Jayaraman wrote: > > > Index: mmotm/mm/page_alloc.c > > =================================================================== > > --- mmotm.orig/mm/page_alloc.c > > +++ mmotm/mm/page_alloc.c > > @@ -1501,8 +1501,10 @@ zonelist_scan: > > try_this_zone: > > page = buffered_rmqueue(preferred_zone, zone, order, > > gfp_mask, migratetype); > > - if (page) > > + if (page) { > > + page->reserve = !!(alloc_flags & ALLOC_NO_WATERMARKS); > > break; > > + } > > this_zone_full: > > if (NUMA_BUILD) > > zlc_mark_zone_full(zonelist, z); > > page->reserve won't necessary indicate that access to reserves was > _necessary_ for the allocation to succeed, though. This will mark any > page being allocated under PF_MEMALLOC as reserve when all zones may be > well above their min watermarks. Normally if zones are above their watermarks, page->reserve will not be set. This is because __alloc_page_nodemask (which seems to be the main non-inline entrypoint) first calls get_page_from_freelist with alloc_flags set to ALLOC_WMARK_LOW|ALLOC_CPUSET. Only if this fails does __alloc_page_nodemask call __alloc_pages_slowpath which potentially sets ALLOC_NO_WATERMARKS in alloc_flags. So page->reserved being set actually tells us: PF_MEMALLOC or GFP_MEMALLOC were used, and a WMARK_LOW allocation attempt failed very recently which is close enough to "the emergency reserves were used" I think. Thanks, NeilBrown -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2 Oct 2009, Neil Brown wrote: > Normally if zones are above their watermarks, page->reserve will not > be set. > This is because __alloc_page_nodemask (which seems to be the main > non-inline entrypoint) first calls get_page_from_freelist with > alloc_flags set to ALLOC_WMARK_LOW|ALLOC_CPUSET. > Only if this fails does __alloc_page_nodemask call > __alloc_pages_slowpath which potentially sets ALLOC_NO_WATERMARKS in > alloc_flags. > > So page->reserved being set actually tells us: > PF_MEMALLOC or GFP_MEMALLOC were used, and > a WMARK_LOW allocation attempt failed very recently > > which is close enough to "the emergency reserves were used" I think. > There're a couple cornercases for GFP_ATOMIC, though: - it isn't restricted by cpuset, so ALLOC_CPUSET will never get set for the slowpath allocs and may very well allow the allocation to succeed in zones far above their min watermark. - it allows for allocating beyond the min watermark in allowed zones simply by setting ALLOC_HARDER; these types of "reserve" allocations wouldn't be marked as page->reserve with your patches if ALLOC_NO_WATERMARKS wasn't set because of the allocation context. The second one is debatable whether it fits your definition of reserve or not, but there's an inconsistency if it doesn't because the allocation may succeed in "no watermark" context (for example, in hard irq context) even though that privilege wasn't necessary to successfully allocate: perhaps it only needed ALLOC_HARDER. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: mmotm/include/linux/mm_types.h =================================================================== --- mmotm.orig/include/linux/mm_types.h +++ mmotm/include/linux/mm_types.h @@ -77,6 +77,7 @@ struct page { union { pgoff_t index; /* Our offset within mapping. */ void *freelist; /* SLUB: freelist req. slab lock */ + int reserve; /* page_alloc: page is a reserve page */ }; struct list_head lru; /* Pageout list, eg. active_list * protected by zone->lru_lock ! Index: mmotm/mm/page_alloc.c =================================================================== --- mmotm.orig/mm/page_alloc.c +++ mmotm/mm/page_alloc.c @@ -1501,8 +1501,10 @@ zonelist_scan: try_this_zone: page = buffered_rmqueue(preferred_zone, zone, order, gfp_mask, migratetype); - if (page) + if (page) { + page->reserve = !!(alloc_flags & ALLOC_NO_WATERMARKS); break; + } this_zone_full: if (NUMA_BUILD) zlc_mark_zone_full(zonelist, z);