Message ID | 1308575540-25219-2-git-send-email-mgorman@suse.de |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Mon, 20 Jun 2011 14:12:07 +0100 Mel Gorman <mgorman@suse.de> wrote: > There is a race between the min_free_kbytes sysctl, memory hotplug > and transparent hugepage support enablement. Memory hotplug uses a > zonelists_mutex to avoid a race when building zonelists. Reuse it to > serialise watermark updates. This patch appears to be a standalone fix, unrelated to the overall patch series? How does one trigger the race and what happens when it hits, btw? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 06, 2011 at 04:44:47PM -0700, Andrew Morton wrote: > On Mon, 20 Jun 2011 14:12:07 +0100 > Mel Gorman <mgorman@suse.de> wrote: > > > There is a race between the min_free_kbytes sysctl, memory hotplug > > and transparent hugepage support enablement. Memory hotplug uses a > > zonelists_mutex to avoid a race when building zonelists. Reuse it to > > serialise watermark updates. > > This patch appears to be a standalone fix, unrelated to the overall > patch series? > Yes. In the original series this would have been a more serious problem as min_free_kbytes was potentially adjusted more frequently. > How does one trigger the race and what happens when it hits, btw? One could trigger the trace by having multiple processes on different CPUs write to min_free_kbytes. One could add memory hotplug events to that for extra fun but it is unnecessary to trigger the race. The consequences are that the value for min_free_kbytes and the zone watermarks get out of sync. Whether the zone watermarks will be too high or too low would depend on the timing. For the most part, the consequence will simply be that the min free level for some zones will be wrong. A more serious consequence is that totalreserve_pages could get out of sync and strict no memory overcommit could fail a mmap when it should have succeeded for the value of min_free_kbytes or suspend fail because it did not preallocate enough pages. It's not exactly earth shattering.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 4e8985a..a327a72 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5045,14 +5045,7 @@ static void setup_per_zone_lowmem_reserve(void) calculate_totalreserve_pages(); } -/** - * setup_per_zone_wmarks - called when min_free_kbytes changes - * or when memory is hot-{added|removed} - * - * Ensures that the watermark[min,low,high] values for each zone are set - * correctly with respect to min_free_kbytes. - */ -void setup_per_zone_wmarks(void) +static void __setup_per_zone_wmarks(void) { unsigned long pages_min = min_free_kbytes >> (PAGE_SHIFT - 10); unsigned long lowmem_pages = 0; @@ -5107,6 +5100,20 @@ void setup_per_zone_wmarks(void) calculate_totalreserve_pages(); } +/** + * setup_per_zone_wmarks - called when min_free_kbytes changes + * or when memory is hot-{added|removed} + * + * Ensures that the watermark[min,low,high] values for each zone are set + * correctly with respect to min_free_kbytes. + */ +void setup_per_zone_wmarks(void) +{ + mutex_lock(&zonelists_mutex); + __setup_per_zone_wmarks(); + mutex_unlock(&zonelists_mutex); +} + /* * The inactive anon list should be small enough that the VM never has to * do too much work, but large enough that each inactive page has a chance
There is a race between the min_free_kbytes sysctl, memory hotplug and transparent hugepage support enablement. Memory hotplug uses a zonelists_mutex to avoid a race when building zonelists. Reuse it to serialise watermark updates. [a.p.zijlstra@chello.nl: Older patch fixed the race with spinlock] Signed-off-by: Mel Gorman <mgorman@suse.de> --- mm/page_alloc.c | 23 +++++++++++++++-------- 1 files changed, 15 insertions(+), 8 deletions(-)