Patchwork [quantal,sru,v2] module: do percpu allocation after uniqueness check. No, really!

login
register
mail settings
Submitter Chris J Arges
Date March 28, 2014, 3:47 p.m.
Message ID <1396021646-8212-2-git-send-email-chris.j.arges@canonical.com>
Download mbox | patch
Permalink /patch/334746/
State New
Headers show

Comments

Chris J Arges - March 28, 2014, 3:47 p.m.
From: Rusty Russell <rusty@rustcorp.com.au>

BugLink: http://bugs.launchpad.net/bugs/1088433

v3.8-rc1-5-g1fb9341 was supposed to stop parallel kvm loads exhausting
percpu memory on large machines:

    Now we have a new state MODULE_STATE_UNFORMED, we can insert the
    module into the list (and thus guarantee its uniqueness) before we
    allocate the per-cpu region.

In my defence, it didn't actually say the patch did this.  Just that
we "can".

This patch actually *does* it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Jim Hull <jim.hull@hp.com>
Cc: stable@kernel.org # 3.8
(backported from commit 8d8022e8aba85192e937f1f0f7450e256d66ae5c)

Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
---
 kernel/module.c |   34 ++++++++++++++++++----------------
 1 file changed, 18 insertions(+), 16 deletions(-)
Seth Forshee - March 28, 2014, 6:18 p.m.
On Fri, Mar 28, 2014 at 10:47:26AM -0500, Chris J Arges wrote:
> From: Rusty Russell <rusty@rustcorp.com.au>
> 
> BugLink: http://bugs.launchpad.net/bugs/1088433
> 
> v3.8-rc1-5-g1fb9341 was supposed to stop parallel kvm loads exhausting
> percpu memory on large machines:
> 
>     Now we have a new state MODULE_STATE_UNFORMED, we can insert the
>     module into the list (and thus guarantee its uniqueness) before we
>     allocate the per-cpu region.
> 
> In my defence, it didn't actually say the patch did this.  Just that
> we "can".
> 
> This patch actually *does* it.
> 
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> Tested-by: Jim Hull <jim.hull@hp.com>
> Cc: stable@kernel.org # 3.8
> (backported from commit 8d8022e8aba85192e937f1f0f7450e256d66ae5c)
> 
> Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>

The backport looks good to me.

Acked-by: Seth Forshee <seth.forshee@canonical.com>

Patch

diff --git a/kernel/module.c b/kernel/module.c
index 2ad732d1..0962e1e 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2810,7 +2810,6 @@  static struct module *layout_and_allocate(struct load_info *info)
 {
 	/* Module within temporary copy. */
 	struct module *mod;
-	Elf_Shdr *pcpusec;
 	int err;
 
 	mod = setup_load_info(info);
@@ -2825,17 +2824,10 @@  static struct module *layout_and_allocate(struct load_info *info)
 	err = module_frob_arch_sections(info->hdr, info->sechdrs,
 					info->secstrings, mod);
 	if (err < 0)
-		goto out;
+		return ERR_PTR(err);
 
-	pcpusec = &info->sechdrs[info->index.pcpu];
-	if (pcpusec->sh_size) {
-		/* We have a special allocation for this section. */
-		err = percpu_modalloc(mod,
-				      pcpusec->sh_size, pcpusec->sh_addralign);
-		if (err)
-			goto out;
-		pcpusec->sh_flags &= ~(unsigned long)SHF_ALLOC;
-	}
+	/* We will do a special allocation for per-cpu sections later. */
+	info->sechdrs[info->index.pcpu].sh_flags &= ~(unsigned long)SHF_ALLOC;
 
 	/* Determine total sizes, and put offsets in sh_entsize.  For now
 	   this is done generically; there doesn't appear to be any
@@ -2846,17 +2838,22 @@  static struct module *layout_and_allocate(struct load_info *info)
 	/* Allocate and move to the final place */
 	err = move_module(mod, info);
 	if (err)
-		goto free_percpu;
+		return ERR_PTR(err);
 
 	/* Module has been copied to its final place now: return it. */
 	mod = (void *)info->sechdrs[info->index.mod].sh_addr;
 	kmemleak_load_module(mod, info);
 	return mod;
+}
 
-free_percpu:
-	percpu_modfree(mod);
-out:
-	return ERR_PTR(err);
+static int alloc_module_percpu(struct module *mod, struct load_info *info)
+{
+	Elf_Shdr *pcpusec = &info->sechdrs[info->index.pcpu];
+	if (!pcpusec->sh_size)
+		return 0;
+
+	/* We have a special allocation for this section. */
+	return percpu_modalloc(mod, pcpusec->sh_size, pcpusec->sh_addralign);
 }
 
 /* mod is no longer valid after this! */
@@ -2956,6 +2953,11 @@  again:
 	list_add_rcu(&mod->list, &modules);
 	mutex_unlock(&module_mutex);
 
+	/* To avoid stressing percpu allocator, do this once we're unique. */
+	err = alloc_module_percpu(mod, &info);
+	if (err)
+		goto unlink_mod;
+
 	/* Now module is in final location, initialize linked lists, etc. */
 	err = module_unload_init(mod);
 	if (err)