From patchwork Mon Jan 7 20:31:52 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Herton Ronaldo Krzesinski X-Patchwork-Id: 210116 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from chlorine.canonical.com (chlorine.canonical.com [91.189.94.204]) by ozlabs.org (Postfix) with ESMTP id 0EBB32C009A for ; Tue, 8 Jan 2013 07:32:10 +1100 (EST) Received: from localhost ([127.0.0.1] helo=chlorine.canonical.com) by chlorine.canonical.com with esmtp (Exim 4.71) (envelope-from ) id 1TsJMZ-0005yR-5Z; Mon, 07 Jan 2013 20:32:03 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by chlorine.canonical.com with esmtp (Exim 4.71) (envelope-from ) id 1TsJMX-0005xn-3D for kernel-team@lists.ubuntu.com; Mon, 07 Jan 2013 20:32:01 +0000 Received: from 189.114.234.143.dynamic.adsl.gvt.net.br ([189.114.234.143] helo=canonical.com) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1TsJMT-0002nu-IM; Mon, 07 Jan 2013 20:31:58 +0000 From: Herton Ronaldo Krzesinski To: Tejun Heo Subject: [ 3.5.y.z extended stable ] Patch "cgroup: cgroup_subsys->fork() should be called after the" has been added to staging queue Date: Mon, 7 Jan 2013 18:31:52 -0200 Message-Id: <1357590712-16492-1-git-send-email-herton.krzesinski@canonical.com> X-Mailer: git-send-email 1.7.9.5 X-Extended-Stable: 3.5 Cc: "Rafael J. Wysocki" , kernel-team@lists.ubuntu.com, Oleg Nesterov X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.13 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: kernel-team-bounces@lists.ubuntu.com Errors-To: kernel-team-bounces@lists.ubuntu.com This is a note to let you know that I have just added a patch titled cgroup: cgroup_subsys->fork() should be called after the to the linux-3.5.y-queue branch of the 3.5.y.z extended stable tree which can be found at: http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.5.y-queue If you, or anyone else, feels it should not be added to this tree, please reply to this email. For more information about the 3.5.y.z tree, see https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable Thanks. -Herton ------ From c7c467a1e121c6cc23afd2807ae5d106f3da7d91 Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Tue, 16 Oct 2012 15:03:14 -0700 Subject: [PATCH] cgroup: cgroup_subsys->fork() should be called after the task is added to css_set commit 5edee61edeaaebafe584f8fb7074c1ef4658596b upstream. cgroup core has a bug which violates a basic rule about event notifications - when a new entity needs to be added, you add that to the notification list first and then make the new entity conform to the current state. If done in the reverse order, an event happening inbetween will be lost. cgroup_subsys->fork() is invoked way before the new task is added to the css_set. Currently, cgroup_freezer is the only user of ->fork() and uses it to make new tasks conform to the current state of the freezer. If FROZEN state is requested while fork is in progress between cgroup_fork_callbacks() and cgroup_post_fork(), the child could escape freezing - the cgroup isn't frozen when ->fork() is called and the freezer couldn't see the new task on the css_set. This patch moves cgroup_subsys->fork() invocation to cgroup_post_fork() after the new task is added to the css_set. cgroup_fork_callbacks() is removed. Because now a task may be migrated during cgroup_subsys->fork(), freezer_fork() is updated so that it adheres to the usual RCU locking and the rather pointless comment on why locking can be different there is removed (if it doesn't make anything simpler, why even bother?). Signed-off-by: Tejun Heo Cc: Oleg Nesterov Cc: Rafael J. Wysocki [ herton: backport to 3.5, CGROUP_BUILTIN_SUBSYS_COUNT is still present, move the same code in cgroup_fork_callbacks over to cgroup_post_fork ] Signed-off-by: Herton Ronaldo Krzesinski --- include/linux/cgroup.h | 1 - kernel/cgroup.c | 55 ++++++++++++++++++++++------------------------- kernel/cgroup_freezer.c | 13 ++++------- kernel/fork.c | 9 +------- 4 files changed, 31 insertions(+), 47 deletions(-) -- 1.7.9.5 diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index d3f5fba..986b06a 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -33,7 +33,6 @@ extern int cgroup_lock_is_held(void); extern bool cgroup_lock_live_group(struct cgroup *cgrp); extern void cgroup_unlock(void); extern void cgroup_fork(struct task_struct *p); -extern void cgroup_fork_callbacks(struct task_struct *p); extern void cgroup_post_fork(struct task_struct *p); extern void cgroup_exit(struct task_struct *p, int run_callbacks); extern int cgroupstats_build(struct cgroupstats *stats, diff --git a/kernel/cgroup.c b/kernel/cgroup.c index a91aa0b..a31b636 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -4757,41 +4757,19 @@ void cgroup_fork(struct task_struct *child) } /** - * cgroup_fork_callbacks - run fork callbacks - * @child: the new task - * - * Called on a new task very soon before adding it to the - * tasklist. No need to take any locks since no-one can - * be operating on this task. - */ -void cgroup_fork_callbacks(struct task_struct *child) -{ - if (need_forkexit_callback) { - int i; - /* - * forkexit callbacks are only supported for builtin - * subsystems, and the builtin section of the subsys array is - * immutable, so we don't need to lock the subsys array here. - */ - for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) { - struct cgroup_subsys *ss = subsys[i]; - if (ss->fork) - ss->fork(child); - } - } -} - -/** * cgroup_post_fork - called on a new task after adding it to the task list * @child: the task in question * - * Adds the task to the list running through its css_set if necessary. - * Has to be after the task is visible on the task list in case we race - * with the first call to cgroup_iter_start() - to guarantee that the - * new task ends up on its list. + * Adds the task to the list running through its css_set if necessary and + * call the subsystem fork() callbacks. Has to be after the task is + * visible on the task list in case we race with the first call to + * cgroup_iter_start() - to guarantee that the new task ends up on its + * list. */ void cgroup_post_fork(struct task_struct *child) { + int i; + /* * use_task_css_set_links is set to 1 before we walk the tasklist * under the tasklist_lock and we read it here after we added the child @@ -4811,7 +4789,26 @@ void cgroup_post_fork(struct task_struct *child) task_unlock(child); write_unlock(&css_set_lock); } + + /* + * Call ss->fork(). This must happen after @child is linked on + * css_set; otherwise, @child might change state between ->fork() + * and addition to css_set. + */ + if (need_forkexit_callback) { + /* + * forkexit callbacks are only supported for builtin + * subsystems, and the builtin section of the subsys array is + * immutable, so we don't need to lock the subsys array here. + */ + for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) { + struct cgroup_subsys *ss = subsys[i]; + if (ss->fork) + ss->fork(child); + } + } } + /** * cgroup_exit - detach cgroup from exiting task * @tsk: pointer to task_struct of exiting process diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c index 3649fc6..f990569 100644 --- a/kernel/cgroup_freezer.c +++ b/kernel/cgroup_freezer.c @@ -186,23 +186,15 @@ static void freezer_fork(struct task_struct *task) { struct freezer *freezer; - /* - * No lock is needed, since the task isn't on tasklist yet, - * so it can't be moved to another cgroup, which means the - * freezer won't be removed and will be valid during this - * function call. Nevertheless, apply RCU read-side critical - * section to suppress RCU lockdep false positives. - */ rcu_read_lock(); freezer = task_freezer(task); - rcu_read_unlock(); /* * The root cgroup is non-freezable, so we can skip the * following check. */ if (!freezer->css.cgroup->parent) - return; + goto out; spin_lock_irq(&freezer->lock); BUG_ON(freezer->state == CGROUP_FROZEN); @@ -210,7 +202,10 @@ static void freezer_fork(struct task_struct *task) /* Locking avoids race with FREEZING -> THAWED transitions. */ if (freezer->state == CGROUP_FREEZING) freeze_task(task); + spin_unlock_irq(&freezer->lock); +out: + rcu_read_unlock(); } /* diff --git a/kernel/fork.c b/kernel/fork.c index f9d0499..bbb6881 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1160,7 +1160,6 @@ static struct task_struct *copy_process(unsigned long clone_flags, { int retval; struct task_struct *p; - int cgroup_callbacks_done = 0; if ((clone_flags & (CLONE_NEWNS|CLONE_FS)) == (CLONE_NEWNS|CLONE_FS)) return ERR_PTR(-EINVAL); @@ -1422,12 +1421,6 @@ static struct task_struct *copy_process(unsigned long clone_flags, INIT_LIST_HEAD(&p->thread_group); INIT_HLIST_HEAD(&p->task_works); - /* Now that the task is set up, run cgroup callbacks if - * necessary. We need to run them before the task is visible - * on the tasklist. */ - cgroup_fork_callbacks(p); - cgroup_callbacks_done = 1; - /* Need tasklist lock for parent etc handling! */ write_lock_irq(&tasklist_lock); @@ -1532,7 +1525,7 @@ bad_fork_cleanup_cgroup: #endif if (clone_flags & CLONE_THREAD) threadgroup_change_end(current); - cgroup_exit(p, cgroup_callbacks_done); + cgroup_exit(p, 0); delayacct_tsk_free(p); module_put(task_thread_info(p)->exec_domain->module); bad_fork_cleanup_count: