From patchwork Wed Dec 11 17:13:26 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris J Arges X-Patchwork-Id: 300237 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 811C92C0091 for ; Thu, 12 Dec 2013 04:13:37 +1100 (EST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1VqnLj-0002Jx-P5; Wed, 11 Dec 2013 17:13:27 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1VqnLe-0002Jq-SM for kernel-team@lists.ubuntu.com; Wed, 11 Dec 2013 17:13:22 +0000 Received: from cpe-66-68-155-223.austin.res.rr.com ([66.68.155.223] helo=macgyver.austin.rr.com) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1VqnLe-0004WK-Gm for kernel-team@lists.ubuntu.com; Wed, 11 Dec 2013 17:13:22 +0000 From: Chris J Arges To: kernel-team@lists.ubuntu.com Subject: [trusty][PATCH] sched: Avoid throttle_cfs_rq() racing with period_timer stopping Date: Wed, 11 Dec 2013 11:13:26 -0600 Message-Id: <1386782006-3169-1-git-send-email-chris.j.arges@canonical.com> X-Mailer: git-send-email 1.7.9.5 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com From: Ben Segall BugLink: http://bugs.launchpad.net/bugs/1259645 throttle_cfs_rq() doesn't check to make sure that period_timer is running, and while update_curr/assign_cfs_runtime does, a concurrently running period_timer on another cpu could cancel itself between this cpu's update_curr and throttle_cfs_rq(). If there are no other cfs_rqs running in the tg to restart the timer, this causes the cfs_rq to be stranded forever. Fix this by calling __start_cfs_bandwidth() in throttle if the timer is inactive. (Also add some sched_debug lines for cfs_bandwidth.) Tested: make a run/sleep task in a cgroup, loop switching the cgroup between 1ms/100ms quota and unlimited, checking for timer_active=0 and throttled=1 as a failure. With the throttle_cfs_rq() change commented out this fails, with the full patch it passes. Signed-off-by: Ben Segall Signed-off-by: Peter Zijlstra Cc: pjt@google.com Link: http://lkml.kernel.org/r/20131016181632.22647.84174.stgit@sword-of-the-dawn.mtv.corp.google.com Signed-off-by: Ingo Molnar (cherry picked from commit f9f9ffc237dd924f048204e8799da74f9ecf40cf) Signed-off-by: Chris J Arges --- kernel/sched/debug.c | 8 ++++++++ kernel/sched/fair.c | 2 ++ 2 files changed, 10 insertions(+) diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index 1965599..fd9ca1d 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -225,6 +225,14 @@ void print_cfs_rq(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq) atomic_read(&cfs_rq->tg->runnable_avg)); #endif #endif +#ifdef CONFIG_CFS_BANDWIDTH + SEQ_printf(m, " .%-30s: %d\n", "tg->cfs_bandwidth.timer_active", + cfs_rq->tg->cfs_bandwidth.timer_active); + SEQ_printf(m, " .%-30s: %d\n", "throttled", + cfs_rq->throttled); + SEQ_printf(m, " .%-30s: %d\n", "throttle_count", + cfs_rq->throttle_count); +#endif #ifdef CONFIG_FAIR_GROUP_SCHED print_cfs_group_stats(m, cpu, cfs_rq->tg); diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7c70201..513fc2f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2335,6 +2335,8 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq) cfs_rq->throttled_clock = rq_clock(rq); raw_spin_lock(&cfs_b->lock); list_add_tail_rcu(&cfs_rq->throttled_list, &cfs_b->throttled_cfs_rq); + if (!cfs_b->timer_active) + __start_cfs_bandwidth(cfs_b); raw_spin_unlock(&cfs_b->lock); }