From patchwork Wed Sep 30 09:50:04 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 524928 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 4E975140D76; Thu, 1 Oct 2015 21:31:49 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1Zhc5N-0005c9-Eo; Thu, 01 Oct 2015 11:31:41 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1ZhE3o-0007dH-Kf for kernel-team@lists.ubuntu.com; Wed, 30 Sep 2015 09:52:28 +0000 Received: from av-217-129-142-138.netvisao.pt ([217.129.142.138] helo=localhost) by youngberry.canonical.com with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1ZhE3n-000626-D6; Wed, 30 Sep 2015 09:52:27 +0000 From: Luis Henriques To: linux-kernel@vger.kernel.org, stable@vger.kernel.org, kernel-team@lists.ubuntu.com Subject: [PATCH 3.16.y-ckt 056/133] sched: Fix cpu_active_mask/cpu_online_mask race Date: Wed, 30 Sep 2015 10:50:04 +0100 Message-Id: <1443606681-7124-57-git-send-email-luis.henriques@canonical.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1443606681-7124-1-git-send-email-luis.henriques@canonical.com> References: <1443606681-7124-1-git-send-email-luis.henriques@canonical.com> MIME-Version: 1.0 X-Extended-Stable: 3.16 X-Mailman-Approved-At: Thu, 01 Oct 2015 11:31:40 +0000 Cc: Joerg Roedel , Peter Zijlstra , =?UTF-8?q?Jan=20H=2E=20Sch=C3=B6nherr?= , Borislav Petkov , Anton Blanchard , Matt Wilson , Thomas Gleixner , Linus Torvalds , Ingo Molnar X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com 3.16.7-ckt18 -stable review patch. If anyone has any objections, please let me know. ------------------ From: =?UTF-8?q?Jan=20H=2E=20Sch=C3=B6nherr?= commit dd9d3843755da95f63dd3a376f62b3e45c011210 upstream. There is a race condition in SMP bootup code, which may result in WARNING: CPU: 0 PID: 1 at kernel/workqueue.c:4418 workqueue_cpu_up_callback() or kernel BUG at kernel/smpboot.c:135! It can be triggered with a bit of luck in Linux guests running on busy hosts. CPU0 CPUn ==== ==== _cpu_up() __cpu_up() start_secondary() set_cpu_online() cpumask_set_cpu(cpu, to_cpumask(cpu_online_bits)); cpu_notify(CPU_ONLINE) cpumask_set_cpu(cpu, to_cpumask(cpu_active_bits)); During the various CPU_ONLINE callbacks CPUn is online but not active. Several things can go wrong at that point, depending on the scheduling of tasks on CPU0. Variant 1: cpu_notify(CPU_ONLINE) workqueue_cpu_up_callback() rebind_workers() set_cpus_allowed_ptr() This call fails because it requires an active CPU; rebind_workers() ends with a warning: WARNING: CPU: 0 PID: 1 at kernel/workqueue.c:4418 workqueue_cpu_up_callback() Variant 2: cpu_notify(CPU_ONLINE) smpboot_thread_call() smpboot_unpark_threads() .. __kthread_unpark() __kthread_bind() wake_up_state() .. select_task_rq() select_fallback_rq() The ->wake_cpu of the unparked thread is not allowed, making a call to select_fallback_rq() necessary. Then, select_fallback_rq() cannot find an allowed, active CPU and promptly resets the allowed CPUs, so that the task in question ends up on CPU0. When those unparked tasks are eventually executed, they run immediately into a BUG: kernel BUG at kernel/smpboot.c:135! Just changing the order in which the online/active bits are set (and adding some memory barriers), would solve the two issues above. However, it would change the order of operations back to the one before commit 6acbfb96976f ("sched: Fix hotplug vs. set_cpus_allowed_ptr()"), thus, reintroducing that particular problem. Going further back into history, we have at least the following commits touching this topic: - commit 2baab4e90495 ("sched: Fix select_fallback_rq() vs cpu_active/cpu_online") - commit 5fbd036b552f ("sched: Cleanup cpu_active madness") Together, these give us the following non-working solutions: - secondary CPU sets active before online, because active is assumed to be a subset of online; - secondary CPU sets online before active, because the primary CPU assumes that an online CPU is also active; - secondary CPU sets online and waits for primary CPU to set active, because it might deadlock. Commit 875ebe940d77 ("powerpc/smp: Wait until secondaries are active & online") introduces an arch-specific solution to this arch-independent problem. Now, go for a more general solution without explicit waiting and simply set active twice: once on the secondary CPU after online was set and once on the primary CPU after online was seen. set_cpus_allowed_ptr()") Signed-off-by: Jan H. Schönherr Acked-by: Peter Zijlstra Cc: Anton Blanchard Cc: Borislav Petkov Cc: Joerg Roedel Cc: Linus Torvalds Cc: Matt Wilson Cc: Michael Ellerman Cc: Peter Zijlstra Cc: Thomas Gleixner Fixes: 6acbfb96976f ("sched: Fix hotplug vs. set_cpus_allowed_ptr()") Link: http://lkml.kernel.org/r/1439408156-18840-1-git-send-email-jschoenh@amazon.de Signed-off-by: Ingo Molnar Signed-off-by: Luis Henriques --- kernel/sched/core.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 5c89af9ea192..f06dcf7dcd00 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5174,6 +5174,14 @@ static int sched_cpu_active(struct notifier_block *nfb, case CPU_STARTING: set_cpu_rq_start_time(); return NOTIFY_OK; + case CPU_ONLINE: + /* + * At this point a starting CPU has marked itself as online via + * set_cpu_online(). But it might not yet have marked itself + * as active, which is essential from here on. + * + * Thus, fall-through and help the starting CPU along. + */ case CPU_DOWN_FAILED: set_cpu_active((long)hcpu, true); return NOTIFY_OK;