From patchwork Mon Feb 25 08:39:36 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Bader X-Patchwork-Id: 222862 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from chlorine.canonical.com (chlorine.canonical.com [91.189.94.204]) by ozlabs.org (Postfix) with ESMTP id E5C4D2C02AF for ; Mon, 25 Feb 2013 19:39:59 +1100 (EST) Received: from localhost ([127.0.0.1] helo=chlorine.canonical.com) by chlorine.canonical.com with esmtp (Exim 4.71) (envelope-from ) id 1U9tbA-0006xo-FS; Mon, 25 Feb 2013 08:39:48 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by chlorine.canonical.com with esmtp (Exim 4.71) (envelope-from ) id 1U9tb8-0006xb-AA for kernel-team@lists.ubuntu.com; Mon, 25 Feb 2013 08:39:46 +0000 Received: from p5b2e2e01.dip.t-dialin.net ([91.46.46.1] helo=[192.168.2.5]) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1U9tb8-0002FY-4O; Mon, 25 Feb 2013 08:39:46 +0000 Message-ID: <512B2348.10309@canonical.com> Date: Mon, 25 Feb 2013 09:39:36 +0100 From: Stefan Bader User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2 MIME-Version: 1.0 To: Ubuntu Kernel Team Subject: Fwd: Patch Upstream: xen: Send spinlock IPI to all waiters References: <20130225020250.6FB82203F@git.kroah.org> In-Reply-To: <20130225020250.6FB82203F@git.kroah.org> X-Enigmail-Version: 1.4.6 X-Forwarded-Message-Id: <20130225020250.6FB82203F@git.kroah.org> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.13 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kernel-team-bounces@lists.ubuntu.com Errors-To: kernel-team-bounces@lists.ubuntu.com The patch below now is in the upstream stable process. When pulling / applying this one to Precise, please make sure that the following one is reverted: commit dec8ea944c1a873ccc33680e6155b829d3e129b2 Author: Stefan Bader Date: Tue Feb 5 18:17:33 2013 +0100 [PATCH] UBUNTU: SAUCE: xen/pv-spinlock: Never enable interrupts in xen_spin_lock_slow() Thanks, Stefan -------- Original Message -------- Subject: Patch Upstream: xen: Send spinlock IPI to all waiters Date: Sun, 24 Feb 2013 21:02:50 -0500 (EST) From: Gregs git-bot To: greg@kroah.com, stable@vger.kernel.org commit: 76eaca031f0af2bb303e405986f637811956a422 From: Stefan Bader Date: Fri, 15 Feb 2013 09:48:52 +0100 Subject: xen: Send spinlock IPI to all waiters There is a loophole between Xen's current implementation of pv-spinlocks and the scheduler. This was triggerable through a testcase until v3.6 changed the TLB flushing code. The problem potentially is still there just not observable in the same way. What could happen was (is): 1. CPU n tries to schedule task x away and goes into a slow wait for the runq lock of CPU n-# (must be one with a lower number). 2. CPU n-#, while processing softirqs, tries to balance domains and goes into a slow wait for its own runq lock (for updating some records). Since this is a spin_lock_irqsave in softirq context, interrupts will be re-enabled for the duration of the poll_irq hypercall used by Xen. 3. Before the runq lock of CPU n-# is unlocked, CPU n-1 receives an interrupt (e.g. endio) and when processing the interrupt, tries to wake up task x. But that is in schedule and still on_cpu, so try_to_wake_up goes into a tight loop. 4. The runq lock of CPU n-# gets unlocked, but the message only gets sent to the first waiter, which is CPU n-# and that is busily stuck. 5. CPU n-# never returns from the nested interruption to take and release the lock because the scheduler uses a busy wait. And CPU n never finishes the task migration because the unlock notification only went to CPU n-#. To avoid this and since the unlocking code has no real sense of which waiter is best suited to grab the lock, just send the IPI to all of them. This causes the waiters to return from the hyper- call (those not interrupted at least) and do active spinlocking. BugLink: http://bugs.launchpad.net/bugs/1011792 Acked-by: Jan Beulich Signed-off-by: Stefan Bader Cc: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk --- arch/x86/xen/spinlock.c | 1 - 1 file changed, 1 deletion(-) } diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c index 83e866d..f7a080e 100644 --- a/arch/x86/xen/spinlock.c +++ b/arch/x86/xen/spinlock.c @@ -328,7 +328,6 @@ static noinline void xen_spin_unlock_slow(struct xen_spinlock *xl) if (per_cpu(lock_spinners, cpu) == xl) { ADD_STATS(released_slow_kicked, 1); xen_send_IPI_one(cpu, XEN_SPIN_UNLOCK_VECTOR); - break; } }