diff mbox

[3.16.y-ckt,17/17] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors

Message ID 1460484484-22395-18-git-send-email-luis.henriques@canonical.com
State New
Headers show

Commit Message

Luis Henriques April 12, 2016, 6:08 p.m. UTC
3.16.7-ckt27 -stable review patch.  If anyone has any objections, please let me know.

---8<------------------------------------------------------------

From: Vitaly Kuznetsov <vkuznets@redhat.com>

commit e513229b4c386e6c9f66298c13fde92f73e6e1ac upstream.

When an SMP Hyper-V guest is running on top of 2012R2 Server and secondary
cpus are sent offline (with echo 0 > /sys/devices/system/cpu/cpu$cpu/online)
the system freeze is observed. This happens due to the fact that on newer
hypervisors (Win8, WS2012R2, ...) vmbus channel handlers are distributed
across all cpus (see init_vp_index() function in drivers/hv/channel_mgmt.c)
and on cpu offlining nobody reassigns them to CPU0. Prevent cpu offlining
when vmbus is loaded until the issue is fixed host-side.

This patch also disables hibernation but it is OK as it is also broken (MCE
error is hit on resume). Suspend still works.

Tested with WS2008R2 and WS2012R2.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Chas Williams <3chas3@gmail.com>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
---
 drivers/hv/vmbus_drv.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

Comments

Ben Hutchings April 14, 2016, 7:11 p.m. UTC | #1
On Tue, 2016-04-12 at 19:08 +0100, Luis Henriques wrote:
> 3.16.7-ckt27 -stable review patch.  If anyone has any objections, please let me know.
> 
> ---8<------------------------------------------------------------
> 
> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> commit e513229b4c386e6c9f66298c13fde92f73e6e1ac upstream.
> 
> When an SMP Hyper-V guest is running on top of 2012R2 Server and secondary
> cpus are sent offline (with echo 0 > /sys/devices/system/cpu/cpu$cpu/online)
> the system freeze is observed. This happens due to the fact that on newer
> hypervisors (Win8, WS2012R2, ...) vmbus channel handlers are distributed
> across all cpus (see init_vp_index() function in drivers/hv/channel_mgmt.c)
> and on cpu offlining nobody reassigns them to CPU0. Prevent cpu offlining
> when vmbus is loaded until the issue is fixed host-side.
> 
> This patch also disables hibernation but it is OK as it is also broken (MCE
> error is hit on resume). Suspend still works.
[...]
> +static void hv_cpu_hotplug_quirk(bool vmbus_loaded)
> +{
> +	static void *previous_cpu_disable;
> +
> +	/*
> +	 * Offlining a CPU when running on newer hypervisors (WS2012R2, Win8,
> +	 * ...) is not supported at this moment as channel interrupts are
> +	 * distributed across all of them.
> +	 */
> +
> +	if ((vmbus_proto_version == VERSION_WS2008) ||
> +	    (vmbus_proto_version == VERSION_WIN7))
> +		return;
> +
> +	if (vmbus_loaded) {
> +		previous_cpu_disable = smp_ops.cpu_disable;
> +		smp_ops.cpu_disable = hyperv_cpu_disable;
> +		pr_notice("CPU offlining is not supported by hypervisor\n");
> +	} else if (previous_cpu_disable)
> +		smp_ops.cpu_disable = previous_cpu_disable;
[...]

This is a really bad hack.  What if two different drivers patched
smp_ops and got unloaded in a different order?  Perhaps the core
support code for Hyper-V should define its own smp_ops.

I don't want to stop this going into stable, but seriously, please
clean this up.

Ben.
diff mbox

Patch

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 4d6b26979fbd..233da0b9f4b9 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -32,6 +32,7 @@ 
 #include <linux/completion.h>
 #include <linux/hyperv.h>
 #include <linux/kernel_stat.h>
+#include <linux/cpu.h>
 #include <asm/hyperv.h>
 #include <asm/hypervisor.h>
 #include <asm/mshyperv.h>
@@ -671,6 +672,39 @@  static void vmbus_isr(void)
 		tasklet_schedule(&msg_dpc);
 }
 
+#ifdef CONFIG_HOTPLUG_CPU
+static int hyperv_cpu_disable(void)
+{
+	return -ENOSYS;
+}
+
+static void hv_cpu_hotplug_quirk(bool vmbus_loaded)
+{
+	static void *previous_cpu_disable;
+
+	/*
+	 * Offlining a CPU when running on newer hypervisors (WS2012R2, Win8,
+	 * ...) is not supported at this moment as channel interrupts are
+	 * distributed across all of them.
+	 */
+
+	if ((vmbus_proto_version == VERSION_WS2008) ||
+	    (vmbus_proto_version == VERSION_WIN7))
+		return;
+
+	if (vmbus_loaded) {
+		previous_cpu_disable = smp_ops.cpu_disable;
+		smp_ops.cpu_disable = hyperv_cpu_disable;
+		pr_notice("CPU offlining is not supported by hypervisor\n");
+	} else if (previous_cpu_disable)
+		smp_ops.cpu_disable = previous_cpu_disable;
+}
+#else
+static void hv_cpu_hotplug_quirk(bool vmbus_loaded)
+{
+}
+#endif
+
 /*
  * vmbus_bus_init -Main vmbus driver initialization routine.
  *
@@ -711,6 +745,7 @@  static int vmbus_bus_init(int irq)
 	if (ret)
 		goto err_alloc;
 
+	hv_cpu_hotplug_quirk(true);
 	vmbus_request_offers();
 
 	return 0;
@@ -964,6 +999,7 @@  static void __exit vmbus_exit(void)
 	bus_unregister(&hv_bus);
 	hv_cleanup();
 	acpi_bus_unregister_driver(&vmbus_acpi_driver);
+	hv_cpu_hotplug_quirk(false);
 }