diff mbox

[4/4] powerpc/book3s: Fix guest MC delivery mechanism to avoid soft lockups in guest.

Message ID 20140617084441.GA18798@in.ibm.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Mahesh J Salgaonkar June 17, 2014, 8:44 a.m. UTC
On 2014-06-17 16:23:58 Tue, Paul Mackerras wrote:
> On Wed, Jun 11, 2014 at 02:18:21PM +0530, Mahesh J Salgaonkar wrote:
> > From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> > 
> > Currently we forward MCEs to guest which have been recovered by guest.
> > And for unhandled errors we do not deliver the MCE to guest. It looks like
> > with no support of FWNMI in qemu, guest just panics whenever we deliver the
> > recovered MCEs to guest. Also, the existig code used to return to host for
> > unhandled errors which was casuing guest to hang with soft lockups inside
> > guest and makes it difficult to recover guest instance.
> > 
> > This patch now forwards all fatal MCEs to guest causing guest to crash/panic.
> > And, for recovered errors we just go back to normal functioning of guest
> > instead of returning to host.
> 
> ... having corrupted possibly live values that the guest had in SRR0/1.
> 
> Ideally the guest should have cleared MSR[RI] before putting values in
> SRR0/1, so perhaps you could check that and return to the guest
> without giving it a machine check if MSR[RI] is set.  But if MSR[RI]
> is clear, the guest is unfixably corrupted because the machine check
> overwrote SRR0/1, and the only thing we can do, in the absence of
> FWNMI support, is give the guest a machine check interrupt and let it
> crash.

Yes agree. I have patch (below) ready for the same, will test/verify and send it
out soon.

Thanks,
-Mahesh.

-------------
Deliver machine check with MSR(RI=0) to guest as MCE

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>


---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)
diff mbox

Patch

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 868347e..c9c56ee 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2257,7 +2257,6 @@  machine_check_realmode:
 	mr	r3, r9		/* get vcpu pointer */
 	bl	kvmppc_realmode_machine_check
 	nop
-	cmpdi	r3, 0		/* Did we handle MCE ? */
 	ld	r9, HSTATE_KVM_VCPU(r13)
 	li	r12, BOOK3S_INTERRUPT_MACHINE_CHECK
 	/*
@@ -2270,13 +2269,18 @@  machine_check_realmode:
 	 * The old code used to return to host for unhandled errors which
 	 * was causing guest to hang with soft lockups inside guest and
 	 * makes it difficult to recover guest instance.
+	 *
+	 * if we receive machine check with MSR(RI=0) then deliver it to
+	 * guest as machine check causing guest to crash.
 	 */
-	ld	r10, VCPU_PC(r9)
 	ld	r11, VCPU_MSR(r9)
+	andi.	r10, r11, MSR_RI	/* check for unrecoverable exception */
+	beq	1f			/* Deliver a machine check to guest */
+	ld	r10, VCPU_PC(r9)
+	cmpdi	r3, 0		/* Did we handle MCE ? */
 	bne	2f	/* Continue guest execution. */
 	/* If not, deliver a machine check.  SRR0/1 are already set */
-	li	r10, BOOK3S_INTERRUPT_MACHINE_CHECK
-	ld	r11, VCPU_MSR(r9)
+1:	li	r10, BOOK3S_INTERRUPT_MACHINE_CHECK
 	bl	kvmppc_msr_interrupt
 2:	b	fast_interrupt_c_return