Message ID | 20150317092707.16806.62378.stgit@mars (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | Paul Mackerras |
Headers | show |
On Tue, Mar 17, 2015 at 02:57:48PM +0530, Mahesh J Salgaonkar wrote: > From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> > > and deliver MCE to guest if recovery is failed. For recovered errors > we just go back to normal functioning of guest. But there are cases > where we may hit MCE in guest with MSR(RI=0), which means MCE interrupt is > not recoverable and guest can not function normally it should go down to > panic path. The current implementation does not have check for MSR(RI=0) > which can cause guest to crash with Bad kernel stack pointer instead of > machine check oops message. > > [26281.490060] Bad kernel stack pointer 3fff9ccce5b0 at c00000000000490c > [26281.490434] Oops: Bad kernel stack pointer, sig: 6 [#1] > [26281.490472] SMP NR_CPUS=2048 NUMA pSeries > > This patch fixes this issue by checking MSR(RI=0) in KVM layer and forwarding > unrecoverable interrupt to guest which then panics with proper machine check > Oops message. > > Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> > --- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) The patch itself is fine, but you need a proper headline (something like "KVM: PPC: Book3S HV: Inform guest of unrecoverable machine checks" perhaps) as the subject of the email, and you need to post the patch to both the kvm@vger.kernel.org list and the kvm-ppc@vger.kernel.org list. Also, the English in the patch description could use some improvement. Acked-by: Paul Mackerras <paulus@samba.org>
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index bb94e6f..258f46d 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -2063,7 +2063,6 @@ machine_check_realmode: mr r3, r9 /* get vcpu pointer */ bl kvmppc_realmode_machine_check nop - cmpdi r3, 0 /* Did we handle MCE ? */ ld r9, HSTATE_KVM_VCPU(r13) li r12, BOOK3S_INTERRUPT_MACHINE_CHECK /* @@ -2076,13 +2075,18 @@ machine_check_realmode: * The old code used to return to host for unhandled errors which * was causing guest to hang with soft lockups inside guest and * makes it difficult to recover guest instance. + * + * if we receive machine check with MSR(RI=0) then deliver it to + * guest as machine check causing guest to crash. */ - ld r10, VCPU_PC(r9) ld r11, VCPU_MSR(r9) + andi. r10, r11, MSR_RI /* check for unrecoverable exception */ + beq 1f /* Deliver a machine check to guest */ + ld r10, VCPU_PC(r9) + cmpdi r3, 0 /* Did we handle MCE ? */ bne 2f /* Continue guest execution. */ /* If not, deliver a machine check. SRR0/1 are already set */ - li r10, BOOK3S_INTERRUPT_MACHINE_CHECK - ld r11, VCPU_MSR(r9) +1: li r10, BOOK3S_INTERRUPT_MACHINE_CHECK bl kvmppc_msr_interrupt 2: b fast_interrupt_c_return