diff mbox series

[v2,2/3] KVM: PPC: Book3S HV: Don't push XIVE context when not using XIVE device

Message ID 20190813100100.GC9567@blackberry
State Accepted
Headers show
Series powerpc/xive: Fix race condition leading to host crashes and hangs | expand

Commit Message

Paul Mackerras Aug. 13, 2019, 10:01 a.m. UTC
At present, when running a guest on POWER9 using HV KVM but not using
an in-kernel interrupt controller (XICS or XIVE), for example if QEMU
is run with the kernel_irqchip=off option, the guest entry code goes
ahead and tries to load the guest context into the XIVE hardware, even
though no context has been set up.

To fix this, we check that the "CAM word" is non-zero before pushing
it to the hardware.  The CAM word is initialized to a non-zero value
in kvmppc_xive_connect_vcpu() and kvmppc_xive_native_connect_vcpu(),
and is now cleared in kvmppc_xive_{,native_}cleanup_vcpu.

Cc: stable@vger.kernel.org # v4.11+
Reported-by: Cédric Le Goater <clg@kaod.org>
Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |  2 ++
 arch/powerpc/kvm/book3s_xive.c          | 11 ++++++++++-
 arch/powerpc/kvm/book3s_xive_native.c   |  3 +++
 3 files changed, 15 insertions(+), 1 deletion(-)

Comments

Cédric Le Goater Aug. 13, 2019, 12:18 p.m. UTC | #1
On 13/08/2019 12:01, Paul Mackerras wrote:
> At present, when running a guest on POWER9 using HV KVM but not using
> an in-kernel interrupt controller (XICS or XIVE), for example if QEMU
> is run with the kernel_irqchip=off option, the guest entry code goes
> ahead and tries to load the guest context into the XIVE hardware, even
> though no context has been set up.
> 
> To fix this, we check that the "CAM word" is non-zero before pushing
> it to the hardware.  The CAM word is initialized to a non-zero value
> in kvmppc_xive_connect_vcpu() and kvmppc_xive_native_connect_vcpu(),
> and is now cleared in kvmppc_xive_{,native_}cleanup_vcpu.

If a "CAM word" is defined, it means the vCPU (VP) was enabled at the
XIVE HW level. So this is the criteria to consider that a vCPU needs
to update (push) its XIVE thread interrupt context when scheduled
to run.


Reviewed-by: Cédric Le Goater <clg@kaod.org>

Thanks,

C.

> 
> Cc: stable@vger.kernel.org # v4.11+
> Reported-by: Cédric Le Goater <clg@kaod.org>
> Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
> ---
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |  2 ++
>  arch/powerpc/kvm/book3s_xive.c          | 11 ++++++++++-
>  arch/powerpc/kvm/book3s_xive_native.c   |  3 +++
>  3 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index 2e7e788..07181d0 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -942,6 +942,8 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300)
>  	ld	r11, VCPU_XIVE_SAVED_STATE(r4)
>  	li	r9, TM_QW1_OS
>  	lwz	r8, VCPU_XIVE_CAM_WORD(r4)
> +	cmpwi	r8, 0
> +	beq	no_xive
>  	li	r7, TM_QW1_OS + TM_WORD2
>  	mfmsr	r0
>  	andi.	r0, r0, MSR_DR		/* in real mode? */
> diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
> index 09f838a..586867e 100644
> --- a/arch/powerpc/kvm/book3s_xive.c
> +++ b/arch/powerpc/kvm/book3s_xive.c
> @@ -67,8 +67,14 @@ void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu)
>  	void __iomem *tima = local_paca->kvm_hstate.xive_tima_virt;
>  	u64 pq;
>  
> -	if (!tima)
> +	/*
> +	 * Nothing to do if the platform doesn't have a XIVE
> +	 * or this vCPU doesn't have its own XIVE context
> +	 * (e.g. because it's not using an in-kernel interrupt controller).
> +	 */
> +	if (!tima || !vcpu->arch.xive_cam_word)
>  		return;
> +
>  	eieio();
>  	__raw_writeq(vcpu->arch.xive_saved_state.w01, tima + TM_QW1_OS);
>  	__raw_writel(vcpu->arch.xive_cam_word, tima + TM_QW1_OS + TM_WORD2);
> @@ -1146,6 +1152,9 @@ void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu)
>  	/* Disable the VP */
>  	xive_native_disable_vp(xc->vp_id);
>  
> +	/* Clear the cam word so guest entry won't try to push context */
> +	vcpu->arch.xive_cam_word = 0;
> +
>  	/* Free the queues */
>  	for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) {
>  		struct xive_q *q = &xc->queues[i];
> diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
> index 368427f..11b91b4 100644
> --- a/arch/powerpc/kvm/book3s_xive_native.c
> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> @@ -81,6 +81,9 @@ void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu)
>  	/* Disable the VP */
>  	xive_native_disable_vp(xc->vp_id);
>  
> +	/* Clear the cam word so guest entry won't try to push context */
> +	vcpu->arch.xive_cam_word = 0;
> +
>  	/* Free the queues */
>  	for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) {
>  		kvmppc_xive_native_cleanup_queue(vcpu, i);
>
Michael Ellerman Aug. 22, 2019, 10:46 a.m. UTC | #2
On Tue, 2019-08-13 at 10:01:00 UTC, Paul Mackerras wrote:
> At present, when running a guest on POWER9 using HV KVM but not using
> an in-kernel interrupt controller (XICS or XIVE), for example if QEMU
> is run with the kernel_irqchip=off option, the guest entry code goes
> ahead and tries to load the guest context into the XIVE hardware, even
> though no context has been set up.
> 
> To fix this, we check that the "CAM word" is non-zero before pushing
> it to the hardware.  The CAM word is initialized to a non-zero value
> in kvmppc_xive_connect_vcpu() and kvmppc_xive_native_connect_vcpu(),
> and is now cleared in kvmppc_xive_{,native_}cleanup_vcpu.
> 
> Cc: stable@vger.kernel.org # v4.11+
> Reported-by: Cédric Le Goater <clg@kaod.org>
> Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
> Reviewed-by: Cédric Le Goater <clg@kaod.org>

Applied to powerpc topic/ppc-kvm, thanks.

https://git.kernel.org/powerpc/c/8d4ba9c931bc384bcc6889a43915aaaf19d3e499

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 2e7e788..07181d0 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -942,6 +942,8 @@  ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300)
 	ld	r11, VCPU_XIVE_SAVED_STATE(r4)
 	li	r9, TM_QW1_OS
 	lwz	r8, VCPU_XIVE_CAM_WORD(r4)
+	cmpwi	r8, 0
+	beq	no_xive
 	li	r7, TM_QW1_OS + TM_WORD2
 	mfmsr	r0
 	andi.	r0, r0, MSR_DR		/* in real mode? */
diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
index 09f838a..586867e 100644
--- a/arch/powerpc/kvm/book3s_xive.c
+++ b/arch/powerpc/kvm/book3s_xive.c
@@ -67,8 +67,14 @@  void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu)
 	void __iomem *tima = local_paca->kvm_hstate.xive_tima_virt;
 	u64 pq;
 
-	if (!tima)
+	/*
+	 * Nothing to do if the platform doesn't have a XIVE
+	 * or this vCPU doesn't have its own XIVE context
+	 * (e.g. because it's not using an in-kernel interrupt controller).
+	 */
+	if (!tima || !vcpu->arch.xive_cam_word)
 		return;
+
 	eieio();
 	__raw_writeq(vcpu->arch.xive_saved_state.w01, tima + TM_QW1_OS);
 	__raw_writel(vcpu->arch.xive_cam_word, tima + TM_QW1_OS + TM_WORD2);
@@ -1146,6 +1152,9 @@  void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu)
 	/* Disable the VP */
 	xive_native_disable_vp(xc->vp_id);
 
+	/* Clear the cam word so guest entry won't try to push context */
+	vcpu->arch.xive_cam_word = 0;
+
 	/* Free the queues */
 	for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) {
 		struct xive_q *q = &xc->queues[i];
diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
index 368427f..11b91b4 100644
--- a/arch/powerpc/kvm/book3s_xive_native.c
+++ b/arch/powerpc/kvm/book3s_xive_native.c
@@ -81,6 +81,9 @@  void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu)
 	/* Disable the VP */
 	xive_native_disable_vp(xc->vp_id);
 
+	/* Clear the cam word so guest entry won't try to push context */
+	vcpu->arch.xive_cam_word = 0;
+
 	/* Free the queues */
 	for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) {
 		kvmppc_xive_native_cleanup_queue(vcpu, i);