diff mbox

[v2,2/3] KVM: PPC: epapr: Add idle hcall support for host

Message ID 1325754448-30055-1-git-send-email-yu.liu@freescale.com
State New, archived
Headers show

Commit Message

Liu Yu-B13201 Jan. 5, 2012, 9:07 a.m. UTC
And add a new flag definition in kvm_ppc_pvinfo to indicate
whether host support EV_IDLE hcall.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
---
v2:
1. instead of adding new field in kvm_ppc_pvinfo, use flags.
2. expose hcall definitions to userspace

 arch/powerpc/include/asm/kvm_para.h |   14 ++++++++++++--
 arch/powerpc/kvm/powerpc.c          |    8 ++++++++
 include/linux/kvm.h                 |    2 ++
 3 files changed, 22 insertions(+), 2 deletions(-)

Comments

Alexander Graf Jan. 9, 2012, 2:15 p.m. UTC | #1
On 05.01.2012, at 10:07, Liu Yu wrote:

> And add a new flag definition in kvm_ppc_pvinfo to indicate
> whether host support EV_IDLE hcall.
> 
> Signed-off-by: Liu Yu <yu.liu@freescale.com>
> ---
> v2:
> 1. instead of adding new field in kvm_ppc_pvinfo, use flags.
> 2. expose hcall definitions to userspace
> 
> arch/powerpc/include/asm/kvm_para.h |   14 ++++++++++++--
> arch/powerpc/kvm/powerpc.c          |    8 ++++++++
> include/linux/kvm.h                 |    2 ++
> 3 files changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
> index 50533f9..e8632b6 100644
> --- a/arch/powerpc/include/asm/kvm_para.h
> +++ b/arch/powerpc/include/asm/kvm_para.h
> @@ -41,9 +41,19 @@ struct kvm_vcpu_arch_shared {
> };
> 
> #define KVM_SC_MAGIC_R0		0x4b564d21 /* "KVM!" */
> -#define HC_VENDOR_KVM		(42 << 16)
> +
> +#include <asm/epapr_hcalls.h>
> +
> +/* ePAPR Hypercall Vendor ID */
> +#define HC_VENDOR_EPAPR		(EV_EPAPR_VENDOR_ID << 16)
> +#define HC_VENDOR_KVM		(EV_KVM_VENDOR_ID << 16)
> +
> +/* ePAPR Hypercall Token */
> +#define HC_EV_IDLE		EV_IDLE
> +
> +/* ePAPR Hypercall Return Codes */
> #define HC_EV_SUCCESS		0
> -#define HC_EV_UNIMPLEMENTED	12
> +#define HC_EV_UNIMPLEMENTED	EV_UNIMPLEMENTED
> 
> #define KVM_FEATURE_MAGIC_PAGE	1
> 
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index c33f6a7..1242ee1 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -81,6 +81,10 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
> 
> 		/* Second return value is in r4 */
> 		break;
> +	case HC_VENDOR_EPAPR | HC_EV_IDLE:
> +		r = HC_EV_SUCCESS;
> +		kvm_vcpu_block(vcpu);

Hrm. This will return on signal. So if the guest sends an idle hcall, then user space gets a random signal, we'll continue executing the guest CPU, getting us out of idle even though the guest didn't expect it, since the guest really wants to get an interrupt after the idle hcall.

Maybe we should just set the MSR_WE bit, so we'll get back into wait state on next vcpu_run? But then again that's guest visible, so I suppose a separate flag would be better.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yoder Stuart-B08248 Jan. 9, 2012, 4:04 p.m. UTC | #2
> -----Original Message-----
> From: kvm-ppc-owner@vger.kernel.org [mailto:kvm-ppc-owner@vger.kernel.org] On Behalf Of
> Alexander Graf
> Sent: Monday, January 09, 2012 8:15 AM
> To: Liu Yu-B13201
> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; galak@kernel.crashing.org; Wood Scott-
> B07421; Tabi Timur-B04825
> Subject: Re: [PATCH v2 2/3] KVM: PPC: epapr: Add idle hcall support for host
> 
> 
> On 05.01.2012, at 10:07, Liu Yu wrote:
> 
> > And add a new flag definition in kvm_ppc_pvinfo to indicate whether
> > host support EV_IDLE hcall.
> >
> > Signed-off-by: Liu Yu <yu.liu@freescale.com>
> > ---
> > v2:
> > 1. instead of adding new field in kvm_ppc_pvinfo, use flags.
> > 2. expose hcall definitions to userspace
> >
> > arch/powerpc/include/asm/kvm_para.h |   14 ++++++++++++--
> > arch/powerpc/kvm/powerpc.c          |    8 ++++++++
> > include/linux/kvm.h                 |    2 ++
> > 3 files changed, 22 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/kvm_para.h
> > b/arch/powerpc/include/asm/kvm_para.h
> > index 50533f9..e8632b6 100644
> > --- a/arch/powerpc/include/asm/kvm_para.h
> > +++ b/arch/powerpc/include/asm/kvm_para.h
> > @@ -41,9 +41,19 @@ struct kvm_vcpu_arch_shared { };
> >
> > #define KVM_SC_MAGIC_R0		0x4b564d21 /* "KVM!" */
> > -#define HC_VENDOR_KVM		(42 << 16)
> > +
> > +#include <asm/epapr_hcalls.h>
> > +
> > +/* ePAPR Hypercall Vendor ID */
> > +#define HC_VENDOR_EPAPR		(EV_EPAPR_VENDOR_ID << 16)
> > +#define HC_VENDOR_KVM		(EV_KVM_VENDOR_ID << 16)
> > +
> > +/* ePAPR Hypercall Token */
> > +#define HC_EV_IDLE		EV_IDLE
> > +
> > +/* ePAPR Hypercall Return Codes */
> > #define HC_EV_SUCCESS		0
> > -#define HC_EV_UNIMPLEMENTED	12
> > +#define HC_EV_UNIMPLEMENTED	EV_UNIMPLEMENTED
> >
> > #define KVM_FEATURE_MAGIC_PAGE	1
> >
> > diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> > index c33f6a7..1242ee1 100644
> > --- a/arch/powerpc/kvm/powerpc.c
> > +++ b/arch/powerpc/kvm/powerpc.c
> > @@ -81,6 +81,10 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
> >
> > 		/* Second return value is in r4 */
> > 		break;
> > +	case HC_VENDOR_EPAPR | HC_EV_IDLE:
> > +		r = HC_EV_SUCCESS;
> > +		kvm_vcpu_block(vcpu);
> 
> Hrm. This will return on signal. So if the guest sends an idle hcall, then user space gets a
> random signal, we'll continue executing the guest CPU, getting us out of idle even though the
> guest didn't expect it, since the guest really wants to get an interrupt after the idle hcall.

Is that a problem though?   Won't it be just like a spurious interrupt where
the guest would wake up, see that there is nothing to do, and then
go idle again?   What is your concern here?

> Maybe we should just set the MSR_WE bit, so we'll get back into wait state on next vcpu_run?
> But then again that's guest visible, so I suppose a separate flag would be better.

Put a separate flag where?

Stuart

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Graf Jan. 9, 2012, 4:07 p.m. UTC | #3
On 09.01.2012, at 17:04, Yoder Stuart-B08248 wrote:

> 
> 
>> -----Original Message-----
>> From: kvm-ppc-owner@vger.kernel.org [mailto:kvm-ppc-owner@vger.kernel.org] On Behalf Of
>> Alexander Graf
>> Sent: Monday, January 09, 2012 8:15 AM
>> To: Liu Yu-B13201
>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; galak@kernel.crashing.org; Wood Scott-
>> B07421; Tabi Timur-B04825
>> Subject: Re: [PATCH v2 2/3] KVM: PPC: epapr: Add idle hcall support for host
>> 
>> 
>> On 05.01.2012, at 10:07, Liu Yu wrote:
>> 
>>> And add a new flag definition in kvm_ppc_pvinfo to indicate whether
>>> host support EV_IDLE hcall.
>>> 
>>> Signed-off-by: Liu Yu <yu.liu@freescale.com>
>>> ---
>>> v2:
>>> 1. instead of adding new field in kvm_ppc_pvinfo, use flags.
>>> 2. expose hcall definitions to userspace
>>> 
>>> arch/powerpc/include/asm/kvm_para.h |   14 ++++++++++++--
>>> arch/powerpc/kvm/powerpc.c          |    8 ++++++++
>>> include/linux/kvm.h                 |    2 ++
>>> 3 files changed, 22 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/arch/powerpc/include/asm/kvm_para.h
>>> b/arch/powerpc/include/asm/kvm_para.h
>>> index 50533f9..e8632b6 100644
>>> --- a/arch/powerpc/include/asm/kvm_para.h
>>> +++ b/arch/powerpc/include/asm/kvm_para.h
>>> @@ -41,9 +41,19 @@ struct kvm_vcpu_arch_shared { };
>>> 
>>> #define KVM_SC_MAGIC_R0		0x4b564d21 /* "KVM!" */
>>> -#define HC_VENDOR_KVM		(42 << 16)
>>> +
>>> +#include <asm/epapr_hcalls.h>
>>> +
>>> +/* ePAPR Hypercall Vendor ID */
>>> +#define HC_VENDOR_EPAPR		(EV_EPAPR_VENDOR_ID << 16)
>>> +#define HC_VENDOR_KVM		(EV_KVM_VENDOR_ID << 16)
>>> +
>>> +/* ePAPR Hypercall Token */
>>> +#define HC_EV_IDLE		EV_IDLE
>>> +
>>> +/* ePAPR Hypercall Return Codes */
>>> #define HC_EV_SUCCESS		0
>>> -#define HC_EV_UNIMPLEMENTED	12
>>> +#define HC_EV_UNIMPLEMENTED	EV_UNIMPLEMENTED
>>> 
>>> #define KVM_FEATURE_MAGIC_PAGE	1
>>> 
>>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>>> index c33f6a7..1242ee1 100644
>>> --- a/arch/powerpc/kvm/powerpc.c
>>> +++ b/arch/powerpc/kvm/powerpc.c
>>> @@ -81,6 +81,10 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
>>> 
>>> 		/* Second return value is in r4 */
>>> 		break;
>>> +	case HC_VENDOR_EPAPR | HC_EV_IDLE:
>>> +		r = HC_EV_SUCCESS;
>>> +		kvm_vcpu_block(vcpu);
>> 
>> Hrm. This will return on signal. So if the guest sends an idle hcall, then user space gets a
>> random signal, we'll continue executing the guest CPU, getting us out of idle even though the
>> guest didn't expect it, since the guest really wants to get an interrupt after the idle hcall.
> 
> Is that a problem though?   Won't it be just like a spurious interrupt where
> the guest would wake up, see that there is nothing to do, and then
> go idle again?   What is your concern here?

We would have changed state by enabling interrupts in the asm code. Also, it's not how hardware would work when setting MSR_WE, which if I understand Scott correctly, is the purpose of this hypercall.

> 
>> Maybe we should just set the MSR_WE bit, so we'll get back into wait state on next vcpu_run?
>> But then again that's guest visible, so I suppose a separate flag would be better.
> 
> Put a separate flag where?

In the vcpu struct. Just a bit somewhere to check in kvmppc_core_prepare_to_enter() in parallel to MSR_WE.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Scott Wood Jan. 9, 2012, 8 p.m. UTC | #4
On 01/09/2012 08:15 AM, Alexander Graf wrote:
> 
> On 05.01.2012, at 10:07, Liu Yu wrote:
> 
>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>> index c33f6a7..1242ee1 100644
>> --- a/arch/powerpc/kvm/powerpc.c
>> +++ b/arch/powerpc/kvm/powerpc.c
>> @@ -81,6 +81,10 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
>>
>> 		/* Second return value is in r4 */
>> 		break;
>> +	case HC_VENDOR_EPAPR | HC_EV_IDLE:
>> +		r = HC_EV_SUCCESS;
>> +		kvm_vcpu_block(vcpu);
> 
> Hrm. This will return on signal. So if the guest sends an idle hcall,
> then user space gets a random signal, we'll continue executing the
> guest CPU, getting us out of idle even though the guest didn't expect
> it, since the guest really wants to get an interrupt after the idle
> hcall.

The ePAPR description of this hcall is a little vague (Stuart, put on
list to fix in next ePAPR revision?), but this is expected.  It will
also be the case if a guest directly uses the wait instruction.  Guests
must be able to deal with spurious wakeups.

-Scott

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Graf Jan. 9, 2012, 8:18 p.m. UTC | #5
Am 09.01.2012 um 21:00 schrieb Scott Wood <scottwood@freescale.com>:

> On 01/09/2012 08:15 AM, Alexander Graf wrote:
>> 
>> On 05.01.2012, at 10:07, Liu Yu wrote:
>> 
>>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>>> index c33f6a7..1242ee1 100644
>>> --- a/arch/powerpc/kvm/powerpc.c
>>> +++ b/arch/powerpc/kvm/powerpc.c
>>> @@ -81,6 +81,10 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
>>> 
>>>        /* Second return value is in r4 */
>>>        break;
>>> +    case HC_VENDOR_EPAPR | HC_EV_IDLE:
>>> +        r = HC_EV_SUCCESS;
>>> +        kvm_vcpu_block(vcpu);
>> 
>> Hrm. This will return on signal. So if the guest sends an idle hcall,
>> then user space gets a random signal, we'll continue executing the
>> guest CPU, getting us out of idle even though the guest didn't expect
>> it, since the guest really wants to get an interrupt after the idle
>> hcall.
> 
> The ePAPR description of this hcall is a little vague (Stuart, put on
> list to fix in next ePAPR revision?), but this is expected.  It will
> also be the case if a guest directly uses the wait instruction.  Guests
> must be able to deal with spurious wakeups.

The wait instruction does get executed in an infinite loop though, while this hcall is only executed once.

Alex

> 
> -Scott
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Scott Wood Jan. 9, 2012, 10:33 p.m. UTC | #6
On 01/09/2012 02:18 PM, Alexander Graf wrote:
> 
> 
> Am 09.01.2012 um 21:00 schrieb Scott Wood <scottwood@freescale.com>:
> 
>> On 01/09/2012 08:15 AM, Alexander Graf wrote:
>>>
>>> On 05.01.2012, at 10:07, Liu Yu wrote:
>>>
>>>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>>>> index c33f6a7..1242ee1 100644
>>>> --- a/arch/powerpc/kvm/powerpc.c
>>>> +++ b/arch/powerpc/kvm/powerpc.c
>>>> @@ -81,6 +81,10 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
>>>>
>>>>        /* Second return value is in r4 */
>>>>        break;
>>>> +    case HC_VENDOR_EPAPR | HC_EV_IDLE:
>>>> +        r = HC_EV_SUCCESS;
>>>> +        kvm_vcpu_block(vcpu);
>>>
>>> Hrm. This will return on signal. So if the guest sends an idle hcall,
>>> then user space gets a random signal, we'll continue executing the
>>> guest CPU, getting us out of idle even though the guest didn't expect
>>> it, since the guest really wants to get an interrupt after the idle
>>> hcall.
>>
>> The ePAPR description of this hcall is a little vague (Stuart, put on
>> list to fix in next ePAPR revision?), but this is expected.  It will
>> also be the case if a guest directly uses the wait instruction.  Guests
>> must be able to deal with spurious wakeups.
> 
> The wait instruction does get executed in an infinite loop though, while this hcall is only executed once.

Yes, I pointed that out as something that needs to be fixed in the
ev_idle patch.

-Scott


--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yoder Stuart-B08248 Jan. 9, 2012, 10:41 p.m. UTC | #7
> >> Hrm. This will return on signal. So if the guest sends an idle hcall,
> >> then user space gets a random signal, we'll continue executing the
> >> guest CPU, getting us out of idle even though the guest didn't expect it, since the guest
> really wants to get an interrupt after the idle hcall.
> >
> > Is that a problem though?   Won't it be just like a spurious interrupt where
> > the guest would wake up, see that there is nothing to do, and then
> > go idle again?   What is your concern here?
> 
> We would have changed state by enabling interrupts in the asm code.

What do you mean here, could you elaborate?

Stuart


--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Graf Jan. 9, 2012, 10:51 p.m. UTC | #8
On 09.01.2012, at 23:41, Yoder Stuart-B08248 wrote:

> 
>>>> Hrm. This will return on signal. So if the guest sends an idle hcall,
>>>> then user space gets a random signal, we'll continue executing the
>>>> guest CPU, getting us out of idle even though the guest didn't expect it, since the guest
>> really wants to get an interrupt after the idle hcall.
>>> 
>>> Is that a problem though?   Won't it be just like a spurious interrupt where
>>> the guest would wake up, see that there is nothing to do, and then
>>> go idle again?   What is your concern here?
>> 
>> We would have changed state by enabling interrupts in the asm code.
> 
> What do you mean here, could you elaborate?

> 
> +#ifdef CONFIG_KVM_GUEST
> +/*
> + * r3 contains the pointer to in[8]
> + * r4 contains the pointer to out[8]
> + * r5 contains the hcall vendor and nr
> + * r6 contains the handler which send hcall
> + */
> +_GLOBAL(e500_ev_idle)

entering with interrupts disabled

> +	rlwinm	r7,r1,0,0,31-THREAD_SHIFT	/* current thread_info */
> +	lwz	r8,TI_LOCAL_FLAGS(r7)	/* set napping bit */
> +	ori	r8,r8,_TLF_NAPPING	/* so when we take an exception */
> +	stw	r8,TI_LOCAL_FLAGS(r7)	/* it will return to our caller */
> +	wrteei	1

enable interrupts

> +	mtctr	r6
> +	bctr

call hypercall, race occurs, return to caller with interrupts enabled


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
index 50533f9..e8632b6 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -41,9 +41,19 @@  struct kvm_vcpu_arch_shared {
 };
 
 #define KVM_SC_MAGIC_R0		0x4b564d21 /* "KVM!" */
-#define HC_VENDOR_KVM		(42 << 16)
+
+#include <asm/epapr_hcalls.h>
+
+/* ePAPR Hypercall Vendor ID */
+#define HC_VENDOR_EPAPR		(EV_EPAPR_VENDOR_ID << 16)
+#define HC_VENDOR_KVM		(EV_KVM_VENDOR_ID << 16)
+
+/* ePAPR Hypercall Token */
+#define HC_EV_IDLE		EV_IDLE
+
+/* ePAPR Hypercall Return Codes */
 #define HC_EV_SUCCESS		0
-#define HC_EV_UNIMPLEMENTED	12
+#define HC_EV_UNIMPLEMENTED	EV_UNIMPLEMENTED
 
 #define KVM_FEATURE_MAGIC_PAGE	1
 
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index c33f6a7..1242ee1 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -81,6 +81,10 @@  int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
 
 		/* Second return value is in r4 */
 		break;
+	case HC_VENDOR_EPAPR | HC_EV_IDLE:
+		r = HC_EV_SUCCESS;
+		kvm_vcpu_block(vcpu);
+		break;
 	default:
 		r = HC_EV_UNIMPLEMENTED;
 		break;
@@ -772,6 +776,10 @@  static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 	pvinfo->hcall[2] = inst_sc;
 	pvinfo->hcall[3] = inst_nop;
 
+#ifdef CONFIG_BOOKE
+	pvinfo->flags |= KVM_PPC_PVINFO_FLAGS_EV_IDLE;
+#endif
+
 	return 0;
 }
 
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index c107fae..501712d 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -429,6 +429,8 @@  struct kvm_ppc_pvinfo {
 	__u8  pad[108];
 };
 
+#define KVM_PPC_PVINFO_FLAGS_EV_IDLE   (1<<0)
+
 #define KVMIO 0xAE
 
 /*