Patchwork [08/26] KVM: PPC: Add PV guest critical sections

login
register
mail settings
Submitter Alexander Graf
Date June 25, 2010, 11:24 p.m.
Message ID <1277508314-915-9-git-send-email-agraf@suse.de>
Download mbox | patch
Permalink /patch/57021/
State Not Applicable
Headers show

Comments

Alexander Graf - June 25, 2010, 11:24 p.m.
When running in hooked code we need a way to disable interrupts without
clobbering any interrupts or exiting out to the hypervisor.

To achieve this, we have an additional critical field in the shared page. If
that field is equal to the r1 register of the guest, it tells the hypervisor
that we're in such a critical section and thus may not receive any interrupts.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_para.h |    1 +
 arch/powerpc/kvm/book3s.c           |   15 +++++++++++++--
 arch/powerpc/kvm/booke.c            |   12 ++++++++++++
 3 files changed, 26 insertions(+), 2 deletions(-)
Avi Kivity - June 27, 2010, 8:21 a.m.
On 06/26/2010 02:24 AM, Alexander Graf wrote:
> When running in hooked code we need a way to disable interrupts without
> clobbering any interrupts or exiting out to the hypervisor.
>
> To achieve this, we have an additional critical field in the shared page. If
> that field is equal to the r1 register of the guest, it tells the hypervisor
> that we're in such a critical section and thus may not receive any interrupts.
>    

Is r1 reserved for this purpose?  Can't it match accidentally?

Why won't zero/nonzero work for this?
Alexander Graf - June 27, 2010, 9:40 a.m.
Am 27.06.2010 um 10:21 schrieb Avi Kivity <avi@redhat.com>:

> On 06/26/2010 02:24 AM, Alexander Graf wrote:
>> When running in hooked code we need a way to disable interrupts  
>> without
>> clobbering any interrupts or exiting out to the hypervisor.
>>
>> To achieve this, we have an additional critical field in the shared  
>> page. If
>> that field is equal to the r1 register of the guest, it tells the  
>> hypervisor
>> that we're in such a critical section and thus may not receive any  
>> interrupts.
>>
>
> Is r1 reserved for this purpose?  Can't it match accidentally?

r1 is defined by the abi to be the stack.

>
> Why won't zero/nonzero work for this?

Because there is no store immediate opcode on powerpc :(.

Alex
Avi Kivity - June 27, 2010, 9:52 a.m.
On 06/27/2010 12:40 PM, Alexander Graf wrote:
>
> Am 27.06.2010 um 10:21 schrieb Avi Kivity <avi@redhat.com>:
>
>> On 06/26/2010 02:24 AM, Alexander Graf wrote:
>>> When running in hooked code we need a way to disable interrupts without
>>> clobbering any interrupts or exiting out to the hypervisor.
>>>
>>> To achieve this, we have an additional critical field in the shared 
>>> page. If
>>> that field is equal to the r1 register of the guest, it tells the 
>>> hypervisor
>>> that we're in such a critical section and thus may not receive any 
>>> interrupts.
>>>
>>
>> Is r1 reserved for this purpose?  Can't it match accidentally?
>
> r1 is defined by the abi to be the stack.

Neat trick!

>>
>> Why won't zero/nonzero work for this?
>
> Because there is no store immediate opcode on powerpc :(.

Or inc/dec...
Avi Kivity - June 27, 2010, 10:03 a.m.
On 06/26/2010 02:24 AM, Alexander Graf wrote:
> When running in hooked code we need a way to disable interrupts without
> clobbering any interrupts or exiting out to the hypervisor.
>
> To achieve this, we have an additional critical field in the shared page. If
> that field is equal to the r1 register of the guest, it tells the hypervisor
> that we're in such a critical section and thus may not receive any interrupts.
>
>
> --- a/arch/powerpc/kvm/book3s.c
> +++ b/arch/powerpc/kvm/book3s.c
> @@ -251,14 +251,25 @@ int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu, unsigned int priority)
>   	int deliver = 1;
>   	int vec = 0;
>   	ulong flags = 0ULL;
> +	ulong crit_raw = vcpu->arch.shared->critical;
> +	ulong crit_r1 = kvmppc_get_gpr(vcpu, 1);
> +	bool crit;
> +
> +	/* Truncate crit indicators in 32 bit mode */
> +	if (!(vcpu->arch.shared->msr&  MSR_SF)) {
> +		crit_raw&= 0xffffffff;
> +		crit_r1&= 0xffffffff;
> +	}
> +
> +	crit = (crit_raw == crit_r1);
>    

I think you need to qualify that for supervisor mode only.  Otherwise 
guest userspace can guess the value of shared->critical and disable 
interrupts.
Alexander Graf - June 27, 2010, 10:33 a.m.
Am 27.06.2010 um 11:52 schrieb Avi Kivity <avi@redhat.com>:

> On 06/27/2010 12:40 PM, Alexander Graf wrote:
>>
>> Am 27.06.2010 um 10:21 schrieb Avi Kivity <avi@redhat.com>:
>>
>>> On 06/26/2010 02:24 AM, Alexander Graf wrote:
>>>> When running in hooked code we need a way to disable interrupts  
>>>> without
>>>> clobbering any interrupts or exiting out to the hypervisor.
>>>>
>>>> To achieve this, we have an additional critical field in the  
>>>> shared page. If
>>>> that field is equal to the r1 register of the guest, it tells the  
>>>> hypervisor
>>>> that we're in such a critical section and thus may not receive  
>>>> any interrupts.
>>>>
>>>
>>> Is r1 reserved for this purpose?  Can't it match accidentally?
>>
>> r1 is defined by the abi to be the stack.
>
> Neat trick!
>
>>>
>>> Why won't zero/nonzero work for this?
>>
>> Because there is no store immediate opcode on powerpc :(.
>
> Or inc/dec...

Uh - huh? How would that help?

Alex
Alexander Graf - June 27, 2010, 10:35 a.m.
Am 27.06.2010 um 12:03 schrieb Avi Kivity <avi@redhat.com>:

> On 06/26/2010 02:24 AM, Alexander Graf wrote:
>> When running in hooked code we need a way to disable interrupts  
>> without
>> clobbering any interrupts or exiting out to the hypervisor.
>>
>> To achieve this, we have an additional critical field in the shared  
>> page. If
>> that field is equal to the r1 register of the guest, it tells the  
>> hypervisor
>> that we're in such a critical section and thus may not receive any  
>> interrupts.
>>
>>
>> --- a/arch/powerpc/kvm/book3s.c
>> +++ b/arch/powerpc/kvm/book3s.c
>> @@ -251,14 +251,25 @@ int kvmppc_book3s_irqprio_deliver(struct  
>> kvm_vcpu *vcpu, unsigned int priority)
>>      int deliver = 1;
>>      int vec = 0;
>>      ulong flags = 0ULL;
>> +    ulong crit_raw = vcpu->arch.shared->critical;
>> +    ulong crit_r1 = kvmppc_get_gpr(vcpu, 1);
>> +    bool crit;
>> +
>> +    /* Truncate crit indicators in 32 bit mode */
>> +    if (!(vcpu->arch.shared->msr&  MSR_SF)) {
>> +        crit_raw&= 0xffffffff;
>> +        crit_r1&= 0xffffffff;
>> +    }
>> +
>> +    crit = (crit_raw == crit_r1);
>>
>
> I think you need to qualify that for supervisor mode only.   
> Otherwise guest userspace can guess the value of shared->critical  
> and disable interrupts.


Yes, you're right. Good catch!

Alex

>
Avi Kivity - June 27, 2010, 10:59 a.m.
On 06/27/2010 01:33 PM, Alexander Graf wrote:
>> Or inc/dec...
>
>
> Uh - huh? How would that help?

inc shared->critical   # disable interrupts
Alexander Graf - June 27, 2010, 11:49 a.m.
Am 27.06.2010 um 12:59 schrieb Avi Kivity <avi@redhat.com>:

> On 06/27/2010 01:33 PM, Alexander Graf wrote:
>>> Or inc/dec...
>>
>>
>> Uh - huh? How would that help?
>
> inc shared->critical   # disable interrupts

There is no opcode in the ppc isa that acts on momery without  
involving a register.

An inc would be:

ld rX, A(0)
addi rX, rX, 1
std rX, A(0)

Alex
>
Avi Kivity - June 27, 2010, 11:53 a.m.
On 06/27/2010 02:49 PM, Alexander Graf wrote:
>
> Am 27.06.2010 um 12:59 schrieb Avi Kivity <avi@redhat.com>:
>
>> On 06/27/2010 01:33 PM, Alexander Graf wrote:
>>>> Or inc/dec...
>>>
>>>
>>> Uh - huh? How would that help?
>>
>> inc shared->critical   # disable interrupts
>
> There is no opcode in the ppc isa that acts on momery without 
> involving a register.
>
> An inc would be:
>
> ld rX, A(0)
> addi rX, rX, 1
> std rX, A(0)

Right, I was agreeing with you.  I meant there was no inc/dec mem insns.
Alexander Graf - June 27, 2010, 12:06 p.m.
Am 27.06.2010 um 13:53 schrieb Avi Kivity <avi@redhat.com>:

> On 06/27/2010 02:49 PM, Alexander Graf wrote:
>>
>> Am 27.06.2010 um 12:59 schrieb Avi Kivity <avi@redhat.com>:
>>
>>> On 06/27/2010 01:33 PM, Alexander Graf wrote:
>>>>> Or inc/dec...
>>>>
>>>>
>>>> Uh - huh? How would that help?
>>>
>>> inc shared->critical   # disable interrupts
>>
>> There is no opcode in the ppc isa that acts on momery without  
>> involving a register.
>>
>> An inc would be:
>>
>> ld rX, A(0)
>> addi rX, rX, 1
>> std rX, A(0)
>
> Right, I was agreeing with you.  I meant there was no inc/dec mem  
> insns.

Ah, I see :). I think that's really the only point where I deem the  
x86 isa superior :).

Alex
Benjamin Herrenschmidt - June 27, 2010, 10:03 p.m.
On Sun, 2010-06-27 at 14:06 +0200, Alexander Graf wrote:
> > Right, I was agreeing with you.  I meant there was no inc/dec mem  
> > insns.
> 
> Ah, I see :). I think that's really the only point where I deem the  
> x86 isa superior :). 

Heh, well, that's a typical RISC thing tho. I think ARM and MIPS are
like PowerPC here at least :-)

Cheers,
Ben.

Patch

diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
index eaab306..d1fe9ae 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -23,6 +23,7 @@ 
 #include <linux/types.h>
 
 struct kvm_vcpu_arch_shared {
+	__u64 critical;		/* Guest may not get interrupts if == r1 */
 	__u64 sprg0;
 	__u64 sprg1;
 	__u64 sprg2;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index e8001c5..f0e8047 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -251,14 +251,25 @@  int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu, unsigned int priority)
 	int deliver = 1;
 	int vec = 0;
 	ulong flags = 0ULL;
+	ulong crit_raw = vcpu->arch.shared->critical;
+	ulong crit_r1 = kvmppc_get_gpr(vcpu, 1);
+	bool crit;
+
+	/* Truncate crit indicators in 32 bit mode */
+	if (!(vcpu->arch.shared->msr & MSR_SF)) {
+		crit_raw &= 0xffffffff;
+		crit_r1 &= 0xffffffff;
+	}
+
+	crit = (crit_raw == crit_r1);
 
 	switch (priority) {
 	case BOOK3S_IRQPRIO_DECREMENTER:
-		deliver = vcpu->arch.shared->msr & MSR_EE;
+		deliver = (vcpu->arch.shared->msr & MSR_EE) && !crit;
 		vec = BOOK3S_INTERRUPT_DECREMENTER;
 		break;
 	case BOOK3S_IRQPRIO_EXTERNAL:
-		deliver = vcpu->arch.shared->msr & MSR_EE;
+		deliver = (vcpu->arch.shared->msr & MSR_EE) && !crit;
 		vec = BOOK3S_INTERRUPT_EXTERNAL;
 		break;
 	case BOOK3S_IRQPRIO_SYSTEM_RESET:
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index e7d1216..485f8fa 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -147,6 +147,17 @@  static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 	int allowed = 0;
 	ulong uninitialized_var(msr_mask);
 	bool update_esr = false, update_dear = false;
+	ulong crit_raw = vcpu->arch.shared->critical;
+	ulong crit_r1 = kvmppc_get_gpr(vcpu, 1);
+	bool crit;
+
+	/* Truncate crit indicators in 32 bit mode */
+	if (!(vcpu->arch.shared->msr & MSR_SF)) {
+		crit_raw &= 0xffffffff;
+		crit_r1 &= 0xffffffff;
+	}
+
+	crit = (crit_raw == crit_r1);
 
 	switch (priority) {
 	case BOOKE_IRQPRIO_DTLB_MISS:
@@ -181,6 +192,7 @@  static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 	case BOOKE_IRQPRIO_DECREMENTER:
 	case BOOKE_IRQPRIO_FIT:
 		allowed = vcpu->arch.shared->msr & MSR_EE;
+		allowed = allowed && !crit;
 		msr_mask = MSR_CE|MSR_ME|MSR_DE;
 		break;
 	case BOOKE_IRQPRIO_DEBUG: