Message ID | 1383878656-4196-1-git-send-email-pingfank@linux.vnet.ibm.com |
---|---|
State | New, archived |
Headers | show |
On 08.11.2013, at 03:44, Liu Ping Fan <kernelfans@gmail.com> wrote: > syscall is a very common behavior inside guest, and this patch > optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL, > so hypervisor can return to guest without heavy exit, i.e, no need > to swap TLB, HTAB,.. etc The syscall exit you touch here only happens when you do an sc > 0 with MSR_PR set inside the guest. The only case you realistically see this is when you run PR KVM inside of an HV KVM guest. I don't think we should optimize for that case. Instead, we should rather try to not bounce to the 1st hypervisor in the first place in that scenario :). Alex > > Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com> > --- > Compiled, but lack of bare metal, I have not tested it yet. > --- > arch/powerpc/kvm/book3s_hv.c | 6 ------ > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 13 ++++++++++++- > 2 files changed, 12 insertions(+), 7 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index 62a2b5a..73dc852 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -628,12 +628,6 @@ static int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, > /* hcall - punt to userspace */ > int i; > > - if (vcpu->arch.shregs.msr & MSR_PR) { > - /* sc 1 from userspace - reflect to guest syscall */ > - kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_SYSCALL); > - r = RESUME_GUEST; > - break; > - } > run->papr_hcall.nr = kvmppc_get_gpr(vcpu, 3); > for (i = 0; i < 9; ++i) > run->papr_hcall.args[i] = kvmppc_get_gpr(vcpu, 4 + i); > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index c71103b..9f626c3 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -1388,7 +1388,8 @@ kvmppc_hisi: > hcall_try_real_mode: > ld r3,VCPU_GPR(R3)(r9) > andi. r0,r11,MSR_PR > - bne guest_exit_cont > + /* sc 1 from userspace - reflect to guest syscall */ > + bne sc_0_fast_return > clrrdi r3,r3,2 > cmpldi r3,hcall_real_table_end - hcall_real_table > bge guest_exit_cont > @@ -1409,6 +1410,16 @@ hcall_try_real_mode: > ld r11,VCPU_MSR(r4) > b fast_guest_return > > +sc_0_fast_return: > + ld r10,VCPU_PC(r9) > + ld r11,VCPU_MSR(r9) > + mtspr SPRN_SRR0,r10 > + mtspr SPRN_SRR1,r11 > + li r10, BOOK3S_INTERRUPT_SYSCALL > + LOAD_REG_IMMEDIATE(r3,0xffffffff87a0ffff) /* zero 33:36,42:47 */ > + and r11,r11,r3 > + b fast_guest_return > + > /* We've attempted a real mode hcall, but it's punted it back > * to userspace. We need to restore some clobbered volatiles > * before resuming the pass-it-to-qemu path */ > -- > 1.8.1.4 > -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2013-11-08 at 04:10 +0100, Alexander Graf wrote: > On 08.11.2013, at 03:44, Liu Ping Fan <kernelfans@gmail.com> wrote: > > > syscall is a very common behavior inside guest, and this patch > > optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL, > > so hypervisor can return to guest without heavy exit, i.e, no need > > to swap TLB, HTAB,.. etc > > The syscall exit you touch here only happens when you do an sc > 0 > with MSR_PR set inside the guest. The only case you realistically see > this is when you run PR KVM inside of an HV KVM guest. > > I don't think we should optimize for that case. Instead, we should > rather try to not bounce to the 1st hypervisor in the first place in > that scenario :). Well, so unfortunately openstack CI uses PR inside HV pretty heavily .... it *might* be worthwhile optimizing that path if the patch is simple enough... I'd make that Paul's call. Cheers, Ben. > > Alex > > > > > Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com> > > --- > > Compiled, but lack of bare metal, I have not tested it yet. > > --- > > arch/powerpc/kvm/book3s_hv.c | 6 ------ > > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 13 ++++++++++++- > > 2 files changed, 12 insertions(+), 7 deletions(-) > > > > diff --git a/arch/powerpc/kvm/book3s_hv.c > b/arch/powerpc/kvm/book3s_hv.c > > index 62a2b5a..73dc852 100644 > > --- a/arch/powerpc/kvm/book3s_hv.c > > +++ b/arch/powerpc/kvm/book3s_hv.c > > @@ -628,12 +628,6 @@ static int kvmppc_handle_exit(struct kvm_run > *run, struct kvm_vcpu *vcpu, > > /* hcall - punt to userspace */ > > int i; > > > > - if (vcpu->arch.shregs.msr & MSR_PR) { > > - /* sc 1 from userspace - reflect to guest syscall */ > > - kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_SYSCALL); > > - r = RESUME_GUEST; > > - break; > > - } > > run->papr_hcall.nr = kvmppc_get_gpr(vcpu, 3); > > for (i = 0; i < 9; ++i) > > run->papr_hcall.args[i] = kvmppc_get_gpr(vcpu, 4 + i); > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > index c71103b..9f626c3 100644 > > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > @@ -1388,7 +1388,8 @@ kvmppc_hisi: > > hcall_try_real_mode: > > ld r3,VCPU_GPR(R3)(r9) > > andi. r0,r11,MSR_PR > > - bne guest_exit_cont > > + /* sc 1 from userspace - reflect to guest syscall */ > > + bne sc_0_fast_return > > clrrdi r3,r3,2 > > cmpldi r3,hcall_real_table_end - hcall_real_table > > bge guest_exit_cont > > @@ -1409,6 +1410,16 @@ hcall_try_real_mode: > > ld r11,VCPU_MSR(r4) > > b fast_guest_return > > > > +sc_0_fast_return: > > + ld r10,VCPU_PC(r9) > > + ld r11,VCPU_MSR(r9) > > + mtspr SPRN_SRR0,r10 > > + mtspr SPRN_SRR1,r11 > > + li r10, BOOK3S_INTERRUPT_SYSCALL > > + LOAD_REG_IMMEDIATE(r3,0xffffffff87a0ffff) /* zero 33:36,42:47 */ > > + and r11,r11,r3 > > + b fast_guest_return > > + > > /* We've attempted a real mode hcall, but it's punted it back > > * to userspace. We need to restore some clobbered volatiles > > * before resuming the pass-it-to-qemu path */ > > -- > > 1.8.1.4 > > -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2013-11-08 at 15:05 +1100, Benjamin Herrenschmidt wrote: > On Fri, 2013-11-08 at 04:10 +0100, Alexander Graf wrote: > > On 08.11.2013, at 03:44, Liu Ping Fan <kernelfans@gmail.com> wrote: > > > > > syscall is a very common behavior inside guest, and this patch > > > optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL, > > > so hypervisor can return to guest without heavy exit, i.e, no need > > > to swap TLB, HTAB,.. etc > > > > The syscall exit you touch here only happens when you do an sc > 0 > > with MSR_PR set inside the guest. The only case you realistically see > > this is when you run PR KVM inside of an HV KVM guest. > > > > I don't think we should optimize for that case. Instead, we should > > rather try to not bounce to the 1st hypervisor in the first place in > > that scenario :). > > Well, so unfortunately openstack CI uses PR inside HV pretty > heavily .... it *might* be worthwhile optimizing that path if the patch > is simple enough... I'd make that Paul's call. Note that this is a statement of value for the idea ... not the implementation ;-) From a quick look with Paulus, the patch is quite broken. I'll let Paul comment in details. Cheers, Ben. > Cheers, > Ben. > > > > > Alex > > > > > > > > Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com> > > > --- > > > Compiled, but lack of bare metal, I have not tested it yet. > > > --- > > > arch/powerpc/kvm/book3s_hv.c | 6 ------ > > > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 13 ++++++++++++- > > > 2 files changed, 12 insertions(+), 7 deletions(-) > > > > > > diff --git a/arch/powerpc/kvm/book3s_hv.c > > b/arch/powerpc/kvm/book3s_hv.c > > > index 62a2b5a..73dc852 100644 > > > --- a/arch/powerpc/kvm/book3s_hv.c > > > +++ b/arch/powerpc/kvm/book3s_hv.c > > > @@ -628,12 +628,6 @@ static int kvmppc_handle_exit(struct kvm_run > > *run, struct kvm_vcpu *vcpu, > > > /* hcall - punt to userspace */ > > > int i; > > > > > > - if (vcpu->arch.shregs.msr & MSR_PR) { > > > - /* sc 1 from userspace - reflect to guest syscall */ > > > - kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_SYSCALL); > > > - r = RESUME_GUEST; > > > - break; > > > - } > > > run->papr_hcall.nr = kvmppc_get_gpr(vcpu, 3); > > > for (i = 0; i < 9; ++i) > > > run->papr_hcall.args[i] = kvmppc_get_gpr(vcpu, 4 + i); > > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > > index c71103b..9f626c3 100644 > > > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > > @@ -1388,7 +1388,8 @@ kvmppc_hisi: > > > hcall_try_real_mode: > > > ld r3,VCPU_GPR(R3)(r9) > > > andi. r0,r11,MSR_PR > > > - bne guest_exit_cont > > > + /* sc 1 from userspace - reflect to guest syscall */ > > > + bne sc_0_fast_return > > > clrrdi r3,r3,2 > > > cmpldi r3,hcall_real_table_end - hcall_real_table > > > bge guest_exit_cont > > > @@ -1409,6 +1410,16 @@ hcall_try_real_mode: > > > ld r11,VCPU_MSR(r4) > > > b fast_guest_return > > > > > > +sc_0_fast_return: > > > + ld r10,VCPU_PC(r9) > > > + ld r11,VCPU_MSR(r9) > > > + mtspr SPRN_SRR0,r10 > > > + mtspr SPRN_SRR1,r11 > > > + li r10, BOOK3S_INTERRUPT_SYSCALL > > > + LOAD_REG_IMMEDIATE(r3,0xffffffff87a0ffff) /* zero 33:36,42:47 */ > > > + and r11,r11,r3 > > > + b fast_guest_return > > > + > > > /* We've attempted a real mode hcall, but it's punted it back > > > * to userspace. We need to restore some clobbered volatiles > > > * before resuming the pass-it-to-qemu path */ > > > -- > > > 1.8.1.4 > > > > -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Nov 8, 2013 at 11:10 AM, Alexander Graf <agraf@suse.de> wrote: > > On 08.11.2013, at 03:44, Liu Ping Fan <kernelfans@gmail.com> wrote: > >> syscall is a very common behavior inside guest, and this patch >> optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL, >> so hypervisor can return to guest without heavy exit, i.e, no need >> to swap TLB, HTAB,.. etc > > The syscall exit you touch here only happens when you do an sc > 0 with MSR_PR set inside the guest. The only case you realistically see this is when you run PR KVM inside of an HV KVM guest. > Maybe I misunderstood the ISA spec, but refer for "6.5.14 System Call Interrupt", no description about the MSR_PR when sc trigger a syscall interrupt. So I think, guest application "sc 0" will also fall to the kernel who owns hypervisor mode. Am I right? > I don't think we should optimize for that case. Instead, we should rather try to not bounce to the 1st hypervisor in the first place in that scenario :). > Sorry, but just want to make clear about the idiom: 0 -> kernel run with NV, and 1st -> kernel run on HV-KVM and provide PR-KVM to up layer? Right? When you say "try to not bounce to the 1st hypervisor ", what is the exact meaning and how can we achieve this? I am a quite newer on powerpc, and hope that I can get more clear figure about it :) Thanks Pingfan > > Alex > >> >> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com> >> --- >> Compiled, but lack of bare metal, I have not tested it yet. >> --- >> arch/powerpc/kvm/book3s_hv.c | 6 ------ >> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 13 ++++++++++++- >> 2 files changed, 12 insertions(+), 7 deletions(-) >> >> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c >> index 62a2b5a..73dc852 100644 >> --- a/arch/powerpc/kvm/book3s_hv.c >> +++ b/arch/powerpc/kvm/book3s_hv.c >> @@ -628,12 +628,6 @@ static int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, >> /* hcall - punt to userspace */ >> int i; >> >> - if (vcpu->arch.shregs.msr & MSR_PR) { >> - /* sc 1 from userspace - reflect to guest syscall */ >> - kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_SYSCALL); >> - r = RESUME_GUEST; >> - break; >> - } >> run->papr_hcall.nr = kvmppc_get_gpr(vcpu, 3); >> for (i = 0; i < 9; ++i) >> run->papr_hcall.args[i] = kvmppc_get_gpr(vcpu, 4 + i); >> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> index c71103b..9f626c3 100644 >> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> @@ -1388,7 +1388,8 @@ kvmppc_hisi: >> hcall_try_real_mode: >> ld r3,VCPU_GPR(R3)(r9) >> andi. r0,r11,MSR_PR >> - bne guest_exit_cont >> + /* sc 1 from userspace - reflect to guest syscall */ >> + bne sc_0_fast_return >> clrrdi r3,r3,2 >> cmpldi r3,hcall_real_table_end - hcall_real_table >> bge guest_exit_cont >> @@ -1409,6 +1410,16 @@ hcall_try_real_mode: >> ld r11,VCPU_MSR(r4) >> b fast_guest_return >> >> +sc_0_fast_return: >> + ld r10,VCPU_PC(r9) >> + ld r11,VCPU_MSR(r9) >> + mtspr SPRN_SRR0,r10 >> + mtspr SPRN_SRR1,r11 >> + li r10, BOOK3S_INTERRUPT_SYSCALL >> + LOAD_REG_IMMEDIATE(r3,0xffffffff87a0ffff) /* zero 33:36,42:47 */ >> + and r11,r11,r3 >> + b fast_guest_return >> + >> /* We've attempted a real mode hcall, but it's punted it back >> * to userspace. We need to restore some clobbered volatiles >> * before resuming the pass-it-to-qemu path */ >> -- >> 1.8.1.4 >> > -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Nov 8, 2013 at 12:11 PM, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > On Fri, 2013-11-08 at 15:05 +1100, Benjamin Herrenschmidt wrote: >> On Fri, 2013-11-08 at 04:10 +0100, Alexander Graf wrote: >> > On 08.11.2013, at 03:44, Liu Ping Fan <kernelfans@gmail.com> wrote: >> > >> > > syscall is a very common behavior inside guest, and this patch >> > > optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL, >> > > so hypervisor can return to guest without heavy exit, i.e, no need >> > > to swap TLB, HTAB,.. etc >> > >> > The syscall exit you touch here only happens when you do an sc > 0 >> > with MSR_PR set inside the guest. The only case you realistically see >> > this is when you run PR KVM inside of an HV KVM guest. >> > >> > I don't think we should optimize for that case. Instead, we should >> > rather try to not bounce to the 1st hypervisor in the first place in >> > that scenario :). >> >> Well, so unfortunately openstack CI uses PR inside HV pretty >> heavily .... it *might* be worthwhile optimizing that path if the patch >> is simple enough... I'd make that Paul's call. > > Note that this is a statement of value for the idea ... not the > implementation ;-) From a quick look with Paulus, the patch is quite > broken. I'll let Paul comment in details. > Thank you very much, Regards, Pingfan -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Nov 8, 2013 at 12:19 PM, Liu ping fan <kernelfans@gmail.com> wrote: > On Fri, Nov 8, 2013 at 11:10 AM, Alexander Graf <agraf@suse.de> wrote: >> >> On 08.11.2013, at 03:44, Liu Ping Fan <kernelfans@gmail.com> wrote: >> >>> syscall is a very common behavior inside guest, and this patch >>> optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL, >>> so hypervisor can return to guest without heavy exit, i.e, no need >>> to swap TLB, HTAB,.. etc >> >> The syscall exit you touch here only happens when you do an sc > 0 with MSR_PR set inside the guest. The only case you realistically see this is when you run PR KVM inside of an HV KVM guest. >> > Maybe I misunderstood the ISA spec, but refer for "6.5.14 System Call > Interrupt", no description about the MSR_PR when sc trigger a syscall > interrupt. So I think, guest application "sc 0" will also fall to the > kernel who owns hypervisor mode. Am I right? > Some further comment: I think the essential of the problem is whether we switch RMA from guest to HV when interrupts raise. DSI/ISI will be redirected to HDSI and RMA switch. But what about SYSCALL, and DEC, external interrupt, ...etc? >> I don't think we should optimize for that case. Instead, we should rather try to not bounce to the 1st hypervisor in the first place in that scenario :). >> > Sorry, but just want to make clear about the idiom: 0 -> kernel run > with NV, and 1st -> kernel run on HV-KVM and provide PR-KVM to up > layer? Right? > > When you say "try to not bounce to the 1st hypervisor ", what is the > exact meaning and how can we achieve this? I am a quite newer on > powerpc, and hope that I can get more clear figure about it :) > Thanks Pingfan -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Nov 08, 2013 at 10:44:16AM +0800, Liu Ping Fan wrote: > syscall is a very common behavior inside guest, and this patch > optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL, > so hypervisor can return to guest without heavy exit, i.e, no need > to swap TLB, HTAB,.. etc Many interrupts that are caused by guest code go directly to the guest and don't come to the hypervisor at all. That includes system call (sc 0), alignment interrupts, program interrupts, SLB miss interrupts, etc. See section 6.5 of Book 3S of the Power ISA specification; all the interrupts with '-' in the 'HV' column of the table there get delivered directly to the guest when they occur inside a guest. > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -1388,7 +1388,8 @@ kvmppc_hisi: > hcall_try_real_mode: > ld r3,VCPU_GPR(R3)(r9) > andi. r0,r11,MSR_PR > - bne guest_exit_cont > + /* sc 1 from userspace - reflect to guest syscall */ > + bne sc_0_fast_return Discrepancy between comment and code here. In fact we would only take the branch for a sc 1 instruction in userspace, which occurs when a PR KVM guest nested inside a HV KVM guest does a hypercall (i.e., not for normal system calls). It is probably worthwhile to speed those up. > +sc_0_fast_return: > + ld r10,VCPU_PC(r9) > + ld r11,VCPU_MSR(r9) r11 must already contain this since you just did andi. r0,r11,MSR_PR. In fact r10 already contains VCPU_PC(r9) at this point also, though that is not so obvious. > + mtspr SPRN_SRR0,r10 > + mtspr SPRN_SRR1,r11 > + li r10, BOOK3S_INTERRUPT_SYSCALL > + LOAD_REG_IMMEDIATE(r3,0xffffffff87a0ffff) /* zero 33:36,42:47 */ > + and r11,r11,r3 This is not correct, since you don't even clear PR. In fact what you need is to load up MSR_SF | MSR_ME, though that value changes with little-endian support and changes again with transactional memory support for POWER8. There is an idiom for loading that MSR value, which is: li r11, (MSR_ME << 1) | 1 /* synthesize MSR_SF | MSR_ME */ rotldi r11, r11, 63 which you could use for now, but it will need to be changed when Anton's LE patch gets accepted. Paul. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Nov 8, 2013 at 7:12 PM, Paul Mackerras <paulus@samba.org> wrote: > On Fri, Nov 08, 2013 at 10:44:16AM +0800, Liu Ping Fan wrote: >> syscall is a very common behavior inside guest, and this patch >> optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL, >> so hypervisor can return to guest without heavy exit, i.e, no need >> to swap TLB, HTAB,.. etc > > Many interrupts that are caused by guest code go directly to the guest > and don't come to the hypervisor at all. That includes system call > (sc 0), alignment interrupts, program interrupts, SLB miss interrupts, > etc. See section 6.5 of Book 3S of the Power ISA specification; all > the interrupts with '-' in the 'HV' column of the table there get > delivered directly to the guest when they occur inside a guest. > Oh,got it, thanks! That is an important thing I tried to find out but missed all these days. >> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> @@ -1388,7 +1388,8 @@ kvmppc_hisi: >> hcall_try_real_mode: >> ld r3,VCPU_GPR(R3)(r9) >> andi. r0,r11,MSR_PR >> - bne guest_exit_cont >> + /* sc 1 from userspace - reflect to guest syscall */ >> + bne sc_0_fast_return > > Discrepancy between comment and code here. In fact we would only take > the branch for a sc 1 instruction in userspace, which occurs when a PR > KVM guest nested inside a HV KVM guest does a hypercall (i.e., not for I made a big mistake from the beginning, and now get a more clear understand of the scene. Thanks! > normal system calls). It is probably worthwhile to speed those up. > >> +sc_0_fast_return: >> + ld r10,VCPU_PC(r9) >> + ld r11,VCPU_MSR(r9) > > r11 must already contain this since you just did andi. r0,r11,MSR_PR. > In fact r10 already contains VCPU_PC(r9) at this point also, though > that is not so obvious. > >> + mtspr SPRN_SRR0,r10 >> + mtspr SPRN_SRR1,r11 >> + li r10, BOOK3S_INTERRUPT_SYSCALL >> + LOAD_REG_IMMEDIATE(r3,0xffffffff87a0ffff) /* zero 33:36,42:47 */ >> + and r11,r11,r3 > > This is not correct, since you don't even clear PR. In fact what you Yes. > need is to load up MSR_SF | MSR_ME, though that value changes with Is it enough to only set "MSR_SF | MSR_ME"? Sould the HV guest(PR KVM) need to fake msr, so that PR guest feels that "sc 1" is trapped by PR KVM directly? I.e, according to ISA "Figure 51. MSR setting due to interrupt", about "System Call", we need to keep MSR_IR/_DR unchanged. If it is true, then HV need to help HV guest to make this fake. Right? > little-endian support and changes again with transactional memory > support for POWER8. There is an idiom for loading that MSR value, > which is: > > li r11, (MSR_ME << 1) | 1 /* synthesize MSR_SF | MSR_ME */ > rotldi r11, r11, 63 > Why do we define MSR_SF_LG as bit 63, not like the ISA says bit 0 is SF? And could you enlighten me briefly that what is the effect on the value, when LE and transactional memory are introduced? Thanks and best regards, Pingfan > which you could use for now, but it will need to be changed when > Anton's LE patch gets accepted. > > Paul. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 62a2b5a..73dc852 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -628,12 +628,6 @@ static int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, /* hcall - punt to userspace */ int i; - if (vcpu->arch.shregs.msr & MSR_PR) { - /* sc 1 from userspace - reflect to guest syscall */ - kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_SYSCALL); - r = RESUME_GUEST; - break; - } run->papr_hcall.nr = kvmppc_get_gpr(vcpu, 3); for (i = 0; i < 9; ++i) run->papr_hcall.args[i] = kvmppc_get_gpr(vcpu, 4 + i); diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index c71103b..9f626c3 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -1388,7 +1388,8 @@ kvmppc_hisi: hcall_try_real_mode: ld r3,VCPU_GPR(R3)(r9) andi. r0,r11,MSR_PR - bne guest_exit_cont + /* sc 1 from userspace - reflect to guest syscall */ + bne sc_0_fast_return clrrdi r3,r3,2 cmpldi r3,hcall_real_table_end - hcall_real_table bge guest_exit_cont @@ -1409,6 +1410,16 @@ hcall_try_real_mode: ld r11,VCPU_MSR(r4) b fast_guest_return +sc_0_fast_return: + ld r10,VCPU_PC(r9) + ld r11,VCPU_MSR(r9) + mtspr SPRN_SRR0,r10 + mtspr SPRN_SRR1,r11 + li r10, BOOK3S_INTERRUPT_SYSCALL + LOAD_REG_IMMEDIATE(r3,0xffffffff87a0ffff) /* zero 33:36,42:47 */ + and r11,r11,r3 + b fast_guest_return + /* We've attempted a real mode hcall, but it's punted it back * to userspace. We need to restore some clobbered volatiles * before resuming the pass-it-to-qemu path */
syscall is a very common behavior inside guest, and this patch optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL, so hypervisor can return to guest without heavy exit, i.e, no need to swap TLB, HTAB,.. etc Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com> --- Compiled, but lack of bare metal, I have not tested it yet. --- arch/powerpc/kvm/book3s_hv.c | 6 ------ arch/powerpc/kvm/book3s_hv_rmhandlers.S | 13 ++++++++++++- 2 files changed, 12 insertions(+), 7 deletions(-)