diff mbox

powerpc/kvm: support to handle sw breakpoint

Message ID 1402780097-28827-1-git-send-email-maddy@linux.vnet.ibm.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

maddy June 14, 2014, 9:08 p.m. UTC
This patch adds kernel side support for software breakpoint.
Design is that, by using an illegal instruction, we trap to hypervisor
via Emulation Assistance interrupt, where we check for the illegal instruction
and accordingly we return to Host or Guest. Patch mandates use of "abs" instruction
(primary opcode 31 and extended opcode 360) as sw breakpoint instruction.
Based on PowerISA v2.01, ABS instruction has been dropped from the architecture
and treated an illegal instruction.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/kvm/book3s.c    |  3 ++-
 arch/powerpc/kvm/book3s_hv.c | 23 +++++++++++++++++++----
 2 files changed, 21 insertions(+), 5 deletions(-)

Comments

Alexander Graf June 17, 2014, 8:54 a.m. UTC | #1
On 14.06.14 23:08, Madhavan Srinivasan wrote:
> This patch adds kernel side support for software breakpoint.
> Design is that, by using an illegal instruction, we trap to hypervisor
> via Emulation Assistance interrupt, where we check for the illegal instruction
> and accordingly we return to Host or Guest. Patch mandates use of "abs" instruction
> (primary opcode 31 and extended opcode 360) as sw breakpoint instruction.
> Based on PowerISA v2.01, ABS instruction has been dropped from the architecture
> and treated an illegal instruction.
>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
>   arch/powerpc/kvm/book3s.c    |  3 ++-
>   arch/powerpc/kvm/book3s_hv.c | 23 +++++++++++++++++++----
>   2 files changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
> index c254c27..b40fe5d 100644
> --- a/arch/powerpc/kvm/book3s.c
> +++ b/arch/powerpc/kvm/book3s.c
> @@ -789,7 +789,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
>   int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>   					struct kvm_guest_debug *dbg)
>   {
> -	return -EINVAL;
> +	vcpu->guest_debug = dbg->control;
> +	return 0;
>   }
>   
>   void kvmppc_decrementer_func(unsigned long data)
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 7a12edb..688421d 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -67,6 +67,14 @@
>   /* Used as a "null" value for timebase values */
>   #define TB_NIL	(~(u64)0)
>   
> +/*
> + * SW_BRK_DBG_INT is debug Instruction for supporting Software Breakpoint.
> + * Instruction mnemonic is ABS, primary opcode is 31 and extended opcode is 360.
> + * Based on PowerISA v2.01, ABS instruction has been dropped from the architecture
> + * and treated an illegal instruction.
> + */
> +#define SW_BRK_DBG_INT 0x7c0002d0

The instruction we use to trap needs to get exposed to user space via a 
ONE_REG property.

Also, why don't we use twi always or something else that actually is 
defined as illegal instruction? I would like to see this shared with 
book3s_32 PR.

> +
>   static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
>   static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
>   
> @@ -721,12 +729,19 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   		break;
>   	/*
>   	 * This occurs if the guest executes an illegal instruction.
> -	 * We just generate a program interrupt to the guest, since
> -	 * we don't emulate any guest instructions at this stage.
> +	 * To support software breakpoint, we check for the sw breakpoint
> +	 * instruction to return back to host, if not we just generate a
> +	 * program interrupt to the guest.
>   	 */
>   	case BOOK3S_INTERRUPT_H_EMUL_ASSIST:
> -		kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
> -		r = RESUME_GUEST;
> +		if (vcpu->arch.last_inst == SW_BRK_DBG_INT) {

Don't access last_inst directly. Instead use the provided helpers.

> +			run->exit_reason = KVM_EXIT_DEBUG;
> +			run->debug.arch.address = vcpu->arch.pc;
> +			r = RESUME_HOST;
> +		} else {
> +			kvmppc_core_queue_program(vcpu, 0x80000);

magic numbers

> +			r = RESUME_GUEST;
> +		}
>   		break;
>   	/*
>   	 * This occurs if the guest (kernel or userspace), does something that

Please enable PR KVM as well while you're at it.


Alex
Benjamin Herrenschmidt June 17, 2014, 9:22 a.m. UTC | #2
On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote:
> 
> Also, why don't we use twi always or something else that actually is 
> defined as illegal instruction? I would like to see this shared with 
> book3s_32 PR.

twi will be directed to the guest on HV no ? We want a real illegal
because those go to the host (for potential emulation by the HV). I'm
trying to see if I can get the architect to set one in stone in a future
proof way.

Cheers,
Ben.
Alexander Graf June 17, 2014, 9:25 a.m. UTC | #3
On 17.06.14 11:22, Benjamin Herrenschmidt wrote:
> On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote:
>> Also, why don't we use twi always or something else that actually is
>> defined as illegal instruction? I would like to see this shared with
>> book3s_32 PR.
> twi will be directed to the guest on HV no ? We want a real illegal
> because those go to the host (for potential emulation by the HV).

Ah, good point. I guess we need different one for PR and HV then to 
ensure compatibility with older ISAs on PR.


Alex

> I'm
> trying to see if I can get the architect to set one in stone in a future
> proof way.
>
> Cheers,
> Ben.
>
Benjamin Herrenschmidt June 17, 2014, 9:32 a.m. UTC | #4
On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote:
> On 17.06.14 11:22, Benjamin Herrenschmidt wrote:
> > On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote:
> >> Also, why don't we use twi always or something else that actually is
> >> defined as illegal instruction? I would like to see this shared with
> >> book3s_32 PR.
> > twi will be directed to the guest on HV no ? We want a real illegal
> > because those go to the host (for potential emulation by the HV).
> 
> Ah, good point. I guess we need different one for PR and HV then to 
> ensure compatibility with older ISAs on PR.

Well, we also need to be careful with what happens if a PR guest puts
that instruction in, do that stop its HV guest/host ?

What if it's done in userspace ? Do that stop the kernel ? :-)

Maddy, I haven't checked, does your patch ensure that we only ever stop
if the instruction is at a recorded bkpt address ? It still means that a
userspace process can practically DOS its kernel by issuing a lot of
these causing a crapload of exits.

Cheers,
Ben.

> Alex
> 
> > I'm
> > trying to see if I can get the architect to set one in stone in a future
> > proof way.
> >
> > Cheers,
> > Ben.
> >
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Graf June 17, 2014, 9:43 a.m. UTC | #5
On 17.06.14 11:32, Benjamin Herrenschmidt wrote:
> On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote:
>> On 17.06.14 11:22, Benjamin Herrenschmidt wrote:
>>> On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote:
>>>> Also, why don't we use twi always or something else that actually is
>>>> defined as illegal instruction? I would like to see this shared with
>>>> book3s_32 PR.
>>> twi will be directed to the guest on HV no ? We want a real illegal
>>> because those go to the host (for potential emulation by the HV).
>> Ah, good point. I guess we need different one for PR and HV then to
>> ensure compatibility with older ISAs on PR.
> Well, we also need to be careful with what happens if a PR guest puts
> that instruction in, do that stop its HV guest/host ?
>
> What if it's done in userspace ? Do that stop the kernel ? :-)

The way SW breakpointing is handled is that when we see one, it gets 
deflected into user space. User space then has an array of breakpoints 
it configured itself. If the breakpoint is part of that list, it 
consumes it. If not, it injects a debug interrupt (program in this case) 
into the guest.

That way we can overlay that one instruction with as many layers as we 
like :). We only get a performance hit on execution of that instruction.

> Maddy, I haven't checked, does your patch ensure that we only ever stop
> if the instruction is at a recorded bkpt address ? It still means that a
> userspace process can practically DOS its kernel by issuing a lot of
> these causing a crapload of exits.

Only user space knows about its breakpoint addresses, so we have to 
deflect. However since time still ticks on, we only increase jitter of 
the guest. The process would still get scheduled away after the same 
amount of real time, no?


Alex
maddy June 17, 2014, 10:51 a.m. UTC | #6
On Tuesday 17 June 2014 03:02 PM, Benjamin Herrenschmidt wrote:
> On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote:
>> On 17.06.14 11:22, Benjamin Herrenschmidt wrote:
>>> On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote:
>>>> Also, why don't we use twi always or something else that actually is
>>>> defined as illegal instruction? I would like to see this shared with
>>>> book3s_32 PR.
>>> twi will be directed to the guest on HV no ? We want a real illegal
>>> because those go to the host (for potential emulation by the HV).
>>
>> Ah, good point. I guess we need different one for PR and HV then to 
>> ensure compatibility with older ISAs on PR.
> 
> Well, we also need to be careful with what happens if a PR guest puts
> that instruction in, do that stop its HV guest/host ?
> 

Damn, my mail client is messed up. did not see the mail till now.


I havent tried this incase of PR guest kernel. I will need to try this
before commenting.

> What if it's done in userspace ? Do that stop the kernel ? :-)
> 

Basically flow is that, when we see this instruction, we return to host,
and host checks for address in the SW array and if not it returns to kernel.

> Maddy, I haven't checked, does your patch ensure that we only ever stop
> if the instruction is at a recorded bkpt address ? It still means that a
> userspace process can practically DOS its kernel by issuing a lot of
> these causing a crapload of exits.
> 
This is valid, userspace can create a mess, need to handle this, meaning
incase if we dont find a valid SW breakpoint for this address in the
HOST, we need to route it to guest and kill it at app.

Regards
Maddy

> Cheers,
> Ben.
> 
>> Alex
>>
>>> I'm
>>> trying to see if I can get the architect to set one in stone in a future
>>> proof way.
>>>
>>> Cheers,
>>> Ben.
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
>
maddy June 17, 2014, 11:07 a.m. UTC | #7
On Tuesday 17 June 2014 02:24 PM, Alexander Graf wrote:
> 
> On 14.06.14 23:08, Madhavan Srinivasan wrote:
>> This patch adds kernel side support for software breakpoint.
>> Design is that, by using an illegal instruction, we trap to hypervisor
>> via Emulation Assistance interrupt, where we check for the illegal
>> instruction
>> and accordingly we return to Host or Guest. Patch mandates use of
>> "abs" instruction
>> (primary opcode 31 and extended opcode 360) as sw breakpoint instruction.
>> Based on PowerISA v2.01, ABS instruction has been dropped from the
>> architecture
>> and treated an illegal instruction.
>>
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>> ---
>>   arch/powerpc/kvm/book3s.c    |  3 ++-
>>   arch/powerpc/kvm/book3s_hv.c | 23 +++++++++++++++++++----
>>   2 files changed, 21 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
>> index c254c27..b40fe5d 100644
>> --- a/arch/powerpc/kvm/book3s.c
>> +++ b/arch/powerpc/kvm/book3s.c
>> @@ -789,7 +789,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu
>> *vcpu,
>>   int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>>                       struct kvm_guest_debug *dbg)
>>   {
>> -    return -EINVAL;
>> +    vcpu->guest_debug = dbg->control;
>> +    return 0;
>>   }
>>     void kvmppc_decrementer_func(unsigned long data)
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 7a12edb..688421d 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -67,6 +67,14 @@
>>   /* Used as a "null" value for timebase values */
>>   #define TB_NIL    (~(u64)0)
>>   +/*
>> + * SW_BRK_DBG_INT is debug Instruction for supporting Software
>> Breakpoint.
>> + * Instruction mnemonic is ABS, primary opcode is 31 and extended
>> opcode is 360.
>> + * Based on PowerISA v2.01, ABS instruction has been dropped from the
>> architecture
>> + * and treated an illegal instruction.
>> + */
>> +#define SW_BRK_DBG_INT 0x7c0002d0
> 
> The instruction we use to trap needs to get exposed to user space via a
> ONE_REG property.
> 

Yes. I got to know about that from Bharat (patchset "ppc debug: Add
debug stub support"). I will change it.

> Also, why don't we use twi always or something else that actually is
> defined as illegal instruction? I would like to see this shared with
> book3s_32 PR.
> 
>> +
>>   static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
>>   static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
>>   @@ -721,12 +729,19 @@ static int kvmppc_handle_exit_hv(struct
>> kvm_run *run, struct kvm_vcpu *vcpu,
>>           break;
>>       /*
>>        * This occurs if the guest executes an illegal instruction.
>> -     * We just generate a program interrupt to the guest, since
>> -     * we don't emulate any guest instructions at this stage.
>> +     * To support software breakpoint, we check for the sw breakpoint
>> +     * instruction to return back to host, if not we just generate a
>> +     * program interrupt to the guest.
>>        */
>>       case BOOK3S_INTERRUPT_H_EMUL_ASSIST:
>> -        kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
>> -        r = RESUME_GUEST;
>> +        if (vcpu->arch.last_inst == SW_BRK_DBG_INT) {
> 
> Don't access last_inst directly. Instead use the provided helpers.
> 

Ok. Will look and replace it.

>> +            run->exit_reason = KVM_EXIT_DEBUG;
>> +            run->debug.arch.address = vcpu->arch.pc;
>> +            r = RESUME_HOST;
>> +        } else {
>> +            kvmppc_core_queue_program(vcpu, 0x80000);
> 
> magic numbers
^^^^^
I did not understand this?

>> +            r = RESUME_GUEST;
>> +        }
>>           break;
>>       /*
>>        * This occurs if the guest (kernel or userspace), does
>> something that
> 
> Please enable PR KVM as well while you're at it.
> 
My bad, I did not try the PR KVM. I will try it out.

> 
> Alex
> 
Thanks for review
Regards
Maddy
Alexander Graf June 17, 2014, 11:08 a.m. UTC | #8
On 17.06.14 13:07, Madhavan Srinivasan wrote:
> On Tuesday 17 June 2014 02:24 PM, Alexander Graf wrote:
>> On 14.06.14 23:08, Madhavan Srinivasan wrote:
>>> This patch adds kernel side support for software breakpoint.
>>> Design is that, by using an illegal instruction, we trap to hypervisor
>>> via Emulation Assistance interrupt, where we check for the illegal
>>> instruction
>>> and accordingly we return to Host or Guest. Patch mandates use of
>>> "abs" instruction
>>> (primary opcode 31 and extended opcode 360) as sw breakpoint instruction.
>>> Based on PowerISA v2.01, ABS instruction has been dropped from the
>>> architecture
>>> and treated an illegal instruction.
>>>
>>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>>> ---
>>>    arch/powerpc/kvm/book3s.c    |  3 ++-
>>>    arch/powerpc/kvm/book3s_hv.c | 23 +++++++++++++++++++----
>>>    2 files changed, 21 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
>>> index c254c27..b40fe5d 100644
>>> --- a/arch/powerpc/kvm/book3s.c
>>> +++ b/arch/powerpc/kvm/book3s.c
>>> @@ -789,7 +789,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu
>>> *vcpu,
>>>    int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>>>                        struct kvm_guest_debug *dbg)
>>>    {
>>> -    return -EINVAL;
>>> +    vcpu->guest_debug = dbg->control;
>>> +    return 0;
>>>    }
>>>      void kvmppc_decrementer_func(unsigned long data)
>>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>>> index 7a12edb..688421d 100644
>>> --- a/arch/powerpc/kvm/book3s_hv.c
>>> +++ b/arch/powerpc/kvm/book3s_hv.c
>>> @@ -67,6 +67,14 @@
>>>    /* Used as a "null" value for timebase values */
>>>    #define TB_NIL    (~(u64)0)
>>>    +/*
>>> + * SW_BRK_DBG_INT is debug Instruction for supporting Software
>>> Breakpoint.
>>> + * Instruction mnemonic is ABS, primary opcode is 31 and extended
>>> opcode is 360.
>>> + * Based on PowerISA v2.01, ABS instruction has been dropped from the
>>> architecture
>>> + * and treated an illegal instruction.
>>> + */
>>> +#define SW_BRK_DBG_INT 0x7c0002d0
>> The instruction we use to trap needs to get exposed to user space via a
>> ONE_REG property.
>>
> Yes. I got to know about that from Bharat (patchset "ppc debug: Add
> debug stub support"). I will change it.
>
>> Also, why don't we use twi always or something else that actually is
>> defined as illegal instruction? I would like to see this shared with
>> book3s_32 PR.
>>
>>> +
>>>    static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
>>>    static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
>>>    @@ -721,12 +729,19 @@ static int kvmppc_handle_exit_hv(struct
>>> kvm_run *run, struct kvm_vcpu *vcpu,
>>>            break;
>>>        /*
>>>         * This occurs if the guest executes an illegal instruction.
>>> -     * We just generate a program interrupt to the guest, since
>>> -     * we don't emulate any guest instructions at this stage.
>>> +     * To support software breakpoint, we check for the sw breakpoint
>>> +     * instruction to return back to host, if not we just generate a
>>> +     * program interrupt to the guest.
>>>         */
>>>        case BOOK3S_INTERRUPT_H_EMUL_ASSIST:
>>> -        kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
>>> -        r = RESUME_GUEST;
>>> +        if (vcpu->arch.last_inst == SW_BRK_DBG_INT) {
>> Don't access last_inst directly. Instead use the provided helpers.
>>
> Ok. Will look and replace it.
>
>>> +            run->exit_reason = KVM_EXIT_DEBUG;
>>> +            run->debug.arch.address = vcpu->arch.pc;
>>> +            r = RESUME_HOST;
>>> +        } else {
>>> +            kvmppc_core_queue_program(vcpu, 0x80000);
>> magic numbers
> ^^^^^
> I did not understand this?

You're replacing the readable SRR1_PROGILL with the unreadable 0x80000. 
That's bad.


Alex
maddy June 17, 2014, 11:13 a.m. UTC | #9
On Tuesday 17 June 2014 04:38 PM, Alexander Graf wrote:
> 
> On 17.06.14 13:07, Madhavan Srinivasan wrote:
>> On Tuesday 17 June 2014 02:24 PM, Alexander Graf wrote:
>>> On 14.06.14 23:08, Madhavan Srinivasan wrote:
>>>> This patch adds kernel side support for software breakpoint.
>>>> Design is that, by using an illegal instruction, we trap to hypervisor
>>>> via Emulation Assistance interrupt, where we check for the illegal
>>>> instruction
>>>> and accordingly we return to Host or Guest. Patch mandates use of
>>>> "abs" instruction
>>>> (primary opcode 31 and extended opcode 360) as sw breakpoint
>>>> instruction.
>>>> Based on PowerISA v2.01, ABS instruction has been dropped from the
>>>> architecture
>>>> and treated an illegal instruction.
>>>>
>>>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>>>> ---
>>>>    arch/powerpc/kvm/book3s.c    |  3 ++-
>>>>    arch/powerpc/kvm/book3s_hv.c | 23 +++++++++++++++++++----
>>>>    2 files changed, 21 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
>>>> index c254c27..b40fe5d 100644
>>>> --- a/arch/powerpc/kvm/book3s.c
>>>> +++ b/arch/powerpc/kvm/book3s.c
>>>> @@ -789,7 +789,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu
>>>> *vcpu,
>>>>    int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>>>>                        struct kvm_guest_debug *dbg)
>>>>    {
>>>> -    return -EINVAL;
>>>> +    vcpu->guest_debug = dbg->control;
>>>> +    return 0;
>>>>    }
>>>>      void kvmppc_decrementer_func(unsigned long data)
>>>> diff --git a/arch/powerpc/kvm/book3s_hv.c
>>>> b/arch/powerpc/kvm/book3s_hv.c
>>>> index 7a12edb..688421d 100644
>>>> --- a/arch/powerpc/kvm/book3s_hv.c
>>>> +++ b/arch/powerpc/kvm/book3s_hv.c
>>>> @@ -67,6 +67,14 @@
>>>>    /* Used as a "null" value for timebase values */
>>>>    #define TB_NIL    (~(u64)0)
>>>>    +/*
>>>> + * SW_BRK_DBG_INT is debug Instruction for supporting Software
>>>> Breakpoint.
>>>> + * Instruction mnemonic is ABS, primary opcode is 31 and extended
>>>> opcode is 360.
>>>> + * Based on PowerISA v2.01, ABS instruction has been dropped from the
>>>> architecture
>>>> + * and treated an illegal instruction.
>>>> + */
>>>> +#define SW_BRK_DBG_INT 0x7c0002d0
>>> The instruction we use to trap needs to get exposed to user space via a
>>> ONE_REG property.
>>>
>> Yes. I got to know about that from Bharat (patchset "ppc debug: Add
>> debug stub support"). I will change it.
>>
>>> Also, why don't we use twi always or something else that actually is
>>> defined as illegal instruction? I would like to see this shared with
>>> book3s_32 PR.
>>>
>>>> +
>>>>    static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
>>>>    static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
>>>>    @@ -721,12 +729,19 @@ static int kvmppc_handle_exit_hv(struct
>>>> kvm_run *run, struct kvm_vcpu *vcpu,
>>>>            break;
>>>>        /*
>>>>         * This occurs if the guest executes an illegal instruction.
>>>> -     * We just generate a program interrupt to the guest, since
>>>> -     * we don't emulate any guest instructions at this stage.
>>>> +     * To support software breakpoint, we check for the sw breakpoint
>>>> +     * instruction to return back to host, if not we just generate a
>>>> +     * program interrupt to the guest.
>>>>         */
>>>>        case BOOK3S_INTERRUPT_H_EMUL_ASSIST:
>>>> -        kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
>>>> -        r = RESUME_GUEST;
>>>> +        if (vcpu->arch.last_inst == SW_BRK_DBG_INT) {
>>> Don't access last_inst directly. Instead use the provided helpers.
>>>
>> Ok. Will look and replace it.
>>
>>>> +            run->exit_reason = KVM_EXIT_DEBUG;
>>>> +            run->debug.arch.address = vcpu->arch.pc;
>>>> +            r = RESUME_HOST;
>>>> +        } else {
>>>> +            kvmppc_core_queue_program(vcpu, 0x80000);
>>> magic numbers
>> ^^^^^
>> I did not understand this?
> 
> You're replacing the readable SRR1_PROGILL with the unreadable 0x80000.
> That's bad.
> 

Oops. My bad. Will undo that. I guess I messed up when was re basing it.

> 
> Alex
> 
Thanks for review
Regards
Maddy
maddy June 17, 2014, 11:20 a.m. UTC | #10
On Tuesday 17 June 2014 03:13 PM, Alexander Graf wrote:
> 
> On 17.06.14 11:32, Benjamin Herrenschmidt wrote:
>> On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote:
>>> On 17.06.14 11:22, Benjamin Herrenschmidt wrote:
>>>> On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote:
>>>>> Also, why don't we use twi always or something else that actually is
>>>>> defined as illegal instruction? I would like to see this shared with
>>>>> book3s_32 PR.
>>>> twi will be directed to the guest on HV no ? We want a real illegal
>>>> because those go to the host (for potential emulation by the HV).
>>> Ah, good point. I guess we need different one for PR and HV then to
>>> ensure compatibility with older ISAs on PR.
>> Well, we also need to be careful with what happens if a PR guest puts
>> that instruction in, do that stop its HV guest/host ?
>>
>> What if it's done in userspace ? Do that stop the kernel ? :-)
> 
> The way SW breakpointing is handled is that when we see one, it gets
> deflected into user space. User space then has an array of breakpoints
> it configured itself. If the breakpoint is part of that list, it
> consumes it. If not, it injects a debug interrupt (program in this case)
> into the guest.
> 
> That way we can overlay that one instruction with as many layers as we
> like :). We only get a performance hit on execution of that instruction.
> 
>> Maddy, I haven't checked, does your patch ensure that we only ever stop
>> if the instruction is at a recorded bkpt address ? It still means that a
>> userspace process can practically DOS its kernel by issuing a lot of
>> these causing a crapload of exits.
> 
> Only user space knows about its breakpoint addresses, so we have to
> deflect. However since time still ticks on, we only increase jitter of
> the guest. The process would still get scheduled away after the same
 ^^^ Where is this taken care. I am still trying to understand. Kindly
can you explain or point to the code. Will help.

> amount of real time, no?
> 
> 
> Alex
> 
Thanks for review.
Regards
Maddy
Alexander Graf June 17, 2014, 11:31 a.m. UTC | #11
On 17.06.14 13:20, Madhavan Srinivasan wrote:
> On Tuesday 17 June 2014 03:13 PM, Alexander Graf wrote:
>> On 17.06.14 11:32, Benjamin Herrenschmidt wrote:
>>> On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote:
>>>> On 17.06.14 11:22, Benjamin Herrenschmidt wrote:
>>>>> On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote:
>>>>>> Also, why don't we use twi always or something else that actually is
>>>>>> defined as illegal instruction? I would like to see this shared with
>>>>>> book3s_32 PR.
>>>>> twi will be directed to the guest on HV no ? We want a real illegal
>>>>> because those go to the host (for potential emulation by the HV).
>>>> Ah, good point. I guess we need different one for PR and HV then to
>>>> ensure compatibility with older ISAs on PR.
>>> Well, we also need to be careful with what happens if a PR guest puts
>>> that instruction in, do that stop its HV guest/host ?
>>>
>>> What if it's done in userspace ? Do that stop the kernel ? :-)
>> The way SW breakpointing is handled is that when we see one, it gets
>> deflected into user space. User space then has an array of breakpoints
>> it configured itself. If the breakpoint is part of that list, it
>> consumes it. If not, it injects a debug interrupt (program in this case)
>> into the guest.
>>
>> That way we can overlay that one instruction with as many layers as we
>> like :). We only get a performance hit on execution of that instruction.
>>
>>> Maddy, I haven't checked, does your patch ensure that we only ever stop
>>> if the instruction is at a recorded bkpt address ? It still means that a
>>> userspace process can practically DOS its kernel by issuing a lot of
>>> these causing a crapload of exits.
>> Only user space knows about its breakpoint addresses, so we have to
>> deflect. However since time still ticks on, we only increase jitter of
>> the guest. The process would still get scheduled away after the same
>   ^^^ Where is this taken care. I am still trying to understand. Kindly
> can you explain or point to the code. Will help.

We tell the guest via VPA about its steal time which includes QEMU time.


Alex
Alexander Graf June 17, 2014, 2:42 p.m. UTC | #12
On 17.06.14 13:13, Madhavan Srinivasan wrote:
> On Tuesday 17 June 2014 04:38 PM, Alexander Graf wrote:
>> On 17.06.14 13:07, Madhavan Srinivasan wrote:
>>> On Tuesday 17 June 2014 02:24 PM, Alexander Graf wrote:
>>>> On 14.06.14 23:08, Madhavan Srinivasan wrote:
>>>>> This patch adds kernel side support for software breakpoint.
>>>>> Design is that, by using an illegal instruction, we trap to hypervisor
>>>>> via Emulation Assistance interrupt, where we check for the illegal
>>>>> instruction
>>>>> and accordingly we return to Host or Guest. Patch mandates use of
>>>>> "abs" instruction
>>>>> (primary opcode 31 and extended opcode 360) as sw breakpoint
>>>>> instruction.
>>>>> Based on PowerISA v2.01, ABS instruction has been dropped from the
>>>>> architecture
>>>>> and treated an illegal instruction.
>>>>>
>>>>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>>>>> ---
>>>>>     arch/powerpc/kvm/book3s.c    |  3 ++-
>>>>>     arch/powerpc/kvm/book3s_hv.c | 23 +++++++++++++++++++----
>>>>>     2 files changed, 21 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
>>>>> index c254c27..b40fe5d 100644
>>>>> --- a/arch/powerpc/kvm/book3s.c
>>>>> +++ b/arch/powerpc/kvm/book3s.c
>>>>> @@ -789,7 +789,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu
>>>>> *vcpu,
>>>>>     int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>>>>>                         struct kvm_guest_debug *dbg)
>>>>>     {
>>>>> -    return -EINVAL;
>>>>> +    vcpu->guest_debug = dbg->control;
>>>>> +    return 0;
>>>>>     }
>>>>>       void kvmppc_decrementer_func(unsigned long data)
>>>>> diff --git a/arch/powerpc/kvm/book3s_hv.c
>>>>> b/arch/powerpc/kvm/book3s_hv.c
>>>>> index 7a12edb..688421d 100644
>>>>> --- a/arch/powerpc/kvm/book3s_hv.c
>>>>> +++ b/arch/powerpc/kvm/book3s_hv.c
>>>>> @@ -67,6 +67,14 @@
>>>>>     /* Used as a "null" value for timebase values */
>>>>>     #define TB_NIL    (~(u64)0)
>>>>>     +/*
>>>>> + * SW_BRK_DBG_INT is debug Instruction for supporting Software
>>>>> Breakpoint.
>>>>> + * Instruction mnemonic is ABS, primary opcode is 31 and extended
>>>>> opcode is 360.
>>>>> + * Based on PowerISA v2.01, ABS instruction has been dropped from the
>>>>> architecture
>>>>> + * and treated an illegal instruction.
>>>>> + */
>>>>> +#define SW_BRK_DBG_INT 0x7c0002d0
>>>> The instruction we use to trap needs to get exposed to user space via a
>>>> ONE_REG property.
>>>>
>>> Yes. I got to know about that from Bharat (patchset "ppc debug: Add
>>> debug stub support"). I will change it.

Also please make sure to pick an instruction that preferably looks 
identical regardless of guest endianness. Segher suggested 0x00dddd00. 
Does that trap properly for you?


Alex
diff mbox

Patch

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index c254c27..b40fe5d 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -789,7 +789,8 @@  int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
 int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
 					struct kvm_guest_debug *dbg)
 {
-	return -EINVAL;
+	vcpu->guest_debug = dbg->control;
+	return 0;
 }
 
 void kvmppc_decrementer_func(unsigned long data)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 7a12edb..688421d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -67,6 +67,14 @@ 
 /* Used as a "null" value for timebase values */
 #define TB_NIL	(~(u64)0)
 
+/*
+ * SW_BRK_DBG_INT is debug Instruction for supporting Software Breakpoint.
+ * Instruction mnemonic is ABS, primary opcode is 31 and extended opcode is 360.
+ * Based on PowerISA v2.01, ABS instruction has been dropped from the architecture
+ * and treated an illegal instruction.
+ */
+#define SW_BRK_DBG_INT 0x7c0002d0
+
 static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
 static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
 
@@ -721,12 +729,19 @@  static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		break;
 	/*
 	 * This occurs if the guest executes an illegal instruction.
-	 * We just generate a program interrupt to the guest, since
-	 * we don't emulate any guest instructions at this stage.
+	 * To support software breakpoint, we check for the sw breakpoint
+	 * instruction to return back to host, if not we just generate a
+	 * program interrupt to the guest.
 	 */
 	case BOOK3S_INTERRUPT_H_EMUL_ASSIST:
-		kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
-		r = RESUME_GUEST;
+		if (vcpu->arch.last_inst == SW_BRK_DBG_INT) {
+			run->exit_reason = KVM_EXIT_DEBUG;
+			run->debug.arch.address = vcpu->arch.pc;
+			r = RESUME_HOST;
+		} else {
+			kvmppc_core_queue_program(vcpu, 0x80000);
+			r = RESUME_GUEST;
+		}
 		break;
 	/*
 	 * This occurs if the guest (kernel or userspace), does something that