[for-5.0] xive/kvm: Trigger interrupts from userspace
diff mbox series

Message ID 157408992731.494439.3405812941731584740.stgit@bahia.lan
State New
Headers show
Series
  • [for-5.0] xive/kvm: Trigger interrupts from userspace
Related show

Commit Message

Greg Kurz Nov. 18, 2019, 3:12 p.m. UTC
When using the XIVE KVM device, the trigger page is directly accessible
in QEMU. Unlike with XICS, no need to ask KVM to fire the interrupt. A
simple store on the trigger page does the job.

Just call xive_esb_trigger().

This may improve performance of emulated devices that go through
qemu_set_irq(), eg. virtio devices created with ioeventfd=off or
configured by the guest to use LSI interrupts, which aren't really
recommended setups.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/intc/spapr_xive_kvm.c |   16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

Comments

Cédric Le Goater Nov. 18, 2019, 3:37 p.m. UTC | #1
On 18/11/2019 16:12, Greg Kurz wrote:
> When using the XIVE KVM device, the trigger page is directly accessible
> in QEMU. Unlike with XICS, no need to ask KVM to fire the interrupt. A
> simple store on the trigger page does the job.
> 
> Just call xive_esb_trigger().

Yes but the KVM XIVE device does a few other checks. 

It checks that the interrupt was correctly initialized at the KVM device
level. We should be fine in QEMU which has similar checks.

It caches the LSI assertion level. We should be fine also because it is
useless in KVM when using the XIVE native exploitation mode.

It checks it is not a passthru interrupt. Any idea on how to check this 
condition under QEMU ? 
 
> This may improve performance of emulated devices that go through
> qemu_set_irq(), eg. virtio devices created with ioeventfd=off or
> configured by the guest to use LSI interrupts, which aren't really
> recommended setups.

LGTM.

Any figures to share ? 

C.

> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  hw/intc/spapr_xive_kvm.c |   16 ++--------------
>  1 file changed, 2 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> index 08012ac7cd76..69e73552f1ef 100644
> --- a/hw/intc/spapr_xive_kvm.c
> +++ b/hw/intc/spapr_xive_kvm.c
> @@ -354,32 +354,20 @@ static void kvmppc_xive_source_get_state(XiveSource *xsrc)
>  void kvmppc_xive_source_set_irq(void *opaque, int srcno, int val)
>  {
>      XiveSource *xsrc = opaque;
> -    SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
> -    struct kvm_irq_level args;
> -    int rc;
> -
> -    /* The KVM XIVE device should be in use */
> -    assert(xive->fd != -1);
>  
> -    args.irq = srcno;
>      if (!xive_source_irq_is_lsi(xsrc, srcno)) {
>          if (!val) {
>              return;
>          }
> -        args.level = KVM_INTERRUPT_SET;
>      } else {
>          if (val) {
>              xsrc->status[srcno] |= XIVE_STATUS_ASSERTED;
> -            args.level = KVM_INTERRUPT_SET_LEVEL;
>          } else {
>              xsrc->status[srcno] &= ~XIVE_STATUS_ASSERTED;
> -            args.level = KVM_INTERRUPT_UNSET;
>          }
>      }
> -    rc = kvm_vm_ioctl(kvm_state, KVM_IRQ_LINE, &args);
> -    if (rc < 0) {
> -        error_report("XIVE: kvm_irq_line() failed : %s", strerror(errno));
> -    }
> +
> +    xive_esb_trigger(xsrc, srcno);
>  }
Greg Kurz Nov. 18, 2019, 5:25 p.m. UTC | #2
On Mon, 18 Nov 2019 16:37:16 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> On 18/11/2019 16:12, Greg Kurz wrote:
> > When using the XIVE KVM device, the trigger page is directly accessible
> > in QEMU. Unlike with XICS, no need to ask KVM to fire the interrupt. A
> > simple store on the trigger page does the job.
> > 
> > Just call xive_esb_trigger().
> 
> Yes but the KVM XIVE device does a few other checks. 
> 
> It checks that the interrupt was correctly initialized at the KVM device
> level. We should be fine in QEMU which has similar checks.
> 

Yeah, not sure we can do much more than to trust the QEMU irq code to
pass us a valid srcno.

> It caches the LSI assertion level. We should be fine also because it is
> useless in KVM when using the XIVE native exploitation mode.
> 

Yeah, I see it is set in kvmppc_xive_native_set_source() (why?) but never
used anywhere else in book3s_xive_native.c.

> It checks it is not a passthru interrupt. Any idea on how to check this 
> condition under QEMU ? 
> 

AFAICT passthru interrupts don't go through here either.
 
> > This may improve performance of emulated devices that go through
> > qemu_set_irq(), eg. virtio devices created with ioeventfd=off or
> > configured by the guest to use LSI interrupts, which aren't really
> > recommended setups.
> 
> LGTM.
> 
> Any figures to share ? 
> 

No. Since performance critical setups don't go through that path,
I didn't try too much.

> C.
> 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  hw/intc/spapr_xive_kvm.c |   16 ++--------------
> >  1 file changed, 2 insertions(+), 14 deletions(-)
> > 
> > diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> > index 08012ac7cd76..69e73552f1ef 100644
> > --- a/hw/intc/spapr_xive_kvm.c
> > +++ b/hw/intc/spapr_xive_kvm.c
> > @@ -354,32 +354,20 @@ static void kvmppc_xive_source_get_state(XiveSource *xsrc)
> >  void kvmppc_xive_source_set_irq(void *opaque, int srcno, int val)
> >  {
> >      XiveSource *xsrc = opaque;
> > -    SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
> > -    struct kvm_irq_level args;
> > -    int rc;
> > -
> > -    /* The KVM XIVE device should be in use */
> > -    assert(xive->fd != -1);
> >  
> > -    args.irq = srcno;
> >      if (!xive_source_irq_is_lsi(xsrc, srcno)) {
> >          if (!val) {
> >              return;
> >          }
> > -        args.level = KVM_INTERRUPT_SET;
> >      } else {
> >          if (val) {
> >              xsrc->status[srcno] |= XIVE_STATUS_ASSERTED;
> > -            args.level = KVM_INTERRUPT_SET_LEVEL;
> >          } else {
> >              xsrc->status[srcno] &= ~XIVE_STATUS_ASSERTED;
> > -            args.level = KVM_INTERRUPT_UNSET;
> >          }
> >      }
> > -    rc = kvm_vm_ioctl(kvm_state, KVM_IRQ_LINE, &args);
> > -    if (rc < 0) {
> > -        error_report("XIVE: kvm_irq_line() failed : %s", strerror(errno));
> > -    }
> > +
> > +    xive_esb_trigger(xsrc, srcno);
> >  }
> 
>
David Gibson Nov. 19, 2019, 12:47 a.m. UTC | #3
On Mon, Nov 18, 2019 at 04:37:16PM +0100, Cédric Le Goater wrote:
> On 18/11/2019 16:12, Greg Kurz wrote:
> > When using the XIVE KVM device, the trigger page is directly accessible
> > in QEMU. Unlike with XICS, no need to ask KVM to fire the interrupt. A
> > simple store on the trigger page does the job.
> > 
> > Just call xive_esb_trigger().
> 
> Yes but the KVM XIVE device does a few other checks. 
> 
> It checks that the interrupt was correctly initialized at the KVM device
> level. We should be fine in QEMU which has similar checks.
> 
> It caches the LSI assertion level. We should be fine also because it is
> useless in KVM when using the XIVE native exploitation mode.
> 
> It checks it is not a passthru interrupt. Any idea on how to check this 
> condition under QEMU ? 
>  
> > This may improve performance of emulated devices that go through
> > qemu_set_irq(), eg. virtio devices created with ioeventfd=off or
> > configured by the guest to use LSI interrupts, which aren't really
> > recommended setups.
> 
> LGTM.

Ok, between the comments above and this, I'm not sure if this is ready
to merge or not.

> 
> Any figures to share ? 
> 
> C.
> 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  hw/intc/spapr_xive_kvm.c |   16 ++--------------
> >  1 file changed, 2 insertions(+), 14 deletions(-)
> > 
> > diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> > index 08012ac7cd76..69e73552f1ef 100644
> > --- a/hw/intc/spapr_xive_kvm.c
> > +++ b/hw/intc/spapr_xive_kvm.c
> > @@ -354,32 +354,20 @@ static void kvmppc_xive_source_get_state(XiveSource *xsrc)
> >  void kvmppc_xive_source_set_irq(void *opaque, int srcno, int val)
> >  {
> >      XiveSource *xsrc = opaque;
> > -    SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
> > -    struct kvm_irq_level args;
> > -    int rc;
> > -
> > -    /* The KVM XIVE device should be in use */
> > -    assert(xive->fd != -1);
> >  
> > -    args.irq = srcno;
> >      if (!xive_source_irq_is_lsi(xsrc, srcno)) {
> >          if (!val) {
> >              return;
> >          }
> > -        args.level = KVM_INTERRUPT_SET;
> >      } else {
> >          if (val) {
> >              xsrc->status[srcno] |= XIVE_STATUS_ASSERTED;
> > -            args.level = KVM_INTERRUPT_SET_LEVEL;
> >          } else {
> >              xsrc->status[srcno] &= ~XIVE_STATUS_ASSERTED;
> > -            args.level = KVM_INTERRUPT_UNSET;
> >          }
> >      }
> > -    rc = kvm_vm_ioctl(kvm_state, KVM_IRQ_LINE, &args);
> > -    if (rc < 0) {
> > -        error_report("XIVE: kvm_irq_line() failed : %s", strerror(errno));
> > -    }
> > +
> > +    xive_esb_trigger(xsrc, srcno);
> >  }
> 
>
Cédric Le Goater Nov. 19, 2019, 8:15 a.m. UTC | #4
On 19/11/2019 01:47, David Gibson wrote:
> On Mon, Nov 18, 2019 at 04:37:16PM +0100, Cédric Le Goater wrote:
>> On 18/11/2019 16:12, Greg Kurz wrote:
>>> When using the XIVE KVM device, the trigger page is directly accessible
>>> in QEMU. Unlike with XICS, no need to ask KVM to fire the interrupt. A
>>> simple store on the trigger page does the job.
>>>
>>> Just call xive_esb_trigger().
>>
>> Yes but the KVM XIVE device does a few other checks. 
>>
>> It checks that the interrupt was correctly initialized at the KVM device
>> level. We should be fine in QEMU which has similar checks.
>>
>> It caches the LSI assertion level. We should be fine also because it is
>> useless in KVM when using the XIVE native exploitation mode.
>>
>> It checks it is not a passthru interrupt. Any idea on how to check this 
>> condition under QEMU ? 
>>  
>>> This may improve performance of emulated devices that go through
>>> qemu_set_irq(), eg. virtio devices created with ioeventfd=off or
>>> configured by the guest to use LSI interrupts, which aren't really
>>> recommended setups.
>>
>> LGTM.
> 
> Ok, between the comments above and this, I'm not sure if this is ready
> to merge or not.

I think it is. 

With this change, we are loosing a check on passthrough interrupts but 
I am not sure how critical this is given that QEMU can anyhow bypass 
KVM and trigger the interrupt using a store on the ESB page. 

>> Any figures to share ? 

I am torturing Greg to have numbers :) but he resisted well.

>> C.
>>
>>> Signed-off-by: Greg Kurz <groug@kaod.org>

Let's move on.

Reviewed-by: Cédric Le Goater <clg@kaod.org>

C.

>>> ---
>>>  hw/intc/spapr_xive_kvm.c |   16 ++--------------
>>>  1 file changed, 2 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
>>> index 08012ac7cd76..69e73552f1ef 100644
>>> --- a/hw/intc/spapr_xive_kvm.c
>>> +++ b/hw/intc/spapr_xive_kvm.c
>>> @@ -354,32 +354,20 @@ static void kvmppc_xive_source_get_state(XiveSource *xsrc)
>>>  void kvmppc_xive_source_set_irq(void *opaque, int srcno, int val)
>>>  {
>>>      XiveSource *xsrc = opaque;
>>> -    SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
>>> -    struct kvm_irq_level args;
>>> -    int rc;
>>> -
>>> -    /* The KVM XIVE device should be in use */
>>> -    assert(xive->fd != -1);
>>>  
>>> -    args.irq = srcno;
>>>      if (!xive_source_irq_is_lsi(xsrc, srcno)) {
>>>          if (!val) {
>>>              return;
>>>          }
>>> -        args.level = KVM_INTERRUPT_SET;
>>>      } else {
>>>          if (val) {
>>>              xsrc->status[srcno] |= XIVE_STATUS_ASSERTED;
>>> -            args.level = KVM_INTERRUPT_SET_LEVEL;
>>>          } else {
>>>              xsrc->status[srcno] &= ~XIVE_STATUS_ASSERTED;
>>> -            args.level = KVM_INTERRUPT_UNSET;
>>>          }
>>>      }
>>> -    rc = kvm_vm_ioctl(kvm_state, KVM_IRQ_LINE, &args);
>>> -    if (rc < 0) {
>>> -        error_report("XIVE: kvm_irq_line() failed : %s", strerror(errno));
>>> -    }
>>> +
>>> +    xive_esb_trigger(xsrc, srcno);
>>>  }
>>
>>
>
Greg Kurz Nov. 19, 2019, 8:53 a.m. UTC | #5
On Tue, 19 Nov 2019 09:15:52 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> On 19/11/2019 01:47, David Gibson wrote:
> > On Mon, Nov 18, 2019 at 04:37:16PM +0100, Cédric Le Goater wrote:
> >> On 18/11/2019 16:12, Greg Kurz wrote:
> >>> When using the XIVE KVM device, the trigger page is directly accessible
> >>> in QEMU. Unlike with XICS, no need to ask KVM to fire the interrupt. A
> >>> simple store on the trigger page does the job.
> >>>
> >>> Just call xive_esb_trigger().
> >>
> >> Yes but the KVM XIVE device does a few other checks. 
> >>
> >> It checks that the interrupt was correctly initialized at the KVM device
> >> level. We should be fine in QEMU which has similar checks.
> >>
> >> It caches the LSI assertion level. We should be fine also because it is
> >> useless in KVM when using the XIVE native exploitation mode.
> >>
> >> It checks it is not a passthru interrupt. Any idea on how to check this 
> >> condition under QEMU ? 
> >>  
> >>> This may improve performance of emulated devices that go through
> >>> qemu_set_irq(), eg. virtio devices created with ioeventfd=off or
> >>> configured by the guest to use LSI interrupts, which aren't really
> >>> recommended setups.
> >>
> >> LGTM.
> > 
> > Ok, between the comments above and this, I'm not sure if this is ready
> > to merge or not.
> 
> I think it is. 
> 
> With this change, we are loosing a check on passthrough interrupts but 
> I am not sure how critical this is given that QEMU can anyhow bypass 
> KVM and trigger the interrupt using a store on the ESB page. 
> 

True. Thinking a bit more about this: nothing prevents such a store to
be the result of a bug somewhere else in QEMU, eg. some dangling pointer
with the same value, in a much easier way than doing the KVM ioctl. Is
it a concern we should take into account ?

> >> Any figures to share ? 
> 
> I am torturing Greg to have numbers :) but he resisted well.
> 

Maybe a _liquid_ bribe or two can be convincing enough :-)

> >> C.
> >>
> >>> Signed-off-by: Greg Kurz <groug@kaod.org>
> 
> Let's move on.
> 
> Reviewed-by: Cédric Le Goater <clg@kaod.org>
> 
> C.
> 
> >>> ---
> >>>  hw/intc/spapr_xive_kvm.c |   16 ++--------------
> >>>  1 file changed, 2 insertions(+), 14 deletions(-)
> >>>
> >>> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> >>> index 08012ac7cd76..69e73552f1ef 100644
> >>> --- a/hw/intc/spapr_xive_kvm.c
> >>> +++ b/hw/intc/spapr_xive_kvm.c
> >>> @@ -354,32 +354,20 @@ static void kvmppc_xive_source_get_state(XiveSource *xsrc)
> >>>  void kvmppc_xive_source_set_irq(void *opaque, int srcno, int val)
> >>>  {
> >>>      XiveSource *xsrc = opaque;
> >>> -    SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
> >>> -    struct kvm_irq_level args;
> >>> -    int rc;
> >>> -
> >>> -    /* The KVM XIVE device should be in use */
> >>> -    assert(xive->fd != -1);
> >>>  
> >>> -    args.irq = srcno;
> >>>      if (!xive_source_irq_is_lsi(xsrc, srcno)) {
> >>>          if (!val) {
> >>>              return;
> >>>          }
> >>> -        args.level = KVM_INTERRUPT_SET;
> >>>      } else {
> >>>          if (val) {
> >>>              xsrc->status[srcno] |= XIVE_STATUS_ASSERTED;
> >>> -            args.level = KVM_INTERRUPT_SET_LEVEL;
> >>>          } else {
> >>>              xsrc->status[srcno] &= ~XIVE_STATUS_ASSERTED;
> >>> -            args.level = KVM_INTERRUPT_UNSET;
> >>>          }
> >>>      }
> >>> -    rc = kvm_vm_ioctl(kvm_state, KVM_IRQ_LINE, &args);
> >>> -    if (rc < 0) {
> >>> -        error_report("XIVE: kvm_irq_line() failed : %s", strerror(errno));
> >>> -    }
> >>> +
> >>> +    xive_esb_trigger(xsrc, srcno);
> >>>  }
> >>
> >>
> > 
>
David Gibson Nov. 19, 2019, 9:52 p.m. UTC | #6
On Tue, Nov 19, 2019 at 09:15:52AM +0100, Cédric Le Goater wrote:
> On 19/11/2019 01:47, David Gibson wrote:
> > On Mon, Nov 18, 2019 at 04:37:16PM +0100, Cédric Le Goater wrote:
> >> On 18/11/2019 16:12, Greg Kurz wrote:
> >>> When using the XIVE KVM device, the trigger page is directly accessible
> >>> in QEMU. Unlike with XICS, no need to ask KVM to fire the interrupt. A
> >>> simple store on the trigger page does the job.
> >>>
> >>> Just call xive_esb_trigger().
> >>
> >> Yes but the KVM XIVE device does a few other checks. 
> >>
> >> It checks that the interrupt was correctly initialized at the KVM device
> >> level. We should be fine in QEMU which has similar checks.
> >>
> >> It caches the LSI assertion level. We should be fine also because it is
> >> useless in KVM when using the XIVE native exploitation mode.
> >>
> >> It checks it is not a passthru interrupt. Any idea on how to check this 
> >> condition under QEMU ? 
> >>  
> >>> This may improve performance of emulated devices that go through
> >>> qemu_set_irq(), eg. virtio devices created with ioeventfd=off or
> >>> configured by the guest to use LSI interrupts, which aren't really
> >>> recommended setups.
> >>
> >> LGTM.
> > 
> > Ok, between the comments above and this, I'm not sure if this is ready
> > to merge or not.
> 
> I think it is. 
> 
> With this change, we are loosing a check on passthrough interrupts but 
> I am not sure how critical this is given that QEMU can anyhow bypass 
> KVM and trigger the interrupt using a store on the ESB page. 
> 
> >> Any figures to share ? 
> 
> I am torturing Greg to have numbers :) but he resisted well.
> 
> >> C.
> >>
> >>> Signed-off-by: Greg Kurz <groug@kaod.org>
> 
> Let's move on.
> 
> Reviewed-by: Cédric Le Goater <clg@kaod.org>

Works for me.  Applied.

Patch
diff mbox series

diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
index 08012ac7cd76..69e73552f1ef 100644
--- a/hw/intc/spapr_xive_kvm.c
+++ b/hw/intc/spapr_xive_kvm.c
@@ -354,32 +354,20 @@  static void kvmppc_xive_source_get_state(XiveSource *xsrc)
 void kvmppc_xive_source_set_irq(void *opaque, int srcno, int val)
 {
     XiveSource *xsrc = opaque;
-    SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
-    struct kvm_irq_level args;
-    int rc;
-
-    /* The KVM XIVE device should be in use */
-    assert(xive->fd != -1);
 
-    args.irq = srcno;
     if (!xive_source_irq_is_lsi(xsrc, srcno)) {
         if (!val) {
             return;
         }
-        args.level = KVM_INTERRUPT_SET;
     } else {
         if (val) {
             xsrc->status[srcno] |= XIVE_STATUS_ASSERTED;
-            args.level = KVM_INTERRUPT_SET_LEVEL;
         } else {
             xsrc->status[srcno] &= ~XIVE_STATUS_ASSERTED;
-            args.level = KVM_INTERRUPT_UNSET;
         }
     }
-    rc = kvm_vm_ioctl(kvm_state, KVM_IRQ_LINE, &args);
-    if (rc < 0) {
-        error_report("XIVE: kvm_irq_line() failed : %s", strerror(errno));
-    }
+
+    xive_esb_trigger(xsrc, srcno);
 }
 
 /*