diff mbox series

[v10,13/14] vfio-user: handle device interrupts

Message ID 2a492c16e0464f70f7be1fd9c04172f4f18d14ca.1653404595.git.jag.raman@oracle.com
State New
Headers show
Series vfio-user server in QEMU | expand

Commit Message

Jag Raman May 24, 2022, 3:30 p.m. UTC
Forward remote device's interrupts to the guest

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
---
 include/hw/pci/pci.h              |  13 ++++
 include/hw/remote/vfio-user-obj.h |   6 ++
 hw/pci/msi.c                      |  16 ++--
 hw/pci/msix.c                     |  10 ++-
 hw/pci/pci.c                      |  13 ++++
 hw/remote/machine.c               |  14 +++-
 hw/remote/vfio-user-obj.c         | 123 ++++++++++++++++++++++++++++++
 stubs/vfio-user-obj.c             |   6 ++
 MAINTAINERS                       |   1 +
 hw/remote/trace-events            |   1 +
 stubs/meson.build                 |   1 +
 11 files changed, 193 insertions(+), 11 deletions(-)
 create mode 100644 include/hw/remote/vfio-user-obj.h
 create mode 100644 stubs/vfio-user-obj.c

Comments

Stefan Hajnoczi May 25, 2022, 2:53 p.m. UTC | #1
On Tue, May 24, 2022 at 11:30:32AM -0400, Jagannathan Raman wrote:
> Forward remote device's interrupts to the guest
> 
> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> ---
>  include/hw/pci/pci.h              |  13 ++++
>  include/hw/remote/vfio-user-obj.h |   6 ++
>  hw/pci/msi.c                      |  16 ++--
>  hw/pci/msix.c                     |  10 ++-
>  hw/pci/pci.c                      |  13 ++++
>  hw/remote/machine.c               |  14 +++-
>  hw/remote/vfio-user-obj.c         | 123 ++++++++++++++++++++++++++++++
>  stubs/vfio-user-obj.c             |   6 ++
>  MAINTAINERS                       |   1 +
>  hw/remote/trace-events            |   1 +
>  stubs/meson.build                 |   1 +
>  11 files changed, 193 insertions(+), 11 deletions(-)
>  create mode 100644 include/hw/remote/vfio-user-obj.h
>  create mode 100644 stubs/vfio-user-obj.c

It would be great if Michael Tsirkin and Alex Williamson would review
this.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Jag Raman May 31, 2022, 3:01 p.m. UTC | #2
> On May 25, 2022, at 10:53 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> On Tue, May 24, 2022 at 11:30:32AM -0400, Jagannathan Raman wrote:
>> Forward remote device's interrupts to the guest
>> 
>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
>> ---
>> include/hw/pci/pci.h              |  13 ++++
>> include/hw/remote/vfio-user-obj.h |   6 ++
>> hw/pci/msi.c                      |  16 ++--
>> hw/pci/msix.c                     |  10 ++-
>> hw/pci/pci.c                      |  13 ++++
>> hw/remote/machine.c               |  14 +++-
>> hw/remote/vfio-user-obj.c         | 123 ++++++++++++++++++++++++++++++
>> stubs/vfio-user-obj.c             |   6 ++
>> MAINTAINERS                       |   1 +
>> hw/remote/trace-events            |   1 +
>> stubs/meson.build                 |   1 +
>> 11 files changed, 193 insertions(+), 11 deletions(-)
>> create mode 100644 include/hw/remote/vfio-user-obj.h
>> create mode 100644 stubs/vfio-user-obj.c
> 
> It would be great if Michael Tsirkin and Alex Williamson would review
> this.

Hi Michael and Alex,

Do you have any thoughts on this patch?

Thank you!
--
Jag

> 
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Alex Williamson May 31, 2022, 8:10 p.m. UTC | #3
On Tue, 31 May 2022 15:01:57 +0000
Jag Raman <jag.raman@oracle.com> wrote:

> > On May 25, 2022, at 10:53 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > 
> > On Tue, May 24, 2022 at 11:30:32AM -0400, Jagannathan Raman wrote:  
> >> Forward remote device's interrupts to the guest
> >> 
> >> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> >> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> >> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> >> ---
> >> include/hw/pci/pci.h              |  13 ++++
> >> include/hw/remote/vfio-user-obj.h |   6 ++
> >> hw/pci/msi.c                      |  16 ++--
> >> hw/pci/msix.c                     |  10 ++-
> >> hw/pci/pci.c                      |  13 ++++
> >> hw/remote/machine.c               |  14 +++-
> >> hw/remote/vfio-user-obj.c         | 123 ++++++++++++++++++++++++++++++
> >> stubs/vfio-user-obj.c             |   6 ++
> >> MAINTAINERS                       |   1 +
> >> hw/remote/trace-events            |   1 +
> >> stubs/meson.build                 |   1 +
> >> 11 files changed, 193 insertions(+), 11 deletions(-)
> >> create mode 100644 include/hw/remote/vfio-user-obj.h
> >> create mode 100644 stubs/vfio-user-obj.c  
> > 
> > It would be great if Michael Tsirkin and Alex Williamson would review
> > this.  
> 
> Hi Michael and Alex,
> 
> Do you have any thoughts on this patch?

Ultimately this is just how to insert callbacks to replace the default
MSI/X triggers so you can send a vector# over the wire for a remote
machine, right?  I'll let the code owners, Michael and Marcel, comment
if they have grand vision how to architect this differently.  Thanks,

Alex
Stefan Hajnoczi May 31, 2022, 9:03 p.m. UTC | #4
On Tue, 31 May 2022 at 21:11, Alex Williamson
<alex.williamson@redhat.com> wrote:
>
> On Tue, 31 May 2022 15:01:57 +0000
> Jag Raman <jag.raman@oracle.com> wrote:
>
> > > On May 25, 2022, at 10:53 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > >
> > > On Tue, May 24, 2022 at 11:30:32AM -0400, Jagannathan Raman wrote:
> > >> Forward remote device's interrupts to the guest
> > >>
> > >> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> > >> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> > >> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> > >> ---
> > >> include/hw/pci/pci.h              |  13 ++++
> > >> include/hw/remote/vfio-user-obj.h |   6 ++
> > >> hw/pci/msi.c                      |  16 ++--
> > >> hw/pci/msix.c                     |  10 ++-
> > >> hw/pci/pci.c                      |  13 ++++
> > >> hw/remote/machine.c               |  14 +++-
> > >> hw/remote/vfio-user-obj.c         | 123 ++++++++++++++++++++++++++++++
> > >> stubs/vfio-user-obj.c             |   6 ++
> > >> MAINTAINERS                       |   1 +
> > >> hw/remote/trace-events            |   1 +
> > >> stubs/meson.build                 |   1 +
> > >> 11 files changed, 193 insertions(+), 11 deletions(-)
> > >> create mode 100644 include/hw/remote/vfio-user-obj.h
> > >> create mode 100644 stubs/vfio-user-obj.c
> > >
> > > It would be great if Michael Tsirkin and Alex Williamson would review
> > > this.
> >
> > Hi Michael and Alex,
> >
> > Do you have any thoughts on this patch?
>
> Ultimately this is just how to insert callbacks to replace the default
> MSI/X triggers so you can send a vector# over the wire for a remote
> machine, right?  I'll let the code owners, Michael and Marcel, comment
> if they have grand vision how to architect this differently.  Thanks,

An earlier version of the patch intercepted MSI-X at the msix_notify()
level, replacing the entire function. This patch replaces
msix_get_message() and msi_send_message(), leaving the masking logic
in place.

I haven't seen the latest vfio-user client implementation for QEMU,
but if the idea is to allow the guest to directly control the
vfio-user device's MSI-X table's mask bits, then I think this is a
different design from VFIO kernel where masking is emulated by QEMU
and not passed through to the PCI device.

It's been a while since I looked at how this works in QEMU's hw/vfio/
code, so I may not be explaining it correctly, but I think there is a
design difference here between VFIO kernel and vfio-user that's worth
evaluating.

Stefan
Alex Williamson May 31, 2022, 9:45 p.m. UTC | #5
On Tue, 31 May 2022 22:03:14 +0100
Stefan Hajnoczi <stefanha@gmail.com> wrote:

> On Tue, 31 May 2022 at 21:11, Alex Williamson
> <alex.williamson@redhat.com> wrote:
> >
> > On Tue, 31 May 2022 15:01:57 +0000
> > Jag Raman <jag.raman@oracle.com> wrote:
> >  
> > > > On May 25, 2022, at 10:53 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > >
> > > > On Tue, May 24, 2022 at 11:30:32AM -0400, Jagannathan Raman wrote:  
> > > >> Forward remote device's interrupts to the guest
> > > >>
> > > >> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> > > >> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> > > >> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> > > >> ---
> > > >> include/hw/pci/pci.h              |  13 ++++
> > > >> include/hw/remote/vfio-user-obj.h |   6 ++
> > > >> hw/pci/msi.c                      |  16 ++--
> > > >> hw/pci/msix.c                     |  10 ++-
> > > >> hw/pci/pci.c                      |  13 ++++
> > > >> hw/remote/machine.c               |  14 +++-
> > > >> hw/remote/vfio-user-obj.c         | 123 ++++++++++++++++++++++++++++++
> > > >> stubs/vfio-user-obj.c             |   6 ++
> > > >> MAINTAINERS                       |   1 +
> > > >> hw/remote/trace-events            |   1 +
> > > >> stubs/meson.build                 |   1 +
> > > >> 11 files changed, 193 insertions(+), 11 deletions(-)
> > > >> create mode 100644 include/hw/remote/vfio-user-obj.h
> > > >> create mode 100644 stubs/vfio-user-obj.c  
> > > >
> > > > It would be great if Michael Tsirkin and Alex Williamson would review
> > > > this.  
> > >
> > > Hi Michael and Alex,
> > >
> > > Do you have any thoughts on this patch?  
> >
> > Ultimately this is just how to insert callbacks to replace the default
> > MSI/X triggers so you can send a vector# over the wire for a remote
> > machine, right?  I'll let the code owners, Michael and Marcel, comment
> > if they have grand vision how to architect this differently.  Thanks,  
> 
> An earlier version of the patch intercepted MSI-X at the msix_notify()
> level, replacing the entire function. This patch replaces
> msix_get_message() and msi_send_message(), leaving the masking logic
> in place.
> 
> I haven't seen the latest vfio-user client implementation for QEMU,
> but if the idea is to allow the guest to directly control the
> vfio-user device's MSI-X table's mask bits, then I think this is a
> different design from VFIO kernel where masking is emulated by QEMU
> and not passed through to the PCI device.

Essentially what's happening here is an implementation of an interrupt
handler callback in the remote QEMU instance.  The default handler is
to simply write the MSI message data at the MSI message address of the
vCPU, vfio-user replaces that with hijacking the MSI message itself to
simply report the vector# so that the "handler", ie. trigger, can
forward it to the client.  That's very analogous to the kernel
implementation.

The equivalent masking we have today with vfio kernel would happen on
the client side, where the MSI/X code might instead set a pending bit
if the vector is masked on the client.  Likewise the possibility
remains, just as it does on the kernel side, that the guest masking a
vector could be relayed over ioctl/socket to set the equivalent mask on
the host/remote.

I don't see a fundamental design difference here, but please point out
if I'm missing something.  Thanks,

Alex
John Johnson June 1, 2022, 6:37 a.m. UTC | #6
> On May 31, 2022, at 2:45 PM, Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> On Tue, 31 May 2022 22:03:14 +0100
> Stefan Hajnoczi <stefanha@gmail.com> wrote:
> 
>> On Tue, 31 May 2022 at 21:11, Alex Williamson
>> <alex.williamson@redhat.com> wrote:
>>> 
>>> On Tue, 31 May 2022 15:01:57 +0000
>>> Jag Raman <jag.raman@oracle.com> wrote:
>>> 
>>>>> On May 25, 2022, at 10:53 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>>>>> 
>>>>> On Tue, May 24, 2022 at 11:30:32AM -0400, Jagannathan Raman wrote:  
>>>>>> Forward remote device's interrupts to the guest
>>>>>> 
>>>>>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>>>>>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
>>>>>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
>>>>>> ---
>>>>>> include/hw/pci/pci.h              |  13 ++++
>>>>>> include/hw/remote/vfio-user-obj.h |   6 ++
>>>>>> hw/pci/msi.c                      |  16 ++--
>>>>>> hw/pci/msix.c                     |  10 ++-
>>>>>> hw/pci/pci.c                      |  13 ++++
>>>>>> hw/remote/machine.c               |  14 +++-
>>>>>> hw/remote/vfio-user-obj.c         | 123 ++++++++++++++++++++++++++++++
>>>>>> stubs/vfio-user-obj.c             |   6 ++
>>>>>> MAINTAINERS                       |   1 +
>>>>>> hw/remote/trace-events            |   1 +
>>>>>> stubs/meson.build                 |   1 +
>>>>>> 11 files changed, 193 insertions(+), 11 deletions(-)
>>>>>> create mode 100644 include/hw/remote/vfio-user-obj.h
>>>>>> create mode 100644 stubs/vfio-user-obj.c  
>>>>> 
>>>>> It would be great if Michael Tsirkin and Alex Williamson would review
>>>>> this.  
>>>> 
>>>> Hi Michael and Alex,
>>>> 
>>>> Do you have any thoughts on this patch?  
>>> 
>>> Ultimately this is just how to insert callbacks to replace the default
>>> MSI/X triggers so you can send a vector# over the wire for a remote
>>> machine, right?  I'll let the code owners, Michael and Marcel, comment
>>> if they have grand vision how to architect this differently.  Thanks,  
>> 
>> An earlier version of the patch intercepted MSI-X at the msix_notify()
>> level, replacing the entire function. This patch replaces
>> msix_get_message() and msi_send_message(), leaving the masking logic
>> in place.
>> 
>> I haven't seen the latest vfio-user client implementation for QEMU,
>> but if the idea is to allow the guest to directly control the
>> vfio-user device's MSI-X table's mask bits, then I think this is a
>> different design from VFIO kernel where masking is emulated by QEMU
>> and not passed through to the PCI device.
> 
> Essentially what's happening here is an implementation of an interrupt
> handler callback in the remote QEMU instance.  The default handler is
> to simply write the MSI message data at the MSI message address of the
> vCPU, vfio-user replaces that with hijacking the MSI message itself to
> simply report the vector# so that the "handler", ie. trigger, can
> forward it to the client.  That's very analogous to the kernel
> implementation.
> 
> The equivalent masking we have today with vfio kernel would happen on
> the client side, where the MSI/X code might instead set a pending bit
> if the vector is masked on the client.  Likewise the possibility
> remains, just as it does on the kernel side, that the guest masking a
> vector could be relayed over ioctl/socket to set the equivalent mask on
> the host/remote.
> 
> I don't see a fundamental design difference here, but please point out
> if I'm missing something.  Thanks,
> 



	I don’t think you’ve missed anything.  We were trying to stay
close to the kernel implementation.

	Do you have any comments on the client side implementation I
sent on 5/5?

							JJ
Stefan Hajnoczi June 1, 2022, 9:38 a.m. UTC | #7
On Wed, 1 Jun 2022 at 07:40, John Johnson <john.g.johnson@oracle.com> wrote:
>
> > On May 31, 2022, at 2:45 PM, Alex Williamson <alex.williamson@redhat.com> wrote:
> >
> > On Tue, 31 May 2022 22:03:14 +0100
> > Stefan Hajnoczi <stefanha@gmail.com> wrote:
> >
> >> On Tue, 31 May 2022 at 21:11, Alex Williamson
> >> <alex.williamson@redhat.com> wrote:
> >>>
> >>> On Tue, 31 May 2022 15:01:57 +0000
> >>> Jag Raman <jag.raman@oracle.com> wrote:
> >>>
> >>>>> On May 25, 2022, at 10:53 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >>>>>
> >>>>> On Tue, May 24, 2022 at 11:30:32AM -0400, Jagannathan Raman wrote:
> >>>>>> Forward remote device's interrupts to the guest
> >>>>>>
> >>>>>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> >>>>>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> >>>>>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> >>>>>> ---
> >>>>>> include/hw/pci/pci.h              |  13 ++++
> >>>>>> include/hw/remote/vfio-user-obj.h |   6 ++
> >>>>>> hw/pci/msi.c                      |  16 ++--
> >>>>>> hw/pci/msix.c                     |  10 ++-
> >>>>>> hw/pci/pci.c                      |  13 ++++
> >>>>>> hw/remote/machine.c               |  14 +++-
> >>>>>> hw/remote/vfio-user-obj.c         | 123 ++++++++++++++++++++++++++++++
> >>>>>> stubs/vfio-user-obj.c             |   6 ++
> >>>>>> MAINTAINERS                       |   1 +
> >>>>>> hw/remote/trace-events            |   1 +
> >>>>>> stubs/meson.build                 |   1 +
> >>>>>> 11 files changed, 193 insertions(+), 11 deletions(-)
> >>>>>> create mode 100644 include/hw/remote/vfio-user-obj.h
> >>>>>> create mode 100644 stubs/vfio-user-obj.c
> >>>>>
> >>>>> It would be great if Michael Tsirkin and Alex Williamson would review
> >>>>> this.
> >>>>
> >>>> Hi Michael and Alex,
> >>>>
> >>>> Do you have any thoughts on this patch?
> >>>
> >>> Ultimately this is just how to insert callbacks to replace the default
> >>> MSI/X triggers so you can send a vector# over the wire for a remote
> >>> machine, right?  I'll let the code owners, Michael and Marcel, comment
> >>> if they have grand vision how to architect this differently.  Thanks,
> >>
> >> An earlier version of the patch intercepted MSI-X at the msix_notify()
> >> level, replacing the entire function. This patch replaces
> >> msix_get_message() and msi_send_message(), leaving the masking logic
> >> in place.
> >>
> >> I haven't seen the latest vfio-user client implementation for QEMU,
> >> but if the idea is to allow the guest to directly control the
> >> vfio-user device's MSI-X table's mask bits, then I think this is a
> >> different design from VFIO kernel where masking is emulated by QEMU
> >> and not passed through to the PCI device.
> >
> > Essentially what's happening here is an implementation of an interrupt
> > handler callback in the remote QEMU instance.  The default handler is
> > to simply write the MSI message data at the MSI message address of the
> > vCPU, vfio-user replaces that with hijacking the MSI message itself to
> > simply report the vector# so that the "handler", ie. trigger, can
> > forward it to the client.  That's very analogous to the kernel
> > implementation.
> >
> > The equivalent masking we have today with vfio kernel would happen on
> > the client side, where the MSI/X code might instead set a pending bit
> > if the vector is masked on the client.  Likewise the possibility
> > remains, just as it does on the kernel side, that the guest masking a
> > vector could be relayed over ioctl/socket to set the equivalent mask on
> > the host/remote.
> >
> > I don't see a fundamental design difference here, but please point out
> > if I'm missing something.  Thanks,
> >
>
>
>
>         I don’t think you’ve missed anything.  We were trying to stay
> close to the kernel implementation.

Okay.

Stefan
Jag Raman June 1, 2022, 5 p.m. UTC | #8
On May 31, 2022, at 5:45 PM, Alex Williamson <alex.williamson@redhat.com<mailto:alex.williamson@redhat.com>> wrote:

On Tue, 31 May 2022 22:03:14 +0100
Stefan Hajnoczi <stefanha@gmail.com<mailto:stefanha@gmail.com>> wrote:

On Tue, 31 May 2022 at 21:11, Alex Williamson
<alex.williamson@redhat.com<mailto:alex.williamson@redhat.com>> wrote:

On Tue, 31 May 2022 15:01:57 +0000
Jag Raman <jag.raman@oracle.com<mailto:jag.raman@oracle.com>> wrote:

On May 25, 2022, at 10:53 AM, Stefan Hajnoczi <stefanha@redhat.com<mailto:stefanha@redhat.com>> wrote:

On Tue, May 24, 2022 at 11:30:32AM -0400, Jagannathan Raman wrote:
Forward remote device's interrupts to the guest

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com<mailto:elena.ufimtseva@oracle.com>>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com<mailto:john.g.johnson@oracle.com>>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com<mailto:jag.raman@oracle.com>>
---
include/hw/pci/pci.h              |  13 ++++
include/hw/remote/vfio-user-obj.h |   6 ++
hw/pci/msi.c                      |  16 ++--
hw/pci/msix.c                     |  10 ++-
hw/pci/pci.c                      |  13 ++++
hw/remote/machine.c               |  14 +++-
hw/remote/vfio-user-obj.c         | 123 ++++++++++++++++++++++++++++++
stubs/vfio-user-obj.c             |   6 ++
MAINTAINERS                       |   1 +
hw/remote/trace-events            |   1 +
stubs/meson.build                 |   1 +
11 files changed, 193 insertions(+), 11 deletions(-)
create mode 100644 include/hw/remote/vfio-user-obj.h
create mode 100644 stubs/vfio-user-obj.c

It would be great if Michael Tsirkin and Alex Williamson would review
this.

Hi Michael and Alex,

Do you have any thoughts on this patch?

Ultimately this is just how to insert callbacks to replace the default
MSI/X triggers so you can send a vector# over the wire for a remote
machine, right?  I'll let the code owners, Michael and Marcel, comment
if they have grand vision how to architect this differently.  Thanks,

An earlier version of the patch intercepted MSI-X at the msix_notify()
level, replacing the entire function. This patch replaces
msix_get_message() and msi_send_message(), leaving the masking logic
in place.

I haven't seen the latest vfio-user client implementation for QEMU,
but if the idea is to allow the guest to directly control the
vfio-user device's MSI-X table's mask bits, then I think this is a
different design from VFIO kernel where masking is emulated by QEMU
and not passed through to the PCI device.

Essentially what's happening here is an implementation of an interrupt
handler callback in the remote QEMU instance.  The default handler is
to simply write the MSI message data at the MSI message address of the
vCPU, vfio-user replaces that with hijacking the MSI message itself to
simply report the vector# so that the "handler", ie. trigger, can
forward it to the client.  That's very analogous to the kernel
implementation.

The equivalent masking we have today with vfio kernel would happen on
the client side, where the MSI/X code might instead set a pending bit
if the vector is masked on the client.  Likewise the possibility
remains, just as it does on the kernel side, that the guest masking a
vector could be relayed over ioctl/socket to set the equivalent mask on
the host/remote.

Hi Alex,

Just to add some more detail, the emulated PCI device in QEMU presently
maintains a MSIx table (PCIDevice->msix_table) and Pending Bit Array. In the
present VFIO PCI device implementation, QEMU leverages the same
MSIx table for interrupt masking/unmasking. The backend PCI device (such as
the passthru device) always thinks that the interrupt is unmasked and lets
QEMU manage masking.

Whereas in the vfio-user case, the client additionally pushes a copy of
emulated PCI device’s table downstream to the remote device. We did this
to allow a small set of devices (such as e1000e) to clear the
PBA (msix_clr_pending()). Secondly, the remote device uses its copy of the
MSIx table to determine if interrupt should be triggered - this would prevent
an interrupt from being sent to the client unnecessarily if it's masked.

We are wondering if pushing the MSIx table to the remote device and
reading PBA from it would diverge from the VFIO protocol specification?

From your comment, I understand it’s similar to VFIO protocol because VFIO
clients could mask an interrupt using VFIO_DEVICE_SET_IRQS ioctl +
VFIO_IRQ_SET_ACTION_MASK / _UNMASK flags. I observed that QEMU presently
does not use this approach and the kernel does not support it for MSI.

Thank you!
--
Jag
Alex Williamson June 1, 2022, 5:26 p.m. UTC | #9
On Wed, 1 Jun 2022 17:00:54 +0000
Jag Raman <jag.raman@oracle.com> wrote:
> 
> Hi Alex,
> 
> Just to add some more detail, the emulated PCI device in QEMU presently
> maintains a MSIx table (PCIDevice->msix_table) and Pending Bit Array. In the
> present VFIO PCI device implementation, QEMU leverages the same
> MSIx table for interrupt masking/unmasking. The backend PCI device (such as
> the passthru device) always thinks that the interrupt is unmasked and lets
> QEMU manage masking.
> 
> Whereas in the vfio-user case, the client additionally pushes a copy of
> emulated PCI device’s table downstream to the remote device. We did this
> to allow a small set of devices (such as e1000e) to clear the
> PBA (msix_clr_pending()). Secondly, the remote device uses its copy of the
> MSIx table to determine if interrupt should be triggered - this would prevent
> an interrupt from being sent to the client unnecessarily if it's masked.
> 
> We are wondering if pushing the MSIx table to the remote device and
> reading PBA from it would diverge from the VFIO protocol specification?
> 
> From your comment, I understand it’s similar to VFIO protocol because VFIO
> clients could mask an interrupt using VFIO_DEVICE_SET_IRQS ioctl +
> VFIO_IRQ_SET_ACTION_MASK / _UNMASK flags. I observed that QEMU presently
> does not use this approach and the kernel does not support it for MSI.

I believe the SET_IRQS ioctl definition is pre-enabled to support
masking and unmasking, we've just lacked kernel support to mask at the
device which leads to the hybrid approach we have today.  Our intention
would be to use the current uAPI, to provide that masking support, at
which point we'd leave the PBA mapped to the device.

So whether your proposal diverges from the VFIO uAPI depends on what
you mean by "pushing the MSIx table to the remote device".  If that's
done by implementing the existing SET_IRQS masking support, then you're
spot on.  OTOH, if you're actually pushing a copy of the MSIx table
from the client, that's certainly not how I had envisioned the kernel
interface.  Thanks,

Alex
Jag Raman June 1, 2022, 6:01 p.m. UTC | #10
> On Jun 1, 2022, at 1:26 PM, Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> On Wed, 1 Jun 2022 17:00:54 +0000
> Jag Raman <jag.raman@oracle.com> wrote:
>> 
>> Hi Alex,
>> 
>> Just to add some more detail, the emulated PCI device in QEMU presently
>> maintains a MSIx table (PCIDevice->msix_table) and Pending Bit Array. In the
>> present VFIO PCI device implementation, QEMU leverages the same
>> MSIx table for interrupt masking/unmasking. The backend PCI device (such as
>> the passthru device) always thinks that the interrupt is unmasked and lets
>> QEMU manage masking.
>> 
>> Whereas in the vfio-user case, the client additionally pushes a copy of
>> emulated PCI device’s table downstream to the remote device. We did this
>> to allow a small set of devices (such as e1000e) to clear the
>> PBA (msix_clr_pending()). Secondly, the remote device uses its copy of the
>> MSIx table to determine if interrupt should be triggered - this would prevent
>> an interrupt from being sent to the client unnecessarily if it's masked.
>> 
>> We are wondering if pushing the MSIx table to the remote device and
>> reading PBA from it would diverge from the VFIO protocol specification?
>> 
>> From your comment, I understand it’s similar to VFIO protocol because VFIO
>> clients could mask an interrupt using VFIO_DEVICE_SET_IRQS ioctl +
>> VFIO_IRQ_SET_ACTION_MASK / _UNMASK flags. I observed that QEMU presently
>> does not use this approach and the kernel does not support it for MSI.
> 
> I believe the SET_IRQS ioctl definition is pre-enabled to support
> masking and unmasking, we've just lacked kernel support to mask at the
> device which leads to the hybrid approach we have today.  Our intention
> would be to use the current uAPI, to provide that masking support, at
> which point we'd leave the PBA mapped to the device.

Thank you for clarifying!

> 
> So whether your proposal diverges from the VFIO uAPI depends on what
> you mean by "pushing the MSIx table to the remote device".  If that's
> done by implementing the existing SET_IRQS masking support, then you're
> spot on.  OTOH, if you're actually pushing a copy of the MSIx table
> from the client, that's certainly not how I had envisioned the kernel

In the current implementation - when the guest accesses the MSIx table and
PBA, the client passes these accesses through to the remote device.

If we switch to using SET_IRQS approach, we could use SET_IRQS
message for masking/unmasking, but still pass through the the PBA
access to the backend PCI device.

So I think the question is, if we should switch vfio-user to SET_IRQS
now for masking or unmasking, or whenever QEMU does it in the future?
The PBA access would remain the same as it’s now - via device BAR.

Thank you!
--
Jag

> interface.  Thanks,
> 
> Alex
>
Alex Williamson June 1, 2022, 6:30 p.m. UTC | #11
On Wed, 1 Jun 2022 18:01:39 +0000
Jag Raman <jag.raman@oracle.com> wrote:

> > On Jun 1, 2022, at 1:26 PM, Alex Williamson <alex.williamson@redhat.com> wrote:
> > 
> > On Wed, 1 Jun 2022 17:00:54 +0000
> > Jag Raman <jag.raman@oracle.com> wrote:  
> >> 
> >> Hi Alex,
> >> 
> >> Just to add some more detail, the emulated PCI device in QEMU presently
> >> maintains a MSIx table (PCIDevice->msix_table) and Pending Bit Array. In the
> >> present VFIO PCI device implementation, QEMU leverages the same
> >> MSIx table for interrupt masking/unmasking. The backend PCI device (such as
> >> the passthru device) always thinks that the interrupt is unmasked and lets
> >> QEMU manage masking.
> >> 
> >> Whereas in the vfio-user case, the client additionally pushes a copy of
> >> emulated PCI device’s table downstream to the remote device. We did this
> >> to allow a small set of devices (such as e1000e) to clear the
> >> PBA (msix_clr_pending()). Secondly, the remote device uses its copy of the
> >> MSIx table to determine if interrupt should be triggered - this would prevent
> >> an interrupt from being sent to the client unnecessarily if it's masked.
> >> 
> >> We are wondering if pushing the MSIx table to the remote device and
> >> reading PBA from it would diverge from the VFIO protocol specification?
> >> 
> >> From your comment, I understand it’s similar to VFIO protocol because VFIO
> >> clients could mask an interrupt using VFIO_DEVICE_SET_IRQS ioctl +
> >> VFIO_IRQ_SET_ACTION_MASK / _UNMASK flags. I observed that QEMU presently
> >> does not use this approach and the kernel does not support it for MSI.  
> > 
> > I believe the SET_IRQS ioctl definition is pre-enabled to support
> > masking and unmasking, we've just lacked kernel support to mask at the
> > device which leads to the hybrid approach we have today.  Our intention
> > would be to use the current uAPI, to provide that masking support, at
> > which point we'd leave the PBA mapped to the device.  
> 
> Thank you for clarifying!
> 
> > 
> > So whether your proposal diverges from the VFIO uAPI depends on what
> > you mean by "pushing the MSIx table to the remote device".  If that's
> > done by implementing the existing SET_IRQS masking support, then you're
> > spot on.  OTOH, if you're actually pushing a copy of the MSIx table
> > from the client, that's certainly not how I had envisioned the kernel  
> 
> In the current implementation - when the guest accesses the MSIx table and
> PBA, the client passes these accesses through to the remote device.

I suppose you can do this because you don't need some means for the
client to register a notification mechanism for the interrupt, you can
already send notifications via the socket.  But this is now a
divergence from the kernel vfio uapi and eliminates what is intended to
be a device agnostic interrupt interface.

> If we switch to using SET_IRQS approach, we could use SET_IRQS
> message for masking/unmasking, but still pass through the the PBA
> access to the backend PCI device.

Yes.

> So I think the question is, if we should switch vfio-user to SET_IRQS
> now for masking or unmasking, or whenever QEMU does it in the future?
> The PBA access would remain the same as it’s now - via device BAR.

I apologize that I'm constantly overwhelmed with requests that I
haven't reviewed it, but it seems like the client side would be far
more compliant to the vfio kernel interface if vfio-user did implement
an interpretation of the SET_IRQS ioctl.  Potentially couldn't you also
make use of eventfds or define other data types to pass that would give
the client more flexibility in receiving interrupts?  Thanks,

Alex
Jag Raman June 1, 2022, 7:38 p.m. UTC | #12
On Jun 1, 2022, at 2:30 PM, Alex Williamson <alex.williamson@redhat.com<mailto:alex.williamson@redhat.com>> wrote:

On Wed, 1 Jun 2022 18:01:39 +0000
Jag Raman <jag.raman@oracle.com<mailto:jag.raman@oracle.com>> wrote:

On Jun 1, 2022, at 1:26 PM, Alex Williamson <alex.williamson@redhat.com<mailto:alex.williamson@redhat.com>> wrote:

On Wed, 1 Jun 2022 17:00:54 +0000
Jag Raman <jag.raman@oracle.com<mailto:jag.raman@oracle.com>> wrote:

Hi Alex,

Just to add some more detail, the emulated PCI device in QEMU presently
maintains a MSIx table (PCIDevice->msix_table) and Pending Bit Array. In the
present VFIO PCI device implementation, QEMU leverages the same
MSIx table for interrupt masking/unmasking. The backend PCI device (such as
the passthru device) always thinks that the interrupt is unmasked and lets
QEMU manage masking.

Whereas in the vfio-user case, the client additionally pushes a copy of
emulated PCI device’s table downstream to the remote device. We did this
to allow a small set of devices (such as e1000e) to clear the
PBA (msix_clr_pending()). Secondly, the remote device uses its copy of the
MSIx table to determine if interrupt should be triggered - this would prevent
an interrupt from being sent to the client unnecessarily if it's masked.

We are wondering if pushing the MSIx table to the remote device and
reading PBA from it would diverge from the VFIO protocol specification?

From your comment, I understand it’s similar to VFIO protocol because VFIO
clients could mask an interrupt using VFIO_DEVICE_SET_IRQS ioctl +
VFIO_IRQ_SET_ACTION_MASK / _UNMASK flags. I observed that QEMU presently
does not use this approach and the kernel does not support it for MSI.

I believe the SET_IRQS ioctl definition is pre-enabled to support
masking and unmasking, we've just lacked kernel support to mask at the
device which leads to the hybrid approach we have today. Our intention
would be to use the current uAPI, to provide that masking support, at
which point we'd leave the PBA mapped to the device.

Thank you for clarifying!


So whether your proposal diverges from the VFIO uAPI depends on what
you mean by "pushing the MSIx table to the remote device". If that's
done by implementing the existing SET_IRQS masking support, then you're
spot on. OTOH, if you're actually pushing a copy of the MSIx table
from the client, that's certainly not how I had envisioned the kernel

In the current implementation - when the guest accesses the MSIx table and
PBA, the client passes these accesses through to the remote device.

I suppose you can do this because you don't need some means for the
client to register a notification mechanism for the interrupt, you can
already send notifications via the socket. But this is now a
divergence from the kernel vfio uapi and eliminates what is intended to
be a device agnostic interrupt interface.

If we switch to using SET_IRQS approach, we could use SET_IRQS
message for masking/unmasking, but still pass through the the PBA
access to the backend PCI device.

Yes.

So I think the question is, if we should switch vfio-user to SET_IRQS
now for masking or unmasking, or whenever QEMU does it in the future?
The PBA access would remain the same as it’s now - via device BAR.

I apologize that I'm constantly overwhelmed with requests that I
haven't reviewed it, but it seems like the client side would be far
more compliant to the vfio kernel interface if vfio-user did implement
an interpretation of the SET_IRQS ioctl. Potentially couldn't you also

Thanks for confirming! We’ll explore using SET_IRQS for masking
and unmasking.

make use of eventfds or define other data types to pass that would give
the client more flexibility in receiving interrupts? Thanks,

I think the libvfio-user library already uses eventfds to pass interrupts
to the client.

--
Jag


Alex
John Johnson June 3, 2022, 12:20 p.m. UTC | #13
> On Jun 1, 2022, at 1:26 PM, Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> On Wed, 1 Jun 2022 17:00:54 +0000
> Jag Raman <jag.raman@oracle.com> wrote:
>> 
>> Hi Alex,
>> 
>> Just to add some more detail, the emulated PCI device in QEMU presently
>> maintains a MSIx table (PCIDevice->msix_table) and Pending Bit Array. In the
>> present VFIO PCI device implementation, QEMU leverages the same
>> MSIx table for interrupt masking/unmasking. The backend PCI device (such as
>> the passthru device) always thinks that the interrupt is unmasked and lets
>> QEMU manage masking.
>> 
>> Whereas in the vfio-user case, the client additionally pushes a copy of
>> emulated PCI device’s table downstream to the remote device. We did this
>> to allow a small set of devices (such as e1000e) to clear the
>> PBA (msix_clr_pending()). Secondly, the remote device uses its copy of the
>> MSIx table to determine if interrupt should be triggered - this would prevent
>> an interrupt from being sent to the client unnecessarily if it's masked.
>> 
>> We are wondering if pushing the MSIx table to the remote device and
>> reading PBA from it would diverge from the VFIO protocol specification?
>> 
>> From your comment, I understand it’s similar to VFIO protocol because VFIO
>> clients could mask an interrupt using VFIO_DEVICE_SET_IRQS ioctl +
>> VFIO_IRQ_SET_ACTION_MASK / _UNMASK flags. I observed that QEMU presently
>> does not use this approach and the kernel does not support it for MSI.
> 
> I believe the SET_IRQS ioctl definition is pre-enabled to support
> masking and unmasking, we've just lacked kernel support to mask at the
> device which leads to the hybrid approach we have today.  Our intention
> would be to use the current uAPI, to provide that masking support, at
> which point we'd leave the PBA mapped to the device.
> 

	The reason I didn’t use SET_IRQS was the kernel implementation
didn’t, and I wanted to avoid having the two behave differently.  If we
want to go down the modal path, then if we use SET_IRQS to mask interrupts
at the source, we don’t need to emulate masking by changing the interrupt
eventfd from KVM to QEMU when the interrupt is masked.  Do you want that
change as well?

								JJ
Alexander H Duyck June 6, 2022, 6:32 p.m. UTC | #14
On Tue, May 24, 2022 at 9:11 AM Jagannathan Raman <jag.raman@oracle.com> wrote:
>
> Forward remote device's interrupts to the guest
>
> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> ---
>  include/hw/pci/pci.h              |  13 ++++
>  include/hw/remote/vfio-user-obj.h |   6 ++
>  hw/pci/msi.c                      |  16 ++--
>  hw/pci/msix.c                     |  10 ++-
>  hw/pci/pci.c                      |  13 ++++
>  hw/remote/machine.c               |  14 +++-
>  hw/remote/vfio-user-obj.c         | 123 ++++++++++++++++++++++++++++++
>  stubs/vfio-user-obj.c             |   6 ++
>  MAINTAINERS                       |   1 +
>  hw/remote/trace-events            |   1 +
>  stubs/meson.build                 |   1 +
>  11 files changed, 193 insertions(+), 11 deletions(-)
>  create mode 100644 include/hw/remote/vfio-user-obj.h
>  create mode 100644 stubs/vfio-user-obj.c
>

So I had a question about a few bits below. Specifically I ran into
issues when I had setup two devices to be assigned to the same VM via
two vfio-user-pci/x-vfio-user-server interfaces.

What I am hitting is an assert(irq_num < bus->nirq) in
pci_bus_change_irq_level in the server.

> diff --git a/hw/remote/machine.c b/hw/remote/machine.c
> index 645b54343d..75d550daae 100644
> --- a/hw/remote/machine.c
> +++ b/hw/remote/machine.c
> @@ -23,6 +23,8 @@
>  #include "hw/remote/iommu.h"
>  #include "hw/qdev-core.h"
>  #include "hw/remote/iommu.h"
> +#include "hw/remote/vfio-user-obj.h"
> +#include "hw/pci/msi.h"
>
>  static void remote_machine_init(MachineState *machine)
>  {
> @@ -54,12 +56,16 @@ static void remote_machine_init(MachineState *machine)
>
>      if (s->vfio_user) {
>          remote_iommu_setup(pci_host->bus);
> -    }
>
> -    remote_iohub_init(&s->iohub);
> +        msi_nonbroken = true;
> +
> +        vfu_object_set_bus_irq(pci_host->bus);
> +    } else {
> +        remote_iohub_init(&s->iohub);
>
> -    pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq,
> -                 &s->iohub, REMOTE_IOHUB_NB_PIRQS);
> +        pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq,
> +                     &s->iohub, REMOTE_IOHUB_NB_PIRQS);
> +    }
>
>      qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s));
>  }

If I am reading the code right this limits us to one legacy interrupt
in the vfio_user case, irq 0, correct? Is this intentional? Just
wanted to verify as this seems to limit us to supporting only one
device based on the mapping below.

> diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
> index ee28a93782..eeb165a805 100644
> --- a/hw/remote/vfio-user-obj.c
> +++ b/hw/remote/vfio-user-obj.c
> @@ -53,6 +53,9 @@
>  #include "hw/pci/pci.h"
>  #include "qemu/timer.h"
>  #include "exec/memory.h"
> +#include "hw/pci/msi.h"
> +#include "hw/pci/msix.h"
> +#include "hw/remote/vfio-user-obj.h"
>
>  #define TYPE_VFU_OBJECT "x-vfio-user-server"
>  OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
> @@ -96,6 +99,10 @@ struct VfuObject {
>      Error *unplug_blocker;
>
>      int vfu_poll_fd;
> +
> +    MSITriggerFunc *default_msi_trigger;
> +    MSIPrepareMessageFunc *default_msi_prepare_message;
> +    MSIxPrepareMessageFunc *default_msix_prepare_message;
>  };
>
>  static void vfu_object_init_ctx(VfuObject *o, Error **errp);
> @@ -520,6 +527,111 @@ static void vfu_object_register_bars(vfu_ctx_t *vfu_ctx, PCIDevice *pdev)
>      }
>  }
>
> +static int vfu_object_map_irq(PCIDevice *pci_dev, int intx)
> +{
> +    int pci_bdf = PCI_BUILD_BDF(pci_bus_num(pci_get_bus(pci_dev)),
> +                                pci_dev->devfn);
> +
> +    return pci_bdf;
> +}
> +

This bit ends up mapping it so that the BDF ends up setting the IRQ
number. So for example device 0, function 0 will be IRQ 0, and device
1, function 0 will be IRQ 8. Just wondering why it is implemented this
way if we only intend to support one device. Also I am wondering if we
should support some sort of IRQ sharing?

> +static int vfu_object_setup_irqs(VfuObject *o, PCIDevice *pci_dev)
> +{
> +    vfu_ctx_t *vfu_ctx = o->vfu_ctx;
> +    int ret;
> +
> +    ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_INTX_IRQ, 1);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +
> +    if (msix_nr_vectors_allocated(pci_dev)) {
> +        ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_MSIX_IRQ,
> +                                       msix_nr_vectors_allocated(pci_dev));
> +    } else if (msi_nr_vectors_allocated(pci_dev)) {
> +        ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_MSI_IRQ,
> +                                       msi_nr_vectors_allocated(pci_dev));
> +    }
> +
> +    if (ret < 0) {
> +        return ret;
> +    }
> +
> +    vfu_object_setup_msi_cbs(o);
> +
> +    pci_dev->irq_opaque = vfu_ctx;
> +
> +    return 0;
> +}
> +
> +void vfu_object_set_bus_irq(PCIBus *pci_bus)
> +{
> +    pci_bus_irqs(pci_bus, vfu_object_set_irq, vfu_object_map_irq, pci_bus, 1);
> +}
> +
>  /*
>   * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
>   * properties. It also depends on devices instantiated in QEMU. These

So this is the code that was called earlier that is being used to
assign 1 interrupt to the bus. I am just wondering if that is
intentional and if the expected behavior is to only support one device
per server for now?
Jag Raman June 6, 2022, 7:29 p.m. UTC | #15
> On Jun 6, 2022, at 2:32 PM, Alexander Duyck <alexander.duyck@gmail.com> wrote:
> 
> On Tue, May 24, 2022 at 9:11 AM Jagannathan Raman <jag.raman@oracle.com> wrote:
>> 
>> Forward remote device's interrupts to the guest
>> 
>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
>> ---
>> include/hw/pci/pci.h              |  13 ++++
>> include/hw/remote/vfio-user-obj.h |   6 ++
>> hw/pci/msi.c                      |  16 ++--
>> hw/pci/msix.c                     |  10 ++-
>> hw/pci/pci.c                      |  13 ++++
>> hw/remote/machine.c               |  14 +++-
>> hw/remote/vfio-user-obj.c         | 123 ++++++++++++++++++++++++++++++
>> stubs/vfio-user-obj.c             |   6 ++
>> MAINTAINERS                       |   1 +
>> hw/remote/trace-events            |   1 +
>> stubs/meson.build                 |   1 +
>> 11 files changed, 193 insertions(+), 11 deletions(-)
>> create mode 100644 include/hw/remote/vfio-user-obj.h
>> create mode 100644 stubs/vfio-user-obj.c
>> 
> 
> So I had a question about a few bits below. Specifically I ran into
> issues when I had setup two devices to be assigned to the same VM via
> two vfio-user-pci/x-vfio-user-server interfaces.

Thanks for the heads up, Alexander!

> 
> What I am hitting is an assert(irq_num < bus->nirq) in
> pci_bus_change_irq_level in the server.

That is issue is because we are only initializing only one
IRQ for the PCI bus - but it should be more. We will update
it and when the bus initializes interrupts and get back to you.

We only tested MSI/x devices for the multi-device config. We
should also test INTx - could you please confirm which device
you’re using so we could verify that it works before posting
the next revision.

Thank you!
--
Jag

> 
>> diff --git a/hw/remote/machine.c b/hw/remote/machine.c
>> index 645b54343d..75d550daae 100644
>> --- a/hw/remote/machine.c
>> +++ b/hw/remote/machine.c
>> @@ -23,6 +23,8 @@
>> #include "hw/remote/iommu.h"
>> #include "hw/qdev-core.h"
>> #include "hw/remote/iommu.h"
>> +#include "hw/remote/vfio-user-obj.h"
>> +#include "hw/pci/msi.h"
>> 
>> static void remote_machine_init(MachineState *machine)
>> {
>> @@ -54,12 +56,16 @@ static void remote_machine_init(MachineState *machine)
>> 
>>     if (s->vfio_user) {
>>         remote_iommu_setup(pci_host->bus);
>> -    }
>> 
>> -    remote_iohub_init(&s->iohub);
>> +        msi_nonbroken = true;
>> +
>> +        vfu_object_set_bus_irq(pci_host->bus);
>> +    } else {
>> +        remote_iohub_init(&s->iohub);
>> 
>> -    pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq,
>> -                 &s->iohub, REMOTE_IOHUB_NB_PIRQS);
>> +        pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq,
>> +                     &s->iohub, REMOTE_IOHUB_NB_PIRQS);
>> +    }
>> 
>>     qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s));
>> }
> 
> If I am reading the code right this limits us to one legacy interrupt
> in the vfio_user case, irq 0, correct? Is this intentional? Just
> wanted to verify as this seems to limit us to supporting only one
> device based on the mapping below.
> 
>> diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
>> index ee28a93782..eeb165a805 100644
>> --- a/hw/remote/vfio-user-obj.c
>> +++ b/hw/remote/vfio-user-obj.c
>> @@ -53,6 +53,9 @@
>> #include "hw/pci/pci.h"
>> #include "qemu/timer.h"
>> #include "exec/memory.h"
>> +#include "hw/pci/msi.h"
>> +#include "hw/pci/msix.h"
>> +#include "hw/remote/vfio-user-obj.h"
>> 
>> #define TYPE_VFU_OBJECT "x-vfio-user-server"
>> OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
>> @@ -96,6 +99,10 @@ struct VfuObject {
>>     Error *unplug_blocker;
>> 
>>     int vfu_poll_fd;
>> +
>> +    MSITriggerFunc *default_msi_trigger;
>> +    MSIPrepareMessageFunc *default_msi_prepare_message;
>> +    MSIxPrepareMessageFunc *default_msix_prepare_message;
>> };
>> 
>> static void vfu_object_init_ctx(VfuObject *o, Error **errp);
>> @@ -520,6 +527,111 @@ static void vfu_object_register_bars(vfu_ctx_t *vfu_ctx, PCIDevice *pdev)
>>     }
>> }
>> 
>> +static int vfu_object_map_irq(PCIDevice *pci_dev, int intx)
>> +{
>> +    int pci_bdf = PCI_BUILD_BDF(pci_bus_num(pci_get_bus(pci_dev)),
>> +                                pci_dev->devfn);
>> +
>> +    return pci_bdf;
>> +}
>> +
> 
> This bit ends up mapping it so that the BDF ends up setting the IRQ
> number. So for example device 0, function 0 will be IRQ 0, and device
> 1, function 0 will be IRQ 8. Just wondering why it is implemented this
> way if we only intend to support one device. Also I am wondering if we
> should support some sort of IRQ sharing?
> 
>> +static int vfu_object_setup_irqs(VfuObject *o, PCIDevice *pci_dev)
>> +{
>> +    vfu_ctx_t *vfu_ctx = o->vfu_ctx;
>> +    int ret;
>> +
>> +    ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_INTX_IRQ, 1);
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>> +
>> +    if (msix_nr_vectors_allocated(pci_dev)) {
>> +        ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_MSIX_IRQ,
>> +                                       msix_nr_vectors_allocated(pci_dev));
>> +    } else if (msi_nr_vectors_allocated(pci_dev)) {
>> +        ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_MSI_IRQ,
>> +                                       msi_nr_vectors_allocated(pci_dev));
>> +    }
>> +
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>> +
>> +    vfu_object_setup_msi_cbs(o);
>> +
>> +    pci_dev->irq_opaque = vfu_ctx;
>> +
>> +    return 0;
>> +}
>> +
>> +void vfu_object_set_bus_irq(PCIBus *pci_bus)
>> +{
>> +    pci_bus_irqs(pci_bus, vfu_object_set_irq, vfu_object_map_irq, pci_bus, 1);
>> +}
>> +
>> /*
>>  * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
>>  * properties. It also depends on devices instantiated in QEMU. These
> 
> So this is the code that was called earlier that is being used to
> assign 1 interrupt to the bus. I am just wondering if that is
> intentional and if the expected behavior is to only support one device
> per server for now?
Alexander H Duyck June 6, 2022, 8:38 p.m. UTC | #16
On Mon, Jun 6, 2022 at 12:29 PM Jag Raman <jag.raman@oracle.com> wrote:
>
>
>
> > On Jun 6, 2022, at 2:32 PM, Alexander Duyck <alexander.duyck@gmail.com> wrote:
> >
> > On Tue, May 24, 2022 at 9:11 AM Jagannathan Raman <jag.raman@oracle.com> wrote:
> >>
> >> Forward remote device's interrupts to the guest
> >>
> >> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> >> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> >> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> >> ---
> >> include/hw/pci/pci.h              |  13 ++++
> >> include/hw/remote/vfio-user-obj.h |   6 ++
> >> hw/pci/msi.c                      |  16 ++--
> >> hw/pci/msix.c                     |  10 ++-
> >> hw/pci/pci.c                      |  13 ++++
> >> hw/remote/machine.c               |  14 +++-
> >> hw/remote/vfio-user-obj.c         | 123 ++++++++++++++++++++++++++++++
> >> stubs/vfio-user-obj.c             |   6 ++
> >> MAINTAINERS                       |   1 +
> >> hw/remote/trace-events            |   1 +
> >> stubs/meson.build                 |   1 +
> >> 11 files changed, 193 insertions(+), 11 deletions(-)
> >> create mode 100644 include/hw/remote/vfio-user-obj.h
> >> create mode 100644 stubs/vfio-user-obj.c
> >>
> >
> > So I had a question about a few bits below. Specifically I ran into
> > issues when I had setup two devices to be assigned to the same VM via
> > two vfio-user-pci/x-vfio-user-server interfaces.
>
> Thanks for the heads up, Alexander!
>
> >
> > What I am hitting is an assert(irq_num < bus->nirq) in
> > pci_bus_change_irq_level in the server.
>
> That is issue is because we are only initializing only one
> IRQ for the PCI bus - but it should be more. We will update
> it and when the bus initializes interrupts and get back to you.
>
> We only tested MSI/x devices for the multi-device config. We
> should also test INTx - could you please confirm which device
> you’re using so we could verify that it works before posting
> the next revision.
>
> Thank you!
> --
> Jag

I was testing an MSI-X capable NIC, so you can reproduce with e1000e.
During the device enumeration before the driver was even loaded it
threw the error:
qemu-system-x86_64: ../hw/pci/pci.c:276: pci_bus_change_irq_level:
Assertion `irq_num < bus->nirq' failed.

All it really takes is attaching to the second device, that is enough
to trigger the error since the irq_num will be non-zero.

If I recall, all the kernel is doing is unmasking the interrupt via
the config space and it is triggering the error which then shuts down
the server.
diff mbox series

Patch

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 44dacfa224..b54b6ef88f 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -16,6 +16,7 @@  extern bool pci_available;
 #define PCI_SLOT(devfn)         (((devfn) >> 3) & 0x1f)
 #define PCI_FUNC(devfn)         ((devfn) & 0x07)
 #define PCI_BUILD_BDF(bus, devfn)     ((bus << 8) | (devfn))
+#define PCI_BDF_TO_DEVFN(x)     ((x) & 0xff)
 #define PCI_BUS_MAX             256
 #define PCI_DEVFN_MAX           256
 #define PCI_SLOT_MAX            32
@@ -127,6 +128,10 @@  typedef void PCIMapIORegionFunc(PCIDevice *pci_dev, int region_num,
                                 pcibus_t addr, pcibus_t size, int type);
 typedef void PCIUnregisterFunc(PCIDevice *pci_dev);
 
+typedef void MSITriggerFunc(PCIDevice *dev, MSIMessage msg);
+typedef MSIMessage MSIPrepareMessageFunc(PCIDevice *dev, unsigned vector);
+typedef MSIMessage MSIxPrepareMessageFunc(PCIDevice *dev, unsigned vector);
+
 typedef struct PCIIORegion {
     pcibus_t addr; /* current PCI mapping address. -1 means not mapped */
 #define PCI_BAR_UNMAPPED (~(pcibus_t)0)
@@ -329,6 +334,14 @@  struct PCIDevice {
     /* Space to store MSIX table & pending bit array */
     uint8_t *msix_table;
     uint8_t *msix_pba;
+
+    /* May be used by INTx or MSI during interrupt notification */
+    void *irq_opaque;
+
+    MSITriggerFunc *msi_trigger;
+    MSIPrepareMessageFunc *msi_prepare_message;
+    MSIxPrepareMessageFunc *msix_prepare_message;
+
     /* MemoryRegion container for msix exclusive BAR setup */
     MemoryRegion msix_exclusive_bar;
     /* Memory Regions for MSIX table and pending bit entries. */
diff --git a/include/hw/remote/vfio-user-obj.h b/include/hw/remote/vfio-user-obj.h
new file mode 100644
index 0000000000..87ab78b875
--- /dev/null
+++ b/include/hw/remote/vfio-user-obj.h
@@ -0,0 +1,6 @@ 
+#ifndef VFIO_USER_OBJ_H
+#define VFIO_USER_OBJ_H
+
+void vfu_object_set_bus_irq(PCIBus *pci_bus);
+
+#endif
diff --git a/hw/pci/msi.c b/hw/pci/msi.c
index 47d2b0f33c..d556e17a09 100644
--- a/hw/pci/msi.c
+++ b/hw/pci/msi.c
@@ -134,7 +134,7 @@  void msi_set_message(PCIDevice *dev, MSIMessage msg)
     pci_set_word(dev->config + msi_data_off(dev, msi64bit), msg.data);
 }
 
-MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector)
+static MSIMessage msi_prepare_message(PCIDevice *dev, unsigned int vector)
 {
     uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev));
     bool msi64bit = flags & PCI_MSI_FLAGS_64BIT;
@@ -159,6 +159,11 @@  MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector)
     return msg;
 }
 
+MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector)
+{
+    return dev->msi_prepare_message(dev, vector);
+}
+
 bool msi_enabled(const PCIDevice *dev)
 {
     return msi_present(dev) &&
@@ -241,6 +246,8 @@  int msi_init(struct PCIDevice *dev, uint8_t offset,
                      0xffffffff >> (PCI_MSI_VECTORS_MAX - nr_vectors));
     }
 
+    dev->msi_prepare_message = msi_prepare_message;
+
     return 0;
 }
 
@@ -256,6 +263,7 @@  void msi_uninit(struct PCIDevice *dev)
     cap_size = msi_cap_sizeof(flags);
     pci_del_capability(dev, PCI_CAP_ID_MSI, cap_size);
     dev->cap_present &= ~QEMU_PCI_CAP_MSI;
+    dev->msi_prepare_message = NULL;
 
     MSI_DEV_PRINTF(dev, "uninit\n");
 }
@@ -334,11 +342,7 @@  void msi_notify(PCIDevice *dev, unsigned int vector)
 
 void msi_send_message(PCIDevice *dev, MSIMessage msg)
 {
-    MemTxAttrs attrs = {};
-
-    attrs.requester_id = pci_requester_id(dev);
-    address_space_stl_le(&dev->bus_master_as, msg.address, msg.data,
-                         attrs, NULL);
+    dev->msi_trigger(dev, msg);
 }
 
 /* Normally called by pci_default_write_config(). */
diff --git a/hw/pci/msix.c b/hw/pci/msix.c
index ae9331cd0b..6f85192d6f 100644
--- a/hw/pci/msix.c
+++ b/hw/pci/msix.c
@@ -31,7 +31,7 @@ 
 #define MSIX_ENABLE_MASK (PCI_MSIX_FLAGS_ENABLE >> 8)
 #define MSIX_MASKALL_MASK (PCI_MSIX_FLAGS_MASKALL >> 8)
 
-MSIMessage msix_get_message(PCIDevice *dev, unsigned vector)
+static MSIMessage msix_prepare_message(PCIDevice *dev, unsigned vector)
 {
     uint8_t *table_entry = dev->msix_table + vector * PCI_MSIX_ENTRY_SIZE;
     MSIMessage msg;
@@ -41,6 +41,11 @@  MSIMessage msix_get_message(PCIDevice *dev, unsigned vector)
     return msg;
 }
 
+MSIMessage msix_get_message(PCIDevice *dev, unsigned vector)
+{
+    return dev->msix_prepare_message(dev, vector);
+}
+
 /*
  * Special API for POWER to configure the vectors through
  * a side channel. Should never be used by devices.
@@ -344,6 +349,8 @@  int msix_init(struct PCIDevice *dev, unsigned short nentries,
                           "msix-pba", pba_size);
     memory_region_add_subregion(pba_bar, pba_offset, &dev->msix_pba_mmio);
 
+    dev->msix_prepare_message = msix_prepare_message;
+
     return 0;
 }
 
@@ -429,6 +436,7 @@  void msix_uninit(PCIDevice *dev, MemoryRegion *table_bar, MemoryRegion *pba_bar)
     g_free(dev->msix_entry_used);
     dev->msix_entry_used = NULL;
     dev->cap_present &= ~QEMU_PCI_CAP_MSIX;
+    dev->msix_prepare_message = NULL;
 }
 
 void msix_uninit_exclusive_bar(PCIDevice *dev)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index a9b37f8000..435f84393c 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -317,6 +317,15 @@  void pci_device_deassert_intx(PCIDevice *dev)
     }
 }
 
+static void pci_msi_trigger(PCIDevice *dev, MSIMessage msg)
+{
+    MemTxAttrs attrs = {};
+
+    attrs.requester_id = pci_requester_id(dev);
+    address_space_stl_le(&dev->bus_master_as, msg.address, msg.data,
+                         attrs, NULL);
+}
+
 static void pci_reset_regions(PCIDevice *dev)
 {
     int r;
@@ -1212,6 +1221,8 @@  static void pci_qdev_unrealize(DeviceState *dev)
 
     pci_device_deassert_intx(pci_dev);
     do_pci_unregister_device(pci_dev);
+
+    pci_dev->msi_trigger = NULL;
 }
 
 void pci_register_bar(PCIDevice *pci_dev, int region_num,
@@ -2251,6 +2262,8 @@  static void pci_qdev_realize(DeviceState *qdev, Error **errp)
     }
 
     pci_set_power(pci_dev, true);
+
+    pci_dev->msi_trigger = pci_msi_trigger;
 }
 
 PCIDevice *pci_new_multifunction(int devfn, bool multifunction,
diff --git a/hw/remote/machine.c b/hw/remote/machine.c
index 645b54343d..75d550daae 100644
--- a/hw/remote/machine.c
+++ b/hw/remote/machine.c
@@ -23,6 +23,8 @@ 
 #include "hw/remote/iommu.h"
 #include "hw/qdev-core.h"
 #include "hw/remote/iommu.h"
+#include "hw/remote/vfio-user-obj.h"
+#include "hw/pci/msi.h"
 
 static void remote_machine_init(MachineState *machine)
 {
@@ -54,12 +56,16 @@  static void remote_machine_init(MachineState *machine)
 
     if (s->vfio_user) {
         remote_iommu_setup(pci_host->bus);
-    }
 
-    remote_iohub_init(&s->iohub);
+        msi_nonbroken = true;
+
+        vfu_object_set_bus_irq(pci_host->bus);
+    } else {
+        remote_iohub_init(&s->iohub);
 
-    pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq,
-                 &s->iohub, REMOTE_IOHUB_NB_PIRQS);
+        pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq,
+                     &s->iohub, REMOTE_IOHUB_NB_PIRQS);
+    }
 
     qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s));
 }
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
index ee28a93782..eeb165a805 100644
--- a/hw/remote/vfio-user-obj.c
+++ b/hw/remote/vfio-user-obj.c
@@ -53,6 +53,9 @@ 
 #include "hw/pci/pci.h"
 #include "qemu/timer.h"
 #include "exec/memory.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/msix.h"
+#include "hw/remote/vfio-user-obj.h"
 
 #define TYPE_VFU_OBJECT "x-vfio-user-server"
 OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
@@ -96,6 +99,10 @@  struct VfuObject {
     Error *unplug_blocker;
 
     int vfu_poll_fd;
+
+    MSITriggerFunc *default_msi_trigger;
+    MSIPrepareMessageFunc *default_msi_prepare_message;
+    MSIxPrepareMessageFunc *default_msix_prepare_message;
 };
 
 static void vfu_object_init_ctx(VfuObject *o, Error **errp);
@@ -520,6 +527,111 @@  static void vfu_object_register_bars(vfu_ctx_t *vfu_ctx, PCIDevice *pdev)
     }
 }
 
+static int vfu_object_map_irq(PCIDevice *pci_dev, int intx)
+{
+    int pci_bdf = PCI_BUILD_BDF(pci_bus_num(pci_get_bus(pci_dev)),
+                                pci_dev->devfn);
+
+    return pci_bdf;
+}
+
+static void vfu_object_set_irq(void *opaque, int pirq, int level)
+{
+    PCIBus *pci_bus = opaque;
+    PCIDevice *pci_dev = NULL;
+    vfu_ctx_t *vfu_ctx = NULL;
+    int pci_bus_num, devfn;
+
+    if (level) {
+        pci_bus_num = PCI_BUS_NUM(pirq);
+        devfn = PCI_BDF_TO_DEVFN(pirq);
+
+        /*
+         * pci_find_device() performs at O(1) if the device is attached
+         * to the root PCI bus. Whereas, if the device is attached to a
+         * secondary PCI bus (such as when a root port is involved),
+         * finding the parent PCI bus could take O(n)
+         */
+        pci_dev = pci_find_device(pci_bus, pci_bus_num, devfn);
+
+        vfu_ctx = pci_dev->irq_opaque;
+
+        g_assert(vfu_ctx);
+
+        vfu_irq_trigger(vfu_ctx, 0);
+    }
+}
+
+static MSIMessage vfu_object_msi_prepare_msg(PCIDevice *pci_dev,
+                                             unsigned int vector)
+{
+    MSIMessage msg;
+
+    msg.address = 0;
+    msg.data = vector;
+
+    return msg;
+}
+
+static void vfu_object_msi_trigger(PCIDevice *pci_dev, MSIMessage msg)
+{
+    vfu_ctx_t *vfu_ctx = pci_dev->irq_opaque;
+
+    vfu_irq_trigger(vfu_ctx, msg.data);
+}
+
+static void vfu_object_setup_msi_cbs(VfuObject *o)
+{
+    o->default_msi_trigger = o->pci_dev->msi_trigger;
+    o->default_msi_prepare_message = o->pci_dev->msi_prepare_message;
+    o->default_msix_prepare_message = o->pci_dev->msix_prepare_message;
+
+    o->pci_dev->msi_trigger = vfu_object_msi_trigger;
+    o->pci_dev->msi_prepare_message = vfu_object_msi_prepare_msg;
+    o->pci_dev->msix_prepare_message = vfu_object_msi_prepare_msg;
+}
+
+static void vfu_object_restore_msi_cbs(VfuObject *o)
+{
+    o->pci_dev->msi_trigger = o->default_msi_trigger;
+    o->pci_dev->msi_prepare_message = o->default_msi_prepare_message;
+    o->pci_dev->msix_prepare_message = o->default_msix_prepare_message;
+}
+
+static int vfu_object_setup_irqs(VfuObject *o, PCIDevice *pci_dev)
+{
+    vfu_ctx_t *vfu_ctx = o->vfu_ctx;
+    int ret;
+
+    ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_INTX_IRQ, 1);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (msix_nr_vectors_allocated(pci_dev)) {
+        ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_MSIX_IRQ,
+                                       msix_nr_vectors_allocated(pci_dev));
+    } else if (msi_nr_vectors_allocated(pci_dev)) {
+        ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_MSI_IRQ,
+                                       msi_nr_vectors_allocated(pci_dev));
+    }
+
+    if (ret < 0) {
+        return ret;
+    }
+
+    vfu_object_setup_msi_cbs(o);
+
+    pci_dev->irq_opaque = vfu_ctx;
+
+    return 0;
+}
+
+void vfu_object_set_bus_irq(PCIBus *pci_bus)
+{
+    pci_bus_irqs(pci_bus, vfu_object_set_irq, vfu_object_map_irq, pci_bus, 1);
+}
+
 /*
  * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
  * properties. It also depends on devices instantiated in QEMU. These
@@ -632,6 +744,13 @@  static void vfu_object_init_ctx(VfuObject *o, Error **errp)
 
     vfu_object_register_bars(o->vfu_ctx, o->pci_dev);
 
+    ret = vfu_object_setup_irqs(o, o->pci_dev);
+    if (ret < 0) {
+        error_setg(errp, "vfu: Failed to setup interrupts for %s",
+                   o->device);
+        goto fail;
+    }
+
     ret = vfu_realize_ctx(o->vfu_ctx);
     if (ret < 0) {
         error_setg(errp, "vfu: Failed to realize device %s- %s",
@@ -657,6 +776,8 @@  fail:
         o->unplug_blocker = NULL;
     }
     if (o->pci_dev) {
+        vfu_object_restore_msi_cbs(o);
+        o->pci_dev->irq_opaque = NULL;
         object_unref(OBJECT(o->pci_dev));
         o->pci_dev = NULL;
     }
@@ -716,6 +837,8 @@  static void vfu_object_finalize(Object *obj)
     }
 
     if (o->pci_dev) {
+        vfu_object_restore_msi_cbs(o);
+        o->pci_dev->irq_opaque = NULL;
         object_unref(OBJECT(o->pci_dev));
         o->pci_dev = NULL;
     }
diff --git a/stubs/vfio-user-obj.c b/stubs/vfio-user-obj.c
new file mode 100644
index 0000000000..79100d768e
--- /dev/null
+++ b/stubs/vfio-user-obj.c
@@ -0,0 +1,6 @@ 
+#include "qemu/osdep.h"
+#include "hw/remote/vfio-user-obj.h"
+
+void vfu_object_set_bus_irq(PCIBus *pci_bus)
+{
+}
diff --git a/MAINTAINERS b/MAINTAINERS
index 9d8695b68d..844ed75834 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3604,6 +3604,7 @@  F: hw/remote/iohub.c
 F: include/hw/remote/iohub.h
 F: subprojects/libvfio-user
 F: hw/remote/vfio-user-obj.c
+F: include/hw/remote/vfio-user-obj.h
 F: hw/remote/iommu.c
 F: include/hw/remote/iommu.h
 
diff --git a/hw/remote/trace-events b/hw/remote/trace-events
index 847d50d88f..c167b3c7a5 100644
--- a/hw/remote/trace-events
+++ b/hw/remote/trace-events
@@ -12,3 +12,4 @@  vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64""
 vfu_bar_register(int i, uint64_t addr, uint64_t size) "vfu: BAR %d: addr 0x%"PRIx64" size 0x%"PRIx64""
 vfu_bar_rw_enter(const char *op, uint64_t addr) "vfu: %s request for BAR address 0x%"PRIx64""
 vfu_bar_rw_exit(const char *op, uint64_t addr) "vfu: Finished %s of BAR address 0x%"PRIx64""
+vfu_interrupt(int pirq) "vfu: sending interrupt to device - PIRQ %d"
diff --git a/stubs/meson.build b/stubs/meson.build
index 6f80fec761..d8f3fd5c44 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -60,3 +60,4 @@  if have_system
 else
   stub_ss.add(files('qdev.c'))
 endif
+stub_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_false: files('vfio-user-obj.c'))