diff mbox series

[RFC,v3,03/25] hw/iommu: introduce IOMMUContext

Message ID 1580300216-86172-4-git-send-email-yi.l.liu@intel.com
State New
Headers show
Series intel_iommu: expose Shared Virtual Addressing to VMs | expand

Commit Message

Yi Liu Jan. 29, 2020, 12:16 p.m. UTC
From: Peter Xu <peterx@redhat.com>

Currently, many platform vendors provide the capability of dual stage
DMA address translation in hardware. For example, nested translation
on Intel VT-d scalable mode, nested stage translation on ARM SMMUv3,
and etc. Also there are efforts to make QEMU vIOMMU be backed by dual
stage DMA address translation capability provided by hardware to have
better address translation support for passthru devices.

As so, making vIOMMU be backed by dual stage translation capability
requires QEMU vIOMMU to have a way to get aware of such hardware
capability and also require a way to receive DMA address translation
faults (e.g. I/O page request) from host as guest owns stage-1 translation
structures in dual stage DAM address translation.

This patch adds IOMMUContext as an abstract of vIOMMU related operations.
Like provide a way for passthru modules (e.g. VFIO) to register
DualStageIOMMUObject instances. And in future, it is expected to offer
support for receiving host DMA translation faults happened on stage-1
translation.

For more backgrounds, may refer to the discussion below, while there
is also difference between the current implementation and original
proposal. This patch introduces the IOMMUContext as an abstract layer
for passthru module (e.g. VFIO) calls into vIOMMU. The first introduced
interface is to make QEMU vIOMMU be aware of dual stage translation
capability.

https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg05022.html

Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Yi Sun <yi.y.sun@linux.intel.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
---
 hw/iommu/Makefile.objs           |  1 +
 hw/iommu/iommu_context.c         | 54 +++++++++++++++++++++++++++++++++++
 include/hw/iommu/iommu_context.h | 61 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 116 insertions(+)
 create mode 100644 hw/iommu/iommu_context.c
 create mode 100644 include/hw/iommu/iommu_context.h

Comments

David Gibson Jan. 31, 2020, 4:06 a.m. UTC | #1
On Wed, Jan 29, 2020 at 04:16:34AM -0800, Liu, Yi L wrote:
> From: Peter Xu <peterx@redhat.com>
> 
> Currently, many platform vendors provide the capability of dual stage
> DMA address translation in hardware. For example, nested translation
> on Intel VT-d scalable mode, nested stage translation on ARM SMMUv3,
> and etc. Also there are efforts to make QEMU vIOMMU be backed by dual
> stage DMA address translation capability provided by hardware to have
> better address translation support for passthru devices.
> 
> As so, making vIOMMU be backed by dual stage translation capability
> requires QEMU vIOMMU to have a way to get aware of such hardware
> capability and also require a way to receive DMA address translation
> faults (e.g. I/O page request) from host as guest owns stage-1 translation
> structures in dual stage DAM address translation.
> 
> This patch adds IOMMUContext as an abstract of vIOMMU related operations.
> Like provide a way for passthru modules (e.g. VFIO) to register
> DualStageIOMMUObject instances. And in future, it is expected to offer
> support for receiving host DMA translation faults happened on stage-1
> translation.
> 
> For more backgrounds, may refer to the discussion below, while there
> is also difference between the current implementation and original
> proposal. This patch introduces the IOMMUContext as an abstract layer
> for passthru module (e.g. VFIO) calls into vIOMMU. The first introduced
> interface is to make QEMU vIOMMU be aware of dual stage translation
> capability.
> 
> https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg05022.html

Again, is there a reason for not making this a QOM class or interface?


I'm not very clear on the relationship betwen an IOMMUContext and a
DualStageIOMMUObject.  Can there be many IOMMUContexts to a
DualStageIOMMUOBject?  The other way around?  Or is it just
zero-or-one DualStageIOMMUObjects to an IOMMUContext?

> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Eric Auger <eric.auger@redhat.com>
> Cc: Yi Sun <yi.y.sun@linux.intel.com>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
> ---
>  hw/iommu/Makefile.objs           |  1 +
>  hw/iommu/iommu_context.c         | 54 +++++++++++++++++++++++++++++++++++
>  include/hw/iommu/iommu_context.h | 61 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 116 insertions(+)
>  create mode 100644 hw/iommu/iommu_context.c
>  create mode 100644 include/hw/iommu/iommu_context.h
> 
> diff --git a/hw/iommu/Makefile.objs b/hw/iommu/Makefile.objs
> index d4f3b39..1e45072 100644
> --- a/hw/iommu/Makefile.objs
> +++ b/hw/iommu/Makefile.objs
> @@ -1 +1,2 @@
>  obj-y += dual_stage_iommu.o
> +obj-y += iommu_context.o
> diff --git a/hw/iommu/iommu_context.c b/hw/iommu/iommu_context.c
> new file mode 100644
> index 0000000..6340ca3
> --- /dev/null
> +++ b/hw/iommu/iommu_context.c
> @@ -0,0 +1,54 @@
> +/*
> + * QEMU abstract of vIOMMU context
> + *
> + * Copyright (C) 2020 Red Hat Inc.
> + *
> + * Authors: Peter Xu <peterx@redhat.com>,
> + *          Liu Yi L <yi.l.liu@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> +
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> +
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/iommu/iommu_context.h"
> +
> +int iommu_context_register_ds_iommu(IOMMUContext *iommu_ctx,
> +                                    DualStageIOMMUObject *dsi_obj)
> +{
> +    if (!iommu_ctx || !dsi_obj) {

Would this ever happen apart from a bug in the caller?  If not it
should be an assert().

> +        return -ENOENT;
> +    }
> +
> +    if (iommu_ctx->ops && iommu_ctx->ops->register_ds_iommu) {
> +        return iommu_ctx->ops->register_ds_iommu(iommu_ctx, dsi_obj);
> +    }
> +    return -ENOENT;
> +}
> +
> +void iommu_context_unregister_ds_iommu(IOMMUContext *iommu_ctx,
> +                                      DualStageIOMMUObject *dsi_obj)
> +{
> +    if (!iommu_ctx || !dsi_obj) {
> +        return;
> +    }
> +
> +    if (iommu_ctx->ops && iommu_ctx->ops->unregister_ds_iommu) {
> +        iommu_ctx->ops->unregister_ds_iommu(iommu_ctx, dsi_obj);
> +    }
> +}
> +
> +void iommu_context_init(IOMMUContext *iommu_ctx, IOMMUContextOps *ops)
> +{
> +    iommu_ctx->ops = ops;
> +}
> diff --git a/include/hw/iommu/iommu_context.h b/include/hw/iommu/iommu_context.h
> new file mode 100644
> index 0000000..6f2ccb5
> --- /dev/null
> +++ b/include/hw/iommu/iommu_context.h
> @@ -0,0 +1,61 @@
> +/*
> + * QEMU abstraction of IOMMU Context
> + *
> + * Copyright (C) 2020 Red Hat Inc.
> + *
> + * Authors: Peter Xu <peterx@redhat.com>,
> + *          Liu, Yi L <yi.l.liu@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> +
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> +
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HW_IOMMU_CONTEXT_H
> +#define HW_IOMMU_CONTEXT_H
> +
> +#include "qemu/queue.h"
> +#ifndef CONFIG_USER_ONLY
> +#include "exec/hwaddr.h"
> +#endif
> +#include "hw/iommu/dual_stage_iommu.h"
> +
> +typedef struct IOMMUContext IOMMUContext;
> +typedef struct IOMMUContextOps IOMMUContextOps;
> +
> +struct IOMMUContextOps {
> +    /*
> +     * Register DualStageIOMMUObject to vIOMMU thus vIOMMU
> +     * is aware of dual stage translation capability, and
> +     * also be able to setup dual stage translation via
> +     * interfaces exposed by DualStageIOMMUObject.
> +     */
> +    int (*register_ds_iommu)(IOMMUContext *iommu_ctx,
> +                             DualStageIOMMUObject *dsi_obj);
> +    void (*unregister_ds_iommu)(IOMMUContext *iommu_ctx,
> +                                DualStageIOMMUObject *dsi_obj);
> +};
> +
> +/*
> + * This is an abstraction of IOMMU context.
> + */
> +struct IOMMUContext {
> +    IOMMUContextOps *ops;
> +};
> +
> +int iommu_context_register_ds_iommu(IOMMUContext *iommu_ctx,
> +                                    DualStageIOMMUObject *dsi_obj);
> +void iommu_context_unregister_ds_iommu(IOMMUContext *iommu_ctx,
> +                                       DualStageIOMMUObject *dsi_obj);
> +void iommu_context_init(IOMMUContext *iommu_ctx, IOMMUContextOps *ops);
> +
> +#endif
Yi Liu Jan. 31, 2020, 11:42 a.m. UTC | #2
Hi David,

> From: David Gibson [mailto:david@gibson.dropbear.id.au]
> Sent: Friday, January 31, 2020 12:07 PM
> To: Liu, Yi L <yi.l.liu@intel.com>
> Subject: Re: [RFC v3 03/25] hw/iommu: introduce IOMMUContext
> 
> On Wed, Jan 29, 2020 at 04:16:34AM -0800, Liu, Yi L wrote:
> > From: Peter Xu <peterx@redhat.com>
> >
> > Currently, many platform vendors provide the capability of dual stage
> > DMA address translation in hardware. For example, nested translation
> > on Intel VT-d scalable mode, nested stage translation on ARM SMMUv3,
> > and etc. Also there are efforts to make QEMU vIOMMU be backed by dual
> > stage DMA address translation capability provided by hardware to have
> > better address translation support for passthru devices.
> >
> > As so, making vIOMMU be backed by dual stage translation capability
> > requires QEMU vIOMMU to have a way to get aware of such hardware
> > capability and also require a way to receive DMA address translation
> > faults (e.g. I/O page request) from host as guest owns stage-1 translation
> > structures in dual stage DAM address translation.
> >
> > This patch adds IOMMUContext as an abstract of vIOMMU related operations.
> > Like provide a way for passthru modules (e.g. VFIO) to register
> > DualStageIOMMUObject instances. And in future, it is expected to offer
> > support for receiving host DMA translation faults happened on stage-1
> > translation.
> >
> > For more backgrounds, may refer to the discussion below, while there
> > is also difference between the current implementation and original
> > proposal. This patch introduces the IOMMUContext as an abstract layer
> > for passthru module (e.g. VFIO) calls into vIOMMU. The first introduced
> > interface is to make QEMU vIOMMU be aware of dual stage translation
> > capability.
> >
> > https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg05022.html
> 
> Again, is there a reason for not making this a QOM class or interface?

I guess it is enough to make a simple abstract layer as explained in prior
email. IOMMUContext is to provide explicit method for VFIO to call into
vIOMMU emulators.

> 
> I'm not very clear on the relationship betwen an IOMMUContext and a
> DualStageIOMMUObject.  Can there be many IOMMUContexts to a
> DualStageIOMMUOBject?  The other way around?  Or is it just
> zero-or-one DualStageIOMMUObjects to an IOMMUContext?

It is possible. As the below patch shows, DualStageIOMMUObject is per vfio
container. IOMMUContext can be either per-device or shared across devices,
it depends on vendor specific vIOMMU emulators.
[RFC v3 10/25] vfio: register DualStageIOMMUObject to vIOMMU
https://www.spinics.net/lists/kvm/msg205198.html

Take Intel vIOMMU as an example, there is a per device structure which
includes IOMMUContext instance and a DualStageIOMMUObject pointer.

+struct VTDIOMMUContext {
+    VTDBus *vtd_bus;
+    uint8_t devfn;
+    IOMMUContext iommu_context;
+    DualStageIOMMUObject *dsi_obj;
+    IntelIOMMUState *iommu_state;
+};
https://www.spinics.net/lists/kvm/msg205196.html

I think this would leave space for vendor specific vIOMMU emulators to
design their own relationship between an IOMMUContext and a
DualStageIOMMUObject.

> > Cc: Kevin Tian <kevin.tian@intel.com>
> > Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > Cc: Peter Xu <peterx@redhat.com>

[...]

> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "hw/iommu/iommu_context.h"
> > +
> > +int iommu_context_register_ds_iommu(IOMMUContext *iommu_ctx,
> > +                                    DualStageIOMMUObject *dsi_obj)
> > +{
> > +    if (!iommu_ctx || !dsi_obj) {
> 
> Would this ever happen apart from a bug in the caller?  If not it
> should be an assert().

Got it, thanks, I'll check all other alike in this series and fix them in
next version.

Thanks,
Yi Liu
Peter Xu Feb. 11, 2020, 4:58 p.m. UTC | #3
On Fri, Jan 31, 2020 at 11:42:13AM +0000, Liu, Yi L wrote:
> > I'm not very clear on the relationship betwen an IOMMUContext and a
> > DualStageIOMMUObject.  Can there be many IOMMUContexts to a
> > DualStageIOMMUOBject?  The other way around?  Or is it just
> > zero-or-one DualStageIOMMUObjects to an IOMMUContext?
> 
> It is possible. As the below patch shows, DualStageIOMMUObject is per vfio
> container. IOMMUContext can be either per-device or shared across devices,
> it depends on vendor specific vIOMMU emulators.

Is there an example when an IOMMUContext can be not per-device?

It makes sense to me to have an object that is per-container (in your
case, the DualStageIOMMUObject, IIUC), then we can connect that object
to a device.  However I'm a bit confused on why we've got two abstract
layers (the other one is IOMMUContext)?  That was previously for the
whole SVA new APIs, now it's all moved over to the other new object,
then IOMMUContext only register/unregister... Can we put the reg/unreg
procedures into DualStageIOMMUObject as well?  Then we drop the
IOMMUContext (or say, keep IOMMUContext and drop DualStageIOMMUObject
but let IOMMUContext to be per-vfio-container, the major difference is
the naming here, say, PASID allocation does not seem to be related to
dual-stage at all).

Besides that, not sure I read it right... but even with your current
series, the container->iommu_ctx will always only be bound to the
first device created within that container, since you've got:

    group = vfio_get_group(groupid, pci_device_iommu_address_space(pdev),
                           pci_device_iommu_context(pdev), errp);

And:

    if (vfio_connect_container(group, as, iommu_ctx, errp)) {
        error_prepend(errp, "failed to setup container for group %d: ",
                      groupid);
        goto close_fd_exit;
    }

The iommu_ctx will be set to container->iommu_ctx if there's no
existing container.

> [RFC v3 10/25] vfio: register DualStageIOMMUObject to vIOMMU
> https://www.spinics.net/lists/kvm/msg205198.html
> 
> Take Intel vIOMMU as an example, there is a per device structure which
> includes IOMMUContext instance and a DualStageIOMMUObject pointer.
> 
> +struct VTDIOMMUContext {
> +    VTDBus *vtd_bus;
> +    uint8_t devfn;
> +    IOMMUContext iommu_context;
> +    DualStageIOMMUObject *dsi_obj;
> +    IntelIOMMUState *iommu_state;
> +};
> https://www.spinics.net/lists/kvm/msg205196.html
> 
> I think this would leave space for vendor specific vIOMMU emulators to
> design their own relationship between an IOMMUContext and a
> DualStageIOMMUObject.
Yi Liu Feb. 12, 2020, 7:15 a.m. UTC | #4
Hi Peter,

> From: Peter Xu <peterx@redhat.com>
> Sent: Wednesday, February 12, 2020 12:59 AM
> To: Liu, Yi L <yi.l.liu@intel.com>
> Subject: Re: [RFC v3 03/25] hw/iommu: introduce IOMMUContext
> 
> On Fri, Jan 31, 2020 at 11:42:13AM +0000, Liu, Yi L wrote:
> > > I'm not very clear on the relationship betwen an IOMMUContext and a
> > > DualStageIOMMUObject.  Can there be many IOMMUContexts to a
> > > DualStageIOMMUOBject?  The other way around?  Or is it just
> > > zero-or-one DualStageIOMMUObjects to an IOMMUContext?
> >
> > It is possible. As the below patch shows, DualStageIOMMUObject is per vfio
> > container. IOMMUContext can be either per-device or shared across devices,
> > it depends on vendor specific vIOMMU emulators.
> 
> Is there an example when an IOMMUContext can be not per-device?

No, I don’t have such example so far. But as IOMMUContext is got from
pci_device_iommu_context(),  in concept it possible to be not per-device.
It is kind of leave to vIOMMU to decide if different devices could share a
single IOMMUContext.

> It makes sense to me to have an object that is per-container (in your
> case, the DualStageIOMMUObject, IIUC), then we can connect that object
> to a device.  However I'm a bit confused on why we've got two abstract
> layers (the other one is IOMMUContext)?  That was previously for the
> whole SVA new APIs, now it's all moved over to the other new object,
> then IOMMUContext only register/unregister...

Your understanding is correct. Actually, I also struggled on adding two
abstract layer. But, you know, there are two function calling requirements
around vSVA enabling. First one is explicit method for vIOMMU calls into
VFIO (pasid allocation, bind guest page table, cache invalidate). Second
one is explicit method for VFIO calls into vIOMMU (DMA fault/PRQ injection
which is not included in this series yet, but will be upstreamed later). 
So I added the DualStageIOMMUObject to cover vIOMMU to VFIO callings, and
IOMMUContext to cover VFIO to vIOMMU callings. As IOMMUContext covers VFIO
to vIOMMU callings, so I made it include register/unregister.

> Can we put the reg/unreg
> procedures into DualStageIOMMUObject as well?  Then we drop the
> IOMMUContext (or say, keep IOMMUContext and drop DualStageIOMMUObject
> but let IOMMUContext to be per-vfio-container, the major difference is
> the naming here, say, PASID allocation does not seem to be related to
> dual-stage at all).
>
> Besides that, not sure I read it right... but even with your current
> series, the container->iommu_ctx will always only be bound to the
> first device created within that container, since you've got:
> 
>     group = vfio_get_group(groupid, pci_device_iommu_address_space(pdev),
>                            pci_device_iommu_context(pdev), errp);
> 
> And:
> 
>     if (vfio_connect_container(group, as, iommu_ctx, errp)) {
>         error_prepend(errp, "failed to setup container for group %d: ",
>                       groupid);
>         goto close_fd_exit;
>     }
> 
> The iommu_ctx will be set to container->iommu_ctx if there's no
> existing container.

yes, it's true. May need to add a iommu_ctx list in VFIO container or
add check on the input iommu_ctx of vfio_get_group() if sticking on this
direction.

While considering your suggestion on dropping one of the two abstract
layers. I came up a new proposal as below:

We may drop the IOMMUContext in this series, and rename DualStageIOMMUObject
to HostIOMMUContext, which is per-vfio-container. Add an interface in PCI
layer(e.g. an callback in  PCIDevice) to let vIOMMU get HostIOMMUContext.
I think this could cover the requirement of providing explicit method for
vIOMMU to call into VFIO and then program host IOMMU.

While for the requirement of VFIO to vIOMMU callings (e.g. PRQ), I think it
could be done via PCI layer by adding an operation in PCIIOMMUOps. Thoughts?

Thanks,
Yi Liu
Peter Xu Feb. 12, 2020, 3:59 p.m. UTC | #5
On Wed, Feb 12, 2020 at 07:15:13AM +0000, Liu, Yi L wrote:

[...]

> While considering your suggestion on dropping one of the two abstract
> layers. I came up a new proposal as below:
> 
> We may drop the IOMMUContext in this series, and rename DualStageIOMMUObject
> to HostIOMMUContext, which is per-vfio-container. Add an interface in PCI
> layer(e.g. an callback in  PCIDevice) to let vIOMMU get HostIOMMUContext.
> I think this could cover the requirement of providing explicit method for
> vIOMMU to call into VFIO and then program host IOMMU.
> 
> While for the requirement of VFIO to vIOMMU callings (e.g. PRQ), I think it
> could be done via PCI layer by adding an operation in PCIIOMMUOps. Thoughts?

Hmm sounds good. :)

The thing is for the calls to the other direction (e.g. VFIO injecting
faults to vIOMMU), that's neither per-container nor per-device, but
per-vIOMMU.  PCIIOMMUOps suites for that job I'd say, which is per-vIOMMU.

Let's see how it goes.
Yi Liu Feb. 13, 2020, 2:46 a.m. UTC | #6
> From: Peter Xu <peterx@redhat.com>
> Sent: Thursday, February 13, 2020 12:00 AM
> To: Liu, Yi L <yi.l.liu@intel.com>
> Subject: Re: [RFC v3 03/25] hw/iommu: introduce IOMMUContext
> 
> On Wed, Feb 12, 2020 at 07:15:13AM +0000, Liu, Yi L wrote:
> 
> [...]
> 
> > While considering your suggestion on dropping one of the two abstract
> > layers. I came up a new proposal as below:
> >
> > We may drop the IOMMUContext in this series, and rename
> > DualStageIOMMUObject to HostIOMMUContext, which is per-vfio-container.
> > Add an interface in PCI layer(e.g. an callback in  PCIDevice) to let vIOMMU get
> HostIOMMUContext.
> > I think this could cover the requirement of providing explicit method
> > for vIOMMU to call into VFIO and then program host IOMMU.
> >
> > While for the requirement of VFIO to vIOMMU callings (e.g. PRQ), I
> > think it could be done via PCI layer by adding an operation in PCIIOMMUOps.
> Thoughts?
> 
> Hmm sounds good. :)
> 
> The thing is for the calls to the other direction (e.g. VFIO injecting faults to
> vIOMMU), that's neither per-container nor per-device, but per-vIOMMU.
> PCIIOMMUOps suites for that job I'd say, which is per-vIOMMU.
> 
> Let's see how it goes.

Thanks, let me get a new version by end-of this week.

Regards,
Yi Liu
David Gibson Feb. 14, 2020, 5:36 a.m. UTC | #7
On Wed, Feb 12, 2020 at 07:15:13AM +0000, Liu, Yi L wrote:
> Hi Peter,
> 
> > From: Peter Xu <peterx@redhat.com>
> > Sent: Wednesday, February 12, 2020 12:59 AM
> > To: Liu, Yi L <yi.l.liu@intel.com>
> > Subject: Re: [RFC v3 03/25] hw/iommu: introduce IOMMUContext
> > 
> > On Fri, Jan 31, 2020 at 11:42:13AM +0000, Liu, Yi L wrote:
> > > > I'm not very clear on the relationship betwen an IOMMUContext and a
> > > > DualStageIOMMUObject.  Can there be many IOMMUContexts to a
> > > > DualStageIOMMUOBject?  The other way around?  Or is it just
> > > > zero-or-one DualStageIOMMUObjects to an IOMMUContext?
> > >
> > > It is possible. As the below patch shows, DualStageIOMMUObject is per vfio
> > > container. IOMMUContext can be either per-device or shared across devices,
> > > it depends on vendor specific vIOMMU emulators.
> > 
> > Is there an example when an IOMMUContext can be not per-device?
> 
> No, I don’t have such example so far. But as IOMMUContext is got from
> pci_device_iommu_context(),  in concept it possible to be not per-device.
> It is kind of leave to vIOMMU to decide if different devices could share a
> single IOMMUContext.

On the "pseries" machine the vIOMMU only has one set of translations
for a whole virtual PCI Host Bridge (vPHB).  So if you attach multiple
devices to a single vPHB, I believe you'd get multiple devices in an
IOMMUContext.  Well.. if we did the PASID stuff, which we don't at the
moment.

Note that on pseries on the other hand it's routine to create multiple
vPHBs, rather than multiple PCI roots being an oddity as it is on x86.
Yi Liu Feb. 15, 2020, 6:25 a.m. UTC | #8
> From: David Gibson < david@gibson.dropbear.id.au >
> Sent: Friday, February 14, 2020 1:36 PM
> To: Liu, Yi L <yi.l.liu@intel.com>
> Subject: Re: [RFC v3 03/25] hw/iommu: introduce IOMMUContext
> 
> On Wed, Feb 12, 2020 at 07:15:13AM +0000, Liu, Yi L wrote:
> > Hi Peter,
> >
> > > From: Peter Xu <peterx@redhat.com>
> > > Sent: Wednesday, February 12, 2020 12:59 AM
> > > To: Liu, Yi L <yi.l.liu@intel.com>
> > > Subject: Re: [RFC v3 03/25] hw/iommu: introduce IOMMUContext
> > >
> > > On Fri, Jan 31, 2020 at 11:42:13AM +0000, Liu, Yi L wrote:
> > > > > I'm not very clear on the relationship betwen an IOMMUContext and a
> > > > > DualStageIOMMUObject.  Can there be many IOMMUContexts to a
> > > > > DualStageIOMMUOBject?  The other way around?  Or is it just
> > > > > zero-or-one DualStageIOMMUObjects to an IOMMUContext?
> > > >
> > > > It is possible. As the below patch shows, DualStageIOMMUObject is per vfio
> > > > container. IOMMUContext can be either per-device or shared across devices,
> > > > it depends on vendor specific vIOMMU emulators.
> > >
> > > Is there an example when an IOMMUContext can be not per-device?
> >
> > No, I don’t have such example so far. But as IOMMUContext is got from
> > pci_device_iommu_context(),  in concept it possible to be not per-device.
> > It is kind of leave to vIOMMU to decide if different devices could share a
> > single IOMMUContext.
> 
> On the "pseries" machine the vIOMMU only has one set of translations
> for a whole virtual PCI Host Bridge (vPHB).  So if you attach multiple
> devices to a single vPHB, I believe you'd get multiple devices in an
> IOMMUContext.  Well.. if we did the PASID stuff, which we don't at the
> moment.
> 
> Note that on pseries on the other hand it's routine to create multiple
> vPHBs, rather than multiple PCI roots being an oddity as it is on x86.

Thanks for the example, David. :-) BTW. I'll drop IOMMUContext in next version
as the email below mentioned.  Please feel free let me know your opinion.

https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg02874.html

Regards,
Yi Liu
diff mbox series

Patch

diff --git a/hw/iommu/Makefile.objs b/hw/iommu/Makefile.objs
index d4f3b39..1e45072 100644
--- a/hw/iommu/Makefile.objs
+++ b/hw/iommu/Makefile.objs
@@ -1 +1,2 @@ 
 obj-y += dual_stage_iommu.o
+obj-y += iommu_context.o
diff --git a/hw/iommu/iommu_context.c b/hw/iommu/iommu_context.c
new file mode 100644
index 0000000..6340ca3
--- /dev/null
+++ b/hw/iommu/iommu_context.c
@@ -0,0 +1,54 @@ 
+/*
+ * QEMU abstract of vIOMMU context
+ *
+ * Copyright (C) 2020 Red Hat Inc.
+ *
+ * Authors: Peter Xu <peterx@redhat.com>,
+ *          Liu Yi L <yi.l.liu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/iommu/iommu_context.h"
+
+int iommu_context_register_ds_iommu(IOMMUContext *iommu_ctx,
+                                    DualStageIOMMUObject *dsi_obj)
+{
+    if (!iommu_ctx || !dsi_obj) {
+        return -ENOENT;
+    }
+
+    if (iommu_ctx->ops && iommu_ctx->ops->register_ds_iommu) {
+        return iommu_ctx->ops->register_ds_iommu(iommu_ctx, dsi_obj);
+    }
+    return -ENOENT;
+}
+
+void iommu_context_unregister_ds_iommu(IOMMUContext *iommu_ctx,
+                                      DualStageIOMMUObject *dsi_obj)
+{
+    if (!iommu_ctx || !dsi_obj) {
+        return;
+    }
+
+    if (iommu_ctx->ops && iommu_ctx->ops->unregister_ds_iommu) {
+        iommu_ctx->ops->unregister_ds_iommu(iommu_ctx, dsi_obj);
+    }
+}
+
+void iommu_context_init(IOMMUContext *iommu_ctx, IOMMUContextOps *ops)
+{
+    iommu_ctx->ops = ops;
+}
diff --git a/include/hw/iommu/iommu_context.h b/include/hw/iommu/iommu_context.h
new file mode 100644
index 0000000..6f2ccb5
--- /dev/null
+++ b/include/hw/iommu/iommu_context.h
@@ -0,0 +1,61 @@ 
+/*
+ * QEMU abstraction of IOMMU Context
+ *
+ * Copyright (C) 2020 Red Hat Inc.
+ *
+ * Authors: Peter Xu <peterx@redhat.com>,
+ *          Liu, Yi L <yi.l.liu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_IOMMU_CONTEXT_H
+#define HW_IOMMU_CONTEXT_H
+
+#include "qemu/queue.h"
+#ifndef CONFIG_USER_ONLY
+#include "exec/hwaddr.h"
+#endif
+#include "hw/iommu/dual_stage_iommu.h"
+
+typedef struct IOMMUContext IOMMUContext;
+typedef struct IOMMUContextOps IOMMUContextOps;
+
+struct IOMMUContextOps {
+    /*
+     * Register DualStageIOMMUObject to vIOMMU thus vIOMMU
+     * is aware of dual stage translation capability, and
+     * also be able to setup dual stage translation via
+     * interfaces exposed by DualStageIOMMUObject.
+     */
+    int (*register_ds_iommu)(IOMMUContext *iommu_ctx,
+                             DualStageIOMMUObject *dsi_obj);
+    void (*unregister_ds_iommu)(IOMMUContext *iommu_ctx,
+                                DualStageIOMMUObject *dsi_obj);
+};
+
+/*
+ * This is an abstraction of IOMMU context.
+ */
+struct IOMMUContext {
+    IOMMUContextOps *ops;
+};
+
+int iommu_context_register_ds_iommu(IOMMUContext *iommu_ctx,
+                                    DualStageIOMMUObject *dsi_obj);
+void iommu_context_unregister_ds_iommu(IOMMUContext *iommu_ctx,
+                                       DualStageIOMMUObject *dsi_obj);
+void iommu_context_init(IOMMUContext *iommu_ctx, IOMMUContextOps *ops);
+
+#endif