mbox series

[RFC,v4,00/16] VIRTIO-IOMMU device

Message ID 1505807208-9063-1-git-send-email-eric.auger@redhat.com
Headers show
Series VIRTIO-IOMMU device | expand

Message

Eric Auger Sept. 19, 2017, 7:46 a.m. UTC
This series implements the virtio-iommu device.

This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
- probe request support although no reserved region is returned at
  the moment
- unmap semantics less strict, as specified in v0.4
- device registration, attach/detach revisited
- split into smaller patches to ease review
- propose a way to inform the IOMMU mr about the page_size_mask
  of underlying HW IOMMU, if any
- remove warning associated with the translation of the MSI doorbell

The device gets instantiated using the "-device virtio-iommu-device"
option. It currently works with ARM virt machine only, as the machine
must handle the dt binding between the virtio-mmio "iommu" node and
the PCI host bridge node.

The associated VHOST/VFIO adaptation is available on the branch
below but is not officially delivered as part of this series.

Best Regards

Eric

This series can be found at:
https://github.com/eauger/qemu/tree/v2.10.0-virtio-iommu-v4

References:
[1] [RFC] virtio-iommu version 0.4
    git://linux-arm.org/virtio-iommu.git branch viommu/v0.4

Testing:
- guest kernel with v0.4 virtio-iommu driver (4kB page)
- tested with guest using virtio-pci-net and vhost-net
  (,vhost=on/off,iommu_platform,disable-modern=off,disable-legacy=on  )
- tested with VFIO and guest assigned with 2 VFs
- tested with DPDK on guest with 2 assigned VFs

Not tested:
- hot-plug/hot-unplug of EP: not implemented

History:
v2-> v4:
- see above

v2 -> v3:
- rebase on top of 2.10-rc0 and especially
  [PATCH qemu v9 0/2] memory/iommu: QOM'fy IOMMU MemoryRegion
- add mutex init
- fix as->mappings deletion using g_tree_ref/unref
- when a dev is attached whereas it is already attached to
  another address space, first detach it
- fix some error values
- page_sizes = TARGET_PAGE_MASK;
- I haven't changed the unmap() semantics yet, waiting for the
  next virtio-iommu spec revision.

v1 -> v2:
- fix redifinition of viommu_as typedef

Eric Auger (16):
  update-linux-headers: import virtio_iommu.h
  linux-headers: Update for virtio-iommu
  virtio-iommu: add skeleton
  virtio-iommu: Decode the command payload
  virtio-iommu: Add the iommu regions
  virtio-iommu: Register attached devices
  virtio-iommu: Implement attach/detach command
  virtio-iommu: Implement map/unmap
  virtio-iommu: Implement translate
  virtio-iommu: Implement probe request
  hw/arm/virt: Add 2.11 machine type
  hw/arm/virt: Add virtio-iommu to the virt board
  memory.h: Add set_page_size_mask IOMMUMemoryRegion callback
  hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes
  virtio-iommu: Implement set_page_size_mask
  hw/vfio/common: Do not print error when viommu translates into an mmio
    region

 hw/arm/virt.c                                 | 115 +++-
 hw/vfio/common.c                              |   7 +-
 hw/virtio/Makefile.objs                       |   1 +
 hw/virtio/trace-events                        |  24 +
 hw/virtio/virtio-iommu.c                      | 923 ++++++++++++++++++++++++++
 include/exec/memory.h                         |   4 +
 include/hw/arm/virt.h                         |   5 +
 include/hw/vfio/vfio-common.h                 |   1 +
 include/hw/virtio/virtio-iommu.h              |  61 ++
 include/standard-headers/linux/virtio_ids.h   |   1 +
 include/standard-headers/linux/virtio_iommu.h | 177 +++++
 linux-headers/linux/virtio_iommu.h            |   1 +
 scripts/update-linux-headers.sh               |   3 +
 13 files changed, 1312 insertions(+), 11 deletions(-)
 create mode 100644 hw/virtio/virtio-iommu.c
 create mode 100644 include/hw/virtio/virtio-iommu.h
 create mode 100644 include/standard-headers/linux/virtio_iommu.h
 create mode 100644 linux-headers/linux/virtio_iommu.h

Comments

Tomasz Nowicki Sept. 27, 2017, 11:07 a.m. UTC | #1
Hi Eric,

Out of curiosity, I compared your SMMUv3 full emulation against 
virtio-iommu. For virtio-net device behind the virtio-iommu I get 50% 
performance of what I can get for SMMUv3 emulation.

Do you have similar observations? Since there is no need to emulate HW 
behaviour in QEMU I expected virtio-iommu to be faster. I will run some 
more benchmarks.

Thanks,
Tomasz

On 19.09.2017 09:46, Eric Auger wrote:
> This series implements the virtio-iommu device.
> 
> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
> - probe request support although no reserved region is returned at
>    the moment
> - unmap semantics less strict, as specified in v0.4
> - device registration, attach/detach revisited
> - split into smaller patches to ease review
> - propose a way to inform the IOMMU mr about the page_size_mask
>    of underlying HW IOMMU, if any
> - remove warning associated with the translation of the MSI doorbell
> 
> The device gets instantiated using the "-device virtio-iommu-device"
> option. It currently works with ARM virt machine only, as the machine
> must handle the dt binding between the virtio-mmio "iommu" node and
> the PCI host bridge node.
> 
> The associated VHOST/VFIO adaptation is available on the branch
> below but is not officially delivered as part of this series.
> 
> Best Regards
> 
> Eric
> 
> This series can be found at:
> https://github.com/eauger/qemu/tree/v2.10.0-virtio-iommu-v4
> 
> References:
> [1] [RFC] virtio-iommu version 0.4
>      git://linux-arm.org/virtio-iommu.git branch viommu/v0.4
> 
> Testing:
> - guest kernel with v0.4 virtio-iommu driver (4kB page)
> - tested with guest using virtio-pci-net and vhost-net
>    (,vhost=on/off,iommu_platform,disable-modern=off,disable-legacy=on  )
> - tested with VFIO and guest assigned with 2 VFs
> - tested with DPDK on guest with 2 assigned VFs
> 
> Not tested:
> - hot-plug/hot-unplug of EP: not implemented
> 
> History:
> v2-> v4:
> - see above
> 
> v2 -> v3:
> - rebase on top of 2.10-rc0 and especially
>    [PATCH qemu v9 0/2] memory/iommu: QOM'fy IOMMU MemoryRegion
> - add mutex init
> - fix as->mappings deletion using g_tree_ref/unref
> - when a dev is attached whereas it is already attached to
>    another address space, first detach it
> - fix some error values
> - page_sizes = TARGET_PAGE_MASK;
> - I haven't changed the unmap() semantics yet, waiting for the
>    next virtio-iommu spec revision.
> 
> v1 -> v2:
> - fix redifinition of viommu_as typedef
> 
> Eric Auger (16):
>    update-linux-headers: import virtio_iommu.h
>    linux-headers: Update for virtio-iommu
>    virtio-iommu: add skeleton
>    virtio-iommu: Decode the command payload
>    virtio-iommu: Add the iommu regions
>    virtio-iommu: Register attached devices
>    virtio-iommu: Implement attach/detach command
>    virtio-iommu: Implement map/unmap
>    virtio-iommu: Implement translate
>    virtio-iommu: Implement probe request
>    hw/arm/virt: Add 2.11 machine type
>    hw/arm/virt: Add virtio-iommu to the virt board
>    memory.h: Add set_page_size_mask IOMMUMemoryRegion callback
>    hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes
>    virtio-iommu: Implement set_page_size_mask
>    hw/vfio/common: Do not print error when viommu translates into an mmio
>      region
> 
>   hw/arm/virt.c                                 | 115 +++-
>   hw/vfio/common.c                              |   7 +-
>   hw/virtio/Makefile.objs                       |   1 +
>   hw/virtio/trace-events                        |  24 +
>   hw/virtio/virtio-iommu.c                      | 923 ++++++++++++++++++++++++++
>   include/exec/memory.h                         |   4 +
>   include/hw/arm/virt.h                         |   5 +
>   include/hw/vfio/vfio-common.h                 |   1 +
>   include/hw/virtio/virtio-iommu.h              |  61 ++
>   include/standard-headers/linux/virtio_ids.h   |   1 +
>   include/standard-headers/linux/virtio_iommu.h | 177 +++++
>   linux-headers/linux/virtio_iommu.h            |   1 +
>   scripts/update-linux-headers.sh               |   3 +
>   13 files changed, 1312 insertions(+), 11 deletions(-)
>   create mode 100644 hw/virtio/virtio-iommu.c
>   create mode 100644 include/hw/virtio/virtio-iommu.h
>   create mode 100644 include/standard-headers/linux/virtio_iommu.h
>   create mode 100644 linux-headers/linux/virtio_iommu.h
>
Eric Auger Sept. 27, 2017, 3:38 p.m. UTC | #2
Hi Tomasz

On 27/09/2017 13:07, Tomasz Nowicki wrote:
> Hi Eric,
> 
> Out of curiosity, I compared your SMMUv3 full emulation against
> virtio-iommu. For virtio-net device behind the virtio-iommu I get 50%
> performance of what I can get for SMMUv3 emulation.
> 
> Do you have similar observations? Since there is no need to emulate HW
> behaviour in QEMU I expected virtio-iommu to be faster. I will run some
> more benchmarks.

No I don't have formal comparative figures at the moment. Benchmarking
is also on the top of my todo list though.

One reason could be that virtio-iommu sends requests for each map
operations whereas this does not happen for vsmmuv3 (in non caching
mode). You could try to turn the vsmmuv3 caching-mode option and I guess
you should have more similar perf figures between virtio-iommu and
vsmuv3.

Thanks

Eric
> 
> Thanks,
> Tomasz
> 
> On 19.09.2017 09:46, Eric Auger wrote:
>> This series implements the virtio-iommu device.
>>
>> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
>> - probe request support although no reserved region is returned at
>>    the moment
>> - unmap semantics less strict, as specified in v0.4
>> - device registration, attach/detach revisited
>> - split into smaller patches to ease review
>> - propose a way to inform the IOMMU mr about the page_size_mask
>>    of underlying HW IOMMU, if any
>> - remove warning associated with the translation of the MSI doorbell
>>
>> The device gets instantiated using the "-device virtio-iommu-device"
>> option. It currently works with ARM virt machine only, as the machine
>> must handle the dt binding between the virtio-mmio "iommu" node and
>> the PCI host bridge node.
>>
>> The associated VHOST/VFIO adaptation is available on the branch
>> below but is not officially delivered as part of this series.
>>
>> Best Regards
>>
>> Eric
>>
>> This series can be found at:
>> https://github.com/eauger/qemu/tree/v2.10.0-virtio-iommu-v4
>>
>> References:
>> [1] [RFC] virtio-iommu version 0.4
>>      git://linux-arm.org/virtio-iommu.git branch viommu/v0.4
>>
>> Testing:
>> - guest kernel with v0.4 virtio-iommu driver (4kB page)
>> - tested with guest using virtio-pci-net and vhost-net
>>    (,vhost=on/off,iommu_platform,disable-modern=off,disable-legacy=on  )
>> - tested with VFIO and guest assigned with 2 VFs
>> - tested with DPDK on guest with 2 assigned VFs
>>
>> Not tested:
>> - hot-plug/hot-unplug of EP: not implemented
>>
>> History:
>> v2-> v4:
>> - see above
>>
>> v2 -> v3:
>> - rebase on top of 2.10-rc0 and especially
>>    [PATCH qemu v9 0/2] memory/iommu: QOM'fy IOMMU MemoryRegion
>> - add mutex init
>> - fix as->mappings deletion using g_tree_ref/unref
>> - when a dev is attached whereas it is already attached to
>>    another address space, first detach it
>> - fix some error values
>> - page_sizes = TARGET_PAGE_MASK;
>> - I haven't changed the unmap() semantics yet, waiting for the
>>    next virtio-iommu spec revision.
>>
>> v1 -> v2:
>> - fix redifinition of viommu_as typedef
>>
>> Eric Auger (16):
>>    update-linux-headers: import virtio_iommu.h
>>    linux-headers: Update for virtio-iommu
>>    virtio-iommu: add skeleton
>>    virtio-iommu: Decode the command payload
>>    virtio-iommu: Add the iommu regions
>>    virtio-iommu: Register attached devices
>>    virtio-iommu: Implement attach/detach command
>>    virtio-iommu: Implement map/unmap
>>    virtio-iommu: Implement translate
>>    virtio-iommu: Implement probe request
>>    hw/arm/virt: Add 2.11 machine type
>>    hw/arm/virt: Add virtio-iommu to the virt board
>>    memory.h: Add set_page_size_mask IOMMUMemoryRegion callback
>>    hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes
>>    virtio-iommu: Implement set_page_size_mask
>>    hw/vfio/common: Do not print error when viommu translates into an mmio
>>      region
>>
>>   hw/arm/virt.c                                 | 115 +++-
>>   hw/vfio/common.c                              |   7 +-
>>   hw/virtio/Makefile.objs                       |   1 +
>>   hw/virtio/trace-events                        |  24 +
>>   hw/virtio/virtio-iommu.c                      | 923
>> ++++++++++++++++++++++++++
>>   include/exec/memory.h                         |   4 +
>>   include/hw/arm/virt.h                         |   5 +
>>   include/hw/vfio/vfio-common.h                 |   1 +
>>   include/hw/virtio/virtio-iommu.h              |  61 ++
>>   include/standard-headers/linux/virtio_ids.h   |   1 +
>>   include/standard-headers/linux/virtio_iommu.h | 177 +++++
>>   linux-headers/linux/virtio_iommu.h            |   1 +
>>   scripts/update-linux-headers.sh               |   3 +
>>   13 files changed, 1312 insertions(+), 11 deletions(-)
>>   create mode 100644 hw/virtio/virtio-iommu.c
>>   create mode 100644 include/hw/virtio/virtio-iommu.h
>>   create mode 100644 include/standard-headers/linux/virtio_iommu.h
>>   create mode 100644 linux-headers/linux/virtio_iommu.h
>>
>
Peter Maydell Oct. 11, 2017, 2:56 p.m. UTC | #3
On 19 September 2017 at 08:46, Eric Auger <eric.auger@redhat.com> wrote:
> This series implements the virtio-iommu device.
>
> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
> - probe request support although no reserved region is returned at
>   the moment
> - unmap semantics less strict, as specified in v0.4
> - device registration, attach/detach revisited
> - split into smaller patches to ease review
> - propose a way to inform the IOMMU mr about the page_size_mask
>   of underlying HW IOMMU, if any
> - remove warning associated with the translation of the MSI doorbell
>
> The device gets instantiated using the "-device virtio-iommu-device"
> option. It currently works with ARM virt machine only, as the machine
> must handle the dt binding between the virtio-mmio "iommu" node and
> the PCI host bridge node.

Could this work on x86, or is it inherently arm-only?

thanks
-- PMM
Eric Auger Oct. 11, 2017, 4:08 p.m. UTC | #4
Hi Peter,

On 11/10/2017 16:56, Peter Maydell wrote:
> On 19 September 2017 at 08:46, Eric Auger <eric.auger@redhat.com> wrote:
>> This series implements the virtio-iommu device.
>>
>> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
>> - probe request support although no reserved region is returned at
>>   the moment
>> - unmap semantics less strict, as specified in v0.4
>> - device registration, attach/detach revisited
>> - split into smaller patches to ease review
>> - propose a way to inform the IOMMU mr about the page_size_mask
>>   of underlying HW IOMMU, if any
>> - remove warning associated with the translation of the MSI doorbell
>>
>> The device gets instantiated using the "-device virtio-iommu-device"
>> option. It currently works with ARM virt machine only, as the machine
>> must handle the dt binding between the virtio-mmio "iommu" node and
>> the PCI host bridge node.
> 
> Could this work on x86, or is it inherently arm-only?

Yes this is the goal. At the moment the ACPI probing is not yet properly
specified but a Q35 prototype was developed in the Red Hat Virt team.
This will be presented at the KVM forum.

Thanks

Eric
> 
> thanks
> -- PMM
>
Peter Maydell Oct. 12, 2017, 9:54 a.m. UTC | #5
On 11 October 2017 at 17:08, Auger Eric <eric.auger@redhat.com> wrote:
> Hi Peter,
>
> On 11/10/2017 16:56, Peter Maydell wrote:
>> On 19 September 2017 at 08:46, Eric Auger <eric.auger@redhat.com> wrote:
>>> This series implements the virtio-iommu device.
>>>
>>> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
>>> - probe request support although no reserved region is returned at
>>>   the moment
>>> - unmap semantics less strict, as specified in v0.4
>>> - device registration, attach/detach revisited
>>> - split into smaller patches to ease review
>>> - propose a way to inform the IOMMU mr about the page_size_mask
>>>   of underlying HW IOMMU, if any
>>> - remove warning associated with the translation of the MSI doorbell
>>>
>>> The device gets instantiated using the "-device virtio-iommu-device"
>>> option. It currently works with ARM virt machine only, as the machine
>>> must handle the dt binding between the virtio-mmio "iommu" node and
>>> the PCI host bridge node.
>>
>> Could this work on x86, or is it inherently arm-only?
>
> Yes this is the goal. At the moment the ACPI probing is not yet properly
> specified but a Q35 prototype was developed in the Red Hat Virt team.
> This will be presented at the KVM forum.

Since I have very little familiarity with virtio or iommu code,
I'd be much happier if this was reviewed as a generic virtio-iommu
by the x86/virtio devs and then the arm specific parts done second...

I'm also not clear on what we're expecting the recommended or normal
way to do device passthrough is going to be -- this virtio-mmio,
or presenting the guest with an SMMUv3 interface? Do we really
need to implement both ?

thanks
-- PMM
Eric Auger Oct. 12, 2017, 10:09 a.m. UTC | #6
Hi Peter,

On 12/10/2017 11:54, Peter Maydell wrote:
> On 11 October 2017 at 17:08, Auger Eric <eric.auger@redhat.com> wrote:
>> Hi Peter,
>>
>> On 11/10/2017 16:56, Peter Maydell wrote:
>>> On 19 September 2017 at 08:46, Eric Auger <eric.auger@redhat.com> wrote:
>>>> This series implements the virtio-iommu device.
>>>>
>>>> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
>>>> - probe request support although no reserved region is returned at
>>>>   the moment
>>>> - unmap semantics less strict, as specified in v0.4
>>>> - device registration, attach/detach revisited
>>>> - split into smaller patches to ease review
>>>> - propose a way to inform the IOMMU mr about the page_size_mask
>>>>   of underlying HW IOMMU, if any
>>>> - remove warning associated with the translation of the MSI doorbell
>>>>
>>>> The device gets instantiated using the "-device virtio-iommu-device"
>>>> option. It currently works with ARM virt machine only, as the machine
>>>> must handle the dt binding between the virtio-mmio "iommu" node and
>>>> the PCI host bridge node.
>>>
>>> Could this work on x86, or is it inherently arm-only?
>>
>> Yes this is the goal. At the moment the ACPI probing is not yet properly
>> specified but a Q35 prototype was developed in the Red Hat Virt team.
>> This will be presented at the KVM forum.
> 
> Since I have very little familiarity with virtio or iommu code,
> I'd be much happier if this was reviewed as a generic virtio-iommu
> by the x86/virtio devs and then the arm specific parts done second...

Understood. I was rather expecting you to review the smmuv3 emulation
code which you did, in a comprehensive manner ;-), and many thanks for that.

Note sure this is time yet to get this RFC reviewed as
- the v0.4 virtio-iommu driver it relies on was not officially submitted,
- the virtio-iommu specification review has not really been reviewed,
- the ACPI probing method has not been discussed yet.

Jean-Philippe, please correct me if I am wrong.

So to me, this is pure RFC at the moment.
> 
> I'm also not clear on what we're expecting the recommended or normal
> way to do device passthrough is going to be -- this virtio-mmio,
> or presenting the guest with an SMMUv3 interface? Do we really
> need to implement both ?

I think the KVM forum is the right place to sync as both approaches will
be presented and some pros/cons + performance figures will be given.

As we talk about choosing, there is one alternative that was suggested
on the ML by Alex & Michael but never really get considered yet and
maybe should be: using intel iommu emulation code for ARM. I aknowledge
this deserves a thorough impact study on kernel and FW side but I would
be happy to get your opinion about the QEMU side. Would you have a by-
principle rejection of this idea to instantiate such an Intel device in
mach virt or would it be something you would be ready to consider?

Thanks

Eric


> 
> thanks
> -- PMM
>
Jean-Philippe Brucker Oct. 12, 2017, 10:46 a.m. UTC | #7
On 12/10/17 11:09, Auger Eric wrote:
> Hi Peter,
> 
> On 12/10/2017 11:54, Peter Maydell wrote:
>> On 11 October 2017 at 17:08, Auger Eric <eric.auger@redhat.com> wrote:
>>> Hi Peter,
>>>
>>> On 11/10/2017 16:56, Peter Maydell wrote:
>>>> On 19 September 2017 at 08:46, Eric Auger <eric.auger@redhat.com> wrote:
>>>>> This series implements the virtio-iommu device.
>>>>>
>>>>> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
>>>>> - probe request support although no reserved region is returned at
>>>>>   the moment
>>>>> - unmap semantics less strict, as specified in v0.4
>>>>> - device registration, attach/detach revisited
>>>>> - split into smaller patches to ease review
>>>>> - propose a way to inform the IOMMU mr about the page_size_mask
>>>>>   of underlying HW IOMMU, if any
>>>>> - remove warning associated with the translation of the MSI doorbell
>>>>>
>>>>> The device gets instantiated using the "-device virtio-iommu-device"
>>>>> option. It currently works with ARM virt machine only, as the machine
>>>>> must handle the dt binding between the virtio-mmio "iommu" node and
>>>>> the PCI host bridge node.
>>>>
>>>> Could this work on x86, or is it inherently arm-only?
>>>
>>> Yes this is the goal. At the moment the ACPI probing is not yet properly
>>> specified but a Q35 prototype was developed in the Red Hat Virt team.
>>> This will be presented at the KVM forum.
>>
>> Since I have very little familiarity with virtio or iommu code,
>> I'd be much happier if this was reviewed as a generic virtio-iommu
>> by the x86/virtio devs and then the arm specific parts done second...
> 
> Understood. I was rather expecting you to review the smmuv3 emulation
> code which you did, in a comprehensive manner ;-), and many thanks for that.
> 
> Note sure this is time yet to get this RFC reviewed as
> - the v0.4 virtio-iommu driver it relies on was not officially submitted,
> - the virtio-iommu specification review has not really been reviewed,
> - the ACPI probing method has not been discussed yet.
> 
> Jean-Philippe, please correct me if I am wrong.
> 
> So to me, this is pure RFC at the moment.

Yes, having it as an RFC for the moment allowed to break compatibility
between versions of the specification. The downside is that it probably
doesn't get as many eyeballs as it would without an RFC tag (although I
did receive tonnes of helpful comments). Maybe v0.5, that should be
published shortly, can be a candidate for mainline.

Thanks,
Jean

>> I'm also not clear on what we're expecting the recommended or normal
>> way to do device passthrough is going to be -- this virtio-mmio,
>> or presenting the guest with an SMMUv3 interface? Do we really
>> need to implement both ?
> 
> I think the KVM forum is the right place to sync as both approaches will
> be presented and some pros/cons + performance figures will be given.
> 
> As we talk about choosing, there is one alternative that was suggested
> on the ML by Alex & Michael but never really get considered yet and
> maybe should be: using intel iommu emulation code for ARM. I aknowledge
> this deserves a thorough impact study on kernel and FW side but I would
> be happy to get your opinion about the QEMU side. Would you have a by-
> principle rejection of this idea to instantiate such an Intel device in
> mach virt or would it be something you would be ready to consider?
> 
> Thanks
> 
> Eric
> 
> 
>>
>> thanks
>> -- PMM
>>
Tian, Kevin Oct. 13, 2017, 7:01 a.m. UTC | #8
> From: Auger Eric [mailto:eric.auger@redhat.com]

> Sent: Thursday, October 12, 2017 6:10 PM

> 

> Hi Peter,

> 

> On 12/10/2017 11:54, Peter Maydell wrote:

> > On 11 October 2017 at 17:08, Auger Eric <eric.auger@redhat.com> wrote:

> >> Hi Peter,

> >>

> >> On 11/10/2017 16:56, Peter Maydell wrote:

> >>> On 19 September 2017 at 08:46, Eric Auger <eric.auger@redhat.com>

> wrote:

> >>>> This series implements the virtio-iommu device.

> >>>>

> >>>> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.

> >>>> - probe request support although no reserved region is returned at

> >>>>   the moment

> >>>> - unmap semantics less strict, as specified in v0.4

> >>>> - device registration, attach/detach revisited

> >>>> - split into smaller patches to ease review

> >>>> - propose a way to inform the IOMMU mr about the page_size_mask

> >>>>   of underlying HW IOMMU, if any

> >>>> - remove warning associated with the translation of the MSI doorbell

> >>>>

> >>>> The device gets instantiated using the "-device virtio-iommu-device"

> >>>> option. It currently works with ARM virt machine only, as the machine

> >>>> must handle the dt binding between the virtio-mmio "iommu" node

> and

> >>>> the PCI host bridge node.

> >>>

> >>> Could this work on x86, or is it inherently arm-only?

> >>

> >> Yes this is the goal. At the moment the ACPI probing is not yet properly

> >> specified but a Q35 prototype was developed in the Red Hat Virt team.

> >> This will be presented at the KVM forum.

> >

> > Since I have very little familiarity with virtio or iommu code,

> > I'd be much happier if this was reviewed as a generic virtio-iommu

> > by the x86/virtio devs and then the arm specific parts done second...

> 

> Understood. I was rather expecting you to review the smmuv3 emulation

> code which you did, in a comprehensive manner ;-), and many thanks for

> that.

> 

> Note sure this is time yet to get this RFC reviewed as

> - the v0.4 virtio-iommu driver it relies on was not officially submitted,

> - the virtio-iommu specification review has not really been reviewed,

> - the ACPI probing method has not been discussed yet.

> 

> Jean-Philippe, please correct me if I am wrong.

> 

> So to me, this is pure RFC at the moment.

> >

> > I'm also not clear on what we're expecting the recommended or normal

> > way to do device passthrough is going to be -- this virtio-mmio,

> > or presenting the guest with an SMMUv3 interface? Do we really

> > need to implement both ?

> 

> I think the KVM forum is the right place to sync as both approaches will

> be presented and some pros/cons + performance figures will be given.

> 

> As we talk about choosing, there is one alternative that was suggested

> on the ML by Alex & Michael but never really get considered yet and

> maybe should be: using intel iommu emulation code for ARM. I

> aknowledge

> this deserves a thorough impact study on kernel and FW side but I would

> be happy to get your opinion about the QEMU side. Would you have a by-

> principle rejection of this idea to instantiate such an Intel device in

> mach virt or would it be something you would be ready to consider?

> 


bear posting a link to Alex/Michael's comment? interesting to know
the rationale...

Thanks
Kevin
Eric Auger Oct. 13, 2017, 7:43 a.m. UTC | #9
Hi Kevin,

On 13/10/2017 09:01, Tian, Kevin wrote:
>> From: Auger Eric [mailto:eric.auger@redhat.com]
>> Sent: Thursday, October 12, 2017 6:10 PM
>>
>> Hi Peter,
>>
>> On 12/10/2017 11:54, Peter Maydell wrote:
>>> On 11 October 2017 at 17:08, Auger Eric <eric.auger@redhat.com> wrote:
>>>> Hi Peter,
>>>>
>>>> On 11/10/2017 16:56, Peter Maydell wrote:
>>>>> On 19 September 2017 at 08:46, Eric Auger <eric.auger@redhat.com>
>> wrote:
>>>>>> This series implements the virtio-iommu device.
>>>>>>
>>>>>> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
>>>>>> - probe request support although no reserved region is returned at
>>>>>>   the moment
>>>>>> - unmap semantics less strict, as specified in v0.4
>>>>>> - device registration, attach/detach revisited
>>>>>> - split into smaller patches to ease review
>>>>>> - propose a way to inform the IOMMU mr about the page_size_mask
>>>>>>   of underlying HW IOMMU, if any
>>>>>> - remove warning associated with the translation of the MSI doorbell
>>>>>>
>>>>>> The device gets instantiated using the "-device virtio-iommu-device"
>>>>>> option. It currently works with ARM virt machine only, as the machine
>>>>>> must handle the dt binding between the virtio-mmio "iommu" node
>> and
>>>>>> the PCI host bridge node.
>>>>>
>>>>> Could this work on x86, or is it inherently arm-only?
>>>>
>>>> Yes this is the goal. At the moment the ACPI probing is not yet properly
>>>> specified but a Q35 prototype was developed in the Red Hat Virt team.
>>>> This will be presented at the KVM forum.
>>>
>>> Since I have very little familiarity with virtio or iommu code,
>>> I'd be much happier if this was reviewed as a generic virtio-iommu
>>> by the x86/virtio devs and then the arm specific parts done second...
>>
>> Understood. I was rather expecting you to review the smmuv3 emulation
>> code which you did, in a comprehensive manner ;-), and many thanks for
>> that.
>>
>> Note sure this is time yet to get this RFC reviewed as
>> - the v0.4 virtio-iommu driver it relies on was not officially submitted,
>> - the virtio-iommu specification review has not really been reviewed,
>> - the ACPI probing method has not been discussed yet.
>>
>> Jean-Philippe, please correct me if I am wrong.
>>
>> So to me, this is pure RFC at the moment.
>>>
>>> I'm also not clear on what we're expecting the recommended or normal
>>> way to do device passthrough is going to be -- this virtio-mmio,
>>> or presenting the guest with an SMMUv3 interface? Do we really
>>> need to implement both ?
>>
>> I think the KVM forum is the right place to sync as both approaches will
>> be presented and some pros/cons + performance figures will be given.
>>
>> As we talk about choosing, there is one alternative that was suggested
>> on the ML by Alex & Michael but never really get considered yet and
>> maybe should be: using intel iommu emulation code for ARM. I
>> aknowledge
>> this deserves a thorough impact study on kernel and FW side but I would
>> be happy to get your opinion about the QEMU side. Would you have a by-
>> principle rejection of this idea to instantiate such an Intel device in
>> mach virt or would it be something you would be ready to consider?
>>
> 
> bear posting a link to Alex/Michael's comment? interesting to know
> the rationale...

This was suggested here for instance:

https://lkml.org/lkml/2017/7/12/579

I think rationale was
- this is an emulated platform so there is more freedom
- vtd emulation code is rather stable
- it has pieces missing on smmu: cachine mode, IOTLB invalidation
command with addr_mask,
- intel iommu driver implements deferred IOTLB Invalidation which boosts
perf

But problems I foresee are:
- MSI handling. ARM MSI doorbells can be anywhere in the GPA address
space whereas Intel has MSIs within the APIC window: [FEE0_0000h –
FEF0_000h]. So I suspect the MSI handling may not work at kernel level.
- intel IOMMU input/output address ranges and page sizes may be
different from ARM ones.
- ARM uses ACPI IORT for binding RC <-> IOMMU <-> MSI controller whereas
Intel uses other tables
- Intel's dmar kernel code must be compiled/enabled on ARM

So personally I don't think this solution is viable but I prefer this
gets discussed on the ML.

Thanks

Eric
> 
> Thanks
> Kevin
>