Message ID | 1585542301-84087-1-git-send-email-yi.l.liu@intel.com |
---|---|
Headers | show |
Series | intel_iommu: expose Shared Virtual Addressing to VMs | expand |
Patchew URL: https://patchew.org/QEMU/1585542301-84087-1-git-send-email-yi.l.liu@intel.com/ Hi, This series failed the docker-mingw@fedora build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #! /bin/bash export ARCH=x86_64 make docker-image-fedora V=1 NETWORK=1 time make docker-test-mingw@fedora J=14 NETWORK=1 === TEST SCRIPT END === from /tmp/qemu-test/src/include/hw/pci/pci_bus.h:4, from /tmp/qemu-test/src/include/hw/pci-host/i440fx.h:15, from /tmp/qemu-test/src/stubs/pci-host-piix.c:2: /tmp/qemu-test/src/include/hw/iommu/host_iommu_context.h:28:10: fatal error: linux/iommu.h: No such file or directory #include <linux/iommu.h> ^~~~~~~~~~~~~~~ compilation terminated. CC scsi/pr-manager-stub.o make: *** [/tmp/qemu-test/src/rules.mak:69: stubs/pci-host-piix.o] Error 1 make: *** Waiting for unfinished jobs.... CC block/curl.o Traceback (most recent call last): --- raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=a71cba547b0b47ef91f874b42e00f828', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-enp9m7rr/src/docker-src.2020-03-30-01.38.53.2480:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2. filter=--filter=label=com.qemu.instance.uuid=a71cba547b0b47ef91f874b42e00f828 make[1]: *** [docker-run] Error 1 make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-enp9m7rr/src' make: *** [docker-run-test-mingw@fedora] Error 2 real 2m1.872s user 0m8.422s The full log is available at http://patchew.org/logs/1585542301-84087-1-git-send-email-yi.l.liu@intel.com/testing.docker-mingw@fedora/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-devel@redhat.com
Hi Yi, On 3/30/20 6:24 AM, Liu Yi L wrote: > Shared Virtual Addressing (SVA), a.k.a, Shared Virtual Memory (SVM) on > Intel platforms allows address space sharing between device DMA and > applications. SVA can reduce programming complexity and enhance security. > > This QEMU series is intended to expose SVA usage to VMs. i.e. Sharing > guest application address space with passthru devices. This is called > vSVA in this series. The whole vSVA enabling requires QEMU/VFIO/IOMMU > changes. > > The high-level architecture for SVA virtualization is as below, the key > design of vSVA support is to utilize the dual-stage IOMMU translation ( > also known as IOMMU nesting translation) capability in host IOMMU. > > .-------------. .---------------------------. > | vIOMMU | | Guest process CR3, FL only| > | | '---------------------------' > .----------------/ > | PASID Entry |--- PASID cache flush - > '-------------' | > | | V > | | CR3 in GPA > '-------------' > Guest > ------| Shadow |--------------------------|-------- > v v v > Host > .-------------. .----------------------. > | pIOMMU | | Bind FL for GVA-GPA | > | | '----------------------' > .----------------/ | > | PASID Entry | V (Nested xlate) > '----------------\.------------------------------. > | | |SL for GPA-HPA, default domain| > | | '------------------------------' > '-------------' > Where: > - FL = First level/stage one page tables > - SL = Second level/stage two page tables > > The complete vSVA kernel upstream patches are divided into three phases: > 1. Common APIs and PCI device direct assignment > 2. IOMMU-backed Mediated Device assignment > 3. Page Request Services (PRS) support > > This QEMU patchset is aiming for the phase 1 and phase 2. It is based > on the two kernel series below. > [1] [PATCH V10 00/11] Nested Shared Virtual Address (SVA) VT-d support: > https://lkml.org/lkml/2020/3/20/1172 > [2] [PATCH v1 0/8] vfio: expose virtual Shared Virtual Addressing to VMs > https://lkml.org/lkml/2020/3/22/116 + [PATCH v2 0/3] IOMMU user API enhancement, right? I think in general, as long as the kernel dependencies are not resolved, the QEMU series is supposed to stay in RFC state. Thanks Eric > > There are roughly two parts: > 1. Introduce HostIOMMUContext as abstract of host IOMMU. It provides explicit > method for vIOMMU emulators to communicate with host IOMMU. e.g. propagate > guest page table binding to host IOMMU to setup dual-stage DMA translation > in host IOMMU and flush iommu iotlb. > 2. Setup dual-stage IOMMU translation for Intel vIOMMU. Includes > - Check IOMMU uAPI version compatibility and VFIO Nesting capabilities which > includes hardware compatibility (stage 1 format) and VFIO_PASID_REQ > availability. This is preparation for setting up dual-stage DMA translation > in host IOMMU. > - Propagate guest PASID allocation and free request to host. > - Propagate guest page table binding to host to setup dual-stage IOMMU DMA > translation in host IOMMU. > - Propagate guest IOMMU cache invalidation to host to ensure iotlb > correctness. > > The complete QEMU set can be found in below link: > https://github.com/luxis1999/qemu.git: sva_vtd_v10_v2 > > Complete kernel can be found in: > https://github.com/luxis1999/linux-vsva.git: vsva-linux-5.6-rc6 > > Tests: basci vSVA functionality test, VM reboot/shutdown/crash, kernel build in > guest, boot VM with vSVA disabled, full comapilation with all archs. > > Regards, > Yi Liu > > Changelog: > - Patch v1 -> Patch v2: > a) Refactor the vfio HostIOMMUContext init code (patch 0008 - 0009 of v1 series) > b) Refactor the pasid binding handling (patch 0011 - 0016 of v1 series) > Patch v1: https://patchwork.ozlabs.org/cover/1259648/ > > - RFC v3.1 -> Patch v1: > a) Implement HostIOMMUContext in QOM manner. > b) Add pci_set/unset_iommu_context() to register HostIOMMUContext to > vIOMMU, thus the lifecircle of HostIOMMUContext is awared in vIOMMU > side. In such way, vIOMMU could use the methods provided by the > HostIOMMUContext safely. > c) Add back patch "[RFC v3 01/25] hw/pci: modify pci_setup_iommu() to set PCIIOMMUOps" > RFCv3.1: https://patchwork.kernel.org/cover/11397879/ > > - RFC v3 -> v3.1: > a) Drop IOMMUContext, and rename DualStageIOMMUObject to HostIOMMUContext. > HostIOMMUContext is per-vfio-container, it is exposed to vIOMMU via PCI > layer. VFIO registers a PCIHostIOMMUFunc callback to PCI layer, vIOMMU > could get HostIOMMUContext instance via it. > b) Check IOMMU uAPI version by VFIO_CHECK_EXTENSION > c) Add a check on VFIO_PASID_REQ availability via VFIO_GET_IOMMU_IHNFO > d) Reorder the series, put vSVA linux header file update in the beginning > put the x-scalable-mode option mofification in the end of the series. > e) Dropped patch "[RFC v3 01/25] hw/pci: modify pci_setup_iommu() to set PCIIOMMUOps" > RFCv3: https://patchwork.kernel.org/cover/11356033/ > > - RFC v2 -> v3: > a) Introduce DualStageIOMMUObject to abstract the host IOMMU programming > capability. e.g. request PASID from host, setup IOMMU nesting translation > on host IOMMU. The pasid_alloc/bind_guest_page_table/iommu_cache_flush > operations are moved to be DualStageIOMMUOps. Thus, DualStageIOMMUObject > is an abstract layer which provides QEMU vIOMMU emulators with an explicit > method to program host IOMMU. > b) Compared with RFC v2, the IOMMUContext has also been updated. It is > modified to provide an abstract for vIOMMU emulators. It provides the > method for pass-through modules (like VFIO) to communicate with host IOMMU. > e.g. tell vIOMMU emulators about the IOMMU nesting capability on host side > and report the host IOMMU DMA translation faults to vIOMMU emulators. > RFC v2: https://www.spinics.net/lists/kvm/msg198556.html > > - RFC v1 -> v2: > Introduce IOMMUContext to abstract the connection between VFIO > and vIOMMU emulators, which is a replacement of the PCIPASIDOps > in RFC v1. Modify x-scalable-mode to be string option instead of > adding a new option as RFC v1 did. Refined the pasid cache management > and addressed the TODOs mentioned in RFC v1. > RFC v1: https://patchwork.kernel.org/cover/11033657/ > > Eric Auger (1): > scripts/update-linux-headers: Import iommu.h > > Liu Yi L (21): > header file update VFIO/IOMMU vSVA APIs > vfio: check VFIO_TYPE1_NESTING_IOMMU support > hw/iommu: introduce HostIOMMUContext > hw/pci: modify pci_setup_iommu() to set PCIIOMMUOps > hw/pci: introduce pci_device_set/unset_iommu_context() > intel_iommu: add set/unset_iommu_context callback > vfio/common: provide PASID alloc/free hooks > vfio/common: init HostIOMMUContext per-container > vfio/pci: set host iommu context to vIOMMU > intel_iommu: add virtual command capability support > intel_iommu: process PASID cache invalidation > intel_iommu: add PASID cache management infrastructure > vfio: add bind stage-1 page table support > intel_iommu: bind/unbind guest page table to host > intel_iommu: replay pasid binds after context cache invalidation > intel_iommu: do not pass down pasid bind for PASID #0 > vfio: add support for flush iommu stage-1 cache > intel_iommu: process PASID-based iotlb invalidation > intel_iommu: propagate PASID-based iotlb invalidation to host > intel_iommu: process PASID-based Device-TLB invalidation > intel_iommu: modify x-scalable-mode to be string option > > hw/Makefile.objs | 1 + > hw/alpha/typhoon.c | 6 +- > hw/arm/smmu-common.c | 6 +- > hw/hppa/dino.c | 6 +- > hw/i386/amd_iommu.c | 6 +- > hw/i386/intel_iommu.c | 1109 ++++++++++++++++++++++++++++++++- > hw/i386/intel_iommu_internal.h | 114 ++++ > hw/i386/trace-events | 6 + > hw/iommu/Makefile.objs | 1 + > hw/iommu/host_iommu_context.c | 161 +++++ > hw/pci-host/designware.c | 6 +- > hw/pci-host/pnv_phb3.c | 6 +- > hw/pci-host/pnv_phb4.c | 6 +- > hw/pci-host/ppce500.c | 6 +- > hw/pci-host/prep.c | 6 +- > hw/pci-host/sabre.c | 6 +- > hw/pci/pci.c | 53 +- > hw/ppc/ppc440_pcix.c | 6 +- > hw/ppc/spapr_pci.c | 6 +- > hw/s390x/s390-pci-bus.c | 8 +- > hw/vfio/common.c | 260 +++++++- > hw/vfio/pci.c | 13 + > hw/virtio/virtio-iommu.c | 6 +- > include/hw/i386/intel_iommu.h | 57 +- > include/hw/iommu/host_iommu_context.h | 116 ++++ > include/hw/pci/pci.h | 18 +- > include/hw/pci/pci_bus.h | 2 +- > include/hw/vfio/vfio-common.h | 4 + > linux-headers/linux/iommu.h | 378 +++++++++++ > linux-headers/linux/vfio.h | 127 ++++ > scripts/update-linux-headers.sh | 2 +- > 31 files changed, 2463 insertions(+), 45 deletions(-) > create mode 100644 hw/iommu/Makefile.objs > create mode 100644 hw/iommu/host_iommu_context.c > create mode 100644 include/hw/iommu/host_iommu_context.h > create mode 100644 linux-headers/linux/iommu.h >
On Mon, Mar 30, 2020 at 12:36:23PM +0200, Auger Eric wrote: > I think in general, as long as the kernel dependencies are not resolved, > the QEMU series is supposed to stay in RFC state. Yeah I agree. I think the subject is not extremely important, but we definitely should wait for the kernel part to be ready before merging the series. Side note: I offered quite a few r-bs for the series (and I still plan to move on reading it this week since there's a new version, and try to offer more r-bs when I still have some context in my brain-cache), however they're mostly only for myself to avoid re-reading the whole series again in the future especially because it's huge... :) Thanks,
Hi Eric, > From: Peter Xu <peterx@redhat.com> > Sent: Monday, March 30, 2020 10:47 PM > To: Auger Eric <eric.auger@redhat.com> > Subject: Re: [PATCH v2 00/22] intel_iommu: expose Shared Virtual Addressing to > VMs > > On Mon, Mar 30, 2020 at 12:36:23PM +0200, Auger Eric wrote: > > I think in general, as long as the kernel dependencies are not > > resolved, the QEMU series is supposed to stay in RFC state. > > Yeah I agree. I think the subject is not extremely important, but we definitely should > wait for the kernel part to be ready before merging the series. > > Side note: I offered quite a few r-bs for the series (and I still plan to move on > reading it this week since there's a new version, and try to offer more r-bs when I > still have some context in my brain-cache), however they're mostly only for myself > to avoid re-reading the whole series again in the future especially because it's > huge... :) Agreed. I'll rename the next version as RFCv6 then. BTW. although there is dependency on kernel side, but I think we'd get agreement on the interaction mechanism between vfio and vIOMMU within QEMU. Also, for the VT-d specific changes (e.g. the pasid cache invalidation patches and the pasid-based-iotlb invalidations), we can actually get them ready as they have no dependency on kernel side change. Please help. :-) Regards, Yi Liu
On 2020/3/30 下午12:24, Liu Yi L wrote: > Shared Virtual Addressing (SVA), a.k.a, Shared Virtual Memory (SVM) on > Intel platforms allows address space sharing between device DMA and > applications. SVA can reduce programming complexity and enhance security. > > This QEMU series is intended to expose SVA usage to VMs. i.e. Sharing > guest application address space with passthru devices. This is called > vSVA in this series. The whole vSVA enabling requires QEMU/VFIO/IOMMU > changes. > > The high-level architecture for SVA virtualization is as below, the key > design of vSVA support is to utilize the dual-stage IOMMU translation ( > also known as IOMMU nesting translation) capability in host IOMMU. > > .-------------. .---------------------------. > | vIOMMU | | Guest process CR3, FL only| > | | '---------------------------' > .----------------/ > | PASID Entry |--- PASID cache flush - > '-------------' | > | | V > | | CR3 in GPA > '-------------' > Guest > ------| Shadow |--------------------------|-------- > v v v > Host > .-------------. .----------------------. > | pIOMMU | | Bind FL for GVA-GPA | > | | '----------------------' > .----------------/ | > | PASID Entry | V (Nested xlate) > '----------------\.------------------------------. > | ||SL for GPA-HPA, default domain| > | | '------------------------------' > '-------------' > Where: > - FL = First level/stage one page tables > - SL = Second level/stage two page tables > > The complete vSVA kernel upstream patches are divided into three phases: > 1. Common APIs and PCI device direct assignment > 2. IOMMU-backed Mediated Device assignment > 3. Page Request Services (PRS) support > > This QEMU patchset is aiming for the phase 1 and phase 2. It is based > on the two kernel series below. > [1] [PATCH V10 00/11] Nested Shared Virtual Address (SVA) VT-d support: > https://lkml.org/lkml/2020/3/20/1172 > [2] [PATCH v1 0/8] vfio: expose virtual Shared Virtual Addressing to VMs > https://lkml.org/lkml/2020/3/22/116 > > There are roughly two parts: > 1. Introduce HostIOMMUContext as abstract of host IOMMU. It provides explicit > method for vIOMMU emulators to communicate with host IOMMU. e.g. propagate > guest page table binding to host IOMMU to setup dual-stage DMA translation > in host IOMMU and flush iommu iotlb. > 2. Setup dual-stage IOMMU translation for Intel vIOMMU. Includes > - Check IOMMU uAPI version compatibility and VFIO Nesting capabilities which > includes hardware compatibility (stage 1 format) and VFIO_PASID_REQ > availability. This is preparation for setting up dual-stage DMA translation > in host IOMMU. > - Propagate guest PASID allocation and free request to host. > - Propagate guest page table binding to host to setup dual-stage IOMMU DMA > translation in host IOMMU. > - Propagate guest IOMMU cache invalidation to host to ensure iotlb > correctness. > > The complete QEMU set can be found in below link: > https://github.com/luxis1999/qemu.git: sva_vtd_v10_v2 Hi Yi: I could not find the branch there. Thanks
On Thu, Apr 02, 2020 at 04:33:02PM +0800, Jason Wang wrote: > > The complete QEMU set can be found in below link: > > https://github.com/luxis1999/qemu.git: sva_vtd_v10_v2 > > > Hi Yi: > > I could not find the branch there. Jason, He typed wrong... It's actually (I found it myself): https://github.com/luxis1999/qemu/tree/sva_vtd_v10_qemu_v2
On Sun, Mar 29, 2020 at 09:24:39PM -0700, Liu Yi L wrote: > Tests: basci vSVA functionality test, Could you elaborate what's the functionality test? Does that contains at least some IOs go through the SVA-capable device so the nested page table is used? I thought it was a yes, but after I notice that the BIND message flags seems to be wrong, I really think I should ask this loud.. > VM reboot/shutdown/crash, What's the VM crash test? > kernel build in > guest, boot VM with vSVA disabled, full comapilation with all archs. I believe I've said similar things, but... I'd appreciate if you can also smoke on 2nd-level only with the series applied. Thanks,
On 2020/4/2 下午9:46, Peter Xu wrote: > On Thu, Apr 02, 2020 at 04:33:02PM +0800, Jason Wang wrote: >>> The complete QEMU set can be found in below link: >>> https://github.com/luxis1999/qemu.git: sva_vtd_v10_v2 >> >> Hi Yi: >> >> I could not find the branch there. > Jason, > > He typed wrong... It's actually (I found it myself): > > https://github.com/luxis1999/qemu/tree/sva_vtd_v10_qemu_v2 Aha, I see. Thanks >
> From: Peter Xu <peterx@redhat.com> > Sent: Thursday, April 2, 2020 9:46 PM > To: Jason Wang <jasowang@redhat.com> > Subject: Re: [PATCH v2 00/22] intel_iommu: expose Shared Virtual Addressing to > VMs > > On Thu, Apr 02, 2020 at 04:33:02PM +0800, Jason Wang wrote: > > > The complete QEMU set can be found in below link: > > > https://github.com/luxis1999/qemu.git: sva_vtd_v10_v2 > > > > > > Hi Yi: > > > > I could not find the branch there. > > Jason, > > He typed wrong... It's actually (I found it myself): > > https://github.com/luxis1999/qemu/tree/sva_vtd_v10_qemu_v2 thanks, really a silly type mistake. Regards, Yi Liu
> From: Peter Xu <peterx@redhat.com> > Sent: Friday, April 3, 2020 2:13 AM > To: Liu, Yi L <yi.l.liu@intel.com> > Subject: Re: [PATCH v2 00/22] intel_iommu: expose Shared Virtual Addressing to > VMs > > On Sun, Mar 29, 2020 at 09:24:39PM -0700, Liu Yi L wrote: > > Tests: basci vSVA functionality test, > > Could you elaborate what's the functionality test? Does that contains > at least some IOs go through the SVA-capable device so the nested page > table is used? I thought it was a yes, but after I notice that the > BIND message flags seems to be wrong, I really think I should ask this > loud.. as just replied, in the verification, only the SRE bit is used. So it's not spotted. In my functionality test, I've passthru a SVA-capable device and issue SVA transactions. > > VM reboot/shutdown/crash, > > What's the VM crash test? it's ctrl+c to kill the VM. > > kernel build in > > guest, boot VM with vSVA disabled, full comapilation with all archs. > > I believe I've said similar things, but... I'd appreciate if you can > also smoke on 2nd-level only with the series applied. yeah, you mean the legacy case, I booted with such config. Regards, Yi Liu