Message ID | 20181008064713.27349-3-peterx@redhat.com |
---|---|
State | New |
Headers | show |
Series | intel_iommu: handle invalid ce for shadow sync | expand |
Hi Peter, On 10/8/18 8:47 AM, Peter Xu wrote: > We should handle VTD_FR_CONTEXT_ENTRY_P properly when synchronizing > shadow page tables. Having invalid context entry there is perfectly > valid when we move a device out of an existing domain. When that > happens, instead of posting an error we invalidate the whole region. > > Without this patch, QEMU will crash if we do these steps: > > (1) start QEMU with VT-d IOMMU and two 10G NICs (ixgbe) > (2) bind the NICs with vfio-pci in the guest > (3) start testpmd with the NICs applied > (4) stop testpmd > (5) rebind the NIC back to ixgbe kernel driver > > The patch should fix it. > > Reported-by: Pei Zhang <pezhang@redhat.com> > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1627272 > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > hw/i386/intel_iommu.c | 20 +++++++++++++++++++- > 1 file changed, 19 insertions(+), 1 deletion(-) > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > index d6b4f8705d..b0884e87e8 100644 > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -37,6 +37,8 @@ > #include "kvm_i386.h" > #include "trace.h" > > +static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n); > + > static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val, > uint64_t wmask, uint64_t w1cmask) > { > @@ -1056,12 +1058,28 @@ static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as) > { > int ret; > VTDContextEntry ce; > + IOMMUNotifier *n; > > ret = vtd_dev_to_context_entry(vtd_as->iommu_state, > pci_bus_num(vtd_as->bus), > vtd_as->devfn, &ce); > if (ret) { > - return ret; > + if (ret == -VTD_FR_CONTEXT_ENTRY_P) { > + /* > + * It's a valid scenario to have a context entry that is > + * not present. For example, when a device is removed > + * from an existing domain then the context entry will be > + * zeroed by the guest before it was put into another > + * domain. When this happens, instead of synchronizing > + * the shadow pages we should invalidate all existing > + * mappings and notify the backends. > + */ > + IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) { > + vtd_address_space_unmap(vtd_as, n); > + } don't you want to return here also in this case? Thanks Eric > + } else { > + return ret; > + } > } > > return vtd_sync_shadow_page_table_range(vtd_as, &ce, 0, UINT64_MAX); >
On Tue, Oct 09, 2018 at 09:27:15AM +0200, Auger Eric wrote: [...] > > @@ -1056,12 +1058,28 @@ static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as) > > { > > int ret; > > VTDContextEntry ce; > > + IOMMUNotifier *n; > > > > ret = vtd_dev_to_context_entry(vtd_as->iommu_state, > > pci_bus_num(vtd_as->bus), > > vtd_as->devfn, &ce); > > if (ret) { > > - return ret; > > + if (ret == -VTD_FR_CONTEXT_ENTRY_P) { > > + /* > > + * It's a valid scenario to have a context entry that is > > + * not present. For example, when a device is removed > > + * from an existing domain then the context entry will be > > + * zeroed by the guest before it was put into another > > + * domain. When this happens, instead of synchronizing > > + * the shadow pages we should invalidate all existing > > + * mappings and notify the backends. > > + */ > > + IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) { > > + vtd_address_space_unmap(vtd_as, n); > > + } > don't you want to return here also in this case? Yes... a stupid mistake! I'll repost. Thank you,
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index d6b4f8705d..b0884e87e8 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -37,6 +37,8 @@ #include "kvm_i386.h" #include "trace.h" +static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n); + static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val, uint64_t wmask, uint64_t w1cmask) { @@ -1056,12 +1058,28 @@ static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as) { int ret; VTDContextEntry ce; + IOMMUNotifier *n; ret = vtd_dev_to_context_entry(vtd_as->iommu_state, pci_bus_num(vtd_as->bus), vtd_as->devfn, &ce); if (ret) { - return ret; + if (ret == -VTD_FR_CONTEXT_ENTRY_P) { + /* + * It's a valid scenario to have a context entry that is + * not present. For example, when a device is removed + * from an existing domain then the context entry will be + * zeroed by the guest before it was put into another + * domain. When this happens, instead of synchronizing + * the shadow pages we should invalidate all existing + * mappings and notify the backends. + */ + IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) { + vtd_address_space_unmap(vtd_as, n); + } + } else { + return ret; + } } return vtd_sync_shadow_page_table_range(vtd_as, &ce, 0, UINT64_MAX);
We should handle VTD_FR_CONTEXT_ENTRY_P properly when synchronizing shadow page tables. Having invalid context entry there is perfectly valid when we move a device out of an existing domain. When that happens, instead of posting an error we invalidate the whole region. Without this patch, QEMU will crash if we do these steps: (1) start QEMU with VT-d IOMMU and two 10G NICs (ixgbe) (2) bind the NICs with vfio-pci in the guest (3) start testpmd with the NICs applied (4) stop testpmd (5) rebind the NIC back to ixgbe kernel driver The patch should fix it. Reported-by: Pei Zhang <pezhang@redhat.com> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1627272 Signed-off-by: Peter Xu <peterx@redhat.com> --- hw/i386/intel_iommu.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)