diff mbox series

[v2,2/2] intel_iommu: handle invalid ce for shadow sync

Message ID 20181008064713.27349-3-peterx@redhat.com
State New
Headers show
Series intel_iommu: handle invalid ce for shadow sync | expand

Commit Message

Peter Xu Oct. 8, 2018, 6:47 a.m. UTC
We should handle VTD_FR_CONTEXT_ENTRY_P properly when synchronizing
shadow page tables.  Having invalid context entry there is perfectly
valid when we move a device out of an existing domain.  When that
happens, instead of posting an error we invalidate the whole region.

Without this patch, QEMU will crash if we do these steps:

(1) start QEMU with VT-d IOMMU and two 10G NICs (ixgbe)
(2) bind the NICs with vfio-pci in the guest
(3) start testpmd with the NICs applied
(4) stop testpmd
(5) rebind the NIC back to ixgbe kernel driver

The patch should fix it.

Reported-by: Pei Zhang <pezhang@redhat.com>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1627272
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

Comments

Eric Auger Oct. 9, 2018, 7:27 a.m. UTC | #1
Hi Peter,

On 10/8/18 8:47 AM, Peter Xu wrote:
> We should handle VTD_FR_CONTEXT_ENTRY_P properly when synchronizing
> shadow page tables.  Having invalid context entry there is perfectly
> valid when we move a device out of an existing domain.  When that
> happens, instead of posting an error we invalidate the whole region.
> 
> Without this patch, QEMU will crash if we do these steps:
> 
> (1) start QEMU with VT-d IOMMU and two 10G NICs (ixgbe)
> (2) bind the NICs with vfio-pci in the guest
> (3) start testpmd with the NICs applied
> (4) stop testpmd
> (5) rebind the NIC back to ixgbe kernel driver
> 
> The patch should fix it.
> 
> Reported-by: Pei Zhang <pezhang@redhat.com>
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1627272
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  hw/i386/intel_iommu.c | 20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index d6b4f8705d..b0884e87e8 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -37,6 +37,8 @@
>  #include "kvm_i386.h"
>  #include "trace.h"
>  
> +static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n);
> +
>  static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val,
>                              uint64_t wmask, uint64_t w1cmask)
>  {
> @@ -1056,12 +1058,28 @@ static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
>  {
>      int ret;
>      VTDContextEntry ce;
> +    IOMMUNotifier *n;
>  
>      ret = vtd_dev_to_context_entry(vtd_as->iommu_state,
>                                     pci_bus_num(vtd_as->bus),
>                                     vtd_as->devfn, &ce);
>      if (ret) {
> -        return ret;
> +        if (ret == -VTD_FR_CONTEXT_ENTRY_P) {
> +            /*
> +             * It's a valid scenario to have a context entry that is
> +             * not present.  For example, when a device is removed
> +             * from an existing domain then the context entry will be
> +             * zeroed by the guest before it was put into another
> +             * domain.  When this happens, instead of synchronizing
> +             * the shadow pages we should invalidate all existing
> +             * mappings and notify the backends.
> +             */
> +            IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) {
> +                vtd_address_space_unmap(vtd_as, n);
> +            }
don't you want to return here also in this case?

Thanks

Eric
> +        } else {
> +            return ret;
> +        }
>      }
>  
>      return vtd_sync_shadow_page_table_range(vtd_as, &ce, 0, UINT64_MAX);
>
Peter Xu Oct. 9, 2018, 7:38 a.m. UTC | #2
On Tue, Oct 09, 2018 at 09:27:15AM +0200, Auger Eric wrote:

[...]

> > @@ -1056,12 +1058,28 @@ static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
> >  {
> >      int ret;
> >      VTDContextEntry ce;
> > +    IOMMUNotifier *n;
> >  
> >      ret = vtd_dev_to_context_entry(vtd_as->iommu_state,
> >                                     pci_bus_num(vtd_as->bus),
> >                                     vtd_as->devfn, &ce);
> >      if (ret) {
> > -        return ret;
> > +        if (ret == -VTD_FR_CONTEXT_ENTRY_P) {
> > +            /*
> > +             * It's a valid scenario to have a context entry that is
> > +             * not present.  For example, when a device is removed
> > +             * from an existing domain then the context entry will be
> > +             * zeroed by the guest before it was put into another
> > +             * domain.  When this happens, instead of synchronizing
> > +             * the shadow pages we should invalidate all existing
> > +             * mappings and notify the backends.
> > +             */
> > +            IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) {
> > +                vtd_address_space_unmap(vtd_as, n);
> > +            }
> don't you want to return here also in this case?

Yes... a stupid mistake!

I'll repost.  Thank you,
diff mbox series

Patch

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index d6b4f8705d..b0884e87e8 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -37,6 +37,8 @@ 
 #include "kvm_i386.h"
 #include "trace.h"
 
+static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n);
+
 static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val,
                             uint64_t wmask, uint64_t w1cmask)
 {
@@ -1056,12 +1058,28 @@  static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
 {
     int ret;
     VTDContextEntry ce;
+    IOMMUNotifier *n;
 
     ret = vtd_dev_to_context_entry(vtd_as->iommu_state,
                                    pci_bus_num(vtd_as->bus),
                                    vtd_as->devfn, &ce);
     if (ret) {
-        return ret;
+        if (ret == -VTD_FR_CONTEXT_ENTRY_P) {
+            /*
+             * It's a valid scenario to have a context entry that is
+             * not present.  For example, when a device is removed
+             * from an existing domain then the context entry will be
+             * zeroed by the guest before it was put into another
+             * domain.  When this happens, instead of synchronizing
+             * the shadow pages we should invalidate all existing
+             * mappings and notify the backends.
+             */
+            IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) {
+                vtd_address_space_unmap(vtd_as, n);
+            }
+        } else {
+            return ret;
+        }
     }
 
     return vtd_sync_shadow_page_table_range(vtd_as, &ce, 0, UINT64_MAX);