diff mbox

[for-2.10,3/3] qdev: defer DEVICE_DEL event until instance_finalize()

Message ID 1501119055-4060-4-git-send-email-mdroth@linux.vnet.ibm.com
State New
Headers show

Commit Message

Michael Roth July 27, 2017, 1:30 a.m. UTC
DEVICE_DEL is currently emitted when a Device is unparented, as
opposed to when it is finalized. The main design motivation for this
seems to be that after unparent()/unrealize(), the Device is no
longer visible to the guest, and thus the operation is complete
from the perspective of management.

However, there are cases where remaining host-side cleanup is also
pertinent to management. The is generally handled by treating these
resources as aspects of the "backend", which can be managed via
separate interfaces/events, such as blockdev_add/del, netdev_add/del,
object_add/del, etc, but some devices do not have this level of
compartmentalization, namely vfio-pci, and possibly to lend themselves
well to it.

In the case of vfio-pci, the "backend" cleanup happens as part of
the finalization of the vfio-pci device itself, in particular the
cleanup of the VFIO group FD. Failing to wait for this cleanup can
result in tools like libvirt attempting to rebind the device to
the host while it's still being used by VFIO, which can result in
host crashes or other misbehavior depending on the host driver.

Deferring DEVICE_DEL still affords us the ability to manage backends
explicitly, while also addressing cases like vfio-pci's, so we
implement that approach here.

An alternative proposal involving having VFIO emit a separate event
to denote completion of host-side cleanup was discussed, but the
prevailing opinion seems to be that it is not worth the added
complexity, and leaves the issue open for other Device implementations
solve in the future.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
 hw/core/qdev.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

Comments

Greg Kurz July 31, 2017, 5:11 p.m. UTC | #1
On Wed, 26 Jul 2017 20:30:55 -0500
Michael Roth <mdroth@linux.vnet.ibm.com> wrote:

> DEVICE_DEL is currently emitted when a Device is unparented, as
> opposed to when it is finalized. The main design motivation for this
> seems to be that after unparent()/unrealize(), the Device is no
> longer visible to the guest, and thus the operation is complete
> from the perspective of management.
> 
> However, there are cases where remaining host-side cleanup is also
> pertinent to management. The is generally handled by treating these
> resources as aspects of the "backend", which can be managed via
> separate interfaces/events, such as blockdev_add/del, netdev_add/del,
> object_add/del, etc, but some devices do not have this level of
> compartmentalization, namely vfio-pci, and possibly to lend themselves
> well to it.
> 
> In the case of vfio-pci, the "backend" cleanup happens as part of
> the finalization of the vfio-pci device itself, in particular the
> cleanup of the VFIO group FD. Failing to wait for this cleanup can
> result in tools like libvirt attempting to rebind the device to
> the host while it's still being used by VFIO, which can result in
> host crashes or other misbehavior depending on the host driver.
> 
> Deferring DEVICE_DEL still affords us the ability to manage backends
> explicitly, while also addressing cases like vfio-pci's, so we
> implement that approach here.
> 
> An alternative proposal involving having VFIO emit a separate event
> to denote completion of host-side cleanup was discussed, but the
> prevailing opinion seems to be that it is not worth the added
> complexity, and leaves the issue open for other Device implementations
> solve in the future.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/core/qdev.c | 23 ++++++++++++-----------
>  1 file changed, 12 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 08c4061..d14acba 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -1067,7 +1067,6 @@ static void device_finalize(Object *obj)
>      NamedGPIOList *ngl, *next;
>  
>      DeviceState *dev = DEVICE(obj);
> -    qemu_opts_del(dev->opts);
>  
>      QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
>          QLIST_REMOVE(ngl, node);
> @@ -1078,6 +1077,18 @@ static void device_finalize(Object *obj)
>           * here
>           */
>      }
> +
> +    /* Only send event if the device had been completely realized */
> +    if (dev->pending_deleted_event) {
> +        g_assert(dev->canonical_path);
> +
> +        qapi_event_send_device_deleted(!!dev->id, dev->id, dev->canonical_path,
> +                                       &error_abort);
> +        g_free(dev->canonical_path);
> +        dev->canonical_path = NULL;
> +    }
> +
> +    qemu_opts_del(dev->opts);
>  }
>  
>  static void device_class_base_init(ObjectClass *class, void *data)
> @@ -1107,16 +1118,6 @@ static void device_unparent(Object *obj)
>          object_unref(OBJECT(dev->parent_bus));
>          dev->parent_bus = NULL;
>      }
> -
> -    /* Only send event if the device had been completely realized */
> -    if (dev->pending_deleted_event) {
> -        g_assert(dev->canonical_path);
> -
> -        qapi_event_send_device_deleted(!!dev->id, dev->id, dev->canonical_path,
> -                                       &error_abort);
> -        g_free(dev->canonical_path);
> -        dev->canonical_path = NULL;
> -    }
>  }
>  
>  static void device_class_init(ObjectClass *class, void *data)
Eric Auger Aug. 9, 2017, 2:04 p.m. UTC | #2
Hi Michael,

On 27/07/2017 03:30, Michael Roth wrote:
> DEVICE_DEL is currently emitted when a Device is unparented, as
> opposed to when it is finalized. The main design motivation for this
> seems to be that after unparent()/unrealize(), the Device is no
> longer visible to the guest, and thus the operation is complete
> from the perspective of management.
> 
> However, there are cases where remaining host-side cleanup is also
> pertinent to management. The is generally handled by treating these
> resources as aspects of the "backend", which can be managed via
> separate interfaces/events, such as blockdev_add/del, netdev_add/del,
> object_add/del, etc, but some devices do not have this level of
> compartmentalization, namely vfio-pci, and possibly to lend themselves
> well to it.
> 
> In the case of vfio-pci, the "backend" cleanup happens as part of
> the finalization of the vfio-pci device itself, in particular the
> cleanup of the VFIO group FD. Failing to wait for this cleanup can
> result in tools like libvirt attempting to rebind the device to
> the host while it's still being used by VFIO, which can result in
> host crashes or other misbehavior depending on the host driver.
> 
> Deferring DEVICE_DEL still affords us the ability to manage backends
> explicitly, while also addressing cases like vfio-pci's, so we
> implement that approach here.
> 
> An alternative proposal involving having VFIO emit a separate event
> to denote completion of host-side cleanup was discussed, but the
> prevailing opinion seems to be that it is not worth the added
> complexity, and leaves the issue open for other Device implementations
> solve in the future.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/core/qdev.c | 23 ++++++++++++-----------
>  1 file changed, 12 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 08c4061..d14acba 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -1067,7 +1067,6 @@ static void device_finalize(Object *obj)
>      NamedGPIOList *ngl, *next;
>  
>      DeviceState *dev = DEVICE(obj);
> -    qemu_opts_del(dev->opts);
>  
>      QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
>          QLIST_REMOVE(ngl, node);
> @@ -1078,6 +1077,18 @@ static void device_finalize(Object *obj)
>           * here
>           */
>      }
> +
> +    /* Only send event if the device had been completely realized */
> +    if (dev->pending_deleted_event) {
> +        g_assert(dev->canonical_path);
> +
> +        qapi_event_send_device_deleted(!!dev->id, dev->id, dev->canonical_path,
> +                                       &error_abort);
> +        g_free(dev->canonical_path);
> +        dev->canonical_path = NULL;
> +    }
> +
> +    qemu_opts_del(dev->opts);
>  }
>  
>  static void device_class_base_init(ObjectClass *class, void *data)
> @@ -1107,16 +1118,6 @@ static void device_unparent(Object *obj)
>          object_unref(OBJECT(dev->parent_bus));
>          dev->parent_bus = NULL;
>      }
> -
> -    /* Only send event if the device had been completely realized */
> -    if (dev->pending_deleted_event) {
> -        g_assert(dev->canonical_path);
> -
> -        qapi_event_send_device_deleted(!!dev->id, dev->id, dev->canonical_path,
> -                                       &error_abort);
> -        g_free(dev->canonical_path);
> -        dev->canonical_path = NULL;
> -    }
is the code below, introduced in patch 1/device_set_realized() still
relevant?
        /* always re-initialize since we clean up in device_unparent()
instead
         * of unrealize()
         */
        g_free(dev->canonical_path);

Thanks

Eric
>  }
>  
>  static void device_class_init(ObjectClass *class, void *data)
>
Michael Roth Oct. 7, 2017, 12:03 a.m. UTC | #3
Quoting Auger Eric (2017-08-09 09:04:54)
> Hi Michael,
> 
> On 27/07/2017 03:30, Michael Roth wrote:
> > DEVICE_DEL is currently emitted when a Device is unparented, as
> > opposed to when it is finalized. The main design motivation for this
> > seems to be that after unparent()/unrealize(), the Device is no
> > longer visible to the guest, and thus the operation is complete
> > from the perspective of management.
> > 
> > However, there are cases where remaining host-side cleanup is also
> > pertinent to management. The is generally handled by treating these
> > resources as aspects of the "backend", which can be managed via
> > separate interfaces/events, such as blockdev_add/del, netdev_add/del,
> > object_add/del, etc, but some devices do not have this level of
> > compartmentalization, namely vfio-pci, and possibly to lend themselves
> > well to it.
> > 
> > In the case of vfio-pci, the "backend" cleanup happens as part of
> > the finalization of the vfio-pci device itself, in particular the
> > cleanup of the VFIO group FD. Failing to wait for this cleanup can
> > result in tools like libvirt attempting to rebind the device to
> > the host while it's still being used by VFIO, which can result in
> > host crashes or other misbehavior depending on the host driver.
> > 
> > Deferring DEVICE_DEL still affords us the ability to manage backends
> > explicitly, while also addressing cases like vfio-pci's, so we
> > implement that approach here.
> > 
> > An alternative proposal involving having VFIO emit a separate event
> > to denote completion of host-side cleanup was discussed, but the
> > prevailing opinion seems to be that it is not worth the added
> > complexity, and leaves the issue open for other Device implementations
> > solve in the future.
> > 
> > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> > ---
> >  hw/core/qdev.c | 23 ++++++++++++-----------
> >  1 file changed, 12 insertions(+), 11 deletions(-)
> > 
> > diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> > index 08c4061..d14acba 100644
> > --- a/hw/core/qdev.c
> > +++ b/hw/core/qdev.c
> > @@ -1067,7 +1067,6 @@ static void device_finalize(Object *obj)
> >      NamedGPIOList *ngl, *next;
> >  
> >      DeviceState *dev = DEVICE(obj);
> > -    qemu_opts_del(dev->opts);
> >  
> >      QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
> >          QLIST_REMOVE(ngl, node);
> > @@ -1078,6 +1077,18 @@ static void device_finalize(Object *obj)
> >           * here
> >           */
> >      }
> > +
> > +    /* Only send event if the device had been completely realized */
> > +    if (dev->pending_deleted_event) {
> > +        g_assert(dev->canonical_path);
> > +
> > +        qapi_event_send_device_deleted(!!dev->id, dev->id, dev->canonical_path,
> > +                                       &error_abort);
> > +        g_free(dev->canonical_path);
> > +        dev->canonical_path = NULL;
> > +    }
> > +
> > +    qemu_opts_del(dev->opts);
> >  }
> >  
> >  static void device_class_base_init(ObjectClass *class, void *data)
> > @@ -1107,16 +1118,6 @@ static void device_unparent(Object *obj)
> >          object_unref(OBJECT(dev->parent_bus));
> >          dev->parent_bus = NULL;
> >      }
> > -
> > -    /* Only send event if the device had been completely realized */
> > -    if (dev->pending_deleted_event) {
> > -        g_assert(dev->canonical_path);
> > -
> > -        qapi_event_send_device_deleted(!!dev->id, dev->id, dev->canonical_path,
> > -                                       &error_abort);
> > -        g_free(dev->canonical_path);
> > -        dev->canonical_path = NULL;
> > -    }
> is the code below, introduced in patch 1/device_set_realized() still
> relevant?
>         /* always re-initialize since we clean up in device_unparent()
> instead
>          * of unrealize()
>          */
>         g_free(dev->canonical_path);

Hi Eric,

Sorry for missing your reply previously. That comment does indeed need
some adjusting after patch 3. Will fix it up for v2.

> 
> Thanks
> 
> Eric
> >  }
> >  
> >  static void device_class_init(ObjectClass *class, void *data)
> > 
>
diff mbox

Patch

diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 08c4061..d14acba 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -1067,7 +1067,6 @@  static void device_finalize(Object *obj)
     NamedGPIOList *ngl, *next;
 
     DeviceState *dev = DEVICE(obj);
-    qemu_opts_del(dev->opts);
 
     QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
         QLIST_REMOVE(ngl, node);
@@ -1078,6 +1077,18 @@  static void device_finalize(Object *obj)
          * here
          */
     }
+
+    /* Only send event if the device had been completely realized */
+    if (dev->pending_deleted_event) {
+        g_assert(dev->canonical_path);
+
+        qapi_event_send_device_deleted(!!dev->id, dev->id, dev->canonical_path,
+                                       &error_abort);
+        g_free(dev->canonical_path);
+        dev->canonical_path = NULL;
+    }
+
+    qemu_opts_del(dev->opts);
 }
 
 static void device_class_base_init(ObjectClass *class, void *data)
@@ -1107,16 +1118,6 @@  static void device_unparent(Object *obj)
         object_unref(OBJECT(dev->parent_bus));
         dev->parent_bus = NULL;
     }
-
-    /* Only send event if the device had been completely realized */
-    if (dev->pending_deleted_event) {
-        g_assert(dev->canonical_path);
-
-        qapi_event_send_device_deleted(!!dev->id, dev->id, dev->canonical_path,
-                                       &error_abort);
-        g_free(dev->canonical_path);
-        dev->canonical_path = NULL;
-    }
 }
 
 static void device_class_init(ObjectClass *class, void *data)