diff mbox

[RFC,02/15] qdev: store DeviceState's canonical path to use when unparenting

Message ID 55422F91.4050504@redhat.com
State New
Headers show

Commit Message

Paolo Bonzini April 30, 2015, 1:35 p.m. UTC
On 29/04/2015 21:20, Michael Roth wrote:
> If the parent is finalized as a result of object_unparent(), it
> will still be attached to the composition tree at the time any
> children are unparented as a result of that same call to
> object_unparent(). However, in some cases, object_unparent()
> will complete without finalizing the parent device, due to
> lingering references that won't be released till some time later.
> One such example is if the parent has MemoryRegion children (which
> take a ref on their parent), who in turn have AddressSpace's (which
> take a ref on their regions), since those AddressSpaces get cleaned
> up asynchronously by the RCU thread.
> 
> In this case qdev:device_unparent() may be called for a child Device
> that no longer has a path to the root/machine container, causing
> object_get_canonical_path() to assert.

This doesn't seem right.  Unparent callbacks are _not_ called when you 
finalize, they are called in post-order as soon as you unplug a device 
(the call tree is object_unparent ==> device_unparent(parent) ==> 
bus_unparent(parent->bus) ==> device_unparent(parent->bus->child[0]) 
and so on).

DEVICE_DELETED is called after a device's children have been 
unparented.  It could be called after a bus is dead though.  Could it 
be that the patch you want is simply this:


?

Paolo

Comments

Michael Roth April 30, 2015, 11:03 p.m. UTC | #1
Quoting Paolo Bonzini (2015-04-30 08:35:13)
> 
> 
> On 29/04/2015 21:20, Michael Roth wrote:
> > If the parent is finalized as a result of object_unparent(), it
> > will still be attached to the composition tree at the time any
> > children are unparented as a result of that same call to
> > object_unparent(). However, in some cases, object_unparent()
> > will complete without finalizing the parent device, due to
> > lingering references that won't be released till some time later.
> > One such example is if the parent has MemoryRegion children (which
> > take a ref on their parent), who in turn have AddressSpace's (which
> > take a ref on their regions), since those AddressSpaces get cleaned
> > up asynchronously by the RCU thread.
> > 
> > In this case qdev:device_unparent() may be called for a child Device
> > that no longer has a path to the root/machine container, causing
> > object_get_canonical_path() to assert.
> 
> This doesn't seem right.  Unparent callbacks are _not_ called when you 
> finalize, they are called in post-order as soon as you unplug a device 
> (the call tree is object_unparent ==> device_unparent(parent) ==> 
> bus_unparent(parent->bus) ==> device_unparent(parent->bus->child[0]) 
> and so on).

Hmm, well, that does seem to be the case for devices that reside on a
bus, since the post-order traversal from the parent will eventually
reach any such devices.

And for a device that gets unparented before it's parent bus, I think
the fix you posted ends up not being needed because the child is
actually a link property of the parent bus, as opposed to an actual
child property, so removing the property doesn't "disconnect" the
device from the composition tree: presumably the *actual* parent
object/container (/machine/unattached I believe?) is still around
when the DEVICE_DELETED event is sent, so it still has a canonical
path and we don't get the assert from object_get_canonical_path().

In my case though I have a couple device types (spapr_drc, and
spapr_iommu) that are direct child properties of the PHB, and
from what I can tell the clean up path for those when we do
object_unparent(phb) goes something like:

object_unparent(o):
  object_property_del_child(o->parent, o, NULL):
    object_property_del(p, prop_name):
      prop->release(p, prop_name, prop_opaque):
      | object_finalize_child_property(p, prop_name, o):
      |   o->class->unparent(o):
      |     device_unparent(o) <- (post-order clean-up, but skips
      |                            direct children like spapr_drc/spapr_iommu)
      |   o->parent = NULL
      |   object_unref(o):
      |     object_finalize(o): <- may happen asynchronously due to RCU cleanup
      |                            for AddressSpace/MemoryRegion ref holder.
      |                            object o will no longer be child prop of
      |                            o->parent. actually, this seems like it would
      |                            be the case during synchronous finalization
      |                            as well now that I look at it more closely...
      |       object_property_del_all(o)
      |         for prop in o->properties:
      |           prop->release(o, prop->name, prop->opaque/o->child)
      |             object_finalize_child_property(o, prop_name, c):
      |               o->class->unparent(c):
      |                 device_unparent(c) <- (spapr_drc/spapr_iommu children
      |                                        get their callbacks after being
      |                                        disconnected, no longer have
      |                                        canonical paths)
      |           QTAILQ_REMOVE(&o->properties, prop, node)
      |       object_deinit(o)
      |       o->free(o)
      QTAILQ_REMOVE(&o->properties, prop, node)

I played around with the idea of temporarilly moving unparented, unfinalized
objects to an "orphan" container. It seemed like a fun way of tracking leaked
objects, and avoids the assert, but that got wierd pretty quickly... and
having DEVICE_DELETED randomly change up the device path didn't seem like
the intended behavior, so this hack ended up seeming pretty reasonable.

The other approach, which I hadn't looked into too closely, was to defer
unparenting an object until it's ref count goes to 0. Could maybe look into
that instead if it seems less hacky.

> 
> DEVICE_DELETED is called after a device's children have been 
> unparented.  It could be called after a bus is dead though.  Could it 
> be that the patch you want is simply this:
> 
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 6e6a65d..46019c4 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -1241,11 +1241,6 @@ static void device_unparent(Object *obj)
>          bus = QLIST_FIRST(&dev->child_bus);
>          object_unparent(OBJECT(bus));
>      }
> -    if (dev->parent_bus) {
> -        bus_remove_child(dev->parent_bus, dev);
> -        object_unref(OBJECT(dev->parent_bus));
> -        dev->parent_bus = NULL;
> -    }
> 
>      /* Only send event if the device had been completely realized */
>      if (dev->pending_deleted_event) {
> @@ -1254,6 +1249,12 @@ static void device_unparent(Object *obj)
>          qapi_event_send_device_deleted(!!dev->id, dev->id, path, &error_abort);
>          g_free(path);
>      }
> +
> +    if (dev->parent_bus) {
> +        bus_remove_child(dev->parent_bus, dev);
> +        object_unref(OBJECT(dev->parent_bus));
> +        dev->parent_bus = NULL;
> +    }
>  }
> 
>  static void device_class_init(ObjectClass *class, void *data)
> 
> ?
> 
> Paolo
>
Paolo Bonzini May 1, 2015, 8:43 p.m. UTC | #2
On 01/05/2015 01:03, Michael Roth wrote:
> 
> I played around with the idea of temporarilly moving unparented, unfinalized
> objects to an "orphan" container. It seemed like a fun way of tracking leaked
> objects, and avoids the assert, but that got wierd pretty quickly... and
> having DEVICE_DELETED randomly change up the device path didn't seem like
> the intended behavior, so this hack ended up seeming pretty reasonable.
> 
> The other approach, which I hadn't looked into too closely, was to defer
> unparenting an object until it's ref count goes to 0. Could maybe look into
> that instead if it seems less hacky.

What about unparenting children devices in the device's unrealize
callback?  It sucks that you have to do it manually, but using stale
canonical paths isn't the nicest thing either.

Paolo
Michael Roth May 1, 2015, 10:54 p.m. UTC | #3
Quoting Paolo Bonzini (2015-05-01 15:43:45)
> 
> 
> On 01/05/2015 01:03, Michael Roth wrote:
> > 
> > I played around with the idea of temporarilly moving unparented, unfinalized
> > objects to an "orphan" container. It seemed like a fun way of tracking leaked
> > objects, and avoids the assert, but that got wierd pretty quickly... and
> > having DEVICE_DELETED randomly change up the device path didn't seem like
> > the intended behavior, so this hack ended up seeming pretty reasonable.
> > 
> > The other approach, which I hadn't looked into too closely, was to defer
> > unparenting an object until it's ref count goes to 0. Could maybe look into
> > that instead if it seems less hacky.
> 
> What about unparenting children devices in the device's unrealize
> callback?  It sucks that you have to do it manually, but using stale
> canonical paths isn't the nicest thing either.

That does seems to do the trick. It felt wrong when I first looked at
it because in some cases the children attach themselves to the parent
without making making the parent aware, but using the child property
list ends up seeming pretty reasonable:

static int spapr_phb_unparent_child_devices(Object *child, void *opaque)
{
    DeviceState *dev = (DeviceState *) object_dynamic_cast(child, TYPE_DEVICE);
    if (dev) {
        object_unparent(child);
    }                                                                                                      
    return 0;
}

static void spapr_phb_unrealize(DeviceState *dev, Error **errp)
{
    ...
    /* clean up child devices (spapr_drc, spapr_iommu, etc.).
     * non-DeviceState's will be handled explicitly
     */
    object_child_foreach(OBJECT(dev), spapr_phb_unparent_child_devices, NULL);
    ...
}

Maybe if we have some more examples pop up it might make sense to
move that to qdev:device_unparent(), since we sort of expect that
for Devices.

Either way this does seem nicer. Thanks!

> 
> Paolo
>
Paolo Bonzini May 4, 2015, 9:35 a.m. UTC | #4
On 02/05/2015 00:54, Michael Roth wrote:
>> > 
>> > What about unparenting children devices in the device's unrealize
>> > callback?  It sucks that you have to do it manually, but using stale
>> > canonical paths isn't the nicest thing either.
> That does seems to do the trick. It felt wrong when I first looked at
> it because in some cases the children attach themselves to the parent
> without making making the parent aware,

Can you point to an example?  The parent "owns" its property namespace,
so it would be wrong for a child to attach itself unbeknownst to the parent.

There are a couple cases where an object adds a property to another
object, e.g. the rtc-time property of /machine, but in that case the
property is a well-known name whose setting is "outsourced" by /machine
to the device.

Paolo
Michael Roth May 5, 2015, 3:48 p.m. UTC | #5
Quoting Paolo Bonzini (2015-05-04 04:35:11)
> 
> 
> On 02/05/2015 00:54, Michael Roth wrote:
> >> > 
> >> > What about unparenting children devices in the device's unrealize
> >> > callback?  It sucks that you have to do it manually, but using stale
> >> > canonical paths isn't the nicest thing either.
> > That does seems to do the trick. It felt wrong when I first looked at
> > it because in some cases the children attach themselves to the parent
> > without making making the parent aware,
> 
> Can you point to an example?  The parent "owns" its property namespace,
> so it would be wrong for a child to attach itself unbeknownst to the parent.

Well, I was referring to:

  sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn)

and:

  sPAPRDRConnector *spapr_dr_connector_new(Object *owner,
                                           sPAPRDRConnectorType type,
                                           uint32_t id)

It wasn't immediately obvious to me that this would result in the
resulting objects attaching themselves as children, but in retrospect
that does seem to be implied by the 'owner' parameter.

If there are cases where this is done with a parameter that isn't explicitly
named 'owner' though it might be somewhat ambiguous (could be pulling out
individual fields as opposed to attaching itself), but I don't see any
examples outside of local functions or qdev-managed devices.

> 
> There are a couple cases where an object adds a property to another
> object, e.g. the rtc-time property of /machine, but in that case the
> property is a well-known name whose setting is "outsourced" by /machine
> to the device.
> 
> Paolo
>
Paolo Bonzini May 19, 2015, 9:41 a.m. UTC | #6
On 05/05/2015 17:48, Michael Roth wrote:
> Well, I was referring to:
> 
>   sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn)
> 
> and:
> 
>   sPAPRDRConnector *spapr_dr_connector_new(Object *owner,
>                                            sPAPRDRConnectorType type,
>                                            uint32_t id)
> 
> It wasn't immediately obvious to me that this would result in the
> resulting objects attaching themselves as children, but in retrospect
> that does seem to be implied by the 'owner' parameter.

I guess that's okay, as long as the callers of these functions pass
their "this" object (or "self", or whatever :)) as the owner.

Paolo

> If there are cases where this is done with a parameter that isn't explicitly
> named 'owner' though it might be somewhat ambiguous (could be pulling out
> individual fields as opposed to attaching itself), but I don't see any
> examples outside of local functions or qdev-managed devices.
diff mbox

Patch

diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 6e6a65d..46019c4 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -1241,11 +1241,6 @@  static void device_unparent(Object *obj)
         bus = QLIST_FIRST(&dev->child_bus);
         object_unparent(OBJECT(bus));
     }
-    if (dev->parent_bus) {
-        bus_remove_child(dev->parent_bus, dev);
-        object_unref(OBJECT(dev->parent_bus));
-        dev->parent_bus = NULL;
-    }
 
     /* Only send event if the device had been completely realized */
     if (dev->pending_deleted_event) {
@@ -1254,6 +1249,12 @@  static void device_unparent(Object *obj)
         qapi_event_send_device_deleted(!!dev->id, dev->id, path, &error_abort);
         g_free(path);
     }
+
+    if (dev->parent_bus) {
+        bus_remove_child(dev->parent_bus, dev);
+        object_unref(OBJECT(dev->parent_bus));
+        dev->parent_bus = NULL;
+    }
 }
 
 static void device_class_init(ObjectClass *class, void *data)