diff mbox series

[6/6] spapr: Model DR connectors as simple objects

Message ID 20201218103400.689660-7-groug@kaod.org
State New
Headers show
Series spapr: Fix visibility and traversal of DR connectors | expand

Commit Message

Greg Kurz Dec. 18, 2020, 10:34 a.m. UTC
Modeling DR connectors as individual devices raises some
concerns, as already discussed a year ago in this thread:

https://patchew.org/QEMU/20191017205953.13122-1-cheloha@linux.vnet.ibm.com/

First, high maxmem settings creates too many DRC devices.
This causes scalability issues. It severely increase boot
time because the multiple traversals of the DRC list that
are performed during machine setup are quadratic operations.
This is directly related to the fact that DRCs are modeled
as individual devices and added to the composition tree.

Second, DR connectors are really an internal concept of
PAPR. They aren't something that the user or management
layer can manipulate in any way. We already don't allow
their creation with device_add by clearing user_creatable.

DR connectors don't even need to be modeled as actual
devices since they don't sit in a bus. They just need
to be associated to an 'owner' object and to have the
equivalent of realize/unrealize functions.

Downgrade them to be simple objects. Convert the existing
realize() and unrealize() to be methods of the DR connector
base class. Also have the base class to inherit from the
vmstate_if interface directly. The get_id() hook simply
returns NULL, just as device_vmstate_if_get_id() does for
devices that don't sit in a bus. The DR connector is no
longer made a child object. This means that it must be
explicitely freed when no longer needed. This is only
required for PHBs and PCI bridges actually : have them to
free the DRC with spapr_dr_connector_free() instead of
object_unparent().

No longer add the DRCs to the QOM composition tree. Track
them with a glib hash table using the global DRC index as
the key instead. This makes traversal a linear operation.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 include/hw/ppc/spapr_drc.h |   8 +-
 hw/ppc/spapr_drc.c         | 166 ++++++++++++++-----------------------
 hw/ppc/spapr_pci.c         |   2 +-
 3 files changed, 69 insertions(+), 107 deletions(-)

Comments

Daniel Henrique Barboza Dec. 21, 2020, 8:45 p.m. UTC | #1
On 12/18/20 7:34 AM, Greg Kurz wrote:
> Modeling DR connectors as individual devices raises some
> concerns, as already discussed a year ago in this thread:
> 
> https://patchew.org/QEMU/20191017205953.13122-1-cheloha@linux.vnet.ibm.com/
> 
> First, high maxmem settings creates too many DRC devices.
> This causes scalability issues. It severely increase boot

s/increase/increases

> time because the multiple traversals of the DRC list that
> are performed during machine setup are quadratic operations.
> This is directly related to the fact that DRCs are modeled
> as individual devices and added to the composition tree.
> 
> Second, DR connectors are really an internal concept of
> PAPR. They aren't something that the user or management
> layer can manipulate in any way. We already don't allow
> their creation with device_add by clearing user_creatable.
> 
> DR connectors don't even need to be modeled as actual
> devices since they don't sit in a bus. They just need
> to be associated to an 'owner' object and to have the
> equivalent of realize/unrealize functions.
> 
> Downgrade them to be simple objects. Convert the existing
> realize() and unrealize() to be methods of the DR connector
> base class. Also have the base class to inherit from the
> vmstate_if interface directly. The get_id() hook simply
> returns NULL, just as device_vmstate_if_get_id() does for
> devices that don't sit in a bus. The DR connector is no
> longer made a child object. This means that it must be
> explicitely freed when no longer needed. This is only


s/explicitely/explicitly

> required for PHBs and PCI bridges actually : have them to
> free the DRC with spapr_dr_connector_free() instead of
> object_unparent().
> 
> No longer add the DRCs to the QOM composition tree. Track
> them with a glib hash table using the global DRC index as
> the key instead. This makes traversal a linear operation.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---


Code LGTM. The maintainer is welcome to fix the nits while pushing.


Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>




>   include/hw/ppc/spapr_drc.h |   8 +-
>   hw/ppc/spapr_drc.c         | 166 ++++++++++++++-----------------------
>   hw/ppc/spapr_pci.c         |   2 +-
>   3 files changed, 69 insertions(+), 107 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
> index 8982927d5c24..a26aa8b9d4c3 100644
> --- a/include/hw/ppc/spapr_drc.h
> +++ b/include/hw/ppc/spapr_drc.h
> @@ -170,7 +170,7 @@ typedef enum {
>   
>   typedef struct SpaprDrc {
>       /*< private >*/
> -    DeviceState parent;
> +    Object parent;
>   
>       uint32_t id;
>       Object *owner;
> @@ -193,7 +193,7 @@ struct SpaprMachineState;
>   
>   typedef struct SpaprDrcClass {
>       /*< private >*/
> -    DeviceClass parent;
> +    ObjectClass parent;
>       SpaprDrcState empty_state;
>       SpaprDrcState ready_state;
>   
> @@ -209,6 +209,9 @@ typedef struct SpaprDrcClass {
>   
>       int (*dt_populate)(SpaprDrc *drc, struct SpaprMachineState *spapr,
>                          void *fdt, int *fdt_start_offset, Error **errp);
> +
> +    void (*realize)(SpaprDrc *drc);
> +    void (*unrealize)(SpaprDrc *drc);
>   } SpaprDrcClass;
>   
>   typedef struct SpaprDrcPhysical {
> @@ -232,6 +235,7 @@ SpaprDrcType spapr_drc_type(SpaprDrc *drc);
>   
>   SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type,
>                                            uint32_t id);
> +void spapr_dr_connector_free(SpaprDrc *drc);
>   SpaprDrc *spapr_drc_by_index(uint32_t index);
>   SpaprDrc *spapr_drc_by_id(const char *type, uint32_t id);
>   int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask);
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index 8571d5bafe4e..e26763f8b5a4 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -27,7 +27,6 @@
>   #include "sysemu/reset.h"
>   #include "trace.h"
>   
> -#define DRC_CONTAINER_PATH "/dr-connector"
>   #define DRC_INDEX_TYPE_SHIFT 28
>   #define DRC_INDEX_ID_MASK ((1ULL << DRC_INDEX_TYPE_SHIFT) - 1)
>   
> @@ -503,65 +502,56 @@ static const VMStateDescription vmstate_spapr_drc = {
>       }
>   };
>   
> -static void drc_realize(DeviceState *d, Error **errp)
> +static GHashTable *drc_hash_table(void)
>   {
> -    SpaprDrc *drc = SPAPR_DR_CONNECTOR(d);
> -    Object *root_container;
> -    gchar *link_name;
> -    const char *child_name;
> +    static GHashTable *dht;
>   
> +    if (!dht) {
> +        dht = g_hash_table_new(NULL, NULL);
> +    }
> +
> +    return dht;
> +}
> +
> +
> +static void drc_realize(SpaprDrc *drc)
> +{
>       trace_spapr_drc_realize(spapr_drc_index(drc));
> -    /* NOTE: we do this as part of realize/unrealize due to the fact
> -     * that the guest will communicate with the DRC via RTAS calls
> -     * referencing the global DRC index. By unlinking the DRC
> -     * from DRC_CONTAINER_PATH/<drc_index> we effectively make it
> -     * inaccessible by the guest, since lookups rely on this path
> -     * existing in the composition tree
> -     */
> -    root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
> -    link_name = g_strdup_printf("%x", spapr_drc_index(drc));
> -    child_name = object_get_canonical_path_component(OBJECT(drc));
> -    trace_spapr_drc_realize_child(spapr_drc_index(drc), child_name);
> -    object_property_add_alias(root_container, link_name,
> -                              drc->owner, child_name);
> -    g_free(link_name);
> +
> +    g_hash_table_insert(drc_hash_table(),
> +                        GUINT_TO_POINTER(spapr_drc_index(drc)), drc);
>       vmstate_register(VMSTATE_IF(drc), spapr_drc_index(drc), &vmstate_spapr_drc,
>                        drc);
>       trace_spapr_drc_realize_complete(spapr_drc_index(drc));
>   }
>   
> -static void drc_unrealize(DeviceState *d)
> +static void drc_unrealize(SpaprDrc *drc)
>   {
> -    SpaprDrc *drc = SPAPR_DR_CONNECTOR(d);
> -    Object *root_container;
> -    gchar *name;
> -
>       trace_spapr_drc_unrealize(spapr_drc_index(drc));
>       vmstate_unregister(VMSTATE_IF(drc), &vmstate_spapr_drc, drc);
> -    root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
> -    name = g_strdup_printf("%x", spapr_drc_index(drc));
> -    object_property_del(root_container, name);
> -    g_free(name);
> +    g_hash_table_remove(drc_hash_table(),
> +                        GUINT_TO_POINTER(spapr_drc_index(drc)));
>   }
>   
>   SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type,
>                                            uint32_t id)
>   {
>       SpaprDrc *drc = SPAPR_DR_CONNECTOR(object_new(type));
> -    char *prop_name;
>   
>       drc->id = id;
> -    drc->owner = owner;
> -    prop_name = g_strdup_printf("dr-connector[%"PRIu32"]",
> -                                spapr_drc_index(drc));
> -    object_property_add_child(owner, prop_name, OBJECT(drc));
> -    object_unref(OBJECT(drc));
> -    qdev_realize(DEVICE(drc), NULL, NULL);
> -    g_free(prop_name);
> +    drc->owner = object_ref(owner);
> +    SPAPR_DR_CONNECTOR_GET_CLASS(drc)->realize(drc);
>   
>       return drc;
>   }
>   
> +void spapr_dr_connector_free(SpaprDrc *drc)
> +{
> +    SPAPR_DR_CONNECTOR_GET_CLASS(drc)->unrealize(drc);
> +    object_unref(drc->owner);
> +    object_unref(drc);
> +}
> +
>   static void spapr_dr_connector_instance_init(Object *obj)
>   {
>       SpaprDrc *drc = SPAPR_DR_CONNECTOR(obj);
> @@ -575,17 +565,19 @@ static void spapr_dr_connector_instance_init(Object *obj)
>       drc->state = drck->empty_state;
>   }
>   
> +static char *drc_vmstate_if_get_id(VMStateIf *obj)
> +{
> +    return NULL;
> +}
> +
>   static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
>   {
> -    DeviceClass *dk = DEVICE_CLASS(k);
> +    SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
> +    VMStateIfClass *vc = VMSTATE_IF_CLASS(k);
>   
> -    dk->realize = drc_realize;
> -    dk->unrealize = drc_unrealize;
> -    /*
> -     * Reason: DR connector needs to be wired to either the machine or to a
> -     * PHB in spapr_dr_connector_new().
> -     */
> -    dk->user_creatable = false;
> +    drck->realize = drc_realize;
> +    drck->unrealize = drc_unrealize;
> +    vc->get_id = drc_vmstate_if_get_id;
>   }
>   
>   static bool drc_physical_needed(void *opaque)
> @@ -623,39 +615,32 @@ static void drc_physical_reset(void *opaque)
>       }
>   }
>   
> -static void realize_physical(DeviceState *d, Error **errp)
> +static void realize_physical(SpaprDrc *drc)
>   {
> -    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(d);
> -    Error *local_err = NULL;
> -
> -    drc_realize(d, &local_err);
> -    if (local_err) {
> -        error_propagate(errp, local_err);
> -        return;
> -    }
> +    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(drc);
>   
> +    drc_realize(drc);
>       vmstate_register(VMSTATE_IF(drcp),
>                        spapr_drc_index(SPAPR_DR_CONNECTOR(drcp)),
>                        &vmstate_spapr_drc_physical, drcp);
>       qemu_register_reset(drc_physical_reset, drcp);
>   }
>   
> -static void unrealize_physical(DeviceState *d)
> +static void unrealize_physical(SpaprDrc *drc)
>   {
> -    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(d);
> +    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(drc);
>   
> -    drc_unrealize(d);
> -    vmstate_unregister(VMSTATE_IF(drcp), &vmstate_spapr_drc_physical, drcp);
>       qemu_unregister_reset(drc_physical_reset, drcp);
> +    vmstate_unregister(VMSTATE_IF(drcp), &vmstate_spapr_drc_physical, drcp);
> +    drc_unrealize(drc);
>   }
>   
>   static void spapr_drc_physical_class_init(ObjectClass *k, void *data)
>   {
> -    DeviceClass *dk = DEVICE_CLASS(k);
>       SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
>   
> -    dk->realize = realize_physical;
> -    dk->unrealize = unrealize_physical;
> +    drck->realize = realize_physical;
> +    drck->unrealize = unrealize_physical;
>       drck->dr_entity_sense = physical_entity_sense;
>       drck->isolate = drc_isolate_physical;
>       drck->unisolate = drc_unisolate_physical;
> @@ -731,12 +716,16 @@ static void spapr_drc_pmem_class_init(ObjectClass *k, void *data)
>   
>   static const TypeInfo spapr_dr_connector_info = {
>       .name          = TYPE_SPAPR_DR_CONNECTOR,
> -    .parent        = TYPE_DEVICE,
> +    .parent        = TYPE_OBJECT,
>       .instance_size = sizeof(SpaprDrc),
>       .instance_init = spapr_dr_connector_instance_init,
>       .class_size    = sizeof(SpaprDrcClass),
>       .class_init    = spapr_dr_connector_class_init,
>       .abstract      = true,
> +    .interfaces = (InterfaceInfo[]) {
> +        { TYPE_VMSTATE_IF },
> +        { }
> +    },
>   };
>   
>   static const TypeInfo spapr_drc_physical_info = {
> @@ -789,14 +778,9 @@ static const TypeInfo spapr_drc_pmem_info = {
>   
>   SpaprDrc *spapr_drc_by_index(uint32_t index)
>   {
> -    Object *obj;
> -    gchar *name;
> -
> -    name = g_strdup_printf("%s/%x", DRC_CONTAINER_PATH, index);
> -    obj = object_resolve_path(name, NULL);
> -    g_free(name);
> -
> -    return !obj ? NULL : SPAPR_DR_CONNECTOR(obj);
> +    return
> +        SPAPR_DR_CONNECTOR(g_hash_table_lookup(drc_hash_table(),
> +                                               GUINT_TO_POINTER(index)));
>   }
>   
>   SpaprDrc *spapr_drc_by_id(const char *type, uint32_t id)
> @@ -824,13 +808,12 @@ SpaprDrc *spapr_drc_by_id(const char *type, uint32_t id)
>    */
>   int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask)
>   {
> -    Object *root_container;
> -    ObjectProperty *prop;
> -    ObjectPropertyIterator iter;
> +    GHashTableIter iter;
>       uint32_t drc_count = 0;
>       GArray *drc_indexes, *drc_power_domains;
>       GString *drc_names, *drc_types;
>       int ret;
> +    SpaprDrc *drc;
>   
>       /*
>        * This should really be only called once per node since it overwrites
> @@ -851,26 +834,12 @@ int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask)
>       drc_names = g_string_set_size(g_string_new(NULL), sizeof(uint32_t));
>       drc_types = g_string_set_size(g_string_new(NULL), sizeof(uint32_t));
>   
> -    /* aliases for all DRConnector objects will be rooted in QOM
> -     * composition tree at DRC_CONTAINER_PATH
> -     */
> -    root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
> -
> -    object_property_iter_init(&iter, root_container);
> -    while ((prop = object_property_iter_next(&iter))) {
> -        Object *obj;
> -        SpaprDrc *drc;
> +    g_hash_table_iter_init(&iter, drc_hash_table());
> +    while (g_hash_table_iter_next(&iter, NULL, (gpointer) &drc)) {
>           SpaprDrcClass *drck;
>           char *drc_name = NULL;
>           uint32_t drc_index, drc_power_domain;
>   
> -        if (!strstart(prop->type, "link<", NULL)) {
> -            continue;
> -        }
> -
> -        obj = object_property_get_link(root_container, prop->name,
> -                                       &error_abort);
> -        drc = SPAPR_DR_CONNECTOR(obj);
>           drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>   
>           if (owner && (drc->owner != owner)) {
> @@ -951,23 +920,12 @@ out:
>   
>   void spapr_drc_reset_all(SpaprMachineState *spapr)
>   {
> -    Object *drc_container;
> -    ObjectProperty *prop;
> -    ObjectPropertyIterator iter;
> +    GHashTableIter iter;
> +    SpaprDrc *drc;
>   
> -    drc_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
>   restart:
> -    object_property_iter_init(&iter, drc_container);
> -    while ((prop = object_property_iter_next(&iter))) {
> -        SpaprDrc *drc;
> -
> -        if (!strstart(prop->type, "link<", NULL)) {
> -            continue;
> -        }
> -        drc = SPAPR_DR_CONNECTOR(object_property_get_link(drc_container,
> -                                                          prop->name,
> -                                                          &error_abort));
> -
> +    g_hash_table_iter_init(&iter, drc_hash_table());
> +    while (g_hash_table_iter_next(&iter, NULL, (gpointer) &drc)) {
>           /*
>            * This will complete any pending plug/unplug requests.
>            * In case of a unplugged PHB or PCI bridge, this will
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 76d7c91e9c64..ca0cca664e3c 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1262,7 +1262,7 @@ static void remove_drcs(SpaprPhbState *phb, PCIBus *bus)
>           SpaprDrc *drc = drc_from_devfn(phb, chassis, i);
>   
>           if (drc) {
> -            object_unparent(OBJECT(drc));
> +            spapr_dr_connector_free(drc);
>           }
>       }
>   }
>
David Gibson Dec. 28, 2020, 8:28 a.m. UTC | #2
On Fri, Dec 18, 2020 at 11:34:00AM +0100, Greg Kurz wrote:
> Modeling DR connectors as individual devices raises some
> concerns, as already discussed a year ago in this thread:
> 
> https://patchew.org/QEMU/20191017205953.13122-1-cheloha@linux.vnet.ibm.com/
> 
> First, high maxmem settings creates too many DRC devices.
> This causes scalability issues. It severely increase boot
> time because the multiple traversals of the DRC list that
> are performed during machine setup are quadratic operations.
> This is directly related to the fact that DRCs are modeled
> as individual devices and added to the composition tree.
> 
> Second, DR connectors are really an internal concept of
> PAPR. They aren't something that the user or management
> layer can manipulate in any way. We already don't allow
> their creation with device_add by clearing user_creatable.
> 
> DR connectors don't even need to be modeled as actual
> devices since they don't sit in a bus. They just need
> to be associated to an 'owner' object and to have the
> equivalent of realize/unrealize functions.
> 
> Downgrade them to be simple objects. Convert the existing
> realize() and unrealize() to be methods of the DR connector
> base class. Also have the base class to inherit from the
> vmstate_if interface directly. The get_id() hook simply
> returns NULL, just as device_vmstate_if_get_id() does for
> devices that don't sit in a bus. The DR connector is no
> longer made a child object. This means that it must be
> explicitely freed when no longer needed. This is only
> required for PHBs and PCI bridges actually : have them to
> free the DRC with spapr_dr_connector_free() instead of
> object_unparent().
> 
> No longer add the DRCs to the QOM composition tree. Track
> them with a glib hash table using the global DRC index as
> the key instead. This makes traversal a linear operation.

I have some reservations about this one.  The main thing is that
attaching migration state to something that's not a device seems a bit
odd to me.  AFAICT exactly one other non-device implements
TYPE_VMSTATE_IF, and what it does isn't very clear to me.

As I might have mentioned to you I had a different idea for how to
address this problem: still use a TYPE_DEVICE, but have it manage a
whole array of DRCs as one unit, rather than just a single one.
Specifically I was thinking:

* one array per PCI bus (DRCs for each function on the bus)
* one array for each block of memory (so one for base memory, one for
  each DIMM)
* one array for all the cpus
* one array for all the PHBs

It has some disadvantages compared to your scheme: it still leaves
(less) devices which can't be user managed, which is a bit ugly.  On
the other hand, each of those arrays can reasonably be dense, so we
can use direct indexing rather than a hash table, which is a bit
nicer.

Thoughts?

> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  include/hw/ppc/spapr_drc.h |   8 +-
>  hw/ppc/spapr_drc.c         | 166 ++++++++++++++-----------------------
>  hw/ppc/spapr_pci.c         |   2 +-
>  3 files changed, 69 insertions(+), 107 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
> index 8982927d5c24..a26aa8b9d4c3 100644
> --- a/include/hw/ppc/spapr_drc.h
> +++ b/include/hw/ppc/spapr_drc.h
> @@ -170,7 +170,7 @@ typedef enum {
>  
>  typedef struct SpaprDrc {
>      /*< private >*/
> -    DeviceState parent;
> +    Object parent;
>  
>      uint32_t id;
>      Object *owner;
> @@ -193,7 +193,7 @@ struct SpaprMachineState;
>  
>  typedef struct SpaprDrcClass {
>      /*< private >*/
> -    DeviceClass parent;
> +    ObjectClass parent;
>      SpaprDrcState empty_state;
>      SpaprDrcState ready_state;
>  
> @@ -209,6 +209,9 @@ typedef struct SpaprDrcClass {
>  
>      int (*dt_populate)(SpaprDrc *drc, struct SpaprMachineState *spapr,
>                         void *fdt, int *fdt_start_offset, Error **errp);
> +
> +    void (*realize)(SpaprDrc *drc);
> +    void (*unrealize)(SpaprDrc *drc);
>  } SpaprDrcClass;
>  
>  typedef struct SpaprDrcPhysical {
> @@ -232,6 +235,7 @@ SpaprDrcType spapr_drc_type(SpaprDrc *drc);
>  
>  SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type,
>                                           uint32_t id);
> +void spapr_dr_connector_free(SpaprDrc *drc);
>  SpaprDrc *spapr_drc_by_index(uint32_t index);
>  SpaprDrc *spapr_drc_by_id(const char *type, uint32_t id);
>  int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask);
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index 8571d5bafe4e..e26763f8b5a4 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -27,7 +27,6 @@
>  #include "sysemu/reset.h"
>  #include "trace.h"
>  
> -#define DRC_CONTAINER_PATH "/dr-connector"
>  #define DRC_INDEX_TYPE_SHIFT 28
>  #define DRC_INDEX_ID_MASK ((1ULL << DRC_INDEX_TYPE_SHIFT) - 1)
>  
> @@ -503,65 +502,56 @@ static const VMStateDescription vmstate_spapr_drc = {
>      }
>  };
>  
> -static void drc_realize(DeviceState *d, Error **errp)
> +static GHashTable *drc_hash_table(void)
>  {
> -    SpaprDrc *drc = SPAPR_DR_CONNECTOR(d);
> -    Object *root_container;
> -    gchar *link_name;
> -    const char *child_name;
> +    static GHashTable *dht;
>  
> +    if (!dht) {
> +        dht = g_hash_table_new(NULL, NULL);
> +    }
> +
> +    return dht;
> +}
> +
> +
> +static void drc_realize(SpaprDrc *drc)
> +{
>      trace_spapr_drc_realize(spapr_drc_index(drc));
> -    /* NOTE: we do this as part of realize/unrealize due to the fact
> -     * that the guest will communicate with the DRC via RTAS calls
> -     * referencing the global DRC index. By unlinking the DRC
> -     * from DRC_CONTAINER_PATH/<drc_index> we effectively make it
> -     * inaccessible by the guest, since lookups rely on this path
> -     * existing in the composition tree
> -     */
> -    root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
> -    link_name = g_strdup_printf("%x", spapr_drc_index(drc));
> -    child_name = object_get_canonical_path_component(OBJECT(drc));
> -    trace_spapr_drc_realize_child(spapr_drc_index(drc), child_name);
> -    object_property_add_alias(root_container, link_name,
> -                              drc->owner, child_name);
> -    g_free(link_name);
> +
> +    g_hash_table_insert(drc_hash_table(),
> +                        GUINT_TO_POINTER(spapr_drc_index(drc)), drc);
>      vmstate_register(VMSTATE_IF(drc), spapr_drc_index(drc), &vmstate_spapr_drc,
>                       drc);
>      trace_spapr_drc_realize_complete(spapr_drc_index(drc));
>  }
>  
> -static void drc_unrealize(DeviceState *d)
> +static void drc_unrealize(SpaprDrc *drc)
>  {
> -    SpaprDrc *drc = SPAPR_DR_CONNECTOR(d);
> -    Object *root_container;
> -    gchar *name;
> -
>      trace_spapr_drc_unrealize(spapr_drc_index(drc));
>      vmstate_unregister(VMSTATE_IF(drc), &vmstate_spapr_drc, drc);
> -    root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
> -    name = g_strdup_printf("%x", spapr_drc_index(drc));
> -    object_property_del(root_container, name);
> -    g_free(name);
> +    g_hash_table_remove(drc_hash_table(),
> +                        GUINT_TO_POINTER(spapr_drc_index(drc)));
>  }
>  
>  SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type,
>                                           uint32_t id)
>  {
>      SpaprDrc *drc = SPAPR_DR_CONNECTOR(object_new(type));
> -    char *prop_name;
>  
>      drc->id = id;
> -    drc->owner = owner;
> -    prop_name = g_strdup_printf("dr-connector[%"PRIu32"]",
> -                                spapr_drc_index(drc));
> -    object_property_add_child(owner, prop_name, OBJECT(drc));
> -    object_unref(OBJECT(drc));
> -    qdev_realize(DEVICE(drc), NULL, NULL);
> -    g_free(prop_name);
> +    drc->owner = object_ref(owner);
> +    SPAPR_DR_CONNECTOR_GET_CLASS(drc)->realize(drc);
>  
>      return drc;
>  }
>  
> +void spapr_dr_connector_free(SpaprDrc *drc)
> +{
> +    SPAPR_DR_CONNECTOR_GET_CLASS(drc)->unrealize(drc);
> +    object_unref(drc->owner);
> +    object_unref(drc);
> +}
> +
>  static void spapr_dr_connector_instance_init(Object *obj)
>  {
>      SpaprDrc *drc = SPAPR_DR_CONNECTOR(obj);
> @@ -575,17 +565,19 @@ static void spapr_dr_connector_instance_init(Object *obj)
>      drc->state = drck->empty_state;
>  }
>  
> +static char *drc_vmstate_if_get_id(VMStateIf *obj)
> +{
> +    return NULL;
> +}
> +
>  static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
>  {
> -    DeviceClass *dk = DEVICE_CLASS(k);
> +    SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
> +    VMStateIfClass *vc = VMSTATE_IF_CLASS(k);
>  
> -    dk->realize = drc_realize;
> -    dk->unrealize = drc_unrealize;
> -    /*
> -     * Reason: DR connector needs to be wired to either the machine or to a
> -     * PHB in spapr_dr_connector_new().
> -     */
> -    dk->user_creatable = false;
> +    drck->realize = drc_realize;
> +    drck->unrealize = drc_unrealize;
> +    vc->get_id = drc_vmstate_if_get_id;
>  }
>  
>  static bool drc_physical_needed(void *opaque)
> @@ -623,39 +615,32 @@ static void drc_physical_reset(void *opaque)
>      }
>  }
>  
> -static void realize_physical(DeviceState *d, Error **errp)
> +static void realize_physical(SpaprDrc *drc)
>  {
> -    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(d);
> -    Error *local_err = NULL;
> -
> -    drc_realize(d, &local_err);
> -    if (local_err) {
> -        error_propagate(errp, local_err);
> -        return;
> -    }
> +    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(drc);
>  
> +    drc_realize(drc);
>      vmstate_register(VMSTATE_IF(drcp),
>                       spapr_drc_index(SPAPR_DR_CONNECTOR(drcp)),
>                       &vmstate_spapr_drc_physical, drcp);
>      qemu_register_reset(drc_physical_reset, drcp);
>  }
>  
> -static void unrealize_physical(DeviceState *d)
> +static void unrealize_physical(SpaprDrc *drc)
>  {
> -    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(d);
> +    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(drc);
>  
> -    drc_unrealize(d);
> -    vmstate_unregister(VMSTATE_IF(drcp), &vmstate_spapr_drc_physical, drcp);
>      qemu_unregister_reset(drc_physical_reset, drcp);
> +    vmstate_unregister(VMSTATE_IF(drcp), &vmstate_spapr_drc_physical, drcp);
> +    drc_unrealize(drc);
>  }
>  
>  static void spapr_drc_physical_class_init(ObjectClass *k, void *data)
>  {
> -    DeviceClass *dk = DEVICE_CLASS(k);
>      SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
>  
> -    dk->realize = realize_physical;
> -    dk->unrealize = unrealize_physical;
> +    drck->realize = realize_physical;
> +    drck->unrealize = unrealize_physical;
>      drck->dr_entity_sense = physical_entity_sense;
>      drck->isolate = drc_isolate_physical;
>      drck->unisolate = drc_unisolate_physical;
> @@ -731,12 +716,16 @@ static void spapr_drc_pmem_class_init(ObjectClass *k, void *data)
>  
>  static const TypeInfo spapr_dr_connector_info = {
>      .name          = TYPE_SPAPR_DR_CONNECTOR,
> -    .parent        = TYPE_DEVICE,
> +    .parent        = TYPE_OBJECT,
>      .instance_size = sizeof(SpaprDrc),
>      .instance_init = spapr_dr_connector_instance_init,
>      .class_size    = sizeof(SpaprDrcClass),
>      .class_init    = spapr_dr_connector_class_init,
>      .abstract      = true,
> +    .interfaces = (InterfaceInfo[]) {
> +        { TYPE_VMSTATE_IF },
> +        { }
> +    },
>  };
>  
>  static const TypeInfo spapr_drc_physical_info = {
> @@ -789,14 +778,9 @@ static const TypeInfo spapr_drc_pmem_info = {
>  
>  SpaprDrc *spapr_drc_by_index(uint32_t index)
>  {
> -    Object *obj;
> -    gchar *name;
> -
> -    name = g_strdup_printf("%s/%x", DRC_CONTAINER_PATH, index);
> -    obj = object_resolve_path(name, NULL);
> -    g_free(name);
> -
> -    return !obj ? NULL : SPAPR_DR_CONNECTOR(obj);
> +    return
> +        SPAPR_DR_CONNECTOR(g_hash_table_lookup(drc_hash_table(),
> +                                               GUINT_TO_POINTER(index)));
>  }
>  
>  SpaprDrc *spapr_drc_by_id(const char *type, uint32_t id)
> @@ -824,13 +808,12 @@ SpaprDrc *spapr_drc_by_id(const char *type, uint32_t id)
>   */
>  int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask)
>  {
> -    Object *root_container;
> -    ObjectProperty *prop;
> -    ObjectPropertyIterator iter;
> +    GHashTableIter iter;
>      uint32_t drc_count = 0;
>      GArray *drc_indexes, *drc_power_domains;
>      GString *drc_names, *drc_types;
>      int ret;
> +    SpaprDrc *drc;
>  
>      /*
>       * This should really be only called once per node since it overwrites
> @@ -851,26 +834,12 @@ int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask)
>      drc_names = g_string_set_size(g_string_new(NULL), sizeof(uint32_t));
>      drc_types = g_string_set_size(g_string_new(NULL), sizeof(uint32_t));
>  
> -    /* aliases for all DRConnector objects will be rooted in QOM
> -     * composition tree at DRC_CONTAINER_PATH
> -     */
> -    root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
> -
> -    object_property_iter_init(&iter, root_container);
> -    while ((prop = object_property_iter_next(&iter))) {
> -        Object *obj;
> -        SpaprDrc *drc;
> +    g_hash_table_iter_init(&iter, drc_hash_table());
> +    while (g_hash_table_iter_next(&iter, NULL, (gpointer) &drc)) {
>          SpaprDrcClass *drck;
>          char *drc_name = NULL;
>          uint32_t drc_index, drc_power_domain;
>  
> -        if (!strstart(prop->type, "link<", NULL)) {
> -            continue;
> -        }
> -
> -        obj = object_property_get_link(root_container, prop->name,
> -                                       &error_abort);
> -        drc = SPAPR_DR_CONNECTOR(obj);
>          drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>  
>          if (owner && (drc->owner != owner)) {
> @@ -951,23 +920,12 @@ out:
>  
>  void spapr_drc_reset_all(SpaprMachineState *spapr)
>  {
> -    Object *drc_container;
> -    ObjectProperty *prop;
> -    ObjectPropertyIterator iter;
> +    GHashTableIter iter;
> +    SpaprDrc *drc;
>  
> -    drc_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
>  restart:
> -    object_property_iter_init(&iter, drc_container);
> -    while ((prop = object_property_iter_next(&iter))) {
> -        SpaprDrc *drc;
> -
> -        if (!strstart(prop->type, "link<", NULL)) {
> -            continue;
> -        }
> -        drc = SPAPR_DR_CONNECTOR(object_property_get_link(drc_container,
> -                                                          prop->name,
> -                                                          &error_abort));
> -
> +    g_hash_table_iter_init(&iter, drc_hash_table());
> +    while (g_hash_table_iter_next(&iter, NULL, (gpointer) &drc)) {
>          /*
>           * This will complete any pending plug/unplug requests.
>           * In case of a unplugged PHB or PCI bridge, this will
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 76d7c91e9c64..ca0cca664e3c 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1262,7 +1262,7 @@ static void remove_drcs(SpaprPhbState *phb, PCIBus *bus)
>          SpaprDrc *drc = drc_from_devfn(phb, chassis, i);
>  
>          if (drc) {
> -            object_unparent(OBJECT(drc));
> +            spapr_dr_connector_free(drc);
>          }
>      }
>  }
Greg Kurz Jan. 6, 2021, 6:15 p.m. UTC | #3
On Mon, 28 Dec 2020 19:28:39 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Fri, Dec 18, 2020 at 11:34:00AM +0100, Greg Kurz wrote:
> > Modeling DR connectors as individual devices raises some
> > concerns, as already discussed a year ago in this thread:
> > 
> > https://patchew.org/QEMU/20191017205953.13122-1-cheloha@linux.vnet.ibm.com/
> > 
> > First, high maxmem settings creates too many DRC devices.
> > This causes scalability issues. It severely increase boot
> > time because the multiple traversals of the DRC list that
> > are performed during machine setup are quadratic operations.
> > This is directly related to the fact that DRCs are modeled
> > as individual devices and added to the composition tree.
> > 
> > Second, DR connectors are really an internal concept of
> > PAPR. They aren't something that the user or management
> > layer can manipulate in any way. We already don't allow
> > their creation with device_add by clearing user_creatable.
> > 
> > DR connectors don't even need to be modeled as actual
> > devices since they don't sit in a bus. They just need
> > to be associated to an 'owner' object and to have the
> > equivalent of realize/unrealize functions.
> > 
> > Downgrade them to be simple objects. Convert the existing
> > realize() and unrealize() to be methods of the DR connector
> > base class. Also have the base class to inherit from the
> > vmstate_if interface directly. The get_id() hook simply
> > returns NULL, just as device_vmstate_if_get_id() does for
> > devices that don't sit in a bus. The DR connector is no
> > longer made a child object. This means that it must be
> > explicitely freed when no longer needed. This is only
> > required for PHBs and PCI bridges actually : have them to
> > free the DRC with spapr_dr_connector_free() instead of
> > object_unparent().
> > 
> > No longer add the DRCs to the QOM composition tree. Track
> > them with a glib hash table using the global DRC index as
> > the key instead. This makes traversal a linear operation.
> 
> I have some reservations about this one.  The main thing is that
> attaching migration state to something that's not a device seems a bit
> odd to me.  AFAICT exactly one other non-device implements
> TYPE_VMSTATE_IF, and what it does isn't very clear to me.
> 

Even with your proposal below, the current SpaprDrc type, which is
used all over the place, will stop being a TYPE_DEVICE but we still
need to support migration with existing machine types for which DRC
are devices. Implementing TYPE_VMSTATE_IF is essentially a hack that
allows to do that without keeping the current TYPE_DEVICE based
implementation around.

> As I might have mentioned to you I had a different idea for how to
> address this problem: still use a TYPE_DEVICE, but have it manage a
> whole array of DRCs as one unit, rather than just a single one.
> Specifically I was thinking:
> 
> * one array per PCI bus (DRCs for each function on the bus)
> * one array for each block of memory (so one for base memory, one for
>   each DIMM)
> * one array for all the cpus
> * one array for all the PHBs
> 
> It has some disadvantages compared to your scheme: it still leaves
> (less) devices which can't be user managed, which is a bit ugly.  On
> the other hand, each of those arrays can reasonably be dense, so we
> can use direct indexing rather than a hash table, which is a bit
> nicer.
> 
> Thoughts?
> 

I find it a bit overkill to introduce a new TYPE_DEVICE (let's
call it a DRC manager) for something that:
- doesn't sit on a bus
- can't be user managed
- isn't directly represented to the guest as a full node
  in the DT unlike all other devices, but just as indexes
  in some properties of actual DR capable devices.

Given that the DRC index space is global and this is what
the guest passes to DR RTAS calls, we can't do direct
indexing, strictly speaking. We need at least some logic
to dispatch operations on individual DRC states to the
appropriate DRC manager. This logic belongs to the machine
IMHO.

This shouldn't be too complex for CPUs and PHBs since they
sit directly under the machine and have 1:1 relation with
the attached device. It just boils down to instantiate
some DRC managers during machine init:

- range [ 0x10000000 ... 0x10000000 + ${max_cpus} [
  for CPUs
- range [ 0x20000000 ... 0x20000000 + 31 [
  for PHBs

For memory, the code currently generates DRC indexes in the range:

[ 0x80000000 ... 0x80000000 + ${base_ram_size}/256M ... ${max_ram_size}/256M [

ie. it doesn't generate DRC indexes for the base memory AFAICT. Also
each DIMM can be of arbitrary size, ie. consume an arbitrary amount
of DRC indexes. So the machine would instantiate SPAPR_MAX_RAM_SLOTS (32)
DRC managers, each capable of managing the full set of LMB DRCs, just
in case ? Likely a lot of zeroes with high maxmem settings but I guess
we can live with it.

PCI busses would need some extra care though since the machine
doesn't know about them. This would require to be able to
register/unregister DRC managers for SPAPR_DR_CONNECTOR_TYPE_PCI
indexes, so that the dispatching logic know about the ranges
they cover (PHB internals).

And finally comes migration : I cannot think of a way to generate
the VMState sections used by existing machine types out of a set
of arrays of integers... We could keep the current implementation
around and use it with older machine types, but this certainly
looks terrible from a maintenance perspective. Did you have any
suggestion to handle that ?

I seem to remember that one of the motivation to have arrays
of DRCs is to avoid the inflation of VMState sections that
we currently get with high maxmem settings, and it is considered
preferable to stream sparse arrays. This could be achieved by
building these arrays out of the global DRC hash table in a machine
pre-save handler and migrate them in a subsection for the default
machine type. Older machine types would continue with the current
VMState sections thanks to the TYPE_VMSTATE_IF hack.

Does this seem a reasonable trade-off to be able to support
older and newer machine types with the same implementation ?

> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  include/hw/ppc/spapr_drc.h |   8 +-
> >  hw/ppc/spapr_drc.c         | 166 ++++++++++++++-----------------------
> >  hw/ppc/spapr_pci.c         |   2 +-
> >  3 files changed, 69 insertions(+), 107 deletions(-)
> > 
> > diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
> > index 8982927d5c24..a26aa8b9d4c3 100644
> > --- a/include/hw/ppc/spapr_drc.h
> > +++ b/include/hw/ppc/spapr_drc.h
> > @@ -170,7 +170,7 @@ typedef enum {
> >  
> >  typedef struct SpaprDrc {
> >      /*< private >*/
> > -    DeviceState parent;
> > +    Object parent;
> >  
> >      uint32_t id;
> >      Object *owner;
> > @@ -193,7 +193,7 @@ struct SpaprMachineState;
> >  
> >  typedef struct SpaprDrcClass {
> >      /*< private >*/
> > -    DeviceClass parent;
> > +    ObjectClass parent;
> >      SpaprDrcState empty_state;
> >      SpaprDrcState ready_state;
> >  
> > @@ -209,6 +209,9 @@ typedef struct SpaprDrcClass {
> >  
> >      int (*dt_populate)(SpaprDrc *drc, struct SpaprMachineState *spapr,
> >                         void *fdt, int *fdt_start_offset, Error **errp);
> > +
> > +    void (*realize)(SpaprDrc *drc);
> > +    void (*unrealize)(SpaprDrc *drc);
> >  } SpaprDrcClass;
> >  
> >  typedef struct SpaprDrcPhysical {
> > @@ -232,6 +235,7 @@ SpaprDrcType spapr_drc_type(SpaprDrc *drc);
> >  
> >  SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type,
> >                                           uint32_t id);
> > +void spapr_dr_connector_free(SpaprDrc *drc);
> >  SpaprDrc *spapr_drc_by_index(uint32_t index);
> >  SpaprDrc *spapr_drc_by_id(const char *type, uint32_t id);
> >  int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask);
> > diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> > index 8571d5bafe4e..e26763f8b5a4 100644
> > --- a/hw/ppc/spapr_drc.c
> > +++ b/hw/ppc/spapr_drc.c
> > @@ -27,7 +27,6 @@
> >  #include "sysemu/reset.h"
> >  #include "trace.h"
> >  
> > -#define DRC_CONTAINER_PATH "/dr-connector"
> >  #define DRC_INDEX_TYPE_SHIFT 28
> >  #define DRC_INDEX_ID_MASK ((1ULL << DRC_INDEX_TYPE_SHIFT) - 1)
> >  
> > @@ -503,65 +502,56 @@ static const VMStateDescription vmstate_spapr_drc = {
> >      }
> >  };
> >  
> > -static void drc_realize(DeviceState *d, Error **errp)
> > +static GHashTable *drc_hash_table(void)
> >  {
> > -    SpaprDrc *drc = SPAPR_DR_CONNECTOR(d);
> > -    Object *root_container;
> > -    gchar *link_name;
> > -    const char *child_name;
> > +    static GHashTable *dht;
> >  
> > +    if (!dht) {
> > +        dht = g_hash_table_new(NULL, NULL);
> > +    }
> > +
> > +    return dht;
> > +}
> > +
> > +
> > +static void drc_realize(SpaprDrc *drc)
> > +{
> >      trace_spapr_drc_realize(spapr_drc_index(drc));
> > -    /* NOTE: we do this as part of realize/unrealize due to the fact
> > -     * that the guest will communicate with the DRC via RTAS calls
> > -     * referencing the global DRC index. By unlinking the DRC
> > -     * from DRC_CONTAINER_PATH/<drc_index> we effectively make it
> > -     * inaccessible by the guest, since lookups rely on this path
> > -     * existing in the composition tree
> > -     */
> > -    root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
> > -    link_name = g_strdup_printf("%x", spapr_drc_index(drc));
> > -    child_name = object_get_canonical_path_component(OBJECT(drc));
> > -    trace_spapr_drc_realize_child(spapr_drc_index(drc), child_name);
> > -    object_property_add_alias(root_container, link_name,
> > -                              drc->owner, child_name);
> > -    g_free(link_name);
> > +
> > +    g_hash_table_insert(drc_hash_table(),
> > +                        GUINT_TO_POINTER(spapr_drc_index(drc)), drc);
> >      vmstate_register(VMSTATE_IF(drc), spapr_drc_index(drc), &vmstate_spapr_drc,
> >                       drc);
> >      trace_spapr_drc_realize_complete(spapr_drc_index(drc));
> >  }
> >  
> > -static void drc_unrealize(DeviceState *d)
> > +static void drc_unrealize(SpaprDrc *drc)
> >  {
> > -    SpaprDrc *drc = SPAPR_DR_CONNECTOR(d);
> > -    Object *root_container;
> > -    gchar *name;
> > -
> >      trace_spapr_drc_unrealize(spapr_drc_index(drc));
> >      vmstate_unregister(VMSTATE_IF(drc), &vmstate_spapr_drc, drc);
> > -    root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
> > -    name = g_strdup_printf("%x", spapr_drc_index(drc));
> > -    object_property_del(root_container, name);
> > -    g_free(name);
> > +    g_hash_table_remove(drc_hash_table(),
> > +                        GUINT_TO_POINTER(spapr_drc_index(drc)));
> >  }
> >  
> >  SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type,
> >                                           uint32_t id)
> >  {
> >      SpaprDrc *drc = SPAPR_DR_CONNECTOR(object_new(type));
> > -    char *prop_name;
> >  
> >      drc->id = id;
> > -    drc->owner = owner;
> > -    prop_name = g_strdup_printf("dr-connector[%"PRIu32"]",
> > -                                spapr_drc_index(drc));
> > -    object_property_add_child(owner, prop_name, OBJECT(drc));
> > -    object_unref(OBJECT(drc));
> > -    qdev_realize(DEVICE(drc), NULL, NULL);
> > -    g_free(prop_name);
> > +    drc->owner = object_ref(owner);
> > +    SPAPR_DR_CONNECTOR_GET_CLASS(drc)->realize(drc);
> >  
> >      return drc;
> >  }
> >  
> > +void spapr_dr_connector_free(SpaprDrc *drc)
> > +{
> > +    SPAPR_DR_CONNECTOR_GET_CLASS(drc)->unrealize(drc);
> > +    object_unref(drc->owner);
> > +    object_unref(drc);
> > +}
> > +
> >  static void spapr_dr_connector_instance_init(Object *obj)
> >  {
> >      SpaprDrc *drc = SPAPR_DR_CONNECTOR(obj);
> > @@ -575,17 +565,19 @@ static void spapr_dr_connector_instance_init(Object *obj)
> >      drc->state = drck->empty_state;
> >  }
> >  
> > +static char *drc_vmstate_if_get_id(VMStateIf *obj)
> > +{
> > +    return NULL;
> > +}
> > +
> >  static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
> >  {
> > -    DeviceClass *dk = DEVICE_CLASS(k);
> > +    SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
> > +    VMStateIfClass *vc = VMSTATE_IF_CLASS(k);
> >  
> > -    dk->realize = drc_realize;
> > -    dk->unrealize = drc_unrealize;
> > -    /*
> > -     * Reason: DR connector needs to be wired to either the machine or to a
> > -     * PHB in spapr_dr_connector_new().
> > -     */
> > -    dk->user_creatable = false;
> > +    drck->realize = drc_realize;
> > +    drck->unrealize = drc_unrealize;
> > +    vc->get_id = drc_vmstate_if_get_id;
> >  }
> >  
> >  static bool drc_physical_needed(void *opaque)
> > @@ -623,39 +615,32 @@ static void drc_physical_reset(void *opaque)
> >      }
> >  }
> >  
> > -static void realize_physical(DeviceState *d, Error **errp)
> > +static void realize_physical(SpaprDrc *drc)
> >  {
> > -    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(d);
> > -    Error *local_err = NULL;
> > -
> > -    drc_realize(d, &local_err);
> > -    if (local_err) {
> > -        error_propagate(errp, local_err);
> > -        return;
> > -    }
> > +    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(drc);
> >  
> > +    drc_realize(drc);
> >      vmstate_register(VMSTATE_IF(drcp),
> >                       spapr_drc_index(SPAPR_DR_CONNECTOR(drcp)),
> >                       &vmstate_spapr_drc_physical, drcp);
> >      qemu_register_reset(drc_physical_reset, drcp);
> >  }
> >  
> > -static void unrealize_physical(DeviceState *d)
> > +static void unrealize_physical(SpaprDrc *drc)
> >  {
> > -    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(d);
> > +    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(drc);
> >  
> > -    drc_unrealize(d);
> > -    vmstate_unregister(VMSTATE_IF(drcp), &vmstate_spapr_drc_physical, drcp);
> >      qemu_unregister_reset(drc_physical_reset, drcp);
> > +    vmstate_unregister(VMSTATE_IF(drcp), &vmstate_spapr_drc_physical, drcp);
> > +    drc_unrealize(drc);
> >  }
> >  
> >  static void spapr_drc_physical_class_init(ObjectClass *k, void *data)
> >  {
> > -    DeviceClass *dk = DEVICE_CLASS(k);
> >      SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
> >  
> > -    dk->realize = realize_physical;
> > -    dk->unrealize = unrealize_physical;
> > +    drck->realize = realize_physical;
> > +    drck->unrealize = unrealize_physical;
> >      drck->dr_entity_sense = physical_entity_sense;
> >      drck->isolate = drc_isolate_physical;
> >      drck->unisolate = drc_unisolate_physical;
> > @@ -731,12 +716,16 @@ static void spapr_drc_pmem_class_init(ObjectClass *k, void *data)
> >  
> >  static const TypeInfo spapr_dr_connector_info = {
> >      .name          = TYPE_SPAPR_DR_CONNECTOR,
> > -    .parent        = TYPE_DEVICE,
> > +    .parent        = TYPE_OBJECT,
> >      .instance_size = sizeof(SpaprDrc),
> >      .instance_init = spapr_dr_connector_instance_init,
> >      .class_size    = sizeof(SpaprDrcClass),
> >      .class_init    = spapr_dr_connector_class_init,
> >      .abstract      = true,
> > +    .interfaces = (InterfaceInfo[]) {
> > +        { TYPE_VMSTATE_IF },
> > +        { }
> > +    },
> >  };
> >  
> >  static const TypeInfo spapr_drc_physical_info = {
> > @@ -789,14 +778,9 @@ static const TypeInfo spapr_drc_pmem_info = {
> >  
> >  SpaprDrc *spapr_drc_by_index(uint32_t index)
> >  {
> > -    Object *obj;
> > -    gchar *name;
> > -
> > -    name = g_strdup_printf("%s/%x", DRC_CONTAINER_PATH, index);
> > -    obj = object_resolve_path(name, NULL);
> > -    g_free(name);
> > -
> > -    return !obj ? NULL : SPAPR_DR_CONNECTOR(obj);
> > +    return
> > +        SPAPR_DR_CONNECTOR(g_hash_table_lookup(drc_hash_table(),
> > +                                               GUINT_TO_POINTER(index)));
> >  }
> >  
> >  SpaprDrc *spapr_drc_by_id(const char *type, uint32_t id)
> > @@ -824,13 +808,12 @@ SpaprDrc *spapr_drc_by_id(const char *type, uint32_t id)
> >   */
> >  int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask)
> >  {
> > -    Object *root_container;
> > -    ObjectProperty *prop;
> > -    ObjectPropertyIterator iter;
> > +    GHashTableIter iter;
> >      uint32_t drc_count = 0;
> >      GArray *drc_indexes, *drc_power_domains;
> >      GString *drc_names, *drc_types;
> >      int ret;
> > +    SpaprDrc *drc;
> >  
> >      /*
> >       * This should really be only called once per node since it overwrites
> > @@ -851,26 +834,12 @@ int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask)
> >      drc_names = g_string_set_size(g_string_new(NULL), sizeof(uint32_t));
> >      drc_types = g_string_set_size(g_string_new(NULL), sizeof(uint32_t));
> >  
> > -    /* aliases for all DRConnector objects will be rooted in QOM
> > -     * composition tree at DRC_CONTAINER_PATH
> > -     */
> > -    root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
> > -
> > -    object_property_iter_init(&iter, root_container);
> > -    while ((prop = object_property_iter_next(&iter))) {
> > -        Object *obj;
> > -        SpaprDrc *drc;
> > +    g_hash_table_iter_init(&iter, drc_hash_table());
> > +    while (g_hash_table_iter_next(&iter, NULL, (gpointer) &drc)) {
> >          SpaprDrcClass *drck;
> >          char *drc_name = NULL;
> >          uint32_t drc_index, drc_power_domain;
> >  
> > -        if (!strstart(prop->type, "link<", NULL)) {
> > -            continue;
> > -        }
> > -
> > -        obj = object_property_get_link(root_container, prop->name,
> > -                                       &error_abort);
> > -        drc = SPAPR_DR_CONNECTOR(obj);
> >          drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> >  
> >          if (owner && (drc->owner != owner)) {
> > @@ -951,23 +920,12 @@ out:
> >  
> >  void spapr_drc_reset_all(SpaprMachineState *spapr)
> >  {
> > -    Object *drc_container;
> > -    ObjectProperty *prop;
> > -    ObjectPropertyIterator iter;
> > +    GHashTableIter iter;
> > +    SpaprDrc *drc;
> >  
> > -    drc_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
> >  restart:
> > -    object_property_iter_init(&iter, drc_container);
> > -    while ((prop = object_property_iter_next(&iter))) {
> > -        SpaprDrc *drc;
> > -
> > -        if (!strstart(prop->type, "link<", NULL)) {
> > -            continue;
> > -        }
> > -        drc = SPAPR_DR_CONNECTOR(object_property_get_link(drc_container,
> > -                                                          prop->name,
> > -                                                          &error_abort));
> > -
> > +    g_hash_table_iter_init(&iter, drc_hash_table());
> > +    while (g_hash_table_iter_next(&iter, NULL, (gpointer) &drc)) {
> >          /*
> >           * This will complete any pending plug/unplug requests.
> >           * In case of a unplugged PHB or PCI bridge, this will
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index 76d7c91e9c64..ca0cca664e3c 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -1262,7 +1262,7 @@ static void remove_drcs(SpaprPhbState *phb, PCIBus *bus)
> >          SpaprDrc *drc = drc_from_devfn(phb, chassis, i);
> >  
> >          if (drc) {
> > -            object_unparent(OBJECT(drc));
> > +            spapr_dr_connector_free(drc);
> >          }
> >      }
> >  }
>
David Gibson Feb. 8, 2021, 6:30 a.m. UTC | #4
On Wed, Jan 06, 2021 at 07:15:36PM +0100, Greg Kurz wrote:
> On Mon, 28 Dec 2020 19:28:39 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Fri, Dec 18, 2020 at 11:34:00AM +0100, Greg Kurz wrote:
> > > Modeling DR connectors as individual devices raises some
> > > concerns, as already discussed a year ago in this thread:
> > > 
> > > https://patchew.org/QEMU/20191017205953.13122-1-cheloha@linux.vnet.ibm.com/
> > > 
> > > First, high maxmem settings creates too many DRC devices.
> > > This causes scalability issues. It severely increase boot
> > > time because the multiple traversals of the DRC list that
> > > are performed during machine setup are quadratic operations.
> > > This is directly related to the fact that DRCs are modeled
> > > as individual devices and added to the composition tree.
> > > 
> > > Second, DR connectors are really an internal concept of
> > > PAPR. They aren't something that the user or management
> > > layer can manipulate in any way. We already don't allow
> > > their creation with device_add by clearing user_creatable.
> > > 
> > > DR connectors don't even need to be modeled as actual
> > > devices since they don't sit in a bus. They just need
> > > to be associated to an 'owner' object and to have the
> > > equivalent of realize/unrealize functions.
> > > 
> > > Downgrade them to be simple objects. Convert the existing
> > > realize() and unrealize() to be methods of the DR connector
> > > base class. Also have the base class to inherit from the
> > > vmstate_if interface directly. The get_id() hook simply
> > > returns NULL, just as device_vmstate_if_get_id() does for
> > > devices that don't sit in a bus. The DR connector is no
> > > longer made a child object. This means that it must be
> > > explicitely freed when no longer needed. This is only
> > > required for PHBs and PCI bridges actually : have them to
> > > free the DRC with spapr_dr_connector_free() instead of
> > > object_unparent().
> > > 
> > > No longer add the DRCs to the QOM composition tree. Track
> > > them with a glib hash table using the global DRC index as
> > > the key instead. This makes traversal a linear operation.
> > 
> > I have some reservations about this one.  The main thing is that
> > attaching migration state to something that's not a device seems a bit
> > odd to me.  AFAICT exactly one other non-device implements
> > TYPE_VMSTATE_IF, and what it does isn't very clear to me.
> > 
> 
> Even with your proposal below, the current SpaprDrc type, which is
> used all over the place, will stop being a TYPE_DEVICE but we still
> need to support migration with existing machine types for which DRC
> are devices.

Ah... that's a good point.

> Implementing TYPE_VMSTATE_IF is essentially a hack that
> allows to do that without keeping the current TYPE_DEVICE based
> implementation around.

Ok, that makes things clearer.

> > As I might have mentioned to you I had a different idea for how to
> > address this problem: still use a TYPE_DEVICE, but have it manage a
> > whole array of DRCs as one unit, rather than just a single one.
> > Specifically I was thinking:
> > 
> > * one array per PCI bus (DRCs for each function on the bus)
> > * one array for each block of memory (so one for base memory, one for
> >   each DIMM)
> > * one array for all the cpus
> > * one array for all the PHBs
> > 
> > It has some disadvantages compared to your scheme: it still leaves
> > (less) devices which can't be user managed, which is a bit ugly.  On
> > the other hand, each of those arrays can reasonably be dense, so we
> > can use direct indexing rather than a hash table, which is a bit
> > nicer.
> > 
> > Thoughts?
> > 
> 
> I find it a bit overkill to introduce a new TYPE_DEVICE (let's
> call it a DRC manager) for something that:
> - doesn't sit on a bus
> - can't be user managed
> - isn't directly represented to the guest as a full node
>   in the DT unlike all other devices, but just as indexes
>   in some properties of actual DR capable devices.
> 
> Given that the DRC index space is global and this is what
> the guest passes to DR RTAS calls, we can't do direct
> indexing, strictly speaking. We need at least some logic
> to dispatch operations on individual DRC states to the
> appropriate DRC manager. This logic belongs to the machine
> IMHO.
> 
> This shouldn't be too complex for CPUs and PHBs since they
> sit directly under the machine and have 1:1 relation with
> the attached device. It just boils down to instantiate
> some DRC managers during machine init:
> 
> - range [ 0x10000000 ... 0x10000000 + ${max_cpus} [
>   for CPUs
> - range [ 0x20000000 ... 0x20000000 + 31 [
>   for PHBs
> 
> For memory, the code currently generates DRC indexes in the range:
> 
> [ 0x80000000 ... 0x80000000 + ${base_ram_size}/256M ... ${max_ram_size}/256M [
> 
> ie. it doesn't generate DRC indexes for the base memory AFAICT. Also
> each DIMM can be of arbitrary size, ie. consume an arbitrary amount
> of DRC indexes. So the machine would instantiate SPAPR_MAX_RAM_SLOTS (32)
> DRC managers, each capable of managing the full set of LMB DRCs, just
> in case ? Likely a lot of zeroes with high maxmem settings but I guess
> we can live with it.

Actually, I was thinking of just a single manager for all the
(pluggable) LMB DRCs, a single manager for all CPU DRCs, a single
manager for all PHB DRCs and one per bus for PCI DRCs.  I'm not
assuming a 1:1 correspondance between manager and user side hotplug
operations.

Although... actually the "manager" could be an interface rather than
an object, in which case the DRC manager would be the machine for
LMBs, CPUs, and PHBs and the parent bus for each PCI slot.

> PCI busses would need some extra care though since the machine
> doesn't know about them. This would require to be able to
> register/unregister DRC managers for SPAPR_DR_CONNECTOR_TYPE_PCI
> indexes, so that the dispatching logic know about the ranges
> they cover (PHB internals).

Right, but that wouldn't really be any different from the dynamic
creation of DRCs we do add_drcs() / remove_drcs() right now, except
that it would create /destroy one object instead of a bunch.

> And finally comes migration : I cannot think of a way to generate
> the VMState sections used by existing machine types out of a set
> of arrays of integers... We could keep the current implementation
> around and use it with older machine types, but this certainly
> looks terrible from a maintenance perspective. Did you have any
> suggestion to handle that ?

Ugh, yeah.. that could be difficult.

> I seem to remember that one of the motivation to have arrays
> of DRCs is to avoid the inflation of VMState sections that
> we currently get with high maxmem settings, and it is considered
> preferable to stream sparse arrays. This could be achieved by
> building these arrays out of the global DRC hash table in a machine
> pre-save handler and migrate them in a subsection for the default
> machine type. Older machine types would continue with the current
> VMState sections thanks to the TYPE_VMSTATE_IF hack.
> 
> Does this seem a reasonable trade-off to be able to support
> older and newer machine types with the same implementation ?

Hrm, maybe, yeah.
diff mbox series

Patch

diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index 8982927d5c24..a26aa8b9d4c3 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -170,7 +170,7 @@  typedef enum {
 
 typedef struct SpaprDrc {
     /*< private >*/
-    DeviceState parent;
+    Object parent;
 
     uint32_t id;
     Object *owner;
@@ -193,7 +193,7 @@  struct SpaprMachineState;
 
 typedef struct SpaprDrcClass {
     /*< private >*/
-    DeviceClass parent;
+    ObjectClass parent;
     SpaprDrcState empty_state;
     SpaprDrcState ready_state;
 
@@ -209,6 +209,9 @@  typedef struct SpaprDrcClass {
 
     int (*dt_populate)(SpaprDrc *drc, struct SpaprMachineState *spapr,
                        void *fdt, int *fdt_start_offset, Error **errp);
+
+    void (*realize)(SpaprDrc *drc);
+    void (*unrealize)(SpaprDrc *drc);
 } SpaprDrcClass;
 
 typedef struct SpaprDrcPhysical {
@@ -232,6 +235,7 @@  SpaprDrcType spapr_drc_type(SpaprDrc *drc);
 
 SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type,
                                          uint32_t id);
+void spapr_dr_connector_free(SpaprDrc *drc);
 SpaprDrc *spapr_drc_by_index(uint32_t index);
 SpaprDrc *spapr_drc_by_id(const char *type, uint32_t id);
 int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask);
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 8571d5bafe4e..e26763f8b5a4 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -27,7 +27,6 @@ 
 #include "sysemu/reset.h"
 #include "trace.h"
 
-#define DRC_CONTAINER_PATH "/dr-connector"
 #define DRC_INDEX_TYPE_SHIFT 28
 #define DRC_INDEX_ID_MASK ((1ULL << DRC_INDEX_TYPE_SHIFT) - 1)
 
@@ -503,65 +502,56 @@  static const VMStateDescription vmstate_spapr_drc = {
     }
 };
 
-static void drc_realize(DeviceState *d, Error **errp)
+static GHashTable *drc_hash_table(void)
 {
-    SpaprDrc *drc = SPAPR_DR_CONNECTOR(d);
-    Object *root_container;
-    gchar *link_name;
-    const char *child_name;
+    static GHashTable *dht;
 
+    if (!dht) {
+        dht = g_hash_table_new(NULL, NULL);
+    }
+
+    return dht;
+}
+
+
+static void drc_realize(SpaprDrc *drc)
+{
     trace_spapr_drc_realize(spapr_drc_index(drc));
-    /* NOTE: we do this as part of realize/unrealize due to the fact
-     * that the guest will communicate with the DRC via RTAS calls
-     * referencing the global DRC index. By unlinking the DRC
-     * from DRC_CONTAINER_PATH/<drc_index> we effectively make it
-     * inaccessible by the guest, since lookups rely on this path
-     * existing in the composition tree
-     */
-    root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
-    link_name = g_strdup_printf("%x", spapr_drc_index(drc));
-    child_name = object_get_canonical_path_component(OBJECT(drc));
-    trace_spapr_drc_realize_child(spapr_drc_index(drc), child_name);
-    object_property_add_alias(root_container, link_name,
-                              drc->owner, child_name);
-    g_free(link_name);
+
+    g_hash_table_insert(drc_hash_table(),
+                        GUINT_TO_POINTER(spapr_drc_index(drc)), drc);
     vmstate_register(VMSTATE_IF(drc), spapr_drc_index(drc), &vmstate_spapr_drc,
                      drc);
     trace_spapr_drc_realize_complete(spapr_drc_index(drc));
 }
 
-static void drc_unrealize(DeviceState *d)
+static void drc_unrealize(SpaprDrc *drc)
 {
-    SpaprDrc *drc = SPAPR_DR_CONNECTOR(d);
-    Object *root_container;
-    gchar *name;
-
     trace_spapr_drc_unrealize(spapr_drc_index(drc));
     vmstate_unregister(VMSTATE_IF(drc), &vmstate_spapr_drc, drc);
-    root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
-    name = g_strdup_printf("%x", spapr_drc_index(drc));
-    object_property_del(root_container, name);
-    g_free(name);
+    g_hash_table_remove(drc_hash_table(),
+                        GUINT_TO_POINTER(spapr_drc_index(drc)));
 }
 
 SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type,
                                          uint32_t id)
 {
     SpaprDrc *drc = SPAPR_DR_CONNECTOR(object_new(type));
-    char *prop_name;
 
     drc->id = id;
-    drc->owner = owner;
-    prop_name = g_strdup_printf("dr-connector[%"PRIu32"]",
-                                spapr_drc_index(drc));
-    object_property_add_child(owner, prop_name, OBJECT(drc));
-    object_unref(OBJECT(drc));
-    qdev_realize(DEVICE(drc), NULL, NULL);
-    g_free(prop_name);
+    drc->owner = object_ref(owner);
+    SPAPR_DR_CONNECTOR_GET_CLASS(drc)->realize(drc);
 
     return drc;
 }
 
+void spapr_dr_connector_free(SpaprDrc *drc)
+{
+    SPAPR_DR_CONNECTOR_GET_CLASS(drc)->unrealize(drc);
+    object_unref(drc->owner);
+    object_unref(drc);
+}
+
 static void spapr_dr_connector_instance_init(Object *obj)
 {
     SpaprDrc *drc = SPAPR_DR_CONNECTOR(obj);
@@ -575,17 +565,19 @@  static void spapr_dr_connector_instance_init(Object *obj)
     drc->state = drck->empty_state;
 }
 
+static char *drc_vmstate_if_get_id(VMStateIf *obj)
+{
+    return NULL;
+}
+
 static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
 {
-    DeviceClass *dk = DEVICE_CLASS(k);
+    SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
+    VMStateIfClass *vc = VMSTATE_IF_CLASS(k);
 
-    dk->realize = drc_realize;
-    dk->unrealize = drc_unrealize;
-    /*
-     * Reason: DR connector needs to be wired to either the machine or to a
-     * PHB in spapr_dr_connector_new().
-     */
-    dk->user_creatable = false;
+    drck->realize = drc_realize;
+    drck->unrealize = drc_unrealize;
+    vc->get_id = drc_vmstate_if_get_id;
 }
 
 static bool drc_physical_needed(void *opaque)
@@ -623,39 +615,32 @@  static void drc_physical_reset(void *opaque)
     }
 }
 
-static void realize_physical(DeviceState *d, Error **errp)
+static void realize_physical(SpaprDrc *drc)
 {
-    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(d);
-    Error *local_err = NULL;
-
-    drc_realize(d, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        return;
-    }
+    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(drc);
 
+    drc_realize(drc);
     vmstate_register(VMSTATE_IF(drcp),
                      spapr_drc_index(SPAPR_DR_CONNECTOR(drcp)),
                      &vmstate_spapr_drc_physical, drcp);
     qemu_register_reset(drc_physical_reset, drcp);
 }
 
-static void unrealize_physical(DeviceState *d)
+static void unrealize_physical(SpaprDrc *drc)
 {
-    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(d);
+    SpaprDrcPhysical *drcp = SPAPR_DRC_PHYSICAL(drc);
 
-    drc_unrealize(d);
-    vmstate_unregister(VMSTATE_IF(drcp), &vmstate_spapr_drc_physical, drcp);
     qemu_unregister_reset(drc_physical_reset, drcp);
+    vmstate_unregister(VMSTATE_IF(drcp), &vmstate_spapr_drc_physical, drcp);
+    drc_unrealize(drc);
 }
 
 static void spapr_drc_physical_class_init(ObjectClass *k, void *data)
 {
-    DeviceClass *dk = DEVICE_CLASS(k);
     SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
 
-    dk->realize = realize_physical;
-    dk->unrealize = unrealize_physical;
+    drck->realize = realize_physical;
+    drck->unrealize = unrealize_physical;
     drck->dr_entity_sense = physical_entity_sense;
     drck->isolate = drc_isolate_physical;
     drck->unisolate = drc_unisolate_physical;
@@ -731,12 +716,16 @@  static void spapr_drc_pmem_class_init(ObjectClass *k, void *data)
 
 static const TypeInfo spapr_dr_connector_info = {
     .name          = TYPE_SPAPR_DR_CONNECTOR,
-    .parent        = TYPE_DEVICE,
+    .parent        = TYPE_OBJECT,
     .instance_size = sizeof(SpaprDrc),
     .instance_init = spapr_dr_connector_instance_init,
     .class_size    = sizeof(SpaprDrcClass),
     .class_init    = spapr_dr_connector_class_init,
     .abstract      = true,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_VMSTATE_IF },
+        { }
+    },
 };
 
 static const TypeInfo spapr_drc_physical_info = {
@@ -789,14 +778,9 @@  static const TypeInfo spapr_drc_pmem_info = {
 
 SpaprDrc *spapr_drc_by_index(uint32_t index)
 {
-    Object *obj;
-    gchar *name;
-
-    name = g_strdup_printf("%s/%x", DRC_CONTAINER_PATH, index);
-    obj = object_resolve_path(name, NULL);
-    g_free(name);
-
-    return !obj ? NULL : SPAPR_DR_CONNECTOR(obj);
+    return
+        SPAPR_DR_CONNECTOR(g_hash_table_lookup(drc_hash_table(),
+                                               GUINT_TO_POINTER(index)));
 }
 
 SpaprDrc *spapr_drc_by_id(const char *type, uint32_t id)
@@ -824,13 +808,12 @@  SpaprDrc *spapr_drc_by_id(const char *type, uint32_t id)
  */
 int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask)
 {
-    Object *root_container;
-    ObjectProperty *prop;
-    ObjectPropertyIterator iter;
+    GHashTableIter iter;
     uint32_t drc_count = 0;
     GArray *drc_indexes, *drc_power_domains;
     GString *drc_names, *drc_types;
     int ret;
+    SpaprDrc *drc;
 
     /*
      * This should really be only called once per node since it overwrites
@@ -851,26 +834,12 @@  int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask)
     drc_names = g_string_set_size(g_string_new(NULL), sizeof(uint32_t));
     drc_types = g_string_set_size(g_string_new(NULL), sizeof(uint32_t));
 
-    /* aliases for all DRConnector objects will be rooted in QOM
-     * composition tree at DRC_CONTAINER_PATH
-     */
-    root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
-
-    object_property_iter_init(&iter, root_container);
-    while ((prop = object_property_iter_next(&iter))) {
-        Object *obj;
-        SpaprDrc *drc;
+    g_hash_table_iter_init(&iter, drc_hash_table());
+    while (g_hash_table_iter_next(&iter, NULL, (gpointer) &drc)) {
         SpaprDrcClass *drck;
         char *drc_name = NULL;
         uint32_t drc_index, drc_power_domain;
 
-        if (!strstart(prop->type, "link<", NULL)) {
-            continue;
-        }
-
-        obj = object_property_get_link(root_container, prop->name,
-                                       &error_abort);
-        drc = SPAPR_DR_CONNECTOR(obj);
         drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
 
         if (owner && (drc->owner != owner)) {
@@ -951,23 +920,12 @@  out:
 
 void spapr_drc_reset_all(SpaprMachineState *spapr)
 {
-    Object *drc_container;
-    ObjectProperty *prop;
-    ObjectPropertyIterator iter;
+    GHashTableIter iter;
+    SpaprDrc *drc;
 
-    drc_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
 restart:
-    object_property_iter_init(&iter, drc_container);
-    while ((prop = object_property_iter_next(&iter))) {
-        SpaprDrc *drc;
-
-        if (!strstart(prop->type, "link<", NULL)) {
-            continue;
-        }
-        drc = SPAPR_DR_CONNECTOR(object_property_get_link(drc_container,
-                                                          prop->name,
-                                                          &error_abort));
-
+    g_hash_table_iter_init(&iter, drc_hash_table());
+    while (g_hash_table_iter_next(&iter, NULL, (gpointer) &drc)) {
         /*
          * This will complete any pending plug/unplug requests.
          * In case of a unplugged PHB or PCI bridge, this will
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 76d7c91e9c64..ca0cca664e3c 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1262,7 +1262,7 @@  static void remove_drcs(SpaprPhbState *phb, PCIBus *bus)
         SpaprDrc *drc = drc_from_devfn(phb, chassis, i);
 
         if (drc) {
-            object_unparent(OBJECT(drc));
+            spapr_dr_connector_free(drc);
         }
     }
 }