diff mbox series

[3/6] ppc: Reparent presenter objects to the interrupt controller object

Message ID 157184233056.3053790.13073641279894392321.stgit@bahia.lan
State New
Headers show
Series ppc: Reparent the interrupt presenter | expand

Commit Message

Greg Kurz Oct. 23, 2019, 2:52 p.m. UTC
Each VCPU is associated to a presenter object within the interrupt
controller, ie. TCTX for XIVE and ICP for XICS, but our current
models put these objects below the VCPU, and we rely on CPU_FOREACH()
to do anything that involves presenters.

This recently bit us with the CAM line matching logic in XIVE because
CPU_FOREACH() can race with CPU hotplug and we ended up considering a
VCPU that wasn't associated to a TCTX object yet. Other users of
CPU_FOREACH() are 'info pic' for both XICS and XIVE. It is again very
easy to crash QEMU with concurrent VCPU hotplug/unplug and 'info pic'.

Reparent the presenter objects to the corresponding interrupt controller
object, ie. XIVE router or ICS, to make it clear they are not some extra
data hanging from the CPU but internal XIVE or XICS entities. The CPU
object now needs to explicitely take a reference on the presenter to
ensure its pointer remains valid until unrealize time.

This will allow to get rid of CPU_FOREACH() and ease further improvements
to the XIVE model.

This change doesn't impact section ids and is thus transparent to
migration.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/intc/spapr_xive.c  |    6 +++++-
 hw/intc/xics.c        |    7 +++++--
 hw/intc/xics_spapr.c  |    8 ++++++--
 hw/intc/xive.c        |    4 +++-
 hw/ppc/pnv.c          |   17 +++++++++++++----
 include/hw/ppc/xics.h |    2 +-
 6 files changed, 33 insertions(+), 11 deletions(-)

Comments

David Gibson Oct. 24, 2019, 2:58 a.m. UTC | #1
On Wed, Oct 23, 2019 at 04:52:10PM +0200, Greg Kurz wrote:
> Each VCPU is associated to a presenter object within the interrupt
> controller, ie. TCTX for XIVE and ICP for XICS, but our current
> models put these objects below the VCPU, and we rely on CPU_FOREACH()
> to do anything that involves presenters.
> 
> This recently bit us with the CAM line matching logic in XIVE because
> CPU_FOREACH() can race with CPU hotplug and we ended up considering a
> VCPU that wasn't associated to a TCTX object yet. Other users of
> CPU_FOREACH() are 'info pic' for both XICS and XIVE. It is again very
> easy to crash QEMU with concurrent VCPU hotplug/unplug and 'info pic'.
> 
> Reparent the presenter objects to the corresponding interrupt controller
> object, ie. XIVE router or ICS, to make it clear they are not some extra
> data hanging from the CPU but internal XIVE or XICS entities. The CPU
> object now needs to explicitely take a reference on the presenter to
> ensure its pointer remains valid until unrealize time.
> 
> This will allow to get rid of CPU_FOREACH() and ease further improvements
> to the XIVE model.
> 
> This change doesn't impact section ids and is thus transparent to
> migration.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>


Urgh.  I see why we want something like this, but reparenting the
presenters to the source modules (particularly for XICS) makes me
uncomfortable.

AFAICT the association here is *purely* about managing lifetimes, not
because those ICPs inherently have something to do with those ICSes.
Same with XIVE, although since they'll be on the same chip there's a
bit more logic to it.

For spapr we kinda-sorta treat the (single) ICS or XiveRouter object
as the "master" of the interrupt controller.  But that's not really
accurate to the hardware, so I don't really want to push that way of
looking at it any further than it already is.

If we could make the presenters children of the machine (spapr) or
chip (pnv) that might make more sense?

I'm also not totally convinced that having the presenter as a child of
the cpu is inherently bad.  Yes we have bugs now, but maybe the right
fix *is* actually to do the NULL check on every CPU_FOREACH().

For comparison, the lapic on x86 machines is a child of the cpu, and I
believe they do have NULL checks on cpu->apic_state in various places
they use CPU_FOREACH().

> ---
>  hw/intc/spapr_xive.c  |    6 +++++-
>  hw/intc/xics.c        |    7 +++++--
>  hw/intc/xics_spapr.c  |    8 ++++++--
>  hw/intc/xive.c        |    4 +++-
>  hw/ppc/pnv.c          |   17 +++++++++++++----
>  include/hw/ppc/xics.h |    2 +-
>  6 files changed, 33 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index b09cc48bcb61..d74ee71e76b4 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -526,6 +526,7 @@ static int spapr_xive_cpu_intc_create(SpaprInterruptController *intc,
>          return -1;
>      }
>  
> +    object_ref(obj);
>      spapr_cpu->tctx = XIVE_TCTX(obj);
>      return 0;
>  }
> @@ -558,7 +559,10 @@ static void spapr_xive_cpu_intc_reset(SpaprInterruptController *intc,
>  static void spapr_xive_cpu_intc_destroy(SpaprInterruptController *intc,
>                                          PowerPCCPU *cpu)
>  {
> -    xive_tctx_destroy(spapr_cpu_state(cpu)->tctx);
> +    XiveTCTX *tctx = spapr_cpu_state(cpu)->tctx;
> +
> +    object_unref(OBJECT(tctx));
> +    xive_tctx_destroy(tctx);
>  }
>  
>  static void spapr_xive_set_irq(SpaprInterruptController *intc, int irq, int val)
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index 5f746079be46..d5e4db668a4b 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -380,13 +380,16 @@ static const TypeInfo icp_info = {
>      .class_size = sizeof(ICPStateClass),
>  };
>  
> -Object *icp_create(Object *cpu, const char *type, XICSFabric *xi, Error **errp)
> +Object *icp_create(Object *cpu, const char *type, ICSState *ics, XICSFabric *xi,
> +                   Error **errp)
>  {
>      Error *local_err = NULL;
> +    g_autofree char *name = NULL;
>      Object *obj;
>  
>      obj = object_new(type);
> -    object_property_add_child(cpu, type, obj, &error_abort);
> +    name = g_strdup_printf("%s[%d]", type, CPU(cpu)->cpu_index);
> +    object_property_add_child(OBJECT(ics), name, obj, &error_abort);
>      object_unref(obj);
>      object_ref(OBJECT(xi));
>      object_property_add_const_link(obj, ICP_PROP_XICS, OBJECT(xi),
> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> index 5977d1bdb73f..080ed73aad64 100644
> --- a/hw/intc/xics_spapr.c
> +++ b/hw/intc/xics_spapr.c
> @@ -337,11 +337,12 @@ static int xics_spapr_cpu_intc_create(SpaprInterruptController *intc,
>      Object *obj;
>      SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
>  
> -    obj = icp_create(OBJECT(cpu), TYPE_ICP, ics->xics, errp);
> +    obj = icp_create(OBJECT(cpu), TYPE_ICP, ics, ics->xics, errp);
>      if (!obj) {
>          return -1;
>      }
>  
> +    object_ref(obj);
>      spapr_cpu->icp = ICP(obj);
>      return 0;
>  }
> @@ -355,7 +356,10 @@ static void xics_spapr_cpu_intc_reset(SpaprInterruptController *intc,
>  static void xics_spapr_cpu_intc_destroy(SpaprInterruptController *intc,
>                                          PowerPCCPU *cpu)
>  {
> -    icp_destroy(spapr_cpu_state(cpu)->icp);
> +    ICPState *icp = spapr_cpu_state(cpu)->icp;
> +
> +    object_unref(OBJECT(icp));
> +    icp_destroy(icp);
>  }
>  
>  static int xics_spapr_claim_irq(SpaprInterruptController *intc, int irq,
> diff --git a/hw/intc/xive.c b/hw/intc/xive.c
> index 952a461d5329..8d2da4a11163 100644
> --- a/hw/intc/xive.c
> +++ b/hw/intc/xive.c
> @@ -677,10 +677,12 @@ static const TypeInfo xive_tctx_info = {
>  Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, Error **errp)
>  {
>      Error *local_err = NULL;
> +    g_autofree char *name = NULL;
>      Object *obj;
>  
>      obj = object_new(TYPE_XIVE_TCTX);
> -    object_property_add_child(cpu, TYPE_XIVE_TCTX, obj, &error_abort);
> +    name = g_strdup_printf(TYPE_XIVE_TCTX "[%d]", CPU(cpu)->cpu_index);
> +    object_property_add_child(OBJECT(xrtr), name, obj, &error_abort);
>      object_unref(obj);
>      object_ref(cpu);
>      object_property_add_const_link(obj, "cpu", cpu, &error_abort);
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index bd17c3536dd5..cbeabf98bff6 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -767,14 +767,16 @@ static void pnv_chip_power8_intc_create(PnvChip *chip, PowerPCCPU *cpu,
>      Error *local_err = NULL;
>      Object *obj;
>      PnvCPUState *pnv_cpu = pnv_cpu_state(cpu);
> +    Pnv8Chip *chip8 = PNV8_CHIP(chip);
>  
> -    obj = icp_create(OBJECT(cpu), TYPE_PNV_ICP, XICS_FABRIC(qdev_get_machine()),
> -                     &local_err);
> +    obj = icp_create(OBJECT(cpu), TYPE_PNV_ICP, &chip8->psi.ics,
> +                     XICS_FABRIC(qdev_get_machine()), &local_err);

Here, the association of the ICPs with the PSI ICS is particularly arbitrary.

>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;
>      }
>  
> +    object_ref(obj);
>      pnv_cpu->intc = obj;
>  }
>  
> @@ -788,7 +790,10 @@ static void pnv_chip_power8_intc_reset(PnvChip *chip, PowerPCCPU *cpu)
>  
>  static void pnv_chip_power8_intc_destroy(PnvChip *chip, PowerPCCPU *cpu)
>  {
> -    icp_destroy(ICP(pnv_cpu_state(cpu)->intc));
> +    Object *intc = pnv_cpu_state(cpu)->intc;
> +
> +    object_unref(intc);
> +    icp_destroy(ICP(intc));
>  }
>  
>  /*
> @@ -825,6 +830,7 @@ static void pnv_chip_power9_intc_create(PnvChip *chip, PowerPCCPU *cpu,
>          return;
>      }
>  
> +    object_ref(obj);
>      pnv_cpu->intc = obj;
>  }
>  
> @@ -837,7 +843,10 @@ static void pnv_chip_power9_intc_reset(PnvChip *chip, PowerPCCPU *cpu)
>  
>  static void pnv_chip_power9_intc_destroy(PnvChip *chip, PowerPCCPU *cpu)
>  {
> -    xive_tctx_destroy(XIVE_TCTX(pnv_cpu_state(cpu)->intc));
> +    Object *intc = pnv_cpu_state(cpu)->intc;
> +
> +    object_unref(intc);
> +    xive_tctx_destroy(XIVE_TCTX(intc));
>  }
>  
>  /*
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index 48a75aa4ab75..f4827e748fd8 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -179,7 +179,7 @@ void ics_pic_print_info(ICSState *ics, Monitor *mon);
>  void ics_resend(ICSState *ics);
>  void icp_resend(ICPState *ss);
>  
> -Object *icp_create(Object *cpu, const char *type, XICSFabric *xi,
> +Object *icp_create(Object *cpu, const char *type, ICSState *ics, XICSFabric *xi,
>                     Error **errp);
>  void icp_destroy(ICPState *icp);
>  
>
Greg Kurz Oct. 24, 2019, 9:04 a.m. UTC | #2
On Thu, 24 Oct 2019 13:58:41 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Wed, Oct 23, 2019 at 04:52:10PM +0200, Greg Kurz wrote:
> > Each VCPU is associated to a presenter object within the interrupt
> > controller, ie. TCTX for XIVE and ICP for XICS, but our current
> > models put these objects below the VCPU, and we rely on CPU_FOREACH()
> > to do anything that involves presenters.
> > 
> > This recently bit us with the CAM line matching logic in XIVE because
> > CPU_FOREACH() can race with CPU hotplug and we ended up considering a
> > VCPU that wasn't associated to a TCTX object yet. Other users of
> > CPU_FOREACH() are 'info pic' for both XICS and XIVE. It is again very
> > easy to crash QEMU with concurrent VCPU hotplug/unplug and 'info pic'.
> > 
> > Reparent the presenter objects to the corresponding interrupt controller
> > object, ie. XIVE router or ICS, to make it clear they are not some extra
> > data hanging from the CPU but internal XIVE or XICS entities. The CPU
> > object now needs to explicitely take a reference on the presenter to
> > ensure its pointer remains valid until unrealize time.
> > 
> > This will allow to get rid of CPU_FOREACH() and ease further improvements
> > to the XIVE model.
> > 
> > This change doesn't impact section ids and is thus transparent to
> > migration.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> 
> 
> Urgh.  I see why we want something like this, but reparenting the
> presenters to the source modules (particularly for XICS) makes me
> uncomfortable.
> 
> AFAICT the association here is *purely* about managing lifetimes, not
> because those ICPs inherently have something to do with those ICSes.
> Same with XIVE, although since they'll be on the same chip there's a
> bit more logic to it.
> 

I did it this way for XIVE because they are indeed on the same chip,
but I agree it is questionable for XICS.

> For spapr we kinda-sorta treat the (single) ICS or XiveRouter object
> as the "master" of the interrupt controller.  But that's not really

Agreed for XICS. On the other side, the XIVE controller chip really has
a routing sub-engine (IVRE) and a presenter sub-engine (IVPE), and the
TCTXs do reside in an SRAM within the IVPE. The XiveRouter type might
be better named XiveController, and each instance to expose a XiveRouter
and a XivePresenter interface. We have one per chip for PNV and only a
single one for sPAPR.

> accurate to the hardware, so I don't really want to push that way of
> looking at it any further than it already is.
> 
> If we could make the presenters children of the machine (spapr) or
> chip (pnv) that might make more sense?
> 

It probably makes sense for XICS, not sure this makes things clearer
for XIVE.

> I'm also not totally convinced that having the presenter as a child of
> the cpu is inherently bad.  Yes we have bugs now, but maybe the right
> fix *is* actually to do the NULL check on every CPU_FOREACH().
> 
> For comparison, the lapic on x86 machines is a child of the cpu, and I
> believe they do have NULL checks on cpu->apic_state in various places
> they use CPU_FOREACH().
> 

I didn't want to go that way because I was finding CPU_FOREACH() to
be fragile since all users are broken, but I can revisit that. Maybe
worth consolidating the logic in a PRESENTER_FOREACH() macro in order
to avoid future breakage with CPU_FOREACH() ?

> > ---
> >  hw/intc/spapr_xive.c  |    6 +++++-
> >  hw/intc/xics.c        |    7 +++++--
> >  hw/intc/xics_spapr.c  |    8 ++++++--
> >  hw/intc/xive.c        |    4 +++-
> >  hw/ppc/pnv.c          |   17 +++++++++++++----
> >  include/hw/ppc/xics.h |    2 +-
> >  6 files changed, 33 insertions(+), 11 deletions(-)
> > 
> > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > index b09cc48bcb61..d74ee71e76b4 100644
> > --- a/hw/intc/spapr_xive.c
> > +++ b/hw/intc/spapr_xive.c
> > @@ -526,6 +526,7 @@ static int spapr_xive_cpu_intc_create(SpaprInterruptController *intc,
> >          return -1;
> >      }
> >  
> > +    object_ref(obj);
> >      spapr_cpu->tctx = XIVE_TCTX(obj);
> >      return 0;
> >  }
> > @@ -558,7 +559,10 @@ static void spapr_xive_cpu_intc_reset(SpaprInterruptController *intc,
> >  static void spapr_xive_cpu_intc_destroy(SpaprInterruptController *intc,
> >                                          PowerPCCPU *cpu)
> >  {
> > -    xive_tctx_destroy(spapr_cpu_state(cpu)->tctx);
> > +    XiveTCTX *tctx = spapr_cpu_state(cpu)->tctx;
> > +
> > +    object_unref(OBJECT(tctx));
> > +    xive_tctx_destroy(tctx);
> >  }
> >  
> >  static void spapr_xive_set_irq(SpaprInterruptController *intc, int irq, int val)
> > diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> > index 5f746079be46..d5e4db668a4b 100644
> > --- a/hw/intc/xics.c
> > +++ b/hw/intc/xics.c
> > @@ -380,13 +380,16 @@ static const TypeInfo icp_info = {
> >      .class_size = sizeof(ICPStateClass),
> >  };
> >  
> > -Object *icp_create(Object *cpu, const char *type, XICSFabric *xi, Error **errp)
> > +Object *icp_create(Object *cpu, const char *type, ICSState *ics, XICSFabric *xi,
> > +                   Error **errp)
> >  {
> >      Error *local_err = NULL;
> > +    g_autofree char *name = NULL;
> >      Object *obj;
> >  
> >      obj = object_new(type);
> > -    object_property_add_child(cpu, type, obj, &error_abort);
> > +    name = g_strdup_printf("%s[%d]", type, CPU(cpu)->cpu_index);
> > +    object_property_add_child(OBJECT(ics), name, obj, &error_abort);
> >      object_unref(obj);
> >      object_ref(OBJECT(xi));
> >      object_property_add_const_link(obj, ICP_PROP_XICS, OBJECT(xi),
> > diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> > index 5977d1bdb73f..080ed73aad64 100644
> > --- a/hw/intc/xics_spapr.c
> > +++ b/hw/intc/xics_spapr.c
> > @@ -337,11 +337,12 @@ static int xics_spapr_cpu_intc_create(SpaprInterruptController *intc,
> >      Object *obj;
> >      SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
> >  
> > -    obj = icp_create(OBJECT(cpu), TYPE_ICP, ics->xics, errp);
> > +    obj = icp_create(OBJECT(cpu), TYPE_ICP, ics, ics->xics, errp);
> >      if (!obj) {
> >          return -1;
> >      }
> >  
> > +    object_ref(obj);
> >      spapr_cpu->icp = ICP(obj);
> >      return 0;
> >  }
> > @@ -355,7 +356,10 @@ static void xics_spapr_cpu_intc_reset(SpaprInterruptController *intc,
> >  static void xics_spapr_cpu_intc_destroy(SpaprInterruptController *intc,
> >                                          PowerPCCPU *cpu)
> >  {
> > -    icp_destroy(spapr_cpu_state(cpu)->icp);
> > +    ICPState *icp = spapr_cpu_state(cpu)->icp;
> > +
> > +    object_unref(OBJECT(icp));
> > +    icp_destroy(icp);
> >  }
> >  
> >  static int xics_spapr_claim_irq(SpaprInterruptController *intc, int irq,
> > diff --git a/hw/intc/xive.c b/hw/intc/xive.c
> > index 952a461d5329..8d2da4a11163 100644
> > --- a/hw/intc/xive.c
> > +++ b/hw/intc/xive.c
> > @@ -677,10 +677,12 @@ static const TypeInfo xive_tctx_info = {
> >  Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, Error **errp)
> >  {
> >      Error *local_err = NULL;
> > +    g_autofree char *name = NULL;
> >      Object *obj;
> >  
> >      obj = object_new(TYPE_XIVE_TCTX);
> > -    object_property_add_child(cpu, TYPE_XIVE_TCTX, obj, &error_abort);
> > +    name = g_strdup_printf(TYPE_XIVE_TCTX "[%d]", CPU(cpu)->cpu_index);
> > +    object_property_add_child(OBJECT(xrtr), name, obj, &error_abort);
> >      object_unref(obj);
> >      object_ref(cpu);
> >      object_property_add_const_link(obj, "cpu", cpu, &error_abort);
> > diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> > index bd17c3536dd5..cbeabf98bff6 100644
> > --- a/hw/ppc/pnv.c
> > +++ b/hw/ppc/pnv.c
> > @@ -767,14 +767,16 @@ static void pnv_chip_power8_intc_create(PnvChip *chip, PowerPCCPU *cpu,
> >      Error *local_err = NULL;
> >      Object *obj;
> >      PnvCPUState *pnv_cpu = pnv_cpu_state(cpu);
> > +    Pnv8Chip *chip8 = PNV8_CHIP(chip);
> >  
> > -    obj = icp_create(OBJECT(cpu), TYPE_PNV_ICP, XICS_FABRIC(qdev_get_machine()),
> > -                     &local_err);
> > +    obj = icp_create(OBJECT(cpu), TYPE_PNV_ICP, &chip8->psi.ics,
> > +                     XICS_FABRIC(qdev_get_machine()), &local_err);
> 
> Here, the association of the ICPs with the PSI ICS is particularly arbitrary.
> 
> >      if (local_err) {
> >          error_propagate(errp, local_err);
> >          return;
> >      }
> >  
> > +    object_ref(obj);
> >      pnv_cpu->intc = obj;
> >  }
> >  
> > @@ -788,7 +790,10 @@ static void pnv_chip_power8_intc_reset(PnvChip *chip, PowerPCCPU *cpu)
> >  
> >  static void pnv_chip_power8_intc_destroy(PnvChip *chip, PowerPCCPU *cpu)
> >  {
> > -    icp_destroy(ICP(pnv_cpu_state(cpu)->intc));
> > +    Object *intc = pnv_cpu_state(cpu)->intc;
> > +
> > +    object_unref(intc);
> > +    icp_destroy(ICP(intc));
> >  }
> >  
> >  /*
> > @@ -825,6 +830,7 @@ static void pnv_chip_power9_intc_create(PnvChip *chip, PowerPCCPU *cpu,
> >          return;
> >      }
> >  
> > +    object_ref(obj);
> >      pnv_cpu->intc = obj;
> >  }
> >  
> > @@ -837,7 +843,10 @@ static void pnv_chip_power9_intc_reset(PnvChip *chip, PowerPCCPU *cpu)
> >  
> >  static void pnv_chip_power9_intc_destroy(PnvChip *chip, PowerPCCPU *cpu)
> >  {
> > -    xive_tctx_destroy(XIVE_TCTX(pnv_cpu_state(cpu)->intc));
> > +    Object *intc = pnv_cpu_state(cpu)->intc;
> > +
> > +    object_unref(intc);
> > +    xive_tctx_destroy(XIVE_TCTX(intc));
> >  }
> >  
> >  /*
> > diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> > index 48a75aa4ab75..f4827e748fd8 100644
> > --- a/include/hw/ppc/xics.h
> > +++ b/include/hw/ppc/xics.h
> > @@ -179,7 +179,7 @@ void ics_pic_print_info(ICSState *ics, Monitor *mon);
> >  void ics_resend(ICSState *ics);
> >  void icp_resend(ICPState *ss);
> >  
> > -Object *icp_create(Object *cpu, const char *type, XICSFabric *xi,
> > +Object *icp_create(Object *cpu, const char *type, ICSState *ics, XICSFabric *xi,
> >                     Error **errp);
> >  void icp_destroy(ICPState *icp);
> >  
> > 
>
David Gibson Oct. 27, 2019, 4:57 p.m. UTC | #3
On Thu, Oct 24, 2019 at 11:04:53AM +0200, Greg Kurz wrote:
> On Thu, 24 Oct 2019 13:58:41 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Wed, Oct 23, 2019 at 04:52:10PM +0200, Greg Kurz wrote:
> > > Each VCPU is associated to a presenter object within the interrupt
> > > controller, ie. TCTX for XIVE and ICP for XICS, but our current
> > > models put these objects below the VCPU, and we rely on CPU_FOREACH()
> > > to do anything that involves presenters.
> > > 
> > > This recently bit us with the CAM line matching logic in XIVE because
> > > CPU_FOREACH() can race with CPU hotplug and we ended up considering a
> > > VCPU that wasn't associated to a TCTX object yet. Other users of
> > > CPU_FOREACH() are 'info pic' for both XICS and XIVE. It is again very
> > > easy to crash QEMU with concurrent VCPU hotplug/unplug and 'info pic'.
> > > 
> > > Reparent the presenter objects to the corresponding interrupt controller
> > > object, ie. XIVE router or ICS, to make it clear they are not some extra
> > > data hanging from the CPU but internal XIVE or XICS entities. The CPU
> > > object now needs to explicitely take a reference on the presenter to
> > > ensure its pointer remains valid until unrealize time.
> > > 
> > > This will allow to get rid of CPU_FOREACH() and ease further improvements
> > > to the XIVE model.
> > > 
> > > This change doesn't impact section ids and is thus transparent to
> > > migration.
> > > 
> > > Signed-off-by: Greg Kurz <groug@kaod.org>
> > 
> > 
> > Urgh.  I see why we want something like this, but reparenting the
> > presenters to the source modules (particularly for XICS) makes me
> > uncomfortable.
> > 
> > AFAICT the association here is *purely* about managing lifetimes, not
> > because those ICPs inherently have something to do with those ICSes.
> > Same with XIVE, although since they'll be on the same chip there's a
> > bit more logic to it.
> > 
> 
> I did it this way for XIVE because they are indeed on the same chip,
> but I agree it is questionable for XICS.
> 
> > For spapr we kinda-sorta treat the (single) ICS or XiveRouter object
> > as the "master" of the interrupt controller.  But that's not really
> 
> Agreed for XICS. On the other side, the XIVE controller chip really has
> a routing sub-engine (IVRE) and a presenter sub-engine (IVPE), and the
> TCTXs do reside in an SRAM within the IVPE. The XiveRouter type might
> be better named XiveController, and each instance to expose a XiveRouter
> and a XivePresenter interface. We have one per chip for PNV and only a
> single one for sPAPR.

Yes, that sounds like a reasonable approach for XIVE.

> 
> > accurate to the hardware, so I don't really want to push that way of
> > looking at it any further than it already is.
> > 
> > If we could make the presenters children of the machine (spapr) or
> > chip (pnv) that might make more sense?
> > 
> 
> It probably makes sense for XICS, not sure this makes things clearer
> for XIVE.
> 
> > I'm also not totally convinced that having the presenter as a child of
> > the cpu is inherently bad.  Yes we have bugs now, but maybe the right
> > fix *is* actually to do the NULL check on every CPU_FOREACH().
> > 
> > For comparison, the lapic on x86 machines is a child of the cpu, and I
> > believe they do have NULL checks on cpu->apic_state in various places
> > they use CPU_FOREACH().
> > 
> 
> I didn't want to go that way because I was finding CPU_FOREACH() to
> be fragile since all users are broken,

Hm, ok.  There aren't that many existing users though, right?

> but I can revisit that. Maybe
> worth consolidating the logic in a PRESENTER_FOREACH() macro in order
> to avoid future breakage with CPU_FOREACH() ?

That sounds worth considering at least, yes.
diff mbox series

Patch

diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index b09cc48bcb61..d74ee71e76b4 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -526,6 +526,7 @@  static int spapr_xive_cpu_intc_create(SpaprInterruptController *intc,
         return -1;
     }
 
+    object_ref(obj);
     spapr_cpu->tctx = XIVE_TCTX(obj);
     return 0;
 }
@@ -558,7 +559,10 @@  static void spapr_xive_cpu_intc_reset(SpaprInterruptController *intc,
 static void spapr_xive_cpu_intc_destroy(SpaprInterruptController *intc,
                                         PowerPCCPU *cpu)
 {
-    xive_tctx_destroy(spapr_cpu_state(cpu)->tctx);
+    XiveTCTX *tctx = spapr_cpu_state(cpu)->tctx;
+
+    object_unref(OBJECT(tctx));
+    xive_tctx_destroy(tctx);
 }
 
 static void spapr_xive_set_irq(SpaprInterruptController *intc, int irq, int val)
diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 5f746079be46..d5e4db668a4b 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -380,13 +380,16 @@  static const TypeInfo icp_info = {
     .class_size = sizeof(ICPStateClass),
 };
 
-Object *icp_create(Object *cpu, const char *type, XICSFabric *xi, Error **errp)
+Object *icp_create(Object *cpu, const char *type, ICSState *ics, XICSFabric *xi,
+                   Error **errp)
 {
     Error *local_err = NULL;
+    g_autofree char *name = NULL;
     Object *obj;
 
     obj = object_new(type);
-    object_property_add_child(cpu, type, obj, &error_abort);
+    name = g_strdup_printf("%s[%d]", type, CPU(cpu)->cpu_index);
+    object_property_add_child(OBJECT(ics), name, obj, &error_abort);
     object_unref(obj);
     object_ref(OBJECT(xi));
     object_property_add_const_link(obj, ICP_PROP_XICS, OBJECT(xi),
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index 5977d1bdb73f..080ed73aad64 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -337,11 +337,12 @@  static int xics_spapr_cpu_intc_create(SpaprInterruptController *intc,
     Object *obj;
     SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
 
-    obj = icp_create(OBJECT(cpu), TYPE_ICP, ics->xics, errp);
+    obj = icp_create(OBJECT(cpu), TYPE_ICP, ics, ics->xics, errp);
     if (!obj) {
         return -1;
     }
 
+    object_ref(obj);
     spapr_cpu->icp = ICP(obj);
     return 0;
 }
@@ -355,7 +356,10 @@  static void xics_spapr_cpu_intc_reset(SpaprInterruptController *intc,
 static void xics_spapr_cpu_intc_destroy(SpaprInterruptController *intc,
                                         PowerPCCPU *cpu)
 {
-    icp_destroy(spapr_cpu_state(cpu)->icp);
+    ICPState *icp = spapr_cpu_state(cpu)->icp;
+
+    object_unref(OBJECT(icp));
+    icp_destroy(icp);
 }
 
 static int xics_spapr_claim_irq(SpaprInterruptController *intc, int irq,
diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index 952a461d5329..8d2da4a11163 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -677,10 +677,12 @@  static const TypeInfo xive_tctx_info = {
 Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, Error **errp)
 {
     Error *local_err = NULL;
+    g_autofree char *name = NULL;
     Object *obj;
 
     obj = object_new(TYPE_XIVE_TCTX);
-    object_property_add_child(cpu, TYPE_XIVE_TCTX, obj, &error_abort);
+    name = g_strdup_printf(TYPE_XIVE_TCTX "[%d]", CPU(cpu)->cpu_index);
+    object_property_add_child(OBJECT(xrtr), name, obj, &error_abort);
     object_unref(obj);
     object_ref(cpu);
     object_property_add_const_link(obj, "cpu", cpu, &error_abort);
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index bd17c3536dd5..cbeabf98bff6 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -767,14 +767,16 @@  static void pnv_chip_power8_intc_create(PnvChip *chip, PowerPCCPU *cpu,
     Error *local_err = NULL;
     Object *obj;
     PnvCPUState *pnv_cpu = pnv_cpu_state(cpu);
+    Pnv8Chip *chip8 = PNV8_CHIP(chip);
 
-    obj = icp_create(OBJECT(cpu), TYPE_PNV_ICP, XICS_FABRIC(qdev_get_machine()),
-                     &local_err);
+    obj = icp_create(OBJECT(cpu), TYPE_PNV_ICP, &chip8->psi.ics,
+                     XICS_FABRIC(qdev_get_machine()), &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
     }
 
+    object_ref(obj);
     pnv_cpu->intc = obj;
 }
 
@@ -788,7 +790,10 @@  static void pnv_chip_power8_intc_reset(PnvChip *chip, PowerPCCPU *cpu)
 
 static void pnv_chip_power8_intc_destroy(PnvChip *chip, PowerPCCPU *cpu)
 {
-    icp_destroy(ICP(pnv_cpu_state(cpu)->intc));
+    Object *intc = pnv_cpu_state(cpu)->intc;
+
+    object_unref(intc);
+    icp_destroy(ICP(intc));
 }
 
 /*
@@ -825,6 +830,7 @@  static void pnv_chip_power9_intc_create(PnvChip *chip, PowerPCCPU *cpu,
         return;
     }
 
+    object_ref(obj);
     pnv_cpu->intc = obj;
 }
 
@@ -837,7 +843,10 @@  static void pnv_chip_power9_intc_reset(PnvChip *chip, PowerPCCPU *cpu)
 
 static void pnv_chip_power9_intc_destroy(PnvChip *chip, PowerPCCPU *cpu)
 {
-    xive_tctx_destroy(XIVE_TCTX(pnv_cpu_state(cpu)->intc));
+    Object *intc = pnv_cpu_state(cpu)->intc;
+
+    object_unref(intc);
+    xive_tctx_destroy(XIVE_TCTX(intc));
 }
 
 /*
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 48a75aa4ab75..f4827e748fd8 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -179,7 +179,7 @@  void ics_pic_print_info(ICSState *ics, Monitor *mon);
 void ics_resend(ICSState *ics);
 void icp_resend(ICPState *ss);
 
-Object *icp_create(Object *cpu, const char *type, XICSFabric *xi,
+Object *icp_create(Object *cpu, const char *type, ICSState *ics, XICSFabric *xi,
                    Error **errp);
 void icp_destroy(ICPState *icp);