diff mbox series

[for-7.1] hw/mips/malta: turn off x86 specific features of PIIX4_PM

Message ID 20220728115034.1327988-1-imammedo@redhat.com
State New
Headers show
Series [for-7.1] hw/mips/malta: turn off x86 specific features of PIIX4_PM | expand

Commit Message

Igor Mammedov July 28, 2022, 11:50 a.m. UTC
QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
  $ qemu-system-mips -monitor stdio
  (qemu) migrate "exec:gzip -c > STATEFILE.gz"
  Segmentation fault (core dumped)

It happens due to PIIX4_PM trying to parse hotplug vmstate structures
which are valid only for x86 and not for MIPS (as it requires ACPI
tables support which is not existent for ithe later)

Issue was probably exposed by trying to cleanup/compile out unused
ACPI bits from MIPS target (but forgetting about migration bits).

Disable compiled out features using compat properties as the least
risky way to deal with issue.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
PS:
another approach could be setting defaults to disabled state and
enabling them using compat props on PC machines (which is more
code to deal with => more risky) or continue with PIIX4_PM
refactoring to split x86-shism out (which I'm not really
interested in due to risk of regressions for not much of
benefit)
---
 hw/mips/malta.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Peter Maydell July 28, 2022, 12:29 p.m. UTC | #1
On Thu, 28 Jul 2022 at 12:50, Igor Mammedov <imammedo@redhat.com> wrote:
>
> QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
>   $ qemu-system-mips -monitor stdio
>   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
>   Segmentation fault (core dumped)
>
> It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> which are valid only for x86 and not for MIPS (as it requires ACPI
> tables support which is not existent for ithe later)
>
> Issue was probably exposed by trying to cleanup/compile out unused
> ACPI bits from MIPS target (but forgetting about migration bits).
>
> Disable compiled out features using compat properties as the least
> risky way to deal with issue.
>
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/995

> ---
> PS:
> another approach could be setting defaults to disabled state and
> enabling them using compat props on PC machines (which is more
> code to deal with => more risky) or continue with PIIX4_PM
> refactoring to split x86-shism out (which I'm not really
> interested in due to risk of regressions for not much of
> benefit)
> ---
>  hw/mips/malta.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/hw/mips/malta.c b/hw/mips/malta.c
> index 7a0ec513b0..0e932988e0 100644
> --- a/hw/mips/malta.c
> +++ b/hw/mips/malta.c
> @@ -1442,6 +1442,14 @@ static const TypeInfo mips_malta_device = {
>      .instance_init = mips_malta_instance_init,
>  };
>
> +GlobalProperty malta_compat[] = {
> +    { "PIIX4_PM", "memory-hotplug-support", "off" },
> +    { "PIIX4_PM", "acpi-pci-hotplug-with-bridge-support", "off" },
> +    { "PIIX4_PM", "acpi-root-pci-hotplug", "off" },
> +    { "PIIX4_PM", "x-not-migrate-acpi-index", "true" },
> +};

Is there an easy way to assert in hw/acpi/piix4.c that if
CONFIG_ACPI_PCIHP was not set then the board has initialized
all these properties to the don't-use-hotplug state ?
That would be a guard against similar bugs (though I suppose
we probably aren't likely to add new piix4 boards...)

> +const size_t malta_compat_len = G_N_ELEMENTS(malta_compat);
> +
>  static void mips_malta_machine_init(MachineClass *mc)
>  {
>      mc->desc = "MIPS Malta Core LV";
> @@ -1455,6 +1463,7 @@ static void mips_malta_machine_init(MachineClass *mc)
>      mc->default_cpu_type = MIPS_CPU_TYPE_NAME("24Kf");
>  #endif
>      mc->default_ram_id = "mips_malta.ram";
> +    compat_props_add(mc->compat_props, malta_compat, malta_compat_len);
>  }
>
>  DEFINE_MACHINE("malta", mips_malta_machine_init)
> --
> 2.31.1

thanks
-- PMM
Igor Mammedov July 28, 2022, 1:16 p.m. UTC | #2
On Thu, 28 Jul 2022 13:29:07 +0100
Peter Maydell <peter.maydell@linaro.org> wrote:

> On Thu, 28 Jul 2022 at 12:50, Igor Mammedov <imammedo@redhat.com> wrote:
> >
> > QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
> >   $ qemu-system-mips -monitor stdio
> >   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> >   Segmentation fault (core dumped)
> >
> > It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> > which are valid only for x86 and not for MIPS (as it requires ACPI
> > tables support which is not existent for ithe later)
> >
> > Issue was probably exposed by trying to cleanup/compile out unused
> > ACPI bits from MIPS target (but forgetting about migration bits).
> >
> > Disable compiled out features using compat properties as the least
> > risky way to deal with issue.
> >
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>  
> 
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/995
> 
> > ---
> > PS:
> > another approach could be setting defaults to disabled state and
> > enabling them using compat props on PC machines (which is more
> > code to deal with => more risky) or continue with PIIX4_PM
> > refactoring to split x86-shism out (which I'm not really
> > interested in due to risk of regressions for not much of
> > benefit)
> > ---
> >  hw/mips/malta.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/hw/mips/malta.c b/hw/mips/malta.c
> > index 7a0ec513b0..0e932988e0 100644
> > --- a/hw/mips/malta.c
> > +++ b/hw/mips/malta.c
> > @@ -1442,6 +1442,14 @@ static const TypeInfo mips_malta_device = {
> >      .instance_init = mips_malta_instance_init,
> >  };
> >
> > +GlobalProperty malta_compat[] = {
> > +    { "PIIX4_PM", "memory-hotplug-support", "off" },
> > +    { "PIIX4_PM", "acpi-pci-hotplug-with-bridge-support", "off" },
> > +    { "PIIX4_PM", "acpi-root-pci-hotplug", "off" },
> > +    { "PIIX4_PM", "x-not-migrate-acpi-index", "true" },
> > +};  
> 
> Is there an easy way to assert in hw/acpi/piix4.c that if
> CONFIG_ACPI_PCIHP was not set then the board has initialized
> all these properties to the don't-use-hotplug state ?
> That would be a guard against similar bugs (though I suppose
> we probably aren't likely to add new piix4 boards...)

unfortunately new features still creep in 'pc' machine
ex: "acpi-root-pci-hotplug"), and I don't see an easy
way to compile that nor enforce that in the future.

Far from easy would be split piix4_pm on base/enhanced
classes so we wouldn't need x86 specific hacks in 'base'
variant (assuming 'enhanced' could maintain the current
VMSTATE to keep cross-version migration working).

> > +const size_t malta_compat_len = G_N_ELEMENTS(malta_compat);
> > +
> >  static void mips_malta_machine_init(MachineClass *mc)
> >  {
> >      mc->desc = "MIPS Malta Core LV";
> > @@ -1455,6 +1463,7 @@ static void mips_malta_machine_init(MachineClass *mc)
> >      mc->default_cpu_type = MIPS_CPU_TYPE_NAME("24Kf");
> >  #endif
> >      mc->default_ram_id = "mips_malta.ram";
> > +    compat_props_add(mc->compat_props, malta_compat, malta_compat_len);
> >  }
> >
> >  DEFINE_MACHINE("malta", mips_malta_machine_init)
> > --
> > 2.31.1  
> 
> thanks
> -- PMM
>
Dr. David Alan Gilbert July 28, 2022, 2:44 p.m. UTC | #3
* Igor Mammedov (imammedo@redhat.com) wrote:
> QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
>   $ qemu-system-mips -monitor stdio
>   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
>   Segmentation fault (core dumped)
> 
> It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> which are valid only for x86 and not for MIPS (as it requires ACPI
> tables support which is not existent for ithe later)
> 
> Issue was probably exposed by trying to cleanup/compile out unused
> ACPI bits from MIPS target (but forgetting about migration bits).
> 
> Disable compiled out features using compat properties as the least
> risky way to deal with issue.

Isn't the problem partially due to a 'stub' vmsd which isn't terminated?

Dave

> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
> PS:
> another approach could be setting defaults to disabled state and
> enabling them using compat props on PC machines (which is more
> code to deal with => more risky) or continue with PIIX4_PM
> refactoring to split x86-shism out (which I'm not really
> interested in due to risk of regressions for not much of
> benefit)
> ---
>  hw/mips/malta.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/hw/mips/malta.c b/hw/mips/malta.c
> index 7a0ec513b0..0e932988e0 100644
> --- a/hw/mips/malta.c
> +++ b/hw/mips/malta.c
> @@ -1442,6 +1442,14 @@ static const TypeInfo mips_malta_device = {
>      .instance_init = mips_malta_instance_init,
>  };
>  
> +GlobalProperty malta_compat[] = {
> +    { "PIIX4_PM", "memory-hotplug-support", "off" },
> +    { "PIIX4_PM", "acpi-pci-hotplug-with-bridge-support", "off" },
> +    { "PIIX4_PM", "acpi-root-pci-hotplug", "off" },
> +    { "PIIX4_PM", "x-not-migrate-acpi-index", "true" },
> +};
> +const size_t malta_compat_len = G_N_ELEMENTS(malta_compat);
> +
>  static void mips_malta_machine_init(MachineClass *mc)
>  {
>      mc->desc = "MIPS Malta Core LV";
> @@ -1455,6 +1463,7 @@ static void mips_malta_machine_init(MachineClass *mc)
>      mc->default_cpu_type = MIPS_CPU_TYPE_NAME("24Kf");
>  #endif
>      mc->default_ram_id = "mips_malta.ram";
> +    compat_props_add(mc->compat_props, malta_compat, malta_compat_len);
>  }
>  
>  DEFINE_MACHINE("malta", mips_malta_machine_init)
> -- 
> 2.31.1
>
Igor Mammedov July 28, 2022, 2:54 p.m. UTC | #4
On Thu, 28 Jul 2022 15:44:20 +0100
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Igor Mammedov (imammedo@redhat.com) wrote:
> > QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
> >   $ qemu-system-mips -monitor stdio
> >   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> >   Segmentation fault (core dumped)
> > 
> > It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> > which are valid only for x86 and not for MIPS (as it requires ACPI
> > tables support which is not existent for ithe later)
> > 
> > Issue was probably exposed by trying to cleanup/compile out unused
> > ACPI bits from MIPS target (but forgetting about migration bits).
> > 
> > Disable compiled out features using compat properties as the least
> > risky way to deal with issue.  
> 
> Isn't the problem partially due to a 'stub' vmsd which isn't terminated?

Not sure what "'stub' vmsd" is, can you explain?

> 
> Dave
> 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > ---
> > PS:
> > another approach could be setting defaults to disabled state and
> > enabling them using compat props on PC machines (which is more
> > code to deal with => more risky) or continue with PIIX4_PM
> > refactoring to split x86-shism out (which I'm not really
> > interested in due to risk of regressions for not much of
> > benefit)
> > ---
> >  hw/mips/malta.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/hw/mips/malta.c b/hw/mips/malta.c
> > index 7a0ec513b0..0e932988e0 100644
> > --- a/hw/mips/malta.c
> > +++ b/hw/mips/malta.c
> > @@ -1442,6 +1442,14 @@ static const TypeInfo mips_malta_device = {
> >      .instance_init = mips_malta_instance_init,
> >  };
> >  
> > +GlobalProperty malta_compat[] = {
> > +    { "PIIX4_PM", "memory-hotplug-support", "off" },
> > +    { "PIIX4_PM", "acpi-pci-hotplug-with-bridge-support", "off" },
> > +    { "PIIX4_PM", "acpi-root-pci-hotplug", "off" },
> > +    { "PIIX4_PM", "x-not-migrate-acpi-index", "true" },
> > +};
> > +const size_t malta_compat_len = G_N_ELEMENTS(malta_compat);
> > +
> >  static void mips_malta_machine_init(MachineClass *mc)
> >  {
> >      mc->desc = "MIPS Malta Core LV";
> > @@ -1455,6 +1463,7 @@ static void mips_malta_machine_init(MachineClass *mc)
> >      mc->default_cpu_type = MIPS_CPU_TYPE_NAME("24Kf");
> >  #endif
> >      mc->default_ram_id = "mips_malta.ram";
> > +    compat_props_add(mc->compat_props, malta_compat, malta_compat_len);
> >  }
> >  
> >  DEFINE_MACHINE("malta", mips_malta_machine_init)
> > -- 
> > 2.31.1
> >
Peter Maydell July 28, 2022, 3:04 p.m. UTC | #5
On Thu, 28 Jul 2022 at 15:44, Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
>
> * Igor Mammedov (imammedo@redhat.com) wrote:
> > QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
> >   $ qemu-system-mips -monitor stdio
> >   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> >   Segmentation fault (core dumped)
> >
> > It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> > which are valid only for x86 and not for MIPS (as it requires ACPI
> > tables support which is not existent for ithe later)
> >
> > Issue was probably exposed by trying to cleanup/compile out unused
> > ACPI bits from MIPS target (but forgetting about migration bits).
> >
> > Disable compiled out features using compat properties as the least
> > risky way to deal with issue.
>
> Isn't the problem partially due to a 'stub' vmsd which isn't terminated?

Yes, but setting these properties causes that vmsd
(vmstate_acpi_pcihp_pci_status) to not be used:

 * it is used only in VMSTATE_PCI_HOTPLUG()
 * that macro is used only in hw/acpi/ich9.c (not relevant here) and
   hw/acpi/piix4.c
 * in piix4.c it is invoked passing it the test functions
   vmstate_test_use_acpi_hotplug_bridge and
   vmstate_test_migrate_acpi_index
 * setting the properties on the device as this patch does
   causes those test functions to return false, so the
   vmstate_acpi_pcihp_pci_status is never examined

-- PMM
Dr. David Alan Gilbert July 28, 2022, 3:09 p.m. UTC | #6
* Igor Mammedov (imammedo@redhat.com) wrote:
> On Thu, 28 Jul 2022 15:44:20 +0100
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
> > >   $ qemu-system-mips -monitor stdio
> > >   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> > >   Segmentation fault (core dumped)
> > > 
> > > It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> > > which are valid only for x86 and not for MIPS (as it requires ACPI
> > > tables support which is not existent for ithe later)
> > > 
> > > Issue was probably exposed by trying to cleanup/compile out unused
> > > ACPI bits from MIPS target (but forgetting about migration bits).
> > > 
> > > Disable compiled out features using compat properties as the least
> > > risky way to deal with issue.  
> > 
> > Isn't the problem partially due to a 'stub' vmsd which isn't terminated?
> 
> Not sure what "'stub' vmsd" is, can you explain?

In hw/acpi/acpi-pci-hotplug-stub.c there is :
const VMStateDescription vmstate_acpi_pcihp_pci_status;

this seg happens when the migration code walks into that - this should
always get populated with some of the minimal fields, in particular the
.name and .fields array terminated with VMSTATE_END_OF_LIST().

Dave

> > 
> > Dave
> > 
> > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > > ---
> > > PS:
> > > another approach could be setting defaults to disabled state and
> > > enabling them using compat props on PC machines (which is more
> > > code to deal with => more risky) or continue with PIIX4_PM
> > > refactoring to split x86-shism out (which I'm not really
> > > interested in due to risk of regressions for not much of
> > > benefit)
> > > ---
> > >  hw/mips/malta.c | 9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/hw/mips/malta.c b/hw/mips/malta.c
> > > index 7a0ec513b0..0e932988e0 100644
> > > --- a/hw/mips/malta.c
> > > +++ b/hw/mips/malta.c
> > > @@ -1442,6 +1442,14 @@ static const TypeInfo mips_malta_device = {
> > >      .instance_init = mips_malta_instance_init,
> > >  };
> > >  
> > > +GlobalProperty malta_compat[] = {
> > > +    { "PIIX4_PM", "memory-hotplug-support", "off" },
> > > +    { "PIIX4_PM", "acpi-pci-hotplug-with-bridge-support", "off" },
> > > +    { "PIIX4_PM", "acpi-root-pci-hotplug", "off" },
> > > +    { "PIIX4_PM", "x-not-migrate-acpi-index", "true" },
> > > +};
> > > +const size_t malta_compat_len = G_N_ELEMENTS(malta_compat);
> > > +
> > >  static void mips_malta_machine_init(MachineClass *mc)
> > >  {
> > >      mc->desc = "MIPS Malta Core LV";
> > > @@ -1455,6 +1463,7 @@ static void mips_malta_machine_init(MachineClass *mc)
> > >      mc->default_cpu_type = MIPS_CPU_TYPE_NAME("24Kf");
> > >  #endif
> > >      mc->default_ram_id = "mips_malta.ram";
> > > +    compat_props_add(mc->compat_props, malta_compat, malta_compat_len);
> > >  }
> > >  
> > >  DEFINE_MACHINE("malta", mips_malta_machine_init)
> > > -- 
> > > 2.31.1
> > >   
>
Peter Maydell July 28, 2022, 3:12 p.m. UTC | #7
On Thu, 28 Jul 2022 at 16:09, Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
>
> * Igor Mammedov (imammedo@redhat.com) wrote:
> > On Thu, 28 Jul 2022 15:44:20 +0100
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> >
> > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
> > > >   $ qemu-system-mips -monitor stdio
> > > >   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> > > >   Segmentation fault (core dumped)
> > > >
> > > > It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> > > > which are valid only for x86 and not for MIPS (as it requires ACPI
> > > > tables support which is not existent for ithe later)
> > > >
> > > > Issue was probably exposed by trying to cleanup/compile out unused
> > > > ACPI bits from MIPS target (but forgetting about migration bits).
> > > >
> > > > Disable compiled out features using compat properties as the least
> > > > risky way to deal with issue.
> > >
> > > Isn't the problem partially due to a 'stub' vmsd which isn't terminated?
> >
> > Not sure what "'stub' vmsd" is, can you explain?
>
> In hw/acpi/acpi-pci-hotplug-stub.c there is :
> const VMStateDescription vmstate_acpi_pcihp_pci_status;
>
> this seg happens when the migration code walks into that - this should
> always get populated with some of the minimal fields, in particular the
> .name and .fields array terminated with VMSTATE_END_OF_LIST().

Either:
 (1) we should be sure the vmstate struct does not get used if the
     compile-time config has ended up with the stub
or
 (2) it needs to actually match the real vmstate struct, otherwise
     migration between a QEMU built with a config that just got the
     stub version and a QEMU built with a config that got the full
     version will break

This patch does the former. Segfaulting if we got something wrong
and tried to use the vmstate when we weren't expecting to is
arguably better than producing an incompatible migration stream.
(Better still would be if we caught this on machine startup rather
than only when savevm was invoked.)

thanks
-- PMM
Ani Sinha July 28, 2022, 6:48 p.m. UTC | #8
On Thu, 28 Jul 2022, Peter Maydell wrote:

> On Thu, 28 Jul 2022 at 15:44, Dr. David Alan Gilbert
> <dgilbert@redhat.com> wrote:

> > Isn't the problem partially due to a 'stub' vmsd which isn't terminated?
>
> Yes, but setting these properties causes that vmsd
> (vmstate_acpi_pcihp_pci_status) to not be used:
>
>  * it is used only in VMSTATE_PCI_HOTPLUG()
>  * that macro is used only in hw/acpi/ich9.c (not relevant here) and
>    hw/acpi/piix4.c
>  * in piix4.c it is invoked passing it the test functions
>    vmstate_test_use_acpi_hotplug_bridge and
>    vmstate_test_migrate_acpi_index
>  * setting the properties on the device as this patch does
>    causes those test functions to return false, so the
>    vmstate_acpi_pcihp_pci_status is never examined

I believe this happens in vmstate_save_state_v() in this condition
checking:

  while (field->name) {
        if ((field->field_exists &&
             field->field_exists(opaque, version_id)) ||
            (!field->field_exists &&
             field->version_id <= version_id)) {
Ani Sinha July 28, 2022, 6:50 p.m. UTC | #9
On Thu, 28 Jul 2022, Peter Maydell wrote:

> On Thu, 28 Jul 2022 at 12:50, Igor Mammedov <imammedo@redhat.com> wrote:
> >
> > QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
> >   $ qemu-system-mips -monitor stdio
> >   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> >   Segmentation fault (core dumped)
> >
> > It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> > which are valid only for x86 and not for MIPS (as it requires ACPI
> > tables support which is not existent for ithe later)
> >
> > Issue was probably exposed by trying to cleanup/compile out unused
> > ACPI bits from MIPS target (but forgetting about migration bits).
> >
> > Disable compiled out features using compat properties as the least
> > risky way to deal with issue.
> >
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/995


Reviewed-by: Ani Sinha <ani@anisinha.ca>

>
> > ---
> > PS:
> > another approach could be setting defaults to disabled state and
> > enabling them using compat props on PC machines (which is more
> > code to deal with => more risky) or continue with PIIX4_PM
> > refactoring to split x86-shism out (which I'm not really
> > interested in due to risk of regressions for not much of
> > benefit)
> > ---
> >  hw/mips/malta.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/hw/mips/malta.c b/hw/mips/malta.c
> > index 7a0ec513b0..0e932988e0 100644
> > --- a/hw/mips/malta.c
> > +++ b/hw/mips/malta.c
> > @@ -1442,6 +1442,14 @@ static const TypeInfo mips_malta_device = {
> >      .instance_init = mips_malta_instance_init,
> >  };
> >
> > +GlobalProperty malta_compat[] = {
> > +    { "PIIX4_PM", "memory-hotplug-support", "off" },
> > +    { "PIIX4_PM", "acpi-pci-hotplug-with-bridge-support", "off" },
> > +    { "PIIX4_PM", "acpi-root-pci-hotplug", "off" },
> > +    { "PIIX4_PM", "x-not-migrate-acpi-index", "true" },
> > +};
>
> Is there an easy way to assert in hw/acpi/piix4.c that if
> CONFIG_ACPI_PCIHP was not set then the board has initialized
> all these properties to the don't-use-hotplug state ?
> That would be a guard against similar bugs (though I suppose
> we probably aren't likely to add new piix4 boards...)
>
> > +const size_t malta_compat_len = G_N_ELEMENTS(malta_compat);
> > +
> >  static void mips_malta_machine_init(MachineClass *mc)
> >  {
> >      mc->desc = "MIPS Malta Core LV";
> > @@ -1455,6 +1463,7 @@ static void mips_malta_machine_init(MachineClass *mc)
> >      mc->default_cpu_type = MIPS_CPU_TYPE_NAME("24Kf");
> >  #endif
> >      mc->default_ram_id = "mips_malta.ram";
> > +    compat_props_add(mc->compat_props, malta_compat, malta_compat_len);
> >  }
> >
> >  DEFINE_MACHINE("malta", mips_malta_machine_init)
> > --
> > 2.31.1
>
> thanks
> -- PMM
>
Igor Mammedov July 29, 2022, 8:09 a.m. UTC | #10
On Thu, 28 Jul 2022 16:04:58 +0100
Peter Maydell <peter.maydell@linaro.org> wrote:

> On Thu, 28 Jul 2022 at 15:44, Dr. David Alan Gilbert
> <dgilbert@redhat.com> wrote:
> >
> > * Igor Mammedov (imammedo@redhat.com) wrote:  
> > > QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
> > >   $ qemu-system-mips -monitor stdio
> > >   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> > >   Segmentation fault (core dumped)
> > >
> > > It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> > > which are valid only for x86 and not for MIPS (as it requires ACPI
> > > tables support which is not existent for ithe later)
> > >
> > > Issue was probably exposed by trying to cleanup/compile out unused
> > > ACPI bits from MIPS target (but forgetting about migration bits).
> > >
> > > Disable compiled out features using compat properties as the least
> > > risky way to deal with issue.  
> >
> > Isn't the problem partially due to a 'stub' vmsd which isn't terminated?  
> 
> Yes, but setting these properties causes that vmsd
> (vmstate_acpi_pcihp_pci_status) to not be used:
> 
>  * it is used only in VMSTATE_PCI_HOTPLUG()
>  * that macro is used only in hw/acpi/ich9.c (not relevant here) and
>    hw/acpi/piix4.c
>  * in piix4.c it is invoked passing it the test functions
>    vmstate_test_use_acpi_hotplug_bridge and
>    vmstate_test_migrate_acpi_index
>  * setting the properties on the device as this patch does
>    causes those test functions to return false, so the
>    vmstate_acpi_pcihp_pci_status is never examined

it's not limited to VMSTATE_PCI_HOTPLUG but also memory hotplug
and other x86 specific knobs that may cause crash.
(I ignored cpu hotplug one for now since it doesn't cause crash)

> 
> -- PMM
>
Igor Mammedov July 29, 2022, 9:57 a.m. UTC | #11
On Thu, 28 Jul 2022 16:12:34 +0100
Peter Maydell <peter.maydell@linaro.org> wrote:

> On Thu, 28 Jul 2022 at 16:09, Dr. David Alan Gilbert
> <dgilbert@redhat.com> wrote:
> >
> > * Igor Mammedov (imammedo@redhat.com) wrote:  
> > > On Thu, 28 Jul 2022 15:44:20 +0100
> > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > >  
> > > > * Igor Mammedov (imammedo@redhat.com) wrote:  
> > > > > QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
> > > > >   $ qemu-system-mips -monitor stdio
> > > > >   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> > > > >   Segmentation fault (core dumped)
> > > > >
> > > > > It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> > > > > which are valid only for x86 and not for MIPS (as it requires ACPI
> > > > > tables support which is not existent for ithe later)
> > > > >
> > > > > Issue was probably exposed by trying to cleanup/compile out unused
> > > > > ACPI bits from MIPS target (but forgetting about migration bits).
> > > > >
> > > > > Disable compiled out features using compat properties as the least
> > > > > risky way to deal with issue.  
> > > >
> > > > Isn't the problem partially due to a 'stub' vmsd which isn't terminated?  
> > >
> > > Not sure what "'stub' vmsd" is, can you explain?  
> >
> > In hw/acpi/acpi-pci-hotplug-stub.c there is :
> > const VMStateDescription vmstate_acpi_pcihp_pci_status;
I think that one is there only for linking purposes and not meant
to be actually used.

> > this seg happens when the migration code walks into that - this should
> > always get populated with some of the minimal fields, in particular the
> > .name and .fields array terminated with VMSTATE_END_OF_LIST().  
> 
> Either:
>  (1) we should be sure the vmstate struct does not get used if the
>      compile-time config has ended up with the stub
> or

>  (2) it needs to actually match the real vmstate struct, otherwise
>      migration between a QEMU built with a config that just got the
>      stub version and a QEMU built with a config that got the full
>      version will break
>
> This patch does the former. Segfaulting if we got something wrong
> and tried to use the vmstate when we weren't expecting to is
> arguably better than producing an incompatible migration stream.

> (Better still would be if we caught this on machine startup rather
> than only when savevm was invoked.)
Theoretically possible with a bunch of mips and x86 stubs, but ...
we typically don't do this kind of checks for migration sake
as that complicates things a lot in general.
i.e. it's common to let migration fail in case of incompatible
migration stream. It's not exactly friendly to user but it's
graceful failure (assuming code is correct and not crashes QEMU)
 
> thanks
> -- PMM
>
Peter Maydell July 29, 2022, 10:17 a.m. UTC | #12
On Fri, 29 Jul 2022 at 10:57, Igor Mammedov <imammedo@redhat.com> wrote:
>
> On Thu, 28 Jul 2022 16:12:34 +0100
> Peter Maydell <peter.maydell@linaro.org> wrote:
>
> > On Thu, 28 Jul 2022 at 16:09, Dr. David Alan Gilbert
> > <dgilbert@redhat.com> wrote:
> > >
> > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > On Thu, 28 Jul 2022 15:44:20 +0100
> > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > >
> > > > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > > > QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
> > > > > >   $ qemu-system-mips -monitor stdio
> > > > > >   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> > > > > >   Segmentation fault (core dumped)
> > > > > >
> > > > > > It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> > > > > > which are valid only for x86 and not for MIPS (as it requires ACPI
> > > > > > tables support which is not existent for ithe later)
> > > > > >
> > > > > > Issue was probably exposed by trying to cleanup/compile out unused
> > > > > > ACPI bits from MIPS target (but forgetting about migration bits).
> > > > > >
> > > > > > Disable compiled out features using compat properties as the least
> > > > > > risky way to deal with issue.
> > > > >
> > > > > Isn't the problem partially due to a 'stub' vmsd which isn't terminated?
> > > >
> > > > Not sure what "'stub' vmsd" is, can you explain?
> > >
> > > In hw/acpi/acpi-pci-hotplug-stub.c there is :
> > > const VMStateDescription vmstate_acpi_pcihp_pci_status;
> I think that one is there only for linking purposes and not meant
> to be actually used.

Yes, exactly. The problem is that without this patch which
sets various properties it *does* get used...

> > > this seg happens when the migration code walks into that - this should
> > > always get populated with some of the minimal fields, in particular the
> > > .name and .fields array terminated with VMSTATE_END_OF_LIST().
> >
> > Either:
> >  (1) we should be sure the vmstate struct does not get used if the
> >      compile-time config has ended up with the stub
> > or
>
> >  (2) it needs to actually match the real vmstate struct, otherwise
> >      migration between a QEMU built with a config that just got the
> >      stub version and a QEMU built with a config that got the full
> >      version will break
> >
> > This patch does the former. Segfaulting if we got something wrong
> > and tried to use the vmstate when we weren't expecting to is
> > arguably better than producing an incompatible migration stream.
>
> > (Better still would be if we caught this on machine startup rather
> > than only when savevm was invoked.)
> Theoretically possible with a bunch of mips and x86 stubs, but ...
> we typically don't do this kind of checks for migration sake
> as that complicates things a lot in general.
> i.e. it's common to let migration fail in case of incompatible
> migration stream. It's not exactly friendly to user but it's
> graceful failure (assuming code is correct and not crashes QEMU)

The point here is that if we ever try to do a migrate with the
stub vmstate struct then that's a bug in QEMU. We should prefer
to catch those early and clearly.

-- PMM
Dr. David Alan Gilbert Aug. 1, 2022, 9:17 a.m. UTC | #13
* Peter Maydell (peter.maydell@linaro.org) wrote:
> On Fri, 29 Jul 2022 at 10:57, Igor Mammedov <imammedo@redhat.com> wrote:
> >
> > On Thu, 28 Jul 2022 16:12:34 +0100
> > Peter Maydell <peter.maydell@linaro.org> wrote:
> >
> > > On Thu, 28 Jul 2022 at 16:09, Dr. David Alan Gilbert
> > > <dgilbert@redhat.com> wrote:
> > > >
> > > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > > On Thu, 28 Jul 2022 15:44:20 +0100
> > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > > >
> > > > > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > > > > QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
> > > > > > >   $ qemu-system-mips -monitor stdio
> > > > > > >   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> > > > > > >   Segmentation fault (core dumped)
> > > > > > >
> > > > > > > It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> > > > > > > which are valid only for x86 and not for MIPS (as it requires ACPI
> > > > > > > tables support which is not existent for ithe later)
> > > > > > >
> > > > > > > Issue was probably exposed by trying to cleanup/compile out unused
> > > > > > > ACPI bits from MIPS target (but forgetting about migration bits).
> > > > > > >
> > > > > > > Disable compiled out features using compat properties as the least
> > > > > > > risky way to deal with issue.
> > > > > >
> > > > > > Isn't the problem partially due to a 'stub' vmsd which isn't terminated?
> > > > >
> > > > > Not sure what "'stub' vmsd" is, can you explain?
> > > >
> > > > In hw/acpi/acpi-pci-hotplug-stub.c there is :
> > > > const VMStateDescription vmstate_acpi_pcihp_pci_status;
> > I think that one is there only for linking purposes and not meant
> > to be actually used.
> 
> Yes, exactly. The problem is that without this patch which
> sets various properties it *does* get used...
> 
> > > > this seg happens when the migration code walks into that - this should
> > > > always get populated with some of the minimal fields, in particular the
> > > > .name and .fields array terminated with VMSTATE_END_OF_LIST().
> > >
> > > Either:
> > >  (1) we should be sure the vmstate struct does not get used if the
> > >      compile-time config has ended up with the stub
> > > or
> >
> > >  (2) it needs to actually match the real vmstate struct, otherwise
> > >      migration between a QEMU built with a config that just got the
> > >      stub version and a QEMU built with a config that got the full
> > >      version will break
> > >
> > > This patch does the former. Segfaulting if we got something wrong
> > > and tried to use the vmstate when we weren't expecting to is
> > > arguably better than producing an incompatible migration stream.
> >
> > > (Better still would be if we caught this on machine startup rather
> > > than only when savevm was invoked.)
> > Theoretically possible with a bunch of mips and x86 stubs, but ...
> > we typically don't do this kind of checks for migration sake
> > as that complicates things a lot in general.
> > i.e. it's common to let migration fail in case of incompatible
> > migration stream. It's not exactly friendly to user but it's
> > graceful failure (assuming code is correct and not crashes QEMU)
> 
> The point here is that if we ever try to do a migrate with the
> stub vmstate struct then that's a bug in QEMU. We should prefer
> to catch those early and clearly.

I'd rather have something that was explicitly poisoned rather than just
walking off the end of an uninitialised array and having to break out
gdb.

Dave

> -- PMM
>
Peter Maydell Aug. 1, 2022, 9:43 a.m. UTC | #14
On Mon, 1 Aug 2022 at 10:17, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
>
> * Peter Maydell (peter.maydell@linaro.org) wrote:
> > On Fri, 29 Jul 2022 at 10:57, Igor Mammedov <imammedo@redhat.com> wrote:
> > >
> > > On Thu, 28 Jul 2022 16:12:34 +0100
> > > Peter Maydell <peter.maydell@linaro.org> wrote:
> > > > Either:
> > > >  (1) we should be sure the vmstate struct does not get used if the
> > > >      compile-time config has ended up with the stub
> > > > or
> > >
> > > >  (2) it needs to actually match the real vmstate struct, otherwise
> > > >      migration between a QEMU built with a config that just got the
> > > >      stub version and a QEMU built with a config that got the full
> > > >      version will break
> > > >
> > > > This patch does the former. Segfaulting if we got something wrong
> > > > and tried to use the vmstate when we weren't expecting to is
> > > > arguably better than producing an incompatible migration stream.
> > >
> > > > (Better still would be if we caught this on machine startup rather
> > > > than only when savevm was invoked.)
> > > Theoretically possible with a bunch of mips and x86 stubs, but ...
> > > we typically don't do this kind of checks for migration sake
> > > as that complicates things a lot in general.
> > > i.e. it's common to let migration fail in case of incompatible
> > > migration stream. It's not exactly friendly to user but it's
> > > graceful failure (assuming code is correct and not crashes QEMU)
> >
> > The point here is that if we ever try to do a migrate with the
> > stub vmstate struct then that's a bug in QEMU. We should prefer
> > to catch those early and clearly.
>
> I'd rather have something that was explicitly poisoned rather than just
> walking off the end of an uninitialised array and having to break out
> gdb.

It doesn't walk off the end of the array -- it segfaults because
it wants to dereference vmsd->name, which is NULL.

If we want to have a more obvious and concrete way to mark "this
vmsd is bad and should never be actively used" that's fine, but it
seems like a separate patch from this one, which is just fixing
the problem that we use a vmsd that we should not be using.

-- PMM
Bernhard Beschow Aug. 3, 2022, 5:26 p.m. UTC | #15
On Tue, Aug 2, 2022 at 8:37 AM Philippe Mathieu-Daudé via <
qemu-devel@nongnu.org> wrote:

> On 28/7/22 15:16, Igor Mammedov wrote:
> > On Thu, 28 Jul 2022 13:29:07 +0100
> > Peter Maydell <peter.maydell@linaro.org> wrote:
> >
> >> On Thu, 28 Jul 2022 at 12:50, Igor Mammedov <imammedo@redhat.com>
> wrote:
> >>>
> >>> QEMU crashes trying to save VMSTATE when only MIPS target are compiled
> in
> >>>    $ qemu-system-mips -monitor stdio
> >>>    (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> >>>    Segmentation fault (core dumped)
> >>>
> >>> It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> >>> which are valid only for x86 and not for MIPS (as it requires ACPI
> >>> tables support which is not existent for ithe later)
>
> We already discussed this Frankenstein PIIX4 problem 2 and 4 years ago:
>
> https://lore.kernel.org/qemu-devel/4d42697e-ba84-e5af-3a17-a2cc52cf0dbc@redhat.com/
>
> https://lore.kernel.org/qemu-devel/20190304210359-mutt-send-email-mst@kernel.org/


Interesting reads!


> >>> Issue was probably exposed by trying to cleanup/compile out unused
> >>> ACPI bits from MIPS target (but forgetting about migration bits).
> >>>
> >>> Disable compiled out features using compat properties as the least
> >>> risky way to deal with issue.
>
> So now MIPS is forced to use meaningless compat[] to satisfy X86.
>
> Am I wrong seeing this as a dirty hack creeping in, yet another
> technical debt that will hit (me...) back in a close future?
>
> Are we sure there are no better solution (probably more time consuming
> and involving refactors) we could do instead?
>

Working on the consolidation of piix3 and -4 soutbridges [1] I've stumbled
over certain design decisions where board/platform specific assumptions are
baked into the piix device models. I figure that's the core of the issue.

In our case the ACPI functionality is implemented by inheritance while
perhaps it should be implemented using composition. With composition, the
ACPI functionality could be injected by the caller: The pc board would
inject it while the Malta board wouldn't. This would solve both the crash
and above design problem.

I'd be willing to implement it but can't make any promises about the time
frame since I'm currently doing this in my free time. Any hints regarding
the implementation would be welcome, though.

Best regards,
Bernhard

[1] https://github.com/shentok/qemu/commits/piix-consolidate


> Thanks,
>
> Phil.
>
> >>> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> >>
> >> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/995
> >>
> >>> ---
> >>> PS:
> >>> another approach could be setting defaults to disabled state and
> >>> enabling them using compat props on PC machines (which is more
> >>> code to deal with => more risky) or continue with PIIX4_PM
> >>> refactoring to split x86-shism out (which I'm not really
> >>> interested in due to risk of regressions for not much of
> >>> benefit)
> >>> ---
> >>>   hw/mips/malta.c | 9 +++++++++
> >>>   1 file changed, 9 insertions(+)
> >>>
> >>> diff --git a/hw/mips/malta.c b/hw/mips/malta.c
> >>> index 7a0ec513b0..0e932988e0 100644
> >>> --- a/hw/mips/malta.c
> >>> +++ b/hw/mips/malta.c
> >>> @@ -1442,6 +1442,14 @@ static const TypeInfo mips_malta_device = {
> >>>       .instance_init = mips_malta_instance_init,
> >>>   };
> >>>
> >>> +GlobalProperty malta_compat[] = {
> >>> +    { "PIIX4_PM", "memory-hotplug-support", "off" },
> >>> +    { "PIIX4_PM", "acpi-pci-hotplug-with-bridge-support", "off" },
> >>> +    { "PIIX4_PM", "acpi-root-pci-hotplug", "off" },
> >>> +    { "PIIX4_PM", "x-not-migrate-acpi-index", "true" },
> >>> +};
> >>
> >> Is there an easy way to assert in hw/acpi/piix4.c that if
> >> CONFIG_ACPI_PCIHP was not set then the board has initialized
> >> all these properties to the don't-use-hotplug state ?
> >> That would be a guard against similar bugs (though I suppose
> >> we probably aren't likely to add new piix4 boards...)
> >
> > unfortunately new features still creep in 'pc' machine
> > ex: "acpi-root-pci-hotplug"), and I don't see an easy
> > way to compile that nor enforce that in the future.
> >
> > Far from easy would be split piix4_pm on base/enhanced
> > classes so we wouldn't need x86 specific hacks in 'base'
> > variant (assuming 'enhanced' could maintain the current
> > VMSTATE to keep cross-version migration working).
> >
> >>> +const size_t malta_compat_len = G_N_ELEMENTS(malta_compat);
> >>> +
> >>>   static void mips_malta_machine_init(MachineClass *mc)
> >>>   {
> >>>       mc->desc = "MIPS Malta Core LV";
> >>> @@ -1455,6 +1463,7 @@ static void mips_malta_machine_init(MachineClass
> *mc)
> >>>       mc->default_cpu_type = MIPS_CPU_TYPE_NAME("24Kf");
> >>>   #endif
> >>>       mc->default_ram_id = "mips_malta.ram";
> >>> +    compat_props_add(mc->compat_props, malta_compat,
> malta_compat_len);
> >>>   }
> >>>
> >>>   DEFINE_MACHINE("malta", mips_malta_machine_init)
> >>> --
> >>> 2.31.1
> >>
> >> thanks
> >> -- PMM
> >>
> >
>
>
>
Peter Maydell Aug. 3, 2022, 6 p.m. UTC | #16
On Wed, 3 Aug 2022 at 18:26, Bernhard Beschow <shentey@gmail.com> wrote:
>
> On Tue, Aug 2, 2022 at 8:37 AM Philippe Mathieu-Daudé via <qemu-devel@nongnu.org> wrote:
>>
>> On 28/7/22 15:16, Igor Mammedov wrote:
>> > On Thu, 28 Jul 2022 13:29:07 +0100
>> > Peter Maydell <peter.maydell@linaro.org> wrote:
>> >
>> >> On Thu, 28 Jul 2022 at 12:50, Igor Mammedov <imammedo@redhat.com> wrote:
>> >>> Disable compiled out features using compat properties as the least
>> >>> risky way to deal with issue.
>>
>> So now MIPS is forced to use meaningless compat[] to satisfy X86.
>>
>> Am I wrong seeing this as a dirty hack creeping in, yet another
>> technical debt that will hit (me...) back in a close future?
>>
>> Are we sure there are no better solution (probably more time consuming
>> and involving refactors) we could do instead?
>
>
> Working on the consolidation of piix3 and -4 soutbridges [1] I've stumbled over certain design decisions where board/platform specific assumptions are baked into the piix device models. I figure that's the core of the issue.
>
> In our case the ACPI functionality is implemented by inheritance while perhaps it should be implemented using composition. With composition, the ACPI functionality could be injected by the caller: The pc board would inject it while the Malta board wouldn't. This would solve both the crash and above design problem.
>
> I'd be willing to implement it but can't make any promises about the time frame since I'm currently doing this in my free time. Any hints regarding the implementation would be welcome, though.


For the 7.1 release (coming up real soon now) can we get consensus
on this patch from Igor as the least risky way to at least fix
the segfault ? We can look at better approaches for 7.2.

thanks
-- PMM
Michael S. Tsirkin Aug. 3, 2022, 10 p.m. UTC | #17
On Thu, Jul 28, 2022 at 07:50:34AM -0400, Igor Mammedov wrote:
> QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
>   $ qemu-system-mips -monitor stdio
>   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
>   Segmentation fault (core dumped)
> 
> It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> which are valid only for x86 and not for MIPS (as it requires ACPI
> tables support which is not existent for ithe later)
> 
> Issue was probably exposed by trying to cleanup/compile out unused
> ACPI bits from MIPS target (but forgetting about migration bits).
> 
> Disable compiled out features using compat properties as the least
> risky way to deal with issue.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>


For 7.1 this seems like the lesser evil.

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
> PS:
> another approach could be setting defaults to disabled state and
> enabling them using compat props on PC machines (which is more
> code to deal with => more risky) or continue with PIIX4_PM
> refactoring to split x86-shism out (which I'm not really
> interested in due to risk of regressions for not much of
> benefit)
> ---
>  hw/mips/malta.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/hw/mips/malta.c b/hw/mips/malta.c
> index 7a0ec513b0..0e932988e0 100644
> --- a/hw/mips/malta.c
> +++ b/hw/mips/malta.c
> @@ -1442,6 +1442,14 @@ static const TypeInfo mips_malta_device = {
>      .instance_init = mips_malta_instance_init,
>  };
>  
> +GlobalProperty malta_compat[] = {
> +    { "PIIX4_PM", "memory-hotplug-support", "off" },
> +    { "PIIX4_PM", "acpi-pci-hotplug-with-bridge-support", "off" },
> +    { "PIIX4_PM", "acpi-root-pci-hotplug", "off" },
> +    { "PIIX4_PM", "x-not-migrate-acpi-index", "true" },
> +};
> +const size_t malta_compat_len = G_N_ELEMENTS(malta_compat);
> +
>  static void mips_malta_machine_init(MachineClass *mc)
>  {
>      mc->desc = "MIPS Malta Core LV";
> @@ -1455,6 +1463,7 @@ static void mips_malta_machine_init(MachineClass *mc)
>      mc->default_cpu_type = MIPS_CPU_TYPE_NAME("24Kf");
>  #endif
>      mc->default_ram_id = "mips_malta.ram";
> +    compat_props_add(mc->compat_props, malta_compat, malta_compat_len);
>  }
>  
>  DEFINE_MACHINE("malta", mips_malta_machine_init)
> -- 
> 2.31.1
Ani Sinha Aug. 4, 2022, 6:44 a.m. UTC | #18
On Wed, Aug 3, 2022 at 3:00 PM Michael S. Tsirkin <mst@redhat.com> wrote:

> On Thu, Jul 28, 2022 at 07:50:34AM -0400, Igor Mammedov wrote:
> > QEMU crashes trying to save VMSTATE when only MIPS target are compiled in
> >   $ qemu-system-mips -monitor stdio
> >   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> >   Segmentation fault (core dumped)
> >
> > It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> > which are valid only for x86 and not for MIPS (as it requires ACPI
> > tables support which is not existent for ithe later)
> >
> > Issue was probably exposed by trying to cleanup/compile out unused
> > ACPI bits from MIPS target (but forgetting about migration bits).
> >
> > Disable compiled out features using compat properties as the least
> > risky way to deal with issue.
> >
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
>
>
> For 7.1 this seems like the lesser evil.
>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>


Yes for 7.1, let's go ahead with this aa it seems least risky approach.

I've already reviewed it.


>
> > ---
> > PS:
> > another approach could be setting defaults to disabled state and
> > enabling them using compat props on PC machines (which is more
> > code to deal with => more risky) or continue with PIIX4_PM
> > refactoring to split x86-shism out (which I'm not really
> > interested in due to risk of regressions for not much of
> > benefit)
> > ---
> >  hw/mips/malta.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/hw/mips/malta.c b/hw/mips/malta.c
> > index 7a0ec513b0..0e932988e0 100644
> > --- a/hw/mips/malta.c
> > +++ b/hw/mips/malta.c
> > @@ -1442,6 +1442,14 @@ static const TypeInfo mips_malta_device = {
> >      .instance_init = mips_malta_instance_init,
> >  };
> >
> > +GlobalProperty malta_compat[] = {
> > +    { "PIIX4_PM", "memory-hotplug-support", "off" },
> > +    { "PIIX4_PM", "acpi-pci-hotplug-with-bridge-support", "off" },
> > +    { "PIIX4_PM", "acpi-root-pci-hotplug", "off" },
> > +    { "PIIX4_PM", "x-not-migrate-acpi-index", "true" },
> > +};
> > +const size_t malta_compat_len = G_N_ELEMENTS(malta_compat);
> > +
> >  static void mips_malta_machine_init(MachineClass *mc)
> >  {
> >      mc->desc = "MIPS Malta Core LV";
> > @@ -1455,6 +1463,7 @@ static void mips_malta_machine_init(MachineClass
> *mc)
> >      mc->default_cpu_type = MIPS_CPU_TYPE_NAME("24Kf");
> >  #endif
> >      mc->default_ram_id = "mips_malta.ram";
> > +    compat_props_add(mc->compat_props, malta_compat, malta_compat_len);
> >  }
> >
> >  DEFINE_MACHINE("malta", mips_malta_machine_init)
> > --
> > 2.31.1
>
>
Bernhard Beschow Aug. 4, 2022, 9:32 p.m. UTC | #19
Am 3. August 2022 20:00:18 MESZ schrieb Peter Maydell <peter.maydell@linaro.org>:
>On Wed, 3 Aug 2022 at 18:26, Bernhard Beschow <shentey@gmail.com> wrote:
>>
>> On Tue, Aug 2, 2022 at 8:37 AM Philippe Mathieu-Daudé via <qemu-devel@nongnu.org> wrote:
>>>
>>> On 28/7/22 15:16, Igor Mammedov wrote:
>>> > On Thu, 28 Jul 2022 13:29:07 +0100
>>> > Peter Maydell <peter.maydell@linaro.org> wrote:
>>> >
>>> >> On Thu, 28 Jul 2022 at 12:50, Igor Mammedov <imammedo@redhat.com> wrote:
>>> >>> Disable compiled out features using compat properties as the least
>>> >>> risky way to deal with issue.
>>>
>>> So now MIPS is forced to use meaningless compat[] to satisfy X86.
>>>
>>> Am I wrong seeing this as a dirty hack creeping in, yet another
>>> technical debt that will hit (me...) back in a close future?
>>>
>>> Are we sure there are no better solution (probably more time consuming
>>> and involving refactors) we could do instead?
>>
>>
>> Working on the consolidation of piix3 and -4 soutbridges [1] I've stumbled over certain design decisions where board/platform specific assumptions are baked into the piix device models. I figure that's the core of the issue.
>>
>> In our case the ACPI functionality is implemented by inheritance while perhaps it should be implemented using composition. With composition, the ACPI functionality could be injected by the caller: The pc board would inject it while the Malta board wouldn't. This would solve both the crash and above design problem.
>>
>> I'd be willing to implement it but can't make any promises about the time frame since I'm currently doing this in my free time. Any hints regarding the implementation would be welcome, though.
>
>
>For the 7.1 release (coming up real soon now) can we get consensus
>on this patch from Igor as the least risky way to at least fix
>the segfault ? We can look at better approaches for 7.2.

Hi,

My proposal isn't 7.1 material. I merily intended to start a design discussion how to proceed after 7.1 that would make Phil's maintainer life easier and provide further insights for my consolidation work.

I don't feel qualified enough to judge the impact of Igor's patch, so I'd leave that for the competent.

Best regards,
Bernhard

>
>thanks
>-- PMM
Igor Mammedov Aug. 8, 2022, 12:15 p.m. UTC | #20
On Wed, 3 Aug 2022 19:26:30 +0200
Bernhard Beschow <shentey@gmail.com> wrote:

> On Tue, Aug 2, 2022 at 8:37 AM Philippe Mathieu-Daudé via <
> qemu-devel@nongnu.org> wrote:
> 
> > On 28/7/22 15:16, Igor Mammedov wrote:  
> > > On Thu, 28 Jul 2022 13:29:07 +0100
> > > Peter Maydell <peter.maydell@linaro.org> wrote:
> > >  
> > >> On Thu, 28 Jul 2022 at 12:50, Igor Mammedov <imammedo@redhat.com>  
> > wrote:  
> > >>>
> > >>> QEMU crashes trying to save VMSTATE when only MIPS target are compiled  
> > in  
> > >>>    $ qemu-system-mips -monitor stdio
> > >>>    (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> > >>>    Segmentation fault (core dumped)
> > >>>
> > >>> It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> > >>> which are valid only for x86 and not for MIPS (as it requires ACPI
> > >>> tables support which is not existent for ithe later)  
> >
> > We already discussed this Frankenstein PIIX4 problem 2 and 4 years ago:
> >
> > https://lore.kernel.org/qemu-devel/4d42697e-ba84-e5af-3a17-a2cc52cf0dbc@redhat.com/
> >
> > https://lore.kernel.org/qemu-devel/20190304210359-mutt-send-email-mst@kernel.org/  
> 
> 
> Interesting reads!
> 
> 
> > >>> Issue was probably exposed by trying to cleanup/compile out unused
> > >>> ACPI bits from MIPS target (but forgetting about migration bits).
> > >>>
> > >>> Disable compiled out features using compat properties as the least
> > >>> risky way to deal with issue.  
> >
> > So now MIPS is forced to use meaningless compat[] to satisfy X86.
> >
> > Am I wrong seeing this as a dirty hack creeping in, yet another
> > technical debt that will hit (me...) back in a close future?
> >
> > Are we sure there are no better solution (probably more time consuming
> > and involving refactors) we could do instead?
> >  
> 
> Working on the consolidation of piix3 and -4 soutbridges [1] I've stumbled
> over certain design decisions where board/platform specific assumptions are
> baked into the piix device models. I figure that's the core of the issue.
> 
> In our case the ACPI functionality is implemented by inheritance while
> perhaps it should be implemented using composition. With composition, the
> ACPI functionality could be injected by the caller: The pc board would
> inject it while the Malta board wouldn't. This would solve both the crash
> and above design problem.

While refactoring we should keep migration stream compatible with older
QEMU versions (we must not regress widely x86 code path). Which might be
tricky in this case.

Perhaps the best we could do is follow up on Philippe's idea to make
PIIX4_PM frankenstein x86-specific (the least chance for regressions)
and create/use clean version for anything else.

> I'd be willing to implement it but can't make any promises about the time
> frame since I'm currently doing this in my free time. Any hints regarding
> the implementation would be welcome, though.
> 
> Best regards,
> Bernhard
> 
> [1] https://github.com/shentok/qemu/commits/piix-consolidate
> 
> 
> > Thanks,
> >
> > Phil.
> >  
> > >>> Signed-off-by: Igor Mammedov <imammedo@redhat.com>  
> > >>
> > >> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/995
> > >>  
> > >>> ---
> > >>> PS:
> > >>> another approach could be setting defaults to disabled state and
> > >>> enabling them using compat props on PC machines (which is more
> > >>> code to deal with => more risky) or continue with PIIX4_PM
> > >>> refactoring to split x86-shism out (which I'm not really
> > >>> interested in due to risk of regressions for not much of
> > >>> benefit)
> > >>> ---
> > >>>   hw/mips/malta.c | 9 +++++++++
> > >>>   1 file changed, 9 insertions(+)
> > >>>
> > >>> diff --git a/hw/mips/malta.c b/hw/mips/malta.c
> > >>> index 7a0ec513b0..0e932988e0 100644
> > >>> --- a/hw/mips/malta.c
> > >>> +++ b/hw/mips/malta.c
> > >>> @@ -1442,6 +1442,14 @@ static const TypeInfo mips_malta_device = {
> > >>>       .instance_init = mips_malta_instance_init,
> > >>>   };
> > >>>
> > >>> +GlobalProperty malta_compat[] = {
> > >>> +    { "PIIX4_PM", "memory-hotplug-support", "off" },
> > >>> +    { "PIIX4_PM", "acpi-pci-hotplug-with-bridge-support", "off" },
> > >>> +    { "PIIX4_PM", "acpi-root-pci-hotplug", "off" },
> > >>> +    { "PIIX4_PM", "x-not-migrate-acpi-index", "true" },
> > >>> +};  
> > >>
> > >> Is there an easy way to assert in hw/acpi/piix4.c that if
> > >> CONFIG_ACPI_PCIHP was not set then the board has initialized
> > >> all these properties to the don't-use-hotplug state ?
> > >> That would be a guard against similar bugs (though I suppose
> > >> we probably aren't likely to add new piix4 boards...)  
> > >
> > > unfortunately new features still creep in 'pc' machine
> > > ex: "acpi-root-pci-hotplug"), and I don't see an easy
> > > way to compile that nor enforce that in the future.
> > >
> > > Far from easy would be split piix4_pm on base/enhanced
> > > classes so we wouldn't need x86 specific hacks in 'base'
> > > variant (assuming 'enhanced' could maintain the current
> > > VMSTATE to keep cross-version migration working).
> > >  
> > >>> +const size_t malta_compat_len = G_N_ELEMENTS(malta_compat);
> > >>> +
> > >>>   static void mips_malta_machine_init(MachineClass *mc)
> > >>>   {
> > >>>       mc->desc = "MIPS Malta Core LV";
> > >>> @@ -1455,6 +1463,7 @@ static void mips_malta_machine_init(MachineClass  
> > *mc)  
> > >>>       mc->default_cpu_type = MIPS_CPU_TYPE_NAME("24Kf");
> > >>>   #endif
> > >>>       mc->default_ram_id = "mips_malta.ram";
> > >>> +    compat_props_add(mc->compat_props, malta_compat,  
> > malta_compat_len);  
> > >>>   }
> > >>>
> > >>>   DEFINE_MACHINE("malta", mips_malta_machine_init)
> > >>> --
> > >>> 2.31.1  
> > >>
> > >> thanks
> > >> -- PMM
> > >>  
> > >  
> >
> >
> >
Bernhard Beschow Aug. 8, 2022, 5:57 p.m. UTC | #21
Am 8. August 2022 14:15:40 MESZ schrieb Igor Mammedov <imammedo@redhat.com>:
>On Wed, 3 Aug 2022 19:26:30 +0200
>Bernhard Beschow <shentey@gmail.com> wrote:
>
>> On Tue, Aug 2, 2022 at 8:37 AM Philippe Mathieu-Daudé via <
>> qemu-devel@nongnu.org> wrote:
>> 
>> > On 28/7/22 15:16, Igor Mammedov wrote:  
>> > > On Thu, 28 Jul 2022 13:29:07 +0100
>> > > Peter Maydell <peter.maydell@linaro.org> wrote:
>> > >  
>> > >> On Thu, 28 Jul 2022 at 12:50, Igor Mammedov <imammedo@redhat.com>  
>> > wrote:  
>> > >>>
>> > >>> QEMU crashes trying to save VMSTATE when only MIPS target are compiled  
>> > in  
>> > >>>    $ qemu-system-mips -monitor stdio
>> > >>>    (qemu) migrate "exec:gzip -c > STATEFILE.gz"
>> > >>>    Segmentation fault (core dumped)
>> > >>>
>> > >>> It happens due to PIIX4_PM trying to parse hotplug vmstate structures
>> > >>> which are valid only for x86 and not for MIPS (as it requires ACPI
>> > >>> tables support which is not existent for ithe later)  
>> >
>> > We already discussed this Frankenstein PIIX4 problem 2 and 4 years ago:
>> >
>> > https://lore.kernel.org/qemu-devel/4d42697e-ba84-e5af-3a17-a2cc52cf0dbc@redhat.com/
>> >
>> > https://lore.kernel.org/qemu-devel/20190304210359-mutt-send-email-mst@kernel.org/  
>> 
>> 
>> Interesting reads!
>> 
>> 
>> > >>> Issue was probably exposed by trying to cleanup/compile out unused
>> > >>> ACPI bits from MIPS target (but forgetting about migration bits).
>> > >>>
>> > >>> Disable compiled out features using compat properties as the least
>> > >>> risky way to deal with issue.  
>> >
>> > So now MIPS is forced to use meaningless compat[] to satisfy X86.
>> >
>> > Am I wrong seeing this as a dirty hack creeping in, yet another
>> > technical debt that will hit (me...) back in a close future?
>> >
>> > Are we sure there are no better solution (probably more time consuming
>> > and involving refactors) we could do instead?
>> >  
>> 
>> Working on the consolidation of piix3 and -4 soutbridges [1] I've stumbled
>> over certain design decisions where board/platform specific assumptions are
>> baked into the piix device models. I figure that's the core of the issue.
>> 
>> In our case the ACPI functionality is implemented by inheritance while
>> perhaps it should be implemented using composition. With composition, the
>> ACPI functionality could be injected by the caller: The pc board would
>> inject it while the Malta board wouldn't. This would solve both the crash
>> and above design problem.
>
>While refactoring we should keep migration stream compatible with older
>QEMU versions (we must not regress widely x86 code path). Which might be
>tricky in this case.

Does this particular fix make future compatibility harder or easier or is it that hard already? IIUC it omits the hotplug bits in the vm state for Malta which is what one would expect there, right?

>Perhaps the best we could do is follow up on Philippe's idea to make
>PIIX4_PM frankenstein x86-specific (the least chance for regressions)
>and create/use clean version for anything else.

Having two implementations of the same device means that we'll end up having duplicate code with board/platform-specific assumptions baked in. I guess what Phil cares about is a sustainable solution without hacks that doesn't cause bloat and/or regressions for MIPS, especially for features where MIPS doesn't benefit from. I believe that composition could be such a solution.

My consolidation work could actually make PIIX4 an option for the PC machine. This means that PIIX4_PM wouldn't be Frankenstein any more. This works already on my branch - for both PC and Malta. Furthermore, it looks like it allowed Malta to benefit more from KVM virtualization, but that's off-topic in this discussion.

>> I'd be willing to implement it but can't make any promises about the time
>> frame since I'm currently doing this in my free time. Any hints regarding
>> the implementation would be welcome, though.
>> 
>> Best regards,
>> Bernhard
>> 
>> [1] https://github.com/shentok/qemu/commits/piix-consolidate
>> 
>> 
>> > Thanks,
>> >
>> > Phil.
>> >  
>> > >>> Signed-off-by: Igor Mammedov <imammedo@redhat.com>  
>> > >>
>> > >> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/995
>> > >>  
>> > >>> ---
>> > >>> PS:
>> > >>> another approach could be setting defaults to disabled state and
>> > >>> enabling them using compat props on PC machines (which is more
>> > >>> code to deal with => more risky) or continue with PIIX4_PM
>> > >>> refactoring to split x86-shism out (which I'm not really
>> > >>> interested in due to risk of regressions for not much of
>> > >>> benefit)
>> > >>> ---
>> > >>>   hw/mips/malta.c | 9 +++++++++
>> > >>>   1 file changed, 9 insertions(+)
>> > >>>
>> > >>> diff --git a/hw/mips/malta.c b/hw/mips/malta.c
>> > >>> index 7a0ec513b0..0e932988e0 100644
>> > >>> --- a/hw/mips/malta.c
>> > >>> +++ b/hw/mips/malta.c
>> > >>> @@ -1442,6 +1442,14 @@ static const TypeInfo mips_malta_device = {
>> > >>>       .instance_init = mips_malta_instance_init,
>> > >>>   };
>> > >>>
>> > >>> +GlobalProperty malta_compat[] = {
>> > >>> +    { "PIIX4_PM", "memory-hotplug-support", "off" },
>> > >>> +    { "PIIX4_PM", "acpi-pci-hotplug-with-bridge-support", "off" },
>> > >>> +    { "PIIX4_PM", "acpi-root-pci-hotplug", "off" },
>> > >>> +    { "PIIX4_PM", "x-not-migrate-acpi-index", "true" },
>> > >>> +};  
>> > >>
>> > >> Is there an easy way to assert in hw/acpi/piix4.c that if
>> > >> CONFIG_ACPI_PCIHP was not set then the board has initialized
>> > >> all these properties to the don't-use-hotplug state ?
>> > >> That would be a guard against similar bugs (though I suppose
>> > >> we probably aren't likely to add new piix4 boards...)  
>> > >
>> > > unfortunately new features still creep in 'pc' machine
>> > > ex: "acpi-root-pci-hotplug"), and I don't see an easy
>> > > way to compile that nor enforce that in the future.
>> > >
>> > > Far from easy would be split piix4_pm on base/enhanced
>> > > classes so we wouldn't need x86 specific hacks in 'base'
>> > > variant (assuming 'enhanced' could maintain the current
>> > > VMSTATE to keep cross-version migration working).
>> > >  
>> > >>> +const size_t malta_compat_len = G_N_ELEMENTS(malta_compat);
>> > >>> +
>> > >>>   static void mips_malta_machine_init(MachineClass *mc)
>> > >>>   {
>> > >>>       mc->desc = "MIPS Malta Core LV";
>> > >>> @@ -1455,6 +1463,7 @@ static void mips_malta_machine_init(MachineClass  
>> > *mc)  
>> > >>>       mc->default_cpu_type = MIPS_CPU_TYPE_NAME("24Kf");
>> > >>>   #endif
>> > >>>       mc->default_ram_id = "mips_malta.ram";
>> > >>> +    compat_props_add(mc->compat_props, malta_compat,  
>> > malta_compat_len);  
>> > >>>   }
>> > >>>
>> > >>>   DEFINE_MACHINE("malta", mips_malta_machine_init)
>> > >>> --
>> > >>> 2.31.1  
>> > >>
>> > >> thanks
>> > >> -- PMM
>> > >>  
>> > >  
>> >
>> >
>> >  
>
Peter Maydell Aug. 8, 2022, 6:02 p.m. UTC | #22
On Mon, 8 Aug 2022 at 18:57, BB <shentey@gmail.com> wrote:
> Am 8. August 2022 14:15:40 MESZ schrieb Igor Mammedov <imammedo@redhat.com>:
> >On Wed, 3 Aug 2022 19:26:30 +0200
> >While refactoring we should keep migration stream compatible with older
> >QEMU versions (we must not regress widely x86 code path). Which might be
> >tricky in this case.
>
> Does this particular fix make future compatibility harder or easier or is it that hard already? IIUC it omits the hotplug bits in the vm state for Malta which is what one would expect there, right?

This patch's fix only affects Malta. It is (I suspect but have
not tested) a migration compat break on Malta, but we don't
care about cross-version migration compat for that board anyway.
Migration compat matters (to a first approximation) only for
those boards which have versioned machine types (eg pc-7.0,
pc-7.1, etc). For all other machine types we retain compat
only if it's easy.

thanks
-- PMM
Bernhard Beschow Aug. 8, 2022, 9:28 p.m. UTC | #23
Am 8. August 2022 20:02:50 MESZ schrieb Peter Maydell <peter.maydell@linaro.org>:
>On Mon, 8 Aug 2022 at 18:57, BB <shentey@gmail.com> wrote:
>> Am 8. August 2022 14:15:40 MESZ schrieb Igor Mammedov <imammedo@redhat.com>:
>> >On Wed, 3 Aug 2022 19:26:30 +0200
>> >While refactoring we should keep migration stream compatible with older
>> >QEMU versions (we must not regress widely x86 code path). Which might be
>> >tricky in this case.
>>
>> Does this particular fix make future compatibility harder or easier or is it that hard already? IIUC it omits the hotplug bits in the vm state for Malta which is what one would expect there, right?
>
>This patch's fix only affects Malta. It is (I suspect but have
>not tested) a migration compat break on Malta, but we don't
>care about cross-version migration compat for that board anyway.
>Migration compat matters (to a first approximation) only for
>those boards which have versioned machine types (eg pc-7.0,
>pc-7.1, etc). For all other machine types we retain compat
>only if it's easy.

I see. Thanks for the clarification!

Best regards,
Bernhard
>
>thanks
>-- PMM
Igor Mammedov Aug. 9, 2022, 7:27 a.m. UTC | #24
On Mon, 08 Aug 2022 19:57:23 +0200
BB <shentey@gmail.com> wrote:

> Am 8. August 2022 14:15:40 MESZ schrieb Igor Mammedov <imammedo@redhat.com>:
> >On Wed, 3 Aug 2022 19:26:30 +0200
> >Bernhard Beschow <shentey@gmail.com> wrote:
> >  
> >> On Tue, Aug 2, 2022 at 8:37 AM Philippe Mathieu-Daudé via <
> >> qemu-devel@nongnu.org> wrote:
> >>   
> >> > On 28/7/22 15:16, Igor Mammedov wrote:    
> >> > > On Thu, 28 Jul 2022 13:29:07 +0100
> >> > > Peter Maydell <peter.maydell@linaro.org> wrote:
> >> > >    
> >> > >> On Thu, 28 Jul 2022 at 12:50, Igor Mammedov <imammedo@redhat.com>    
> >> > wrote:    
> >> > >>>
> >> > >>> QEMU crashes trying to save VMSTATE when only MIPS target are compiled    
> >> > in    
> >> > >>>    $ qemu-system-mips -monitor stdio
> >> > >>>    (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> >> > >>>    Segmentation fault (core dumped)
> >> > >>>
> >> > >>> It happens due to PIIX4_PM trying to parse hotplug vmstate structures
> >> > >>> which are valid only for x86 and not for MIPS (as it requires ACPI
> >> > >>> tables support which is not existent for ithe later)    
> >> >
> >> > We already discussed this Frankenstein PIIX4 problem 2 and 4 years ago:
> >> >
> >> > https://lore.kernel.org/qemu-devel/4d42697e-ba84-e5af-3a17-a2cc52cf0dbc@redhat.com/
> >> >
> >> > https://lore.kernel.org/qemu-devel/20190304210359-mutt-send-email-mst@kernel.org/    
> >> 
> >> 
> >> Interesting reads!
> >> 
> >>   
> >> > >>> Issue was probably exposed by trying to cleanup/compile out unused
> >> > >>> ACPI bits from MIPS target (but forgetting about migration bits).
> >> > >>>
> >> > >>> Disable compiled out features using compat properties as the least
> >> > >>> risky way to deal with issue.    
> >> >
> >> > So now MIPS is forced to use meaningless compat[] to satisfy X86.
> >> >
> >> > Am I wrong seeing this as a dirty hack creeping in, yet another
> >> > technical debt that will hit (me...) back in a close future?
> >> >
> >> > Are we sure there are no better solution (probably more time consuming
> >> > and involving refactors) we could do instead?
> >> >    
> >> 
> >> Working on the consolidation of piix3 and -4 soutbridges [1] I've stumbled
> >> over certain design decisions where board/platform specific assumptions are
> >> baked into the piix device models. I figure that's the core of the issue.
> >> 
> >> In our case the ACPI functionality is implemented by inheritance while
> >> perhaps it should be implemented using composition. With composition, the
> >> ACPI functionality could be injected by the caller: The pc board would
> >> inject it while the Malta board wouldn't. This would solve both the crash
> >> and above design problem.  
> >
> >While refactoring we should keep migration stream compatible with older
> >QEMU versions (we must not regress widely x86 code path). Which might be
> >tricky in this case.  
> 
> Does this particular fix make future compatibility harder or easier or is it that hard already? IIUC it omits the hotplug bits in the vm state for Malta which is what one would expect there, right?
> 
> >Perhaps the best we could do is follow up on Philippe's idea to make
> >PIIX4_PM frankenstein x86-specific (the least chance for regressions)
> >and create/use clean version for anything else.  
> 
> Having two implementations of the same device means that we'll end up having duplicate code with board/platform-specific assumptions baked in. I guess what Phil cares about is a sustainable solution without hacks that doesn't cause bloat and/or regressions for MIPS, especially for features where MIPS doesn't benefit from. I believe that composition could be such a solution.

maybe creating PIIX4_PM-base without carrying any VMstate code and then
inheriting/branching that into piix4_pm-speced and PIIX4_PM, which will
carry it's own VMstate descriptors (with minimal duplication or somewhat
shared) can be made to work.

> My consolidation work could actually make PIIX4 an option for the PC machine. This means that PIIX4_PM wouldn't be Frankenstein any more. This works already on my branch - for both PC and Malta. Furthermore, it looks like it allowed Malta to benefit more from KVM virtualization, but that's off-topic in this discussion.
> 
> >> I'd be willing to implement it but can't make any promises about the time
> >> frame since I'm currently doing this in my free time. Any hints regarding
> >> the implementation would be welcome, though.
> >> 
> >> Best regards,
> >> Bernhard
> >> 
> >> [1] https://github.com/shentok/qemu/commits/piix-consolidate
> >> 
> >>   
> >> > Thanks,
> >> >
> >> > Phil.
> >> >    
> >> > >>> Signed-off-by: Igor Mammedov <imammedo@redhat.com>    
> >> > >>
> >> > >> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/995
> >> > >>    
> >> > >>> ---
> >> > >>> PS:
> >> > >>> another approach could be setting defaults to disabled state and
> >> > >>> enabling them using compat props on PC machines (which is more
> >> > >>> code to deal with => more risky) or continue with PIIX4_PM
> >> > >>> refactoring to split x86-shism out (which I'm not really
> >> > >>> interested in due to risk of regressions for not much of
> >> > >>> benefit)
> >> > >>> ---
> >> > >>>   hw/mips/malta.c | 9 +++++++++
> >> > >>>   1 file changed, 9 insertions(+)
> >> > >>>
> >> > >>> diff --git a/hw/mips/malta.c b/hw/mips/malta.c
> >> > >>> index 7a0ec513b0..0e932988e0 100644
> >> > >>> --- a/hw/mips/malta.c
> >> > >>> +++ b/hw/mips/malta.c
> >> > >>> @@ -1442,6 +1442,14 @@ static const TypeInfo mips_malta_device = {
> >> > >>>       .instance_init = mips_malta_instance_init,
> >> > >>>   };
> >> > >>>
> >> > >>> +GlobalProperty malta_compat[] = {
> >> > >>> +    { "PIIX4_PM", "memory-hotplug-support", "off" },
> >> > >>> +    { "PIIX4_PM", "acpi-pci-hotplug-with-bridge-support", "off" },
> >> > >>> +    { "PIIX4_PM", "acpi-root-pci-hotplug", "off" },
> >> > >>> +    { "PIIX4_PM", "x-not-migrate-acpi-index", "true" },
> >> > >>> +};    
> >> > >>
> >> > >> Is there an easy way to assert in hw/acpi/piix4.c that if
> >> > >> CONFIG_ACPI_PCIHP was not set then the board has initialized
> >> > >> all these properties to the don't-use-hotplug state ?
> >> > >> That would be a guard against similar bugs (though I suppose
> >> > >> we probably aren't likely to add new piix4 boards...)    
> >> > >
> >> > > unfortunately new features still creep in 'pc' machine
> >> > > ex: "acpi-root-pci-hotplug"), and I don't see an easy
> >> > > way to compile that nor enforce that in the future.
> >> > >
> >> > > Far from easy would be split piix4_pm on base/enhanced
> >> > > classes so we wouldn't need x86 specific hacks in 'base'
> >> > > variant (assuming 'enhanced' could maintain the current
> >> > > VMSTATE to keep cross-version migration working).
> >> > >    
> >> > >>> +const size_t malta_compat_len = G_N_ELEMENTS(malta_compat);
> >> > >>> +
> >> > >>>   static void mips_malta_machine_init(MachineClass *mc)
> >> > >>>   {
> >> > >>>       mc->desc = "MIPS Malta Core LV";
> >> > >>> @@ -1455,6 +1463,7 @@ static void mips_malta_machine_init(MachineClass    
> >> > *mc)    
> >> > >>>       mc->default_cpu_type = MIPS_CPU_TYPE_NAME("24Kf");
> >> > >>>   #endif
> >> > >>>       mc->default_ram_id = "mips_malta.ram";
> >> > >>> +    compat_props_add(mc->compat_props, malta_compat,    
> >> > malta_compat_len);    
> >> > >>>   }
> >> > >>>
> >> > >>>   DEFINE_MACHINE("malta", mips_malta_machine_init)
> >> > >>> --
> >> > >>> 2.31.1    
> >> > >>
> >> > >> thanks
> >> > >> -- PMM
> >> > >>    
> >> > >    
> >> >
> >> >
> >> >    
> >  
>
diff mbox series

Patch

diff --git a/hw/mips/malta.c b/hw/mips/malta.c
index 7a0ec513b0..0e932988e0 100644
--- a/hw/mips/malta.c
+++ b/hw/mips/malta.c
@@ -1442,6 +1442,14 @@  static const TypeInfo mips_malta_device = {
     .instance_init = mips_malta_instance_init,
 };
 
+GlobalProperty malta_compat[] = {
+    { "PIIX4_PM", "memory-hotplug-support", "off" },
+    { "PIIX4_PM", "acpi-pci-hotplug-with-bridge-support", "off" },
+    { "PIIX4_PM", "acpi-root-pci-hotplug", "off" },
+    { "PIIX4_PM", "x-not-migrate-acpi-index", "true" },
+};
+const size_t malta_compat_len = G_N_ELEMENTS(malta_compat);
+
 static void mips_malta_machine_init(MachineClass *mc)
 {
     mc->desc = "MIPS Malta Core LV";
@@ -1455,6 +1463,7 @@  static void mips_malta_machine_init(MachineClass *mc)
     mc->default_cpu_type = MIPS_CPU_TYPE_NAME("24Kf");
 #endif
     mc->default_ram_id = "mips_malta.ram";
+    compat_props_add(mc->compat_props, malta_compat, malta_compat_len);
 }
 
 DEFINE_MACHINE("malta", mips_malta_machine_init)