diff mbox series

[2/4] ppc: add CPU IRQ state to PPC VMStateDescription

Message ID 1505054255-2990-3-git-send-email-mark.cave-ayland@ilande.co.uk
State New
Headers show
Series ppc: migration fixes for TCG | expand

Commit Message

Mark Cave-Ayland Sept. 10, 2017, 2:37 p.m. UTC
Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
appears to drop the internal CPU IRQ state from the migration stream. Whilst
testing migration on g3beige/mac99 machines, test images would randomly fail to
resume unless a key was pressed on the VGA console.

Further investigation suggests that internal CPU IRQ state isn't being
preserved and so interrupts asserted at the time of migration are lost. Adding
the pending_interrupts and irq_input_state fields back into the migration
stream appears to fix the problem here during local tests.

As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
the additional fields.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/machine.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Greg Kurz Sept. 11, 2017, 7:50 a.m. UTC | #1
On Sun, 10 Sep 2017 15:37:33 +0100
Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:

> Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
> appears to drop the internal CPU IRQ state from the migration stream. Whilst
> testing migration on g3beige/mac99 machines, test images would randomly fail to
> resume unless a key was pressed on the VGA console.
> 
> Further investigation suggests that internal CPU IRQ state isn't being
> preserved and so interrupts asserted at the time of migration are lost. Adding
> the pending_interrupts and irq_input_state fields back into the migration
> stream appears to fix the problem here during local tests.
> 
> As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
> the additional fields.
> 

And so this unconditionally breaks backward migration... what about adding
a subsection for this ?

> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/machine.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/target/ppc/machine.c b/target/ppc/machine.c
> index e59049f..8fec1a4 100644
> --- a/target/ppc/machine.c
> +++ b/target/ppc/machine.c
> @@ -647,7 +647,7 @@ static const VMStateDescription vmstate_compat = {
>  
>  const VMStateDescription vmstate_ppc_cpu = {
>      .name = "cpu",
> -    .version_id = 5,
> +    .version_id = 6,
>      .minimum_version_id = 5,
>      .minimum_version_id_old = 4,
>      .load_state_old = cpu_load_old,
> @@ -678,6 +678,10 @@ const VMStateDescription vmstate_ppc_cpu = {
>          VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
>          /* FIXME: access_type? */
>  
> +        /* Interrupt state */
> +        VMSTATE_UINT32_V(env.pending_interrupts, PowerPCCPU, 6),
> +        VMSTATE_UINT32_V(env.irq_input_state, PowerPCCPU, 6),
> +
>          /* Sanity checking */
>          VMSTATE_UINTTL_TEST(mig_msr_mask, PowerPCCPU, cpu_pre_2_8_migration),
>          VMSTATE_UINT64_TEST(mig_insns_flags, PowerPCCPU, cpu_pre_2_8_migration),
Dr. David Alan Gilbert Sept. 11, 2017, 9:30 a.m. UTC | #2
* Greg Kurz (groug@kaod.org) wrote:
> On Sun, 10 Sep 2017 15:37:33 +0100
> Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
> 
> > Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
> > appears to drop the internal CPU IRQ state from the migration stream. Whilst
> > testing migration on g3beige/mac99 machines, test images would randomly fail to
> > resume unless a key was pressed on the VGA console.
> > 
> > Further investigation suggests that internal CPU IRQ state isn't being
> > preserved and so interrupts asserted at the time of migration are lost. Adding
> > the pending_interrupts and irq_input_state fields back into the migration
> > stream appears to fix the problem here during local tests.
> > 
> > As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
> > the additional fields.
> > 
> 
> And so this unconditionally breaks backward migration... what about adding
> a subsection for this ?

and wiring it to a flag on the machine type so that older machine types
don't send it.

Dave

> > Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> > ---
> >  target/ppc/machine.c |    6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/target/ppc/machine.c b/target/ppc/machine.c
> > index e59049f..8fec1a4 100644
> > --- a/target/ppc/machine.c
> > +++ b/target/ppc/machine.c
> > @@ -647,7 +647,7 @@ static const VMStateDescription vmstate_compat = {
> >  
> >  const VMStateDescription vmstate_ppc_cpu = {
> >      .name = "cpu",
> > -    .version_id = 5,
> > +    .version_id = 6,
> >      .minimum_version_id = 5,
> >      .minimum_version_id_old = 4,
> >      .load_state_old = cpu_load_old,
> > @@ -678,6 +678,10 @@ const VMStateDescription vmstate_ppc_cpu = {
> >          VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
> >          /* FIXME: access_type? */
> >  
> > +        /* Interrupt state */
> > +        VMSTATE_UINT32_V(env.pending_interrupts, PowerPCCPU, 6),
> > +        VMSTATE_UINT32_V(env.irq_input_state, PowerPCCPU, 6),
> > +
> >          /* Sanity checking */
> >          VMSTATE_UINTTL_TEST(mig_msr_mask, PowerPCCPU, cpu_pre_2_8_migration),
> >          VMSTATE_UINT64_TEST(mig_insns_flags, PowerPCCPU, cpu_pre_2_8_migration),
> 


--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
David Gibson Sept. 11, 2017, 10:48 a.m. UTC | #3
On Mon, Sep 11, 2017 at 10:30:33AM +0100, Dr. David Alan Gilbert wrote:
> * Greg Kurz (groug@kaod.org) wrote:
> > On Sun, 10 Sep 2017 15:37:33 +0100
> > Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
> > 
> > > Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
> > > appears to drop the internal CPU IRQ state from the migration stream. Whilst
> > > testing migration on g3beige/mac99 machines, test images would randomly fail to
> > > resume unless a key was pressed on the VGA console.
> > > 
> > > Further investigation suggests that internal CPU IRQ state isn't being
> > > preserved and so interrupts asserted at the time of migration are lost. Adding
> > > the pending_interrupts and irq_input_state fields back into the migration
> > > stream appears to fix the problem here during local tests.
> > > 
> > > As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
> > > the additional fields.
> > > 
> > 
> > And so this unconditionally breaks backward migration... what about adding
> > a subsection for this ?
> 
> and wiring it to a flag on the machine type so that older machine types
> don't send it.

Right, a subsection is certainly necessary to avoid breaking backwards
migration.

But apart from that I want to understand better exactly why this is
necessary.  What's the state that's being lost, and is it really not
recoverable from anywhere else.

The other thing that concerns me is how we're encoding the
information.  These are essentially internal fields, not reflecting
something with an architected encoding - adding those to the migration
stream is often a bad idea - it inhibits our ability to rework
internal encodings.
> 
> Dave
> 
> > > Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> > > ---
> > >  target/ppc/machine.c |    6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/target/ppc/machine.c b/target/ppc/machine.c
> > > index e59049f..8fec1a4 100644
> > > --- a/target/ppc/machine.c
> > > +++ b/target/ppc/machine.c
> > > @@ -647,7 +647,7 @@ static const VMStateDescription vmstate_compat = {
> > >  
> > >  const VMStateDescription vmstate_ppc_cpu = {
> > >      .name = "cpu",
> > > -    .version_id = 5,
> > > +    .version_id = 6,
> > >      .minimum_version_id = 5,
> > >      .minimum_version_id_old = 4,
> > >      .load_state_old = cpu_load_old,
> > > @@ -678,6 +678,10 @@ const VMStateDescription vmstate_ppc_cpu = {
> > >          VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
> > >          /* FIXME: access_type? */
> > >  
> > > +        /* Interrupt state */
> > > +        VMSTATE_UINT32_V(env.pending_interrupts, PowerPCCPU, 6),
> > > +        VMSTATE_UINT32_V(env.irq_input_state, PowerPCCPU, 6),
> > > +
> > >          /* Sanity checking */
> > >          VMSTATE_UINTTL_TEST(mig_msr_mask, PowerPCCPU, cpu_pre_2_8_migration),
> > >          VMSTATE_UINT64_TEST(mig_insns_flags, PowerPCCPU, cpu_pre_2_8_migration),
> > 
> 
>
Mark Cave-Ayland Sept. 11, 2017, 4:46 p.m. UTC | #4
On 11/09/17 11:48, David Gibson wrote:

> On Mon, Sep 11, 2017 at 10:30:33AM +0100, Dr. David Alan Gilbert wrote:
>> * Greg Kurz (groug@kaod.org) wrote:
>>> On Sun, 10 Sep 2017 15:37:33 +0100
>>> Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
>>>
>>>> Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
>>>> appears to drop the internal CPU IRQ state from the migration stream. Whilst
>>>> testing migration on g3beige/mac99 machines, test images would randomly fail to
>>>> resume unless a key was pressed on the VGA console.
>>>>
>>>> Further investigation suggests that internal CPU IRQ state isn't being
>>>> preserved and so interrupts asserted at the time of migration are lost. Adding
>>>> the pending_interrupts and irq_input_state fields back into the migration
>>>> stream appears to fix the problem here during local tests.
>>>>
>>>> As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
>>>> the additional fields.
>>>>
>>>
>>> And so this unconditionally breaks backward migration... what about adding
>>> a subsection for this ?
>>
>> and wiring it to a flag on the machine type so that older machine types
>> don't send it.
> 
> Right, a subsection is certainly necessary to avoid breaking backwards
> migration.

The suggestion of using the VMSTATE_*_V macros with an increased version
number came from Alexey's original review of the patch many months ago
which is why I did it that way.

Out of curiosity though, what is the criteria for supporting backwards
migration? Obviously forward migration is supported as-is in this
manner, so what determines if a patch needs to be backwards compatible
and how far?

> But apart from that I want to understand better exactly why this is
> necessary.  What's the state that's being lost, and is it really not
> recoverable from anywhere else.

The test case I have is installing Darwin PPC 6.02 with qemu-system-ppc
TCG and repeatedly pausing, executing "savevm foo", then quitting and
continuing with "-loadvm foo" during the installation phase. About 1 in
10 times the installer hangs after the loadvm until I press a key, at
which point it carries on as normal.

I then proceeded to going backwards through the git history until I
found out that it was the removal of the pending_interrupts,
irq_input_state and access_type fields during the conversion to
VMStateDescription commit a90db15 which seemed to cause the problem.

> The other thing that concerns me is how we're encoding the
> information.  These are essentially internal fields, not reflecting
> something with an architected encoding - adding those to the migration
> stream is often a bad idea - it inhibits our ability to rework
> internal encodings.

I'm not sure how this should be managed, however there was a similar
issue with excp_prefix (which was also removed in a90db15) that was
fixed in 2360b6e by calling a helper in cpu_post_load(). I can't easily
see how could work with env.pending_interrupts and env.irq_input_state
though.


ATB,

Mark.
Dr. David Alan Gilbert Sept. 11, 2017, 5:19 p.m. UTC | #5
* Mark Cave-Ayland (mark.cave-ayland@ilande.co.uk) wrote:
> On 11/09/17 11:48, David Gibson wrote:
> 
> > On Mon, Sep 11, 2017 at 10:30:33AM +0100, Dr. David Alan Gilbert wrote:
> >> * Greg Kurz (groug@kaod.org) wrote:
> >>> On Sun, 10 Sep 2017 15:37:33 +0100
> >>> Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
> >>>
> >>>> Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
> >>>> appears to drop the internal CPU IRQ state from the migration stream. Whilst
> >>>> testing migration on g3beige/mac99 machines, test images would randomly fail to
> >>>> resume unless a key was pressed on the VGA console.
> >>>>
> >>>> Further investigation suggests that internal CPU IRQ state isn't being
> >>>> preserved and so interrupts asserted at the time of migration are lost. Adding
> >>>> the pending_interrupts and irq_input_state fields back into the migration
> >>>> stream appears to fix the problem here during local tests.
> >>>>
> >>>> As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
> >>>> the additional fields.
> >>>>
> >>>
> >>> And so this unconditionally breaks backward migration... what about adding
> >>> a subsection for this ?
> >>
> >> and wiring it to a flag on the machine type so that older machine types
> >> don't send it.
> > 
> > Right, a subsection is certainly necessary to avoid breaking backwards
> > migration.
> 
> The suggestion of using the VMSTATE_*_V macros with an increased version
> number came from Alexey's original review of the patch many months ago
> which is why I did it that way.
> 
> Out of curiosity though, what is the criteria for supporting backwards
> migration? Obviously forward migration is supported as-is in this
> manner, so what determines if a patch needs to be backwards compatible
> and how far?

It's a bit fuzzy.  Downstream we do backwards migration between various
versions - and those versions are pretty arbitrarily chosen.
Generally I prefer if we don't break that upstream either, although
it isn't tested much.

If it was in code that was specific to your g3beige I wouldn't mind;
but for ppc in general then if it breaks the server migration it'll
be a pain we'd have to then fix.  Best to keep it working upstream.

But it's fairly easy to put new fields in a subsection and tie it
to a property;  that makes it easy to switch it on/off in machine
types.

> > But apart from that I want to understand better exactly why this is
> > necessary.  What's the state that's being lost, and is it really not
> > recoverable from anywhere else.
> 
> The test case I have is installing Darwin PPC 6.02 with qemu-system-ppc
> TCG and repeatedly pausing, executing "savevm foo", then quitting and
> continuing with "-loadvm foo" during the installation phase. About 1 in
> 10 times the installer hangs after the loadvm until I press a key, at
> which point it carries on as normal.
> 
> I then proceeded to going backwards through the git history until I
> found out that it was the removal of the pending_interrupts,
> irq_input_state and access_type fields during the conversion to
> VMStateDescription commit a90db15 which seemed to cause the problem.
> 
> > The other thing that concerns me is how we're encoding the
> > information.  These are essentially internal fields, not reflecting
> > something with an architected encoding - adding those to the migration
> > stream is often a bad idea - it inhibits our ability to rework
> > internal encodings.
> 
> I'm not sure how this should be managed, however there was a similar
> issue with excp_prefix (which was also removed in a90db15) that was
> fixed in 2360b6e by calling a helper in cpu_post_load(). I can't easily
> see how could work with env.pending_interrupts and env.irq_input_state
> though.

Without knowing anything about this hardware... generally the migration
stream should reflect the real state of the system rather than internal
implementation detail, that way if you change the implementation you
don't need to fudge the state.  Having said that, there's generally
some internal state that's perhaps not immediately obvious from specs
until you try and implement it.

Dave

> ATB,
> 
> Mark.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Dr. David Alan Gilbert Sept. 12, 2017, 4:21 p.m. UTC | #6
* David Gibson (david@gibson.dropbear.id.au) wrote:
> On Mon, Sep 11, 2017 at 10:30:33AM +0100, Dr. David Alan Gilbert wrote:
> > * Greg Kurz (groug@kaod.org) wrote:
> > > On Sun, 10 Sep 2017 15:37:33 +0100
> > > Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
> > > 
> > > > Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
> > > > appears to drop the internal CPU IRQ state from the migration stream. Whilst
> > > > testing migration on g3beige/mac99 machines, test images would randomly fail to
> > > > resume unless a key was pressed on the VGA console.
> > > > 
> > > > Further investigation suggests that internal CPU IRQ state isn't being
> > > > preserved and so interrupts asserted at the time of migration are lost. Adding
> > > > the pending_interrupts and irq_input_state fields back into the migration
> > > > stream appears to fix the problem here during local tests.
> > > > 
> > > > As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
> > > > the additional fields.
> > > > 
> > > 
> > > And so this unconditionally breaks backward migration... what about adding
> > > a subsection for this ?
> > 
> > and wiring it to a flag on the machine type so that older machine types
> > don't send it.
> 
> Right, a subsection is certainly necessary to avoid breaking backwards
> migration.
> 
> But apart from that I want to understand better exactly why this is
> necessary.  What's the state that's being lost, and is it really not
> recoverable from anywhere else.
> 
> The other thing that concerns me is how we're encoding the
> information.  These are essentially internal fields, not reflecting
> something with an architected encoding - adding those to the migration
> stream is often a bad idea - it inhibits our ability to rework
> internal encodings.

Yes, agreed, where possible the contents of the stream should reflect
'real' state that's actually being modelled and be as independent of
the implementation as possible.

Dave

> > 
> > Dave
> > 
> > > > Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> > > > ---
> > > >  target/ppc/machine.c |    6 +++++-
> > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/target/ppc/machine.c b/target/ppc/machine.c
> > > > index e59049f..8fec1a4 100644
> > > > --- a/target/ppc/machine.c
> > > > +++ b/target/ppc/machine.c
> > > > @@ -647,7 +647,7 @@ static const VMStateDescription vmstate_compat = {
> > > >  
> > > >  const VMStateDescription vmstate_ppc_cpu = {
> > > >      .name = "cpu",
> > > > -    .version_id = 5,
> > > > +    .version_id = 6,
> > > >      .minimum_version_id = 5,
> > > >      .minimum_version_id_old = 4,
> > > >      .load_state_old = cpu_load_old,
> > > > @@ -678,6 +678,10 @@ const VMStateDescription vmstate_ppc_cpu = {
> > > >          VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
> > > >          /* FIXME: access_type? */
> > > >  
> > > > +        /* Interrupt state */
> > > > +        VMSTATE_UINT32_V(env.pending_interrupts, PowerPCCPU, 6),
> > > > +        VMSTATE_UINT32_V(env.irq_input_state, PowerPCCPU, 6),
> > > > +
> > > >          /* Sanity checking */
> > > >          VMSTATE_UINTTL_TEST(mig_msr_mask, PowerPCCPU, cpu_pre_2_8_migration),
> > > >          VMSTATE_UINT64_TEST(mig_insns_flags, PowerPCCPU, cpu_pre_2_8_migration),
> > > 
> > 
> > 
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson


--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Mark Cave-Ayland Sept. 12, 2017, 4:41 p.m. UTC | #7
On 12/09/17 17:21, Dr. David Alan Gilbert wrote:

>> Right, a subsection is certainly necessary to avoid breaking backwards
>> migration.
>>
>> But apart from that I want to understand better exactly why this is
>> necessary.  What's the state that's being lost, and is it really not
>> recoverable from anywhere else.
>>
>> The other thing that concerns me is how we're encoding the
>> information.  These are essentially internal fields, not reflecting
>> something with an architected encoding - adding those to the migration
>> stream is often a bad idea - it inhibits our ability to rework
>> internal encodings.
> 
> Yes, agreed, where possible the contents of the stream should reflect
> 'real' state that's actually being modelled and be as independent of
> the implementation as possible.

Oh sure. However I should re-iterate that I'm not trying to add new
fields into the migration stream, merely reinstate the ones that were
dropped without warning in commit a90db15 by Alexey since without them
TCG state doesn't restore correctly in my local tests.

The commit message mentions that prior to the conversion some CPU state
was missing but it doesn't mention anything about dropping existing
fields as part of the conversion process so I suspect that this was an
accidental side-effect.

I can definitely look at re-implementing the patch using a subsection in
order to preserve backwards compatibility though.


ATB,

Mark.
Mark Cave-Ayland Sept. 12, 2017, 4:46 p.m. UTC | #8
On 12/09/17 17:41, Mark Cave-Ayland wrote:

> The commit message mentions that prior to the conversion some CPU state
> was missing but it doesn't mention anything about dropping existing
> fields as part of the conversion process so I suspect that this was an
> accidental side-effect.

Actually I've clicked send a little too early here since re-reading the
last paragraph of a90db15 I can see the inference here: "Exactly what
needs to be saved in what configurations has been more carefully
examined, too".

Alexey - do you recall from your analysis why these fields were no
longer deemed necessary, and how your TCG tests were configured?


ATB,

Mark.
Alexey Kardashevskiy Sept. 13, 2017, 2:23 a.m. UTC | #9
On 13/09/17 02:46, Mark Cave-Ayland wrote:
> On 12/09/17 17:41, Mark Cave-Ayland wrote:
> 
>> The commit message mentions that prior to the conversion some CPU state
>> was missing but it doesn't mention anything about dropping existing
>> fields as part of the conversion process so I suspect that this was an
>> accidental side-effect.
> 
> Actually I've clicked send a little too early here since re-reading the
> last paragraph of a90db15 I can see the inference here: "Exactly what
> needs to be saved in what configurations has been more carefully
> examined, too".
> 
> Alexey - do you recall from your analysis why these fields were no
> longer deemed necessary, and how your TCG tests were configured?

I most certainly did not do analysis (my bad. sorry) - I took the patch
from David as he left the team, fixed to compile and pushed away. I am also
very suspicions we did not try migrating TCG or anything but pseries. My
guest that things did not break (if they did not which I am not sure about,
for the TCG case) because the interrupt controller (XICS) or the
pseries-guest took care of resending an interrupt which does not seem to be
the case for mac99.
David Gibson Sept. 13, 2017, 6:02 a.m. UTC | #10
On Wed, Sep 13, 2017 at 12:23:13PM +1000, Alexey Kardashevskiy wrote:
> On 13/09/17 02:46, Mark Cave-Ayland wrote:
> > On 12/09/17 17:41, Mark Cave-Ayland wrote:
> > 
> >> The commit message mentions that prior to the conversion some CPU state
> >> was missing but it doesn't mention anything about dropping existing
> >> fields as part of the conversion process so I suspect that this was an
> >> accidental side-effect.
> > 
> > Actually I've clicked send a little too early here since re-reading the
> > last paragraph of a90db15 I can see the inference here: "Exactly what
> > needs to be saved in what configurations has been more carefully
> > examined, too".
> > 
> > Alexey - do you recall from your analysis why these fields were no
> > longer deemed necessary, and how your TCG tests were configured?
> 
> I most certainly did not do analysis (my bad. sorry) - I took the patch
> from David as he left the team, fixed to compile and pushed away. I am also
> very suspicions we did not try migrating TCG or anything but pseries. My
> guest that things did not break (if they did not which I am not sure about,
> for the TCG case) because the interrupt controller (XICS) or the
> pseries-guest took care of resending an interrupt which does not seem to be
> the case for mac99.

Right, that's probably true.  The main point, though, is that these
fields were dropped a *long* time ago, when migration was barely
working to begin with.  In particular I'm pretty sure most of the
non-pseries platforms were already pretty broken for migration
(amongst other things).

Polishing the mac platforms up to working again, including migration,
is a reasonable goal.  But it can't be at the expense of pseries,
which is already working, used in production, and much better tested
than mac99 or g3beige ever were.
David Gibson Sept. 13, 2017, 7:03 a.m. UTC | #11
On Mon, Sep 11, 2017 at 06:19:54PM +0100, Dr. David Alan Gilbert wrote:
> * Mark Cave-Ayland (mark.cave-ayland@ilande.co.uk) wrote:
> > On 11/09/17 11:48, David Gibson wrote:
> > 
> > > On Mon, Sep 11, 2017 at 10:30:33AM +0100, Dr. David Alan Gilbert wrote:
> > >> * Greg Kurz (groug@kaod.org) wrote:
> > >>> On Sun, 10 Sep 2017 15:37:33 +0100
> > >>> Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
> > >>>
> > >>>> Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
> > >>>> appears to drop the internal CPU IRQ state from the migration stream. Whilst
> > >>>> testing migration on g3beige/mac99 machines, test images would randomly fail to
> > >>>> resume unless a key was pressed on the VGA console.
> > >>>>
> > >>>> Further investigation suggests that internal CPU IRQ state isn't being
> > >>>> preserved and so interrupts asserted at the time of migration are lost. Adding
> > >>>> the pending_interrupts and irq_input_state fields back into the migration
> > >>>> stream appears to fix the problem here during local tests.
> > >>>>
> > >>>> As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
> > >>>> the additional fields.
> > >>>>
> > >>>
> > >>> And so this unconditionally breaks backward migration... what about adding
> > >>> a subsection for this ?
> > >>
> > >> and wiring it to a flag on the machine type so that older machine types
> > >> don't send it.
> > > 
> > > Right, a subsection is certainly necessary to avoid breaking backwards
> > > migration.
> > 
> > The suggestion of using the VMSTATE_*_V macros with an increased version
> > number came from Alexey's original review of the patch many months ago
> > which is why I did it that way.
> > 
> > Out of curiosity though, what is the criteria for supporting backwards
> > migration? Obviously forward migration is supported as-is in this
> > manner, so what determines if a patch needs to be backwards compatible
> > and how far?
> 
> It's a bit fuzzy.  Downstream we do backwards migration between various
> versions - and those versions are pretty arbitrarily chosen.
> Generally I prefer if we don't break that upstream either, although
> it isn't tested much.

Right.  In this case not breaking backwards migration upstream is
essentially a favour to downstream, since unbreaking it downstream
once its broken upstream is a real PITA.

> If it was in code that was specific to your g3beige I wouldn't mind;
> but for ppc in general then if it breaks the server migration it'll
> be a pain we'd have to then fix.  Best to keep it working upstream.

Right.

> But it's fairly easy to put new fields in a subsection and tie it
> to a property;  that makes it easy to switch it on/off in machine
> types.

In this case I'm not sure we even need a property - I think we could
migrate it only when it's non-zero.  That shold only break it in cases
where actually it would already break.

> > > But apart from that I want to understand better exactly why this is
> > > necessary.  What's the state that's being lost, and is it really not
> > > recoverable from anywhere else.
> > 
> > The test case I have is installing Darwin PPC 6.02 with qemu-system-ppc
> > TCG and repeatedly pausing, executing "savevm foo", then quitting and
> > continuing with "-loadvm foo" during the installation phase. About 1 in
> > 10 times the installer hangs after the loadvm until I press a key, at
> > which point it carries on as normal.
> > 
> > I then proceeded to going backwards through the git history until I
> > found out that it was the removal of the pending_interrupts,
> > irq_input_state and access_type fields during the conversion to
> > VMStateDescription commit a90db15 which seemed to cause the problem.
> > 
> > > The other thing that concerns me is how we're encoding the
> > > information.  These are essentially internal fields, not reflecting
> > > something with an architected encoding - adding those to the migration
> > > stream is often a bad idea - it inhibits our ability to rework
> > > internal encodings.
> > 
> > I'm not sure how this should be managed, however there was a similar
> > issue with excp_prefix (which was also removed in a90db15) that was
> > fixed in 2360b6e by calling a helper in cpu_post_load(). I can't easily
> > see how could work with env.pending_interrupts and env.irq_input_state
> > though.
> 
> Without knowing anything about this hardware... generally the migration
> stream should reflect the real state of the system rather than internal
> implementation detail, that way if you change the implementation you
> don't need to fudge the state.  Having said that, there's generally
> some internal state that's perhaps not immediately obvious from specs
> until you try and implement it.

Right.  I'm not really sure how to handle this yet.  The CPU irq
numbers are pretty much arbitrarily assigned, I don't think much
thought has gone into them.  And if its going to become part of the
migration ABI, some thought needs to be put into it.
Mark Cave-Ayland Sept. 13, 2017, 4:44 p.m. UTC | #12
On 13/09/17 07:02, David Gibson wrote:

>>> Alexey - do you recall from your analysis why these fields were no
>>> longer deemed necessary, and how your TCG tests were configured?
>>
>> I most certainly did not do analysis (my bad. sorry) - I took the patch
>> from David as he left the team, fixed to compile and pushed away. I am also
>> very suspicions we did not try migrating TCG or anything but pseries. My
>> guest that things did not break (if they did not which I am not sure about,
>> for the TCG case) because the interrupt controller (XICS) or the
>> pseries-guest took care of resending an interrupt which does not seem to be
>> the case for mac99.
> 
> Right, that's probably true.  The main point, though, is that these
> fields were dropped a *long* time ago, when migration was barely
> working to begin with.  In particular I'm pretty sure most of the
> non-pseries platforms were already pretty broken for migration
> (amongst other things).
> 
> Polishing the mac platforms up to working again, including migration,
> is a reasonable goal.  But it can't be at the expense of pseries,
> which is already working, used in production, and much better tested
> than mac99 or g3beige ever were.

Oh I completely agree since I'm well aware pseries likely has more users
than the Mac machines - my question was directed more about why we
support backwards migration.

I spent several hours yesterday poking my Darwin test case with trying
the different combinations of pending_interrupts, irq_input_state and
access_type and could easily provoke migration failures unless all 3 of
the fields were present so a practical test shows they are still
required for TCG migration. I think ppc_set_irq()'s use of the interrupt
fields in hw/ppc/ppc.c and the subsequent reference to pending
interrupts in target/ppc may explain why I see freezes/hangs until a key
is pressed in many cases.


ATB,

Mark.
Greg Kurz Sept. 13, 2017, 5:13 p.m. UTC | #13
On Wed, 13 Sep 2017 17:44:54 +0100
Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:

> On 13/09/17 07:02, David Gibson wrote:
> 
> >>> Alexey - do you recall from your analysis why these fields were no
> >>> longer deemed necessary, and how your TCG tests were configured?  
> >>
> >> I most certainly did not do analysis (my bad. sorry) - I took the patch
> >> from David as he left the team, fixed to compile and pushed away. I am also
> >> very suspicions we did not try migrating TCG or anything but pseries. My
> >> guest that things did not break (if they did not which I am not sure about,
> >> for the TCG case) because the interrupt controller (XICS) or the
> >> pseries-guest took care of resending an interrupt which does not seem to be
> >> the case for mac99.  
> > 
> > Right, that's probably true.  The main point, though, is that these
> > fields were dropped a *long* time ago, when migration was barely
> > working to begin with.  In particular I'm pretty sure most of the
> > non-pseries platforms were already pretty broken for migration
> > (amongst other things).
> > 
> > Polishing the mac platforms up to working again, including migration,
> > is a reasonable goal.  But it can't be at the expense of pseries,
> > which is already working, used in production, and much better tested
> > than mac99 or g3beige ever were.  
> 
> Oh I completely agree since I'm well aware pseries likely has more users
> than the Mac machines - my question was directed more about why we
> support backwards migration.
> 

Downstream support backward migration because end users/customers ask for it
for maximum flexibility when it comes to move workloads around different systems
with different QEMU versions. This is fairly usual in data centers with many
systems.

As others already said, breaking things upstream may turn downstream work
into a nightmare (and FWIW, most of the people working on ppc are also
involved in downstream work).

Cheers,

--
Greg

> I spent several hours yesterday poking my Darwin test case with trying
> the different combinations of pending_interrupts, irq_input_state and
> access_type and could easily provoke migration failures unless all 3 of
> the fields were present so a practical test shows they are still
> required for TCG migration. I think ppc_set_irq()'s use of the interrupt
> fields in hw/ppc/ppc.c and the subsequent reference to pending
> interrupts in target/ppc may explain why I see freezes/hangs until a key
> is pressed in many cases.
> 
> 
> ATB,
> 
> Mark.
David Gibson Sept. 14, 2017, 3:30 a.m. UTC | #14
On Wed, Sep 13, 2017 at 05:44:54PM +0100, Mark Cave-Ayland wrote:
> On 13/09/17 07:02, David Gibson wrote:
> 
> >>> Alexey - do you recall from your analysis why these fields were no
> >>> longer deemed necessary, and how your TCG tests were configured?
> >>
> >> I most certainly did not do analysis (my bad. sorry) - I took the patch
> >> from David as he left the team, fixed to compile and pushed away. I am also
> >> very suspicions we did not try migrating TCG or anything but pseries. My
> >> guest that things did not break (if they did not which I am not sure about,
> >> for the TCG case) because the interrupt controller (XICS) or the
> >> pseries-guest took care of resending an interrupt which does not seem to be
> >> the case for mac99.
> > 
> > Right, that's probably true.  The main point, though, is that these
> > fields were dropped a *long* time ago, when migration was barely
> > working to begin with.  In particular I'm pretty sure most of the
> > non-pseries platforms were already pretty broken for migration
> > (amongst other things).
> > 
> > Polishing the mac platforms up to working again, including migration,
> > is a reasonable goal.  But it can't be at the expense of pseries,
> > which is already working, used in production, and much better tested
> > than mac99 or g3beige ever were.
> 
> Oh I completely agree since I'm well aware pseries likely has more users
> than the Mac machines - my question was directed more about why we
> support backwards migration.
> 
> I spent several hours yesterday poking my Darwin test case with trying
> the different combinations of pending_interrupts, irq_input_state and
> access_type and could easily provoke migration failures unless all 3 of
> the fields were present so a practical test shows they are still
> required for TCG migration. I think ppc_set_irq()'s use of the interrupt
> fields in hw/ppc/ppc.c and the subsequent reference to pending
> interrupts in target/ppc may explain why I see freezes/hangs until a key
> is pressed in many cases.

Ok, I think we need to consider (pending_interrupts and irq_input_state)
separately from access_type.  The first two are pretty closely related
to each other, and I've got at least a rough idea of what the problems
there might be.  access_type I'm pretty sure has to be an unrelated
problem, and I've got much less of a handle on it.

I suspect we could work around the problems with pending_interrupts
and irq_input_state by having a post_load hook in the board level
interrupt controller to reassert its output irq line based on its
current state.  I believe the relevant irq inputs to the cpu are
effectively level triggered, so I think that will be enough.

access_type I don't have any good ideas for yet.  We really need to
work out what the exact race is here that's causing its state to be
lost harmfully.
David Gibson Sept. 14, 2017, 3:48 a.m. UTC | #15
On Wed, Sep 13, 2017 at 07:13:09PM +0200, Greg Kurz wrote:
> On Wed, 13 Sep 2017 17:44:54 +0100
> Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
> 
> > On 13/09/17 07:02, David Gibson wrote:
> > 
> > >>> Alexey - do you recall from your analysis why these fields were no
> > >>> longer deemed necessary, and how your TCG tests were configured?  
> > >>
> > >> I most certainly did not do analysis (my bad. sorry) - I took the patch
> > >> from David as he left the team, fixed to compile and pushed away. I am also
> > >> very suspicions we did not try migrating TCG or anything but pseries. My
> > >> guest that things did not break (if they did not which I am not sure about,
> > >> for the TCG case) because the interrupt controller (XICS) or the
> > >> pseries-guest took care of resending an interrupt which does not seem to be
> > >> the case for mac99.  
> > > 
> > > Right, that's probably true.  The main point, though, is that these
> > > fields were dropped a *long* time ago, when migration was barely
> > > working to begin with.  In particular I'm pretty sure most of the
> > > non-pseries platforms were already pretty broken for migration
> > > (amongst other things).
> > > 
> > > Polishing the mac platforms up to working again, including migration,
> > > is a reasonable goal.  But it can't be at the expense of pseries,
> > > which is already working, used in production, and much better tested
> > > than mac99 or g3beige ever were.  
> > 
> > Oh I completely agree since I'm well aware pseries likely has more users
> > than the Mac machines - my question was directed more about why we
> > support backwards migration.
> > 
> 
> Downstream support backward migration because end users/customers ask for it
> for maximum flexibility when it comes to move workloads around different systems
> with different QEMU versions. This is fairly usual in data centers with many
> systems.

One specific case where this matters is if you want to update the qemu
version on an entire cluster of hosts.  Management layers like
oVirt/RHV allow you to do that without disrupting guests by migrating
them off each host, one by one, allowing you to remove it from the
cluster, update and then reinsert after which guests may be migrated
onto it again.

But that process could obviously take a fair while for a big cluster;
days or even weeks.   In the meantime you have a cluster with mixed
qemu versions, and since oVirt/RHV/whatever will also migrate guests
around the cluster to load balance, that means the possibility of a
backwards migration.

> As others already said, breaking things upstream may turn downstream work
> into a nightmare (and FWIW, most of the people working on ppc are also
> involved in downstream work).

Right.  And I even used to work as a RHV support tech too :).
diff mbox series

Patch

diff --git a/target/ppc/machine.c b/target/ppc/machine.c
index e59049f..8fec1a4 100644
--- a/target/ppc/machine.c
+++ b/target/ppc/machine.c
@@ -647,7 +647,7 @@  static const VMStateDescription vmstate_compat = {
 
 const VMStateDescription vmstate_ppc_cpu = {
     .name = "cpu",
-    .version_id = 5,
+    .version_id = 6,
     .minimum_version_id = 5,
     .minimum_version_id_old = 4,
     .load_state_old = cpu_load_old,
@@ -678,6 +678,10 @@  const VMStateDescription vmstate_ppc_cpu = {
         VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
         /* FIXME: access_type? */
 
+        /* Interrupt state */
+        VMSTATE_UINT32_V(env.pending_interrupts, PowerPCCPU, 6),
+        VMSTATE_UINT32_V(env.irq_input_state, PowerPCCPU, 6),
+
         /* Sanity checking */
         VMSTATE_UINTTL_TEST(mig_msr_mask, PowerPCCPU, cpu_pre_2_8_migration),
         VMSTATE_UINT64_TEST(mig_insns_flags, PowerPCCPU, cpu_pre_2_8_migration),