diff mbox series

[3/4] ppc: add CPU access_type into the migration stream

Message ID 1505054255-2990-4-git-send-email-mark.cave-ayland@ilande.co.uk
State New
Headers show
Series ppc: migration fixes for TCG | expand

Commit Message

Mark Cave-Ayland Sept. 10, 2017, 2:37 p.m. UTC
This is referenced in cpu_ppc_handle_mmu_fault() and so should be included
in the migration stream.

Note: the vmstate_ppc version number has already been bumped by the previous
patch in this series.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/machine.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Gibson Sept. 11, 2017, 10:57 a.m. UTC | #1
On Sun, Sep 10, 2017 at 03:37:34PM +0100, Mark Cave-Ayland wrote:
> This is referenced in cpu_ppc_handle_mmu_fault() and so should be included
> in the migration stream.

That is not, on its own, sufficient reason.

> Note: the vmstate_ppc version number has already been bumped by the previous
> patch in this series.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

As with 2/4 it breaks backwards migration.

But more, I really disklike the idea of migrating this.  It's internal
state for one, and it's also essentially transitory state.  Can we
avoid putting it in the otherwise persistent structure at all? Can we
derive the state from elsewhere?  Can we prevent migration from
occurring in the small windows where this data is live?

> ---
>  target/ppc/machine.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/ppc/machine.c b/target/ppc/machine.c
> index 8fec1a4..10b3c41 100644
> --- a/target/ppc/machine.c
> +++ b/target/ppc/machine.c
> @@ -676,7 +676,7 @@ const VMStateDescription vmstate_ppc_cpu = {
>  
>          /* Internal state */
>          VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
> -        /* FIXME: access_type? */
> +        VMSTATE_UINT8_V(env.access_type, PowerPCCPU, 6),
>  
>          /* Interrupt state */
>          VMSTATE_UINT32_V(env.pending_interrupts, PowerPCCPU, 6),
Mark Cave-Ayland Sept. 11, 2017, 4:52 p.m. UTC | #2
On 11/09/17 11:57, David Gibson wrote:

> On Sun, Sep 10, 2017 at 03:37:34PM +0100, Mark Cave-Ayland wrote:
>> This is referenced in cpu_ppc_handle_mmu_fault() and so should be included
>> in the migration stream.
> 
> That is not, on its own, sufficient reason.
> 
>> Note: the vmstate_ppc version number has already been bumped by the previous
>> patch in this series.
>>
>> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> 
> As with 2/4 it breaks backwards migration.
> 
> But more, I really disklike the idea of migrating this.  It's internal
> state for one, and it's also essentially transitory state.  Can we
> avoid putting it in the otherwise persistent structure at all? Can we
> derive the state from elsewhere?  Can we prevent migration from
> occurring in the small windows where this data is live?

From what I can see references to access_type are scattered throughout
mmu_helper.c although I'm not necessarily familiar enough with PPC to
know whether this is something that can be derived elsewhere instead.
And once again it was something that was removed by a90db15.

When pausing a VM, does execution stop at the end of the current TB
rather than immediately? If so, perhaps someone could confirm that
guarantee is good enough for access_type?


ATB,

Mark.
David Gibson Sept. 13, 2017, 7:19 a.m. UTC | #3
On Mon, Sep 11, 2017 at 05:52:10PM +0100, Mark Cave-Ayland wrote:
> On 11/09/17 11:57, David Gibson wrote:
> 
> > On Sun, Sep 10, 2017 at 03:37:34PM +0100, Mark Cave-Ayland wrote:
> >> This is referenced in cpu_ppc_handle_mmu_fault() and so should be included
> >> in the migration stream.
> > 
> > That is not, on its own, sufficient reason.
> > 
> >> Note: the vmstate_ppc version number has already been bumped by the previous
> >> patch in this series.
> >>
> >> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> > 
> > As with 2/4 it breaks backwards migration.
> > 
> > But more, I really disklike the idea of migrating this.  It's internal
> > state for one, and it's also essentially transitory state.  Can we
> > avoid putting it in the otherwise persistent structure at all? Can we
> > derive the state from elsewhere?  Can we prevent migration from
> > occurring in the small windows where this data is live?
> 
> >From what I can see references to access_type are scattered throughout
> mmu_helper.c although I'm not necessarily familiar enough with PPC to
> know whether this is something that can be derived elsewhere instead.
> And once again it was something that was removed by a90db15.

Right, but the migration code prior to a90db15 was a complete mess.
It definitely included a number of things it didn't need to and
shouldn't as well as being missing other things that were needed.
It's not a good model.  And although it might have more-or-less worked
for certain machines like the ones you're reviving here, it was never
properly tested

> When pausing a VM, does execution stop at the end of the current TB
> rather than immediately? If so, perhaps someone could confirm that
> guarantee is good enough for access_type?

I'm pretty sure it has to; we'd have to come up out of an individual
TB in order to get to the main loop where we check the "please pause"
flag.  I'm not sure if that helps us here though - I *think* access
type is about carrying information from the point where we trigger an
exception to the point where we actually start processing the
exception.

This code is really ugly and I've never understood it well :(. It's
always seemed bogus to me that we have an essentially global variable
to carry information over that small gap, though.  Unfortunately it's
unlikely that I'd be able to dive into this and work out if it's
really needed any time soon.
Mark Cave-Ayland Sept. 13, 2017, 5:17 p.m. UTC | #4
On 13/09/17 08:19, David Gibson wrote:

>> When pausing a VM, does execution stop at the end of the current TB
>> rather than immediately? If so, perhaps someone could confirm that
>> guarantee is good enough for access_type?
> 
> I'm pretty sure it has to; we'd have to come up out of an individual
> TB in order to get to the main loop where we check the "please pause"
> flag.  I'm not sure if that helps us here though - I *think* access
> type is about carrying information from the point where we trigger an
> exception to the point where we actually start processing the
> exception.
> 
> This code is really ugly and I've never understood it well :(. It's
> always seemed bogus to me that we have an essentially global variable
> to carry information over that small gap, though.  Unfortunately it's
> unlikely that I'd be able to dive into this and work out if it's
> really needed any time soon.

From my testing yesterday it does appear to be required for TCG (or
unless this is exposing another bug in the Mac migration) so I can check
it works here and then maybe someone else confirm it works on KVM?

A couple of things I've noticed: the heathrow VMStateDescription looks
good, however I can see that the OpenPIC timers won't likely migrate
correctly without adding a few more fields - that's easy to fix.

Another thing is that if we're changing the migration stream for Mac
models Ben has some OpenPIC and other related changes in his wip queue
that it might make sense to put those in first before properly fixing
the Mac machine migration.


ATB,

Mark.
David Gibson Sept. 14, 2017, 3:54 a.m. UTC | #5
On Wed, Sep 13, 2017 at 06:17:21PM +0100, Mark Cave-Ayland wrote:
> On 13/09/17 08:19, David Gibson wrote:
> 
> >> When pausing a VM, does execution stop at the end of the current TB
> >> rather than immediately? If so, perhaps someone could confirm that
> >> guarantee is good enough for access_type?
> > 
> > I'm pretty sure it has to; we'd have to come up out of an individual
> > TB in order to get to the main loop where we check the "please pause"
> > flag.  I'm not sure if that helps us here though - I *think* access
> > type is about carrying information from the point where we trigger an
> > exception to the point where we actually start processing the
> > exception.
> > 
> > This code is really ugly and I've never understood it well :(. It's
> > always seemed bogus to me that we have an essentially global variable
> > to carry information over that small gap, though.  Unfortunately it's
> > unlikely that I'd be able to dive into this and work out if it's
> > really needed any time soon.
> 
> >From my testing yesterday it does appear to be required for TCG (or
> unless this is exposing another bug in the Mac migration) so I can check
> it works here and then maybe someone else confirm it works on KVM?
> 
> A couple of things I've noticed: the heathrow VMStateDescription looks
> good, however I can see that the OpenPIC timers won't likely migrate
> correctly without adding a few more fields - that's easy to fix.

Right.  And since OpenPIC isn't used on any platforms that have real
production use in the wild, it's fine to bump the migration stream
version for it.

> Another thing is that if we're changing the migration stream for Mac
> models Ben has some OpenPIC and other related changes in his wip queue
> that it might make sense to put those in first before properly fixing
> the Mac machine migration.

That would have something to be said for it.

It's probably not essential, though, since I don't consider the
non-pseries platforms at this stage sufficiently mature that we
guarantee a stable migration stream format.
diff mbox series

Patch

diff --git a/target/ppc/machine.c b/target/ppc/machine.c
index 8fec1a4..10b3c41 100644
--- a/target/ppc/machine.c
+++ b/target/ppc/machine.c
@@ -676,7 +676,7 @@  const VMStateDescription vmstate_ppc_cpu = {
 
         /* Internal state */
         VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
-        /* FIXME: access_type? */
+        VMSTATE_UINT8_V(env.access_type, PowerPCCPU, 6),
 
         /* Interrupt state */
         VMSTATE_UINT32_V(env.pending_interrupts, PowerPCCPU, 6),