diff mbox series

[for-3.0,2/4] migration: disallow recovery for release-ram

Message ID 20180723123305.24792-3-peterx@redhat.com
State New
Headers show
Series migration: some fixes for release-ram | expand

Commit Message

Peter Xu July 23, 2018, 12:33 p.m. UTC
Postcopy recovery won't work well with release-ram capability since
release-ram will drop the page buffer as long as the page is put into
the send buffer.  So if there is a network failure happened, any page
buffers that have not yet reached the destination VM but have already
been sent from the source VM will be lost forever.  Let's refuse the
client from resuming such a postcopy migration.  Luckily release-ram was
designed to only be used when src and destination VMs are on the same
host, so it should be fine.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Comments

Juan Quintela July 24, 2018, 9:21 a.m. UTC | #1
Peter Xu <peterx@redhat.com> wrote:
> Postcopy recovery won't work well with release-ram capability since
> release-ram will drop the page buffer as long as the page is put into
> the send buffer.  So if there is a network failure happened, any page
> buffers that have not yet reached the destination VM but have already
> been sent from the source VM will be lost forever.  Let's refuse the
> client from resuming such a postcopy migration.  Luckily release-ram was
> designed to only be used when src and destination VMs are on the same
> host, so it should be fine.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

I wonder if we should have a FAQ somewhere and point an URL to there.
Peter Xu July 24, 2018, 11:39 a.m. UTC | #2
On Tue, Jul 24, 2018 at 11:21:47AM +0200, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > Postcopy recovery won't work well with release-ram capability since
> > release-ram will drop the page buffer as long as the page is put into
> > the send buffer.  So if there is a network failure happened, any page
> > buffers that have not yet reached the destination VM but have already
> > been sent from the source VM will be lost forever.  Let's refuse the
> > client from resuming such a postcopy migration.  Luckily release-ram was
> > designed to only be used when src and destination VMs are on the same
> > host, so it should be fine.
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> 
> Reviewed-by: Juan Quintela <quintela@redhat.com>

Thanks.

> 
> I wonder if we should have a FAQ somewhere and point an URL to there.

Yeah we possibly should.  I have plan to write up a postcopy recovery
wiki page, maybe an addon to the old postcopy wiki page, but I haven't
yet started (especially after knowing that libvirt developers know
well about how to use it already, hence I put that a lower
priority...).

Regards,
diff mbox series

Patch

diff --git a/migration/migration.c b/migration/migration.c
index 8d56d56930..09447f2bb5 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1629,6 +1629,25 @@  static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
                        "paused migration");
             return false;
         }
+
+        /*
+         * Postcopy recovery won't work well with release-ram
+         * capability since release-ram will drop the page buffer as
+         * long as the page is put into the send buffer.  So if there
+         * is a network failure happened, any page buffers that have
+         * not yet reached the destination VM but have already been
+         * sent from the source VM will be lost forever.  Let's refuse
+         * the client from resuming such a postcopy migration.
+         * Luckily release-ram was designed to only be used when src
+         * and destination VMs are on the same host, so it should be
+         * fine.
+         */
+        if (migrate_release_ram()) {
+            error_setg(errp, "Postcopy recovery cannot work "
+                       "when release-ram capability is set");
+            return false;
+        }
+
         /* This is a resume, skip init status */
         return true;
     }