diff mbox series

migration: calculate expected_downtime with ram_bytes_remaining()

Message ID 20180331185536.4835-1-bala24@linux.vnet.ibm.com
State New
Headers show
Series migration: calculate expected_downtime with ram_bytes_remaining() | expand

Commit Message

Balamuruhan S March 31, 2018, 6:55 p.m. UTC
expected_downtime value is not accurate with dirty_pages_rate * page_size,
using ram_bytes_remaining would yeild it correct.

Signed-off-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
---
 migration/migration.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Peter Xu April 3, 2018, 6:10 a.m. UTC | #1
On Sun, Apr 01, 2018 at 12:25:36AM +0530, Balamuruhan S wrote:
> expected_downtime value is not accurate with dirty_pages_rate * page_size,
> using ram_bytes_remaining would yeild it correct.
> 
> Signed-off-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
> ---
>  migration/migration.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 58bd382730..4e43dc4f92 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2245,8 +2245,7 @@ static void migration_update_counters(MigrationState *s,
>       * recalculate. 10000 is a small enough number for our purposes
>       */
>      if (ram_counters.dirty_pages_rate && transferred > 10000) {
> -        s->expected_downtime = ram_counters.dirty_pages_rate *
> -            qemu_target_page_size() / bandwidth;
> +        s->expected_downtime = ram_bytes_remaining() / bandwidth;

This field was removed in e4ed1541ac ("savevm: New save live migration
method: pending", 2012-12-20), in which remaing RAM was used.

And it was added back in 90f8ae724a ("migration: calculate
expected_downtime", 2013-02-22), in which dirty rate was used.

However I didn't find a clue on why we changed from using remaining
RAM to using dirty rate...  So I'll leave this question to Juan.

Besides, I'm a bit confused on when we'll want such a value.  AFAIU
precopy is mostly used by setting up the target downtime before hand,
so we should already know the downtime before hand.  Then why we want
to observe such a thing?

Thanks,
Balamuruhan S April 3, 2018, 5:30 p.m. UTC | #2
On 2018-04-03 11:40, Peter Xu wrote:
> On Sun, Apr 01, 2018 at 12:25:36AM +0530, Balamuruhan S wrote:
>> expected_downtime value is not accurate with dirty_pages_rate * 
>> page_size,
>> using ram_bytes_remaining would yeild it correct.
>> 
>> Signed-off-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
>> ---
>>  migration/migration.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>> 
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 58bd382730..4e43dc4f92 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -2245,8 +2245,7 @@ static void 
>> migration_update_counters(MigrationState *s,
>>       * recalculate. 10000 is a small enough number for our purposes
>>       */
>>      if (ram_counters.dirty_pages_rate && transferred > 10000) {
>> -        s->expected_downtime = ram_counters.dirty_pages_rate *
>> -            qemu_target_page_size() / bandwidth;
>> +        s->expected_downtime = ram_bytes_remaining() / bandwidth;
> 
> This field was removed in e4ed1541ac ("savevm: New save live migration
> method: pending", 2012-12-20), in which remaing RAM was used.
> 
> And it was added back in 90f8ae724a ("migration: calculate
> expected_downtime", 2013-02-22), in which dirty rate was used.
> 
> However I didn't find a clue on why we changed from using remaining
> RAM to using dirty rate...  So I'll leave this question to Juan.
> 
> Besides, I'm a bit confused on when we'll want such a value.  AFAIU
> precopy is mostly used by setting up the target downtime before hand,
> so we should already know the downtime before hand.  Then why we want
> to observe such a thing?

Thanks Peter Xu for reviewing,

I tested precopy migration with 16M hugepage backed ppc guest and 
granularity
of page size in migration is 4K so any page dirtied would result in 4096 
pages
to be transmitted again, this led for migration to continue endless,

default migrate_parameters:
downtime-limit: 300 milliseconds

info migrate:
expected downtime: 1475 milliseconds

Migration status: active
total time: 130874 milliseconds
expected downtime: 1475 milliseconds
setup: 3475 milliseconds
transferred ram: 18197383 kbytes
throughput: 866.83 mbps
remaining ram: 376892 kbytes
total ram: 8388864 kbytes
duplicate: 1678265 pages
skipped: 0 pages
normal: 4536795 pages
normal bytes: 18147180 kbytes
dirty sync count: 6
page size: 4 kbytes
dirty pages rate: 39044 pages

In order to complete migration I configured downtime-limit to 1475
milliseconds but still migration was endless. Later calculated expected
downtime by remaining ram 376892 Kbytes / 866.83 mbps yeilded 3478.34
milliseconds and configuring it as downtime-limit succeeds the migration
to complete. This led to the conclusion that expected downtime is not
accurate.

Regards,
Balamuruhan S

> 
> Thanks,
Peter Xu April 4, 2018, 1:59 a.m. UTC | #3
On Tue, Apr 03, 2018 at 11:00:00PM +0530, bala24 wrote:
> On 2018-04-03 11:40, Peter Xu wrote:
> > On Sun, Apr 01, 2018 at 12:25:36AM +0530, Balamuruhan S wrote:
> > > expected_downtime value is not accurate with dirty_pages_rate *
> > > page_size,
> > > using ram_bytes_remaining would yeild it correct.
> > > 
> > > Signed-off-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
> > > ---
> > >  migration/migration.c | 3 +--
> > >  1 file changed, 1 insertion(+), 2 deletions(-)
> > > 
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index 58bd382730..4e43dc4f92 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -2245,8 +2245,7 @@ static void
> > > migration_update_counters(MigrationState *s,
> > >       * recalculate. 10000 is a small enough number for our purposes
> > >       */
> > >      if (ram_counters.dirty_pages_rate && transferred > 10000) {
> > > -        s->expected_downtime = ram_counters.dirty_pages_rate *
> > > -            qemu_target_page_size() / bandwidth;
> > > +        s->expected_downtime = ram_bytes_remaining() / bandwidth;
> > 
> > This field was removed in e4ed1541ac ("savevm: New save live migration
> > method: pending", 2012-12-20), in which remaing RAM was used.
> > 
> > And it was added back in 90f8ae724a ("migration: calculate
> > expected_downtime", 2013-02-22), in which dirty rate was used.
> > 
> > However I didn't find a clue on why we changed from using remaining
> > RAM to using dirty rate...  So I'll leave this question to Juan.
> > 
> > Besides, I'm a bit confused on when we'll want such a value.  AFAIU
> > precopy is mostly used by setting up the target downtime before hand,
> > so we should already know the downtime before hand.  Then why we want
> > to observe such a thing?
> 
> Thanks Peter Xu for reviewing,
> 
> I tested precopy migration with 16M hugepage backed ppc guest and
> granularity
> of page size in migration is 4K so any page dirtied would result in 4096
> pages
> to be transmitted again, this led for migration to continue endless,
> 
> default migrate_parameters:
> downtime-limit: 300 milliseconds
> 
> info migrate:
> expected downtime: 1475 milliseconds
> 
> Migration status: active
> total time: 130874 milliseconds
> expected downtime: 1475 milliseconds
> setup: 3475 milliseconds
> transferred ram: 18197383 kbytes
> throughput: 866.83 mbps
> remaining ram: 376892 kbytes
> total ram: 8388864 kbytes
> duplicate: 1678265 pages
> skipped: 0 pages
> normal: 4536795 pages
> normal bytes: 18147180 kbytes
> dirty sync count: 6
> page size: 4 kbytes
> dirty pages rate: 39044 pages
> 
> In order to complete migration I configured downtime-limit to 1475
> milliseconds but still migration was endless. Later calculated expected
> downtime by remaining ram 376892 Kbytes / 866.83 mbps yeilded 3478.34
> milliseconds and configuring it as downtime-limit succeeds the migration
> to complete. This led to the conclusion that expected downtime is not
> accurate.

Hmm, thanks for the information.  I'd say your calculation seems
reasonable to me: it shows how long time will it need if we stop the
VM now on source immediately and migrate the rest. However Juan might
have an explanation on existing algorithm which I would like to know
too. So still I'll put aside the "which one is better" question.

For your use case, you can have a look on either of below way to
have a converged migration:

- auto-converge: that's a migration capability that throttles CPU
  usage of guests

- postcopy: that'll let you start the destination VM even without
  transferring all the RAMs before hand

Either of the technique can be configured via "migrate_set_capability"
HMP command or "migrate-set-capabilities" QMP command (some googling
would show detailed steps). And, either of above should help you to
migrate successfully in this hard-to-converge scenario, instead of
your current way (observing downtime, set downtime).

Meanwhile, I'm thinking whether instead of observing the downtime in
real time, whether we should introduce a command to stop the VM
immediately to migrate the rest when we want, or, a new parameter to
current "migrate" command.
Juan Quintela April 4, 2018, 9:02 a.m. UTC | #4
Peter Xu <peterx@redhat.com> wrote:
> On Sun, Apr 01, 2018 at 12:25:36AM +0530, Balamuruhan S wrote:
>> expected_downtime value is not accurate with dirty_pages_rate * page_size,
>> using ram_bytes_remaining would yeild it correct.
>> 
>> Signed-off-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
>> ---
>>  migration/migration.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>> 
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 58bd382730..4e43dc4f92 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -2245,8 +2245,7 @@ static void migration_update_counters(MigrationState *s,
>>       * recalculate. 10000 is a small enough number for our purposes
>>       */
>>      if (ram_counters.dirty_pages_rate && transferred > 10000) {
>> -        s->expected_downtime = ram_counters.dirty_pages_rate *
>> -            qemu_target_page_size() / bandwidth;
>> +        s->expected_downtime = ram_bytes_remaining() / bandwidth;
>
> This field was removed in e4ed1541ac ("savevm: New save live migration
> method: pending", 2012-12-20), in which remaing RAM was used.

Unrelated O:-)

> And it was added back in 90f8ae724a ("migration: calculate
> expected_downtime", 2013-02-22), in which dirty rate was used.

We didn't want to update the field if there haven't been enough activity.

> However I didn't find a clue on why we changed from using remaining
> RAM to using dirty rate...  So I'll leave this question to Juan.
>
> Besides, I'm a bit confused on when we'll want such a value.  AFAIU
> precopy is mostly used by setting up the target downtime before hand,
> so we should already know the downtime before hand.  Then why we want
> to observe such a thing?

What that field means is how much time the system needs to send
everything that is pending.

I.e. if expected_downtime = 2seconds, it means that with current dirty
rate, if we set a downtime of 2 or bigger it is going to finish
migration.

It is a help for upper layers to decide that:
- they want a 1second downtime
- system calculates with current load that they need a 2second downtime

So they can decide:
- change the downtime to 2seconds (easy)
- change the apps running on the guest to dirty less memory (It dependes
  on the guest, app, etc).

I don't know if anyone is using it at all.

Later, Juan.
Juan Quintela April 4, 2018, 9:04 a.m. UTC | #5
Balamuruhan S <bala24@linux.vnet.ibm.com> wrote:
> expected_downtime value is not accurate with dirty_pages_rate * page_size,
> using ram_bytes_remaining would yeild it correct.
>
> Signed-off-by: Balamuruhan S <bala24@linux.vnet.ibm.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

See my other mail on the thread, my understanding is that your change is
corret (TM).

Thanks, Juan.

> ---
>  migration/migration.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 58bd382730..4e43dc4f92 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2245,8 +2245,7 @@ static void migration_update_counters(MigrationState *s,
>       * recalculate. 10000 is a small enough number for our purposes
>       */
>      if (ram_counters.dirty_pages_rate && transferred > 10000) {
> -        s->expected_downtime = ram_counters.dirty_pages_rate *
> -            qemu_target_page_size() / bandwidth;
> +        s->expected_downtime = ram_bytes_remaining() / bandwidth;
>      }
>  
>      qemu_file_reset_rate_limit(s->to_dst_file);
Balamuruhan S April 10, 2018, 9:52 a.m. UTC | #6
On Wed, Apr 04, 2018 at 11:04:59AM +0200, Juan Quintela wrote:
> Balamuruhan S <bala24@linux.vnet.ibm.com> wrote:
> > expected_downtime value is not accurate with dirty_pages_rate * page_size,
> > using ram_bytes_remaining would yeild it correct.
> >
> > Signed-off-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
> 
> Reviewed-by: Juan Quintela <quintela@redhat.com>
> 
> See my other mail on the thread, my understanding is that your change is
> corret (TM).

Juan, Please help to merge it.

Regards,
Bala

> 
> Thanks, Juan.
> 
> > ---
> >  migration/migration.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 58bd382730..4e43dc4f92 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -2245,8 +2245,7 @@ static void migration_update_counters(MigrationState *s,
> >       * recalculate. 10000 is a small enough number for our purposes
> >       */
> >      if (ram_counters.dirty_pages_rate && transferred > 10000) {
> > -        s->expected_downtime = ram_counters.dirty_pages_rate *
> > -            qemu_target_page_size() / bandwidth;
> > +        s->expected_downtime = ram_bytes_remaining() / bandwidth;
> >      }
> >  
> >      qemu_file_reset_rate_limit(s->to_dst_file);
>
Balamuruhan S April 10, 2018, 10:52 a.m. UTC | #7
On 2018-04-10 15:22, Balamuruhan S wrote:
> On Wed, Apr 04, 2018 at 11:04:59AM +0200, Juan Quintela wrote:
>> Balamuruhan S <bala24@linux.vnet.ibm.com> wrote:
>> > expected_downtime value is not accurate with dirty_pages_rate * page_size,
>> > using ram_bytes_remaining would yeild it correct.
>> >
>> > Signed-off-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
>> 
>> Reviewed-by: Juan Quintela <quintela@redhat.com>
>> 
>> See my other mail on the thread, my understanding is that your change 
>> is
>> corret (TM).
> 
> Juan, Please help to merge it.

Sorry for asking it as during discussion going on, but the reason is 
currently
postcopy migration for HP backed P8 guest from P8 -> P9 is broken and to 
use
precopy with appropriate downtime value we need this patch to be 
backported
to distros that is to be released soon.

> 
> Regards,
> Bala
> 
>> 
>> Thanks, Juan.
>> 
>> > ---
>> >  migration/migration.c | 3 +--
>> >  1 file changed, 1 insertion(+), 2 deletions(-)
>> >
>> > diff --git a/migration/migration.c b/migration/migration.c
>> > index 58bd382730..4e43dc4f92 100644
>> > --- a/migration/migration.c
>> > +++ b/migration/migration.c
>> > @@ -2245,8 +2245,7 @@ static void migration_update_counters(MigrationState *s,
>> >       * recalculate. 10000 is a small enough number for our purposes
>> >       */
>> >      if (ram_counters.dirty_pages_rate && transferred > 10000) {
>> > -        s->expected_downtime = ram_counters.dirty_pages_rate *
>> > -            qemu_target_page_size() / bandwidth;
>> > +        s->expected_downtime = ram_bytes_remaining() / bandwidth;
>> >      }
>> >
>> >      qemu_file_reset_rate_limit(s->to_dst_file);
>>
diff mbox series

Patch

diff --git a/migration/migration.c b/migration/migration.c
index 58bd382730..4e43dc4f92 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2245,8 +2245,7 @@  static void migration_update_counters(MigrationState *s,
      * recalculate. 10000 is a small enough number for our purposes
      */
     if (ram_counters.dirty_pages_rate && transferred > 10000) {
-        s->expected_downtime = ram_counters.dirty_pages_rate *
-            qemu_target_page_size() / bandwidth;
+        s->expected_downtime = ram_bytes_remaining() / bandwidth;
     }
 
     qemu_file_reset_rate_limit(s->to_dst_file);