diff mbox series

[1/3] migration: In case of postcopy, the memory ends in res_postcopy_only

Message ID 20230208135719.17864-2-quintela@redhat.com
State New
Headers show
Series Remove res_compatible parameter | expand

Commit Message

Juan Quintela Feb. 8, 2023, 1:57 p.m. UTC
So remove last assignation of res_compatible.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/ram.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Vladimir Sementsov-Ogievskiy Feb. 9, 2023, 5:36 p.m. UTC | #1
On 08.02.23 16:57, Juan Quintela wrote:
> So remove last assignation of res_compatible.

I hoped for some description when asked to split it out :)

> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>   migration/ram.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index b966e148c2..85ccbf88ad 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -3474,7 +3474,7 @@ static void ram_state_pending_exact(void *opaque,
>   
>       if (migrate_postcopy_ram()) {
>           /* We can do postcopy, and all the data is postcopiable */
> -        *res_compatible += remaining_size;
> +        *res_postcopy_only += remaining_size;

Actually, these "remaining_size" bytes are still compatible, i.e. we can migrate these pending bytes in pre-copy, and we actually do it, until user call migrate-start-postcopy, yes? But we exploit the fact that, this change don't affect any logic, just name becomes wrong.. Yes? Or I don't follow:/

>       } else {
>           *res_precopy_only += remaining_size;
>       }
Juan Quintela Feb. 9, 2023, 6:10 p.m. UTC | #2
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
> On 08.02.23 16:57, Juan Quintela wrote:
>> So remove last assignation of res_compatible.
>
> I hoped for some description when asked to split it out :)
>
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>   migration/ram.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>> diff --git a/migration/ram.c b/migration/ram.c
>> index b966e148c2..85ccbf88ad 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -3474,7 +3474,7 @@ static void ram_state_pending_exact(void *opaque,
>>         if (migrate_postcopy_ram()) {
>>           /* We can do postcopy, and all the data is postcopiable */
>> -        *res_compatible += remaining_size;
>> +        *res_postcopy_only += remaining_size;
>
> Actually, these "remaining_size" bytes are still compatible, i.e. we
> can migrate these pending bytes in pre-copy, and we actually do it,
> until user call migrate-start-postcopy, yes? But we exploit the fact
> that, this change don't affect any logic, just name becomes
> wrong.. Yes? Or I don't follow:/

My definition of the fields is: how are we going to transfer that bytes.

if they are on res_precopy_only, we transfer them with precopy, if they
are on res_postocpy_only, we transfer them with postcopy.

So, the rest of RAM, if we are in postcopy, we sent it with postcopy,
and if we are in precopy, we sent them with precopy.  See the whole
code.  This is the _estimate function.

    uint64_t remaining_size = rs->migration_dirty_pages * TARGET_PAGE_SIZE;

    if (migrate_postcopy_ram()) {
        /* We can do postcopy, and all the data is postcopiable */
        *res_postcopy_only += remaining_size;
    } else {
        *res_precopy_only += remaining_size;
    }

After the change, _exact does exactly the same.

The caller (migration_iteration_run()) does this (I remove traces and
things that don't matter for this). This is before the change.
Remember: in precopy, we add res_compat to pend_pre, and in postcopy to
pend_post.

    uint64_t pending_size = pend_pre + pend_compat + pend_post;

### pending_size is the sum of the three, so it doesn't matter.

    if (pend_pre + pend_compat <= s->threshold_size) {

###  In precopy, we add pend_compat to pend_pre, so we are ok.
###  In postcopy, we add the data to pend_postcopy, but that is right,
###  because to calculate the downtime, we only care about what we have
###  to transfer with precopy, in particular, we aren't going to send
###  more ram, so it is ok that it is in pend_post.

        qemu_savevm_state_pending_exact(&pend_pre, &pend_compat, &pend_post);
        pending_size = pend_pre + pend_compat + pend_post;
    }

    if (!pending_size || pending_size < s->threshold_size) {
        migration_completion(s);
        return MIG_ITERATE_BREAK;
    }

    /* Still a significant amount to transfer */
    if (!in_postcopy && pend_pre <= s->threshold_size &&
        qatomic_read(&s->start_postcopy)) {

#### this is what I mean.  See how we only use pend_pre to decide if we
###  are entering postcopy.

        if (postcopy_start(s)) {
            error_report("%s: postcopy failed to start", __func__);
        }
        return MIG_ITERATE_SKIP;
    }

So the only "behaviour" that we can say are having is that with the
change we are a little bit more aggressive on calling
qemu_savevm_state_pending_exact(), but I will arguee that the new
behaviour is the right one.

What do you think?

Later, Juan.
Vladimir Sementsov-Ogievskiy Feb. 14, 2023, 3:04 p.m. UTC | #3
On 09.02.23 21:10, Juan Quintela wrote:
> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
>> On 08.02.23 16:57, Juan Quintela wrote:
>>> So remove last assignation of res_compatible.
>>
>> I hoped for some description when asked to split it out :)
>>
>>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>>> ---
>>>    migration/ram.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>> diff --git a/migration/ram.c b/migration/ram.c
>>> index b966e148c2..85ccbf88ad 100644
>>> --- a/migration/ram.c
>>> +++ b/migration/ram.c
>>> @@ -3474,7 +3474,7 @@ static void ram_state_pending_exact(void *opaque,
>>>          if (migrate_postcopy_ram()) {
>>>            /* We can do postcopy, and all the data is postcopiable */
>>> -        *res_compatible += remaining_size;
>>> +        *res_postcopy_only += remaining_size;
>>
>> Actually, these "remaining_size" bytes are still compatible, i.e. we
>> can migrate these pending bytes in pre-copy, and we actually do it,
>> until user call migrate-start-postcopy, yes? But we exploit the fact
>> that, this change don't affect any logic, just name becomes
>> wrong.. Yes? Or I don't follow:/
> 
> My definition of the fields is: how are we going to transfer that bytes.
> 
> if they are on res_precopy_only, we transfer them with precopy, if they
> are on res_postocpy_only, we transfer them with postcopy.
> 
> So, the rest of RAM, if we are in postcopy, we sent it with postcopy,
> and if we are in precopy, we sent them with precopy.  See the whole
> code.  This is the _estimate function.
> 
>      uint64_t remaining_size = rs->migration_dirty_pages * TARGET_PAGE_SIZE;
> 
>      if (migrate_postcopy_ram()) {
>          /* We can do postcopy, and all the data is postcopiable */
>          *res_postcopy_only += remaining_size;
>      } else {
>          *res_precopy_only += remaining_size;
>      }
> 
> After the change, _exact does exactly the same.
> 
> The caller (migration_iteration_run()) does this (I remove traces and
> things that don't matter for this). This is before the change.
> Remember: in precopy, we add res_compat to pend_pre, and in postcopy to
> pend_post.
> 
>      uint64_t pending_size = pend_pre + pend_compat + pend_post;
> 
> ### pending_size is the sum of the three, so it doesn't matter.
> 
>      if (pend_pre + pend_compat <= s->threshold_size) {
> 
> ###  In precopy, we add pend_compat to pend_pre, so we are ok.
> ###  In postcopy, we add the data to pend_postcopy, but that is right,
> ###  because to calculate the downtime, we only care about what we have
> ###  to transfer with precopy, in particular, we aren't going to send
> ###  more ram, so it is ok that it is in pend_post.
> 
>          qemu_savevm_state_pending_exact(&pend_pre, &pend_compat, &pend_post);
>          pending_size = pend_pre + pend_compat + pend_post;
>      }
> 
>      if (!pending_size || pending_size < s->threshold_size) {
>          migration_completion(s);
>          return MIG_ITERATE_BREAK;
>      }
> 
>      /* Still a significant amount to transfer */
>      if (!in_postcopy && pend_pre <= s->threshold_size &&
>          qatomic_read(&s->start_postcopy)) {
> 
> #### this is what I mean.  See how we only use pend_pre to decide if we
> ###  are entering postcopy.
> 
>          if (postcopy_start(s)) {
>              error_report("%s: postcopy failed to start", __func__);
>          }
>          return MIG_ITERATE_SKIP;
>      }
> 
> So the only "behaviour" that we can say are having is that with the

actualy, this one was already "changed", as _estimate never return compat other than zero.

So, The patch really changes nothing

> change we are a little bit more aggressive on calling
> qemu_savevm_state_pending_exact(), but I will arguee that the new
> behaviour is the right one.
> 
> What do you think?
> 


I think, that the order of logic and documentation changing since introducing _estimate is a bit confused.

But I agree now, that we are safe to unite old compat and old postcopy_only into one variable, as we want only

1. the total sum, to probably go to migration_completion()
2. pend_pre to probably go to postcopy_start()

So, patch is OK, and seems it changes absolutely nothing in logic. Thanks for explanations!


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Juan Quintela Feb. 14, 2023, 3:47 p.m. UTC | #4
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
> On 09.02.23 21:10, Juan Quintela wrote:
>> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
>>> On 08.02.23 16:57, Juan Quintela wrote:
>>>> So remove last assignation of res_compatible.
>> 
>
>
> I think, that the order of logic and documentation changing since introducing _estimate is a bit confused.
>
> But I agree now, that we are safe to unite old compat and old postcopy_only into one variable, as we want only
>
> 1. the total sum, to probably go to migration_completion()
> 2. pend_pre to probably go to postcopy_start()
>
> So, patch is OK, and seems it changes absolutely nothing in logic. Thanks for explanations!
>
>
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

Thanks.

You are welcome.
Juan Quintela Feb. 15, 2023, 9:08 a.m. UTC | #5
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
> On 08.02.23 16:57, Juan Quintela wrote:
>> So remove last assignation of res_compatible.
>
> I hoped for some description when asked to split it out :)
>
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>   migration/ram.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>> diff --git a/migration/ram.c b/migration/ram.c
>> index b966e148c2..85ccbf88ad 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -3474,7 +3474,7 @@ static void ram_state_pending_exact(void *opaque,
>>         if (migrate_postcopy_ram()) {
>>           /* We can do postcopy, and all the data is postcopiable */
>> -        *res_compatible += remaining_size;
>> +        *res_postcopy_only += remaining_size;
>
> Actually, these "remaining_size" bytes are still compatible, i.e. we
> can migrate these pending bytes in pre-copy, and we actually do it,
> until user call migrate-start-postcopy, yes? But we exploit the fact
> that, this change don't affect any logic, just name becomes
> wrong.. Yes? Or I don't follow:/

I think of this from this different angle:
- if we are on precopy, we return on res_precopy everything (and nothing
  on res_postcopy)
- if we are on postcopy, we return on res_precopy what we _must_ sent
  through precopy, and in res_postcopy what we can sent through
  postcopy.

i.e. if we stop the guest and do the migration right now, what are we
going to send through each channel.

Later, Juan.
Vladimir Sementsov-Ogievskiy Feb. 15, 2023, 11:36 a.m. UTC | #6
On 15.02.23 12:08, Juan Quintela wrote:
> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
>> On 08.02.23 16:57, Juan Quintela wrote:
>>> So remove last assignation of res_compatible.
>>
>> I hoped for some description when asked to split it out :)
>>
>>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>>> ---
>>>    migration/ram.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>> diff --git a/migration/ram.c b/migration/ram.c
>>> index b966e148c2..85ccbf88ad 100644
>>> --- a/migration/ram.c
>>> +++ b/migration/ram.c
>>> @@ -3474,7 +3474,7 @@ static void ram_state_pending_exact(void *opaque,
>>>          if (migrate_postcopy_ram()) {
>>>            /* We can do postcopy, and all the data is postcopiable */
>>> -        *res_compatible += remaining_size;
>>> +        *res_postcopy_only += remaining_size;
>>
>> Actually, these "remaining_size" bytes are still compatible, i.e. we
>> can migrate these pending bytes in pre-copy, and we actually do it,
>> until user call migrate-start-postcopy, yes? But we exploit the fact
>> that, this change don't affect any logic, just name becomes
>> wrong.. Yes? Or I don't follow:/
> 
> I think of this from this different angle:
> - if we are on precopy, we return on res_precopy everything (and nothing
>    on res_postcopy)
> - if we are on postcopy, we return on res_precopy what we _must_ sent
>    through precopy, and in res_postcopy what we can sent through
>    postcopy.
> 
> i.e. if we stop the guest and do the migration right now, what are we
> going to send through each channel.
> 

I understand.

I've introduced the division into three parts together with block-dirty-bitmap implementation, and it seemed significant to me that block-dirty-bitmap pending is postcopy_only in the sense that it can't be migrated before source stop, unlike RAM. But yes, it turns out that that's not significant for the generic migration algorithm, it works the same way for RAM and block-dirty-bitmap not distinguishing postcopy_only vs comaptible.

Anyway final documentation and new field names that you proposed are clean and correspond to the meaning which you have expected. And it avoids extra variable that I've introduced.

Haha. Looking at my old commit 4799502640e6a29d3 "migration: introduce postcopy-only pending" I see

-                              uint64_t *non_postcopiable_pending,
-                              uint64_t *postcopiable_pending);
+                              uint64_t *res_precopy_only,
+                              uint64_t *res_compatible,
+                              uint64_t *res_postcopy_only);

so, we just rollback my change, which actually was never necessary. And it was called like I've proposed in 03 discussion thread :) Still, must_precopy and can_postcopy are nicer.
diff mbox series

Patch

diff --git a/migration/ram.c b/migration/ram.c
index b966e148c2..85ccbf88ad 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3474,7 +3474,7 @@  static void ram_state_pending_exact(void *opaque,
 
     if (migrate_postcopy_ram()) {
         /* We can do postcopy, and all the data is postcopiable */
-        *res_compatible += remaining_size;
+        *res_postcopy_only += remaining_size;
     } else {
         *res_precopy_only += remaining_size;
     }