diff mbox series

[1/2] tests/migration: Make precopy fast

Message ID 20230412142001.16501-2-quintela@redhat.com
State New
Headers show
Series tests/migration: Fix migration-test slowdown | expand

Commit Message

Juan Quintela April 12, 2023, 2:20 p.m. UTC
Otherwise we do the 1st migration iteration at a too slow speed.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 tests/qtest/migration-test.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Daniel P. Berrangé April 18, 2023, 11:53 a.m. UTC | #1
On Wed, Apr 12, 2023 at 04:20:00PM +0200, Juan Quintela wrote:
> Otherwise we do the 1st migration iteration at a too slow speed.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  tests/qtest/migration-test.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index 3b615b0da9..7b05b0b7dd 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -1348,6 +1348,7 @@ static void test_precopy_common(MigrateCommon *args)
>          migrate_qmp(from, args->connect_uri, "{}");
>      }
>  
> +    migrate_ensure_converge(from);

This isn't right - it defeats the point of having the call to
migrate_ensure_non_converge() a few lines earlier.

>      if (args->result != MIG_TEST_SUCCEED) {
>          bool allow_active = args->result == MIG_TEST_FAIL;
> @@ -1365,8 +1366,6 @@ static void test_precopy_common(MigrateCommon *args)
>              wait_for_migration_pass(from);
>          }
>  
> -        migrate_ensure_converge(from);
> -

The reason why we had it here was to ensure that we test more than
1 iteration of migration. With this change, migrate will succeed
on the first pass IIUC, and so we won't be exercising the more
complex code path of repeated iterations.


I do agree with the overall idea though. We have many many migration
test scenarios and we don't need all of them to be testing multiple
iterations - a couple would be sufficient.

In fact we don't even need to be testing live migration for most
of the cases. All the TLS test cases could be run with guest CPUs
paused entirely removing any dirtying, since they're only interested
in the initial network handshake/setup process testnig.

I had some patches I was finishing off just before I went on vacation
a few weeks ago which do this kind of optimization, which I can send
out shortly.

>          /* We do this first, as it has a timeout to stop us
>           * hanging forever if migration didn't converge */
>          wait_for_migration_complete(from);


With regards,
Daniel
Juan Quintela April 18, 2023, 12:20 p.m. UTC | #2
Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Wed, Apr 12, 2023 at 04:20:00PM +0200, Juan Quintela wrote:
>> Otherwise we do the 1st migration iteration at a too slow speed.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>  tests/qtest/migration-test.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>> 
>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>> index 3b615b0da9..7b05b0b7dd 100644
>> --- a/tests/qtest/migration-test.c
>> +++ b/tests/qtest/migration-test.c
>> @@ -1348,6 +1348,7 @@ static void test_precopy_common(MigrateCommon *args)
>>          migrate_qmp(from, args->connect_uri, "{}");
>>      }
>>  
>> +    migrate_ensure_converge(from);
>
> This isn't right - it defeats the point of having the call to
> migrate_ensure_non_converge() a few lines earlier.

Depends on what is the definiton or "right" O:-)

>>      if (args->result != MIG_TEST_SUCCEED) {
>>          bool allow_active = args->result == MIG_TEST_FAIL;
>> @@ -1365,8 +1366,6 @@ static void test_precopy_common(MigrateCommon *args)
>>              wait_for_migration_pass(from);
>>          }
>>  
>> -        migrate_ensure_converge(from);
>> -
>
> The reason why we had it here was to ensure that we test more than
> 1 iteration of migration. With this change, migrate will succeed
> on the first pass IIUC, and so we won't be exercising the more
> complex code path of repeated iterations.

Aha.

If that is the definition of "right", then I agree that my changes are
wrong.

But then I think we should change how we do the test.  We should split
this function (then same for postcopy, multifd, etc) to have to
versions, one that want to have multiple rounds, and another that can
finish as fast as possible.

This way we need to setup the 3MB/s only for the tests that we want to
loop, and for the others put something faster.


>
> I do agree with the overall idea though. We have many many migration
> test scenarios and we don't need all of them to be testing multiple
> iterations - a couple would be sufficient.
>
> In fact we don't even need to be testing live migration for most
> of the cases. All the TLS test cases could be run with guest CPUs
> paused entirely removing any dirtying, since they're only interested
> in the initial network handshake/setup process testnig.
>
> I had some patches I was finishing off just before I went on vacation
> a few weeks ago which do this kind of optimization, which I can send
> out shortly.

I will wait for your patches before I sent anything different.

I have local patches for doing something different, changing

      "-serial file:%s/src_serial "

and other friends to:

      "-serial file:%s/src_serial%pid "

So we are sure that two tests never "reuse" the socket, as it can create
problems for example when doing the cancel and relaunching the
destination.

But as said, will wait until you send your series to send anything.

Later, Juan.
Daniel P. Berrangé April 21, 2023, 5:22 p.m. UTC | #3
On Tue, Apr 18, 2023 at 02:20:27PM +0200, Juan Quintela wrote:
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> > On Wed, Apr 12, 2023 at 04:20:00PM +0200, Juan Quintela wrote:
> >> Otherwise we do the 1st migration iteration at a too slow speed.
> >> 
> >> Signed-off-by: Juan Quintela <quintela@redhat.com>
> >> ---
> >>  tests/qtest/migration-test.c | 3 +--
> >>  1 file changed, 1 insertion(+), 2 deletions(-)
> >> 
> >> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> >> index 3b615b0da9..7b05b0b7dd 100644
> >> --- a/tests/qtest/migration-test.c
> >> +++ b/tests/qtest/migration-test.c
> >> @@ -1348,6 +1348,7 @@ static void test_precopy_common(MigrateCommon *args)
> >>          migrate_qmp(from, args->connect_uri, "{}");
> >>      }
> >>  
> >> +    migrate_ensure_converge(from);
> >
> > This isn't right - it defeats the point of having the call to
> > migrate_ensure_non_converge() a few lines earlier.
> 
> Depends on what is the definiton or "right" O:-)
> 
> >>      if (args->result != MIG_TEST_SUCCEED) {
> >>          bool allow_active = args->result == MIG_TEST_FAIL;
> >> @@ -1365,8 +1366,6 @@ static void test_precopy_common(MigrateCommon *args)
> >>              wait_for_migration_pass(from);
> >>          }
> >>  
> >> -        migrate_ensure_converge(from);
> >> -
> >
> > The reason why we had it here was to ensure that we test more than
> > 1 iteration of migration. With this change, migrate will succeed
> > on the first pass IIUC, and so we won't be exercising the more
> > complex code path of repeated iterations.
> 
> Aha.
> 
> If that is the definition of "right", then I agree that my changes are
> wrong.
> 
> But then I think we should change how we do the test.  We should split
> this function (then same for postcopy, multifd, etc) to have to
> versions, one that want to have multiple rounds, and another that can
> finish as fast as possible.
> 
> This way we need to setup the 3MB/s only for the tests that we want to
> loop, and for the others put something faster.
> 
> 
> >
> > I do agree with the overall idea though. We have many many migration
> > test scenarios and we don't need all of them to be testing multiple
> > iterations - a couple would be sufficient.
> >
> > In fact we don't even need to be testing live migration for most
> > of the cases. All the TLS test cases could be run with guest CPUs
> > paused entirely removing any dirtying, since they're only interested
> > in the initial network handshake/setup process testnig.
> >
> > I had some patches I was finishing off just before I went on vacation
> > a few weeks ago which do this kind of optimization, which I can send
> > out shortly.
> 
> I will wait for your patches before I sent anything different.
> 
> I have local patches for doing something different, changing
> 
>       "-serial file:%s/src_serial "
> 
> and other friends to:
> 
>       "-serial file:%s/src_serial%pid "
> 
> So we are sure that two tests never "reuse" the socket, as it can create
> problems for example when doing the cancel and relaunching the
> destination.
> 
> But as said, will wait until you send your series to send anything.

I've just sent a new series which has some more differences and
improvements

https://lists.gnu.org/archive/html/qemu-devel/2023-04/msg03688.html


With regards,
Daniel
diff mbox series

Patch

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 3b615b0da9..7b05b0b7dd 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1348,6 +1348,7 @@  static void test_precopy_common(MigrateCommon *args)
         migrate_qmp(from, args->connect_uri, "{}");
     }
 
+    migrate_ensure_converge(from);
 
     if (args->result != MIG_TEST_SUCCEED) {
         bool allow_active = args->result == MIG_TEST_FAIL;
@@ -1365,8 +1366,6 @@  static void test_precopy_common(MigrateCommon *args)
             wait_for_migration_pass(from);
         }
 
-        migrate_ensure_converge(from);
-
         /* We do this first, as it has a timeout to stop us
          * hanging forever if migration didn't converge */
         wait_for_migration_complete(from);