diff mbox series

[v2,2/6] tests/qtests: remove migration test iterations config

Message ID 20230421171411.566300-3-berrange@redhat.com
State New
Headers show
Series tests/qtest: make migration-test massively faster | expand

Commit Message

Daniel P. Berrangé April 21, 2023, 5:14 p.m. UTC
The 'unsigned int interations' config for migration is somewhat
overkill. Most tests don't set it, and a value of '0' is treated
as equivalent to '1'. The only test that does set it, xbzrle,
used a value of '2'.

This setting, however, only relates to the migration iterations
that take place prior to allowing convergence. IOW, on top of
this iteration count, there is always at least 1 further migration
iteration done to deal with pages that are dirtied during the
previous iteration(s).

IOW, even with iterations==1, the xbzrle test will be running for
a minimum of 2 iterations. With this in mind we can simplify the
code and just get rid of the special case.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/qtest/migration-test.c | 13 +------------
 1 file changed, 1 insertion(+), 12 deletions(-)

Comments

Juan Quintela April 21, 2023, 9:54 p.m. UTC | #1
Daniel P. Berrangé <berrange@redhat.com> wrote:
> The 'unsigned int interations' config for migration is somewhat
> overkill. Most tests don't set it, and a value of '0' is treated
> as equivalent to '1'. The only test that does set it, xbzrle,
> used a value of '2'.
>
> This setting, however, only relates to the migration iterations
> that take place prior to allowing convergence. IOW, on top of
> this iteration count, there is always at least 1 further migration
> iteration done to deal with pages that are dirtied during the
> previous iteration(s).
>
> IOW, even with iterations==1, the xbzrle test will be running for
> a minimum of 2 iterations. With this in mind we can simplify the
> code and just get rid of the special case.

Perhaps the old code was already wrong, but we need at least three
iterations for the xbzrle test:
- 1st iteration: xbzrle is not used, nothing is on cache.
- 2nd iteration: pages are put into cache, no xbzrle is used because
  there is no previous page.
- 3rd iteration: We really use xbzrle now against the copy of the
  previous iterations.

And yes, this should be commented somewhere.

Later, Juan.
Daniel P. Berrangé April 26, 2023, 9:07 a.m. UTC | #2
On Fri, Apr 21, 2023 at 11:54:55PM +0200, Juan Quintela wrote:
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> > The 'unsigned int interations' config for migration is somewhat
> > overkill. Most tests don't set it, and a value of '0' is treated
> > as equivalent to '1'. The only test that does set it, xbzrle,
> > used a value of '2'.
> >
> > This setting, however, only relates to the migration iterations
> > that take place prior to allowing convergence. IOW, on top of
> > this iteration count, there is always at least 1 further migration
> > iteration done to deal with pages that are dirtied during the
> > previous iteration(s).
> >
> > IOW, even with iterations==1, the xbzrle test will be running for
> > a minimum of 2 iterations. With this in mind we can simplify the
> > code and just get rid of the special case.
> 
> Perhaps the old code was already wrong, but we need at least three
> iterations for the xbzrle test:
> - 1st iteration: xbzrle is not used, nothing is on cache.

Are you sure about this ?  I see ram_save_page() calling
save_xbzrle_page() and unless I'm mis-understanding the
code, it doesn't appear to skip anything on the 1st
iteration.

IIUC save_xbzrle_page will add pages into the cache on
the first iteration, so the second iteration will get
cache hits

> - 2nd iteration: pages are put into cache, no xbzrle is used because
>   there is no previous page.
> - 3rd iteration: We really use xbzrle now against the copy of the
>   previous iterations.
> 
> And yes, this should be commented somewhere.

With regards,
Daniel
Juan Quintela April 26, 2023, 9:42 a.m. UTC | #3
Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Fri, Apr 21, 2023 at 11:54:55PM +0200, Juan Quintela wrote:
>> Daniel P. Berrangé <berrange@redhat.com> wrote:
>> > The 'unsigned int interations' config for migration is somewhat
>> > overkill. Most tests don't set it, and a value of '0' is treated
>> > as equivalent to '1'. The only test that does set it, xbzrle,
>> > used a value of '2'.
>> >
>> > This setting, however, only relates to the migration iterations
>> > that take place prior to allowing convergence. IOW, on top of
>> > this iteration count, there is always at least 1 further migration
>> > iteration done to deal with pages that are dirtied during the
>> > previous iteration(s).
>> >
>> > IOW, even with iterations==1, the xbzrle test will be running for
>> > a minimum of 2 iterations. With this in mind we can simplify the
>> > code and just get rid of the special case.
>> 
>> Perhaps the old code was already wrong, but we need at least three
>> iterations for the xbzrle test:
>> - 1st iteration: xbzrle is not used, nothing is on cache.
>
> Are you sure about this ?  I see ram_save_page() calling
> save_xbzrle_page() and unless I'm mis-understanding the
> code, it doesn't appear to skip anything on the 1st
> iteration.

I will admit that code is convoluted as hell.
And I confuse myself a lot here O:-)

struct RAM_STATE {
    ...
    /* Start using XBZRLE (e.g., after the first round). */
    bool xbzrle_enabled;
}

I.e. xbzrle_enabled() and m->xbzrle_enabled are two completely different things.

static int ram_save_page(RAMState *rs, PageSearchStatus *pss)
{
    ...
    if (rs->xbzrle_enabled && !migration_in_postcopy()) {
        pages = save_xbzrle_page(rs, pss, &p, current_addr,
                                 block, offset);
        ....
    }
    ....
}

and

static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
{
    /* Update pss->page for the next dirty bit in ramblock */
    pss_find_next_dirty(pss);

    if (pss->complete_round && pss->block == rs->last_seen_block &&
        ...
        return PAGE_ALL_CLEAN;
    }
    if (!offset_in_ramblock(pss->block,
                            ((ram_addr_t)pss->page) << TARGET_PAGE_BITS)) {
        ....
        if (!pss->block) {
            ....
            if (migrate_use_xbzrle()) {
                rs->xbzrle_enabled = true;
            }
        }
        ...
    } else {
        /* We've found something */
        return PAGE_DIRTY_FOUND;
    }
}



> IIUC save_xbzrle_page will add pages into the cache on
> the first iteration, so the second iteration will get
> cache hits
>
>> - 2nd iteration: pages are put into cache, no xbzrle is used because
>>   there is no previous page.
>> - 3rd iteration: We really use xbzrle now against the copy of the
>>   previous iterations.
>> 
>> And yes, this should be commented somewhere.

Seeing that it has been able to confuse you, a single comment will not
make the trick O:-)

Later, Juan.
Daniel P. Berrangé April 26, 2023, 10:15 a.m. UTC | #4
On Wed, Apr 26, 2023 at 11:42:51AM +0200, Juan Quintela wrote:
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> > On Fri, Apr 21, 2023 at 11:54:55PM +0200, Juan Quintela wrote:
> >> Daniel P. Berrangé <berrange@redhat.com> wrote:
> >> > The 'unsigned int interations' config for migration is somewhat
> >> > overkill. Most tests don't set it, and a value of '0' is treated
> >> > as equivalent to '1'. The only test that does set it, xbzrle,
> >> > used a value of '2'.
> >> >
> >> > This setting, however, only relates to the migration iterations
> >> > that take place prior to allowing convergence. IOW, on top of
> >> > this iteration count, there is always at least 1 further migration
> >> > iteration done to deal with pages that are dirtied during the
> >> > previous iteration(s).
> >> >
> >> > IOW, even with iterations==1, the xbzrle test will be running for
> >> > a minimum of 2 iterations. With this in mind we can simplify the
> >> > code and just get rid of the special case.
> >> 
> >> Perhaps the old code was already wrong, but we need at least three
> >> iterations for the xbzrle test:
> >> - 1st iteration: xbzrle is not used, nothing is on cache.
> >
> > Are you sure about this ?  I see ram_save_page() calling
> > save_xbzrle_page() and unless I'm mis-understanding the
> > code, it doesn't appear to skip anything on the 1st
> > iteration.
> 
> I will admit that code is convoluted as hell.
> And I confuse myself a lot here O:-)
> 
> struct RAM_STATE {
>     ...
>     /* Start using XBZRLE (e.g., after the first round). */
>     bool xbzrle_enabled;
> }
> 
> I.e. xbzrle_enabled() and m->xbzrle_enabled are two completely different things.

Aieeeee !  That's confusing indeed :-)

Lets rename that struct field to 'xbzrle_started', to better
distinguish active state from enabled state.


> static int ram_save_page(RAMState *rs, PageSearchStatus *pss)
> {
>     ...
>     if (rs->xbzrle_enabled && !migration_in_postcopy()) {
>         pages = save_xbzrle_page(rs, pss, &p, current_addr,
>                                  block, offset);
>         ....
>     }
>     ....
> }
> 
> and
> 
> static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
> {
>     /* Update pss->page for the next dirty bit in ramblock */
>     pss_find_next_dirty(pss);
> 
>     if (pss->complete_round && pss->block == rs->last_seen_block &&
>         ...
>         return PAGE_ALL_CLEAN;
>     }
>     if (!offset_in_ramblock(pss->block,
>                             ((ram_addr_t)pss->page) << TARGET_PAGE_BITS)) {
>         ....
>         if (!pss->block) {
>             ....
>             if (migrate_use_xbzrle()) {
>                 rs->xbzrle_enabled = true;
>             }
>         }
>         ...
>     } else {
>         /* We've found something */
>         return PAGE_DIRTY_FOUND;
>     }
> }
> 
> 
> 
> > IIUC save_xbzrle_page will add pages into the cache on
> > the first iteration, so the second iteration will get
> > cache hits
> >
> >> - 2nd iteration: pages are put into cache, no xbzrle is used because
> >>   there is no previous page.
> >> - 3rd iteration: We really use xbzrle now against the copy of the
> >>   previous iterations.
> >> 
> >> And yes, this should be commented somewhere.
> 
> Seeing that it has been able to confuse you, a single comment will not
> make the trick O:-)
> 
> Later, Juan.
> 

With regards,
Daniel
diff mbox series

Patch

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index ac2e8ecac6..e16120ff30 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -568,9 +568,6 @@  typedef struct {
         MIG_TEST_FAIL_DEST_QUIT_ERR,
     } result;
 
-    /* Optional: set number of migration passes to wait for */
-    unsigned int iterations;
-
     /* Postcopy specific fields */
     void *postcopy_data;
     bool postcopy_preempt;
@@ -1354,13 +1351,7 @@  static void test_precopy_common(MigrateCommon *args)
             qtest_set_expected_status(to, EXIT_FAILURE);
         }
     } else {
-        if (args->iterations) {
-            while (args->iterations--) {
-                wait_for_migration_pass(from);
-            }
-        } else {
-            wait_for_migration_pass(from);
-        }
+        wait_for_migration_pass(from);
 
         migrate_ensure_converge(from);
 
@@ -1514,8 +1505,6 @@  static void test_precopy_unix_xbzrle(void)
         .listen_uri = uri,
 
         .start_hook = test_migrate_xbzrle_start,
-
-        .iterations = 2,
     };
 
     test_precopy_common(&args);