Message ID | 20220707184600.24164-1-peterx@redhat.com |
---|---|
State | New |
Headers | show |
Series | tests: migration-test: Allow test to run without uffd | expand |
On Thu, Jul 07, 2022 at 02:46:00PM -0400, Peter Xu wrote: > We used to stop running all tests if uffd is not detected. However > logically that's only needed for postcopy not the rest of tests. > > Keep running the rest when still possible. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > tests/qtest/migration-test.c | 11 +++++------ > 1 file changed, 5 insertions(+), 6 deletions(-) Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> With regards, Daniel
On 07/07/2022 20.46, Peter Xu wrote: > We used to stop running all tests if uffd is not detected. However > logically that's only needed for postcopy not the rest of tests. > > Keep running the rest when still possible. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > tests/qtest/migration-test.c | 11 +++++------ > 1 file changed, 5 insertions(+), 6 deletions(-) Did you test your patch in the gitlab-CI? I just added it to my testing-next branch and the the test is failing reproducibly on macOS here: https://gitlab.com/thuth/qemu/-/jobs/2736260861#L6275 https://gitlab.com/thuth/qemu/-/jobs/2736623914#L6275 (without your patch the whole test is skipped instead) Thomas
Hi, Thomas, On Mon, Jul 18, 2022 at 08:23:26PM +0200, Thomas Huth wrote: > On 07/07/2022 20.46, Peter Xu wrote: > > We used to stop running all tests if uffd is not detected. However > > logically that's only needed for postcopy not the rest of tests. > > > > Keep running the rest when still possible. > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > tests/qtest/migration-test.c | 11 +++++------ > > 1 file changed, 5 insertions(+), 6 deletions(-) > > Did you test your patch in the gitlab-CI? I just added it to my testing-next > branch and the the test is failing reproducibly on macOS here: > > https://gitlab.com/thuth/qemu/-/jobs/2736260861#L6275 > https://gitlab.com/thuth/qemu/-/jobs/2736623914#L6275 > > (without your patch the whole test is skipped instead) Thanks for reporting this. Is it easy to figure out which test was failing on your side? I cannot easily reproduce this here on a MacOS with M1. Or any hint on how I could kick the same CI as you do would help too. I remembered I used to kick the test after any push with .gitlab-ci.yml but it seems it's not triggering for some reason here.
On Mon, Jul 18, 2022 at 03:14:37PM -0400, Peter Xu wrote: > Hi, Thomas, > > On Mon, Jul 18, 2022 at 08:23:26PM +0200, Thomas Huth wrote: > > On 07/07/2022 20.46, Peter Xu wrote: > > > We used to stop running all tests if uffd is not detected. However > > > logically that's only needed for postcopy not the rest of tests. > > > > > > Keep running the rest when still possible. > > > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > > --- > > > tests/qtest/migration-test.c | 11 +++++------ > > > 1 file changed, 5 insertions(+), 6 deletions(-) > > > > Did you test your patch in the gitlab-CI? I just added it to my testing-next > > branch and the the test is failing reproducibly on macOS here: > > > > https://gitlab.com/thuth/qemu/-/jobs/2736260861#L6275 > > https://gitlab.com/thuth/qemu/-/jobs/2736623914#L6275 > > > > (without your patch the whole test is skipped instead) > > Thanks for reporting this. > > Is it easy to figure out which test was failing on your side? I cannot > easily reproduce this here on a MacOS with M1. > > Or any hint on how I could kick the same CI as you do would help too. I > remembered I used to kick the test after any push with .gitlab-ci.yml but > it seems it's not triggering for some reason here. It is now opt-in with gitlab, 'git push -o ci.variable=QEMU_CI=1' to create the pipeline, then in the UI manually start the jobs you wish to run. Or QEMU_CI=2 to auto-run everything. Note for MacOS you'll need to configure Cirrus CI integration first though, per .gitlab-ci.d/cirrus/README With regards, Daniel
On 18/07/2022 21.14, Peter Xu wrote: > Hi, Thomas, > > On Mon, Jul 18, 2022 at 08:23:26PM +0200, Thomas Huth wrote: >> On 07/07/2022 20.46, Peter Xu wrote: >>> We used to stop running all tests if uffd is not detected. However >>> logically that's only needed for postcopy not the rest of tests. >>> >>> Keep running the rest when still possible. >>> >>> Signed-off-by: Peter Xu <peterx@redhat.com> >>> --- >>> tests/qtest/migration-test.c | 11 +++++------ >>> 1 file changed, 5 insertions(+), 6 deletions(-) >> >> Did you test your patch in the gitlab-CI? I just added it to my testing-next >> branch and the the test is failing reproducibly on macOS here: >> >> https://gitlab.com/thuth/qemu/-/jobs/2736260861#L6275 >> https://gitlab.com/thuth/qemu/-/jobs/2736623914#L6275 >> >> (without your patch the whole test is skipped instead) > > Thanks for reporting this. > > Is it easy to figure out which test was failing on your side? I cannot > easily reproduce this here on a MacOS with M1. I've modified the yml file to only run the migration test in verbose mode and got this: ... ok 5 /x86_64/migration/validate_uuid_src_not_set # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon chardev=char0,mode=control -display none -accel kvm -accel tcg -name source,debug-threads=on -m 150M -serial file:/tmp/migration-test-ef2fMr/src_serial -drive file=/tmp/migration-test-ef2fMr/bootsect,format=raw -uuid 11111111-1111-1111-1111-111111111111 2>/dev/null -accel qtest # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon chardev=char0,mode=control -display none -accel kvm -accel tcg -name target,debug-threads=on -m 150M -serial file:/tmp/migration-test-ef2fMr/dest_serial -incoming unix:/tmp/migration-test-ef2fMr/migsocket -drive file=/tmp/migration-test-ef2fMr/bootsect,format=raw 2>/dev/null -accel qtest ok 6 /x86_64/migration/validate_uuid_dst_not_set # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon chardev=char0,mode=control -display none -accel kvm -accel tcg -name source,debug-threads=on -m 150M -serial file:/tmp/migration-test-ef2fMr/src_serial -drive file=/tmp/migration-test-ef2fMr/bootsect,format=raw -accel qtest # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon chardev=char0,mode=control -display none -accel kvm -accel tcg -name target,debug-threads=on -m 150M -serial file:/tmp/migration-test-ef2fMr/dest_serial -incoming unix:/tmp/migration-test-ef2fMr/migsocket -drive file=/tmp/migration-test-ef2fMr/bootsect,format=raw -accel qtest ** ERROR:../tests/qtest/migration-helpers.c:181:wait_for_migration_status: assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) Bail out! ERROR:../tests/qtest/migration-helpers.c:181:wait_for_migration_status: assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) qemu-system-x86_64: failed to save SaveStateEntry with id(name): 2(ram): -5 qemu-system-x86_64: Unable to write to socket: Broken pipe /var/folders/tn/f_9sf1xx5t14qm_6f83q3b840000gn/T/scripts81855ad8681d0d86d1e91e00167939cb.sh: line 9: 58011 Abort trap: 6 QTEST_QEMU_BINARY=./qemu-system-x86_64 tests/qtest/migration-test (see: https://cirrus-ci.com/task/5719789887815680?logs=build#L7205 ) So it seems like validate_uuid_dst_not_set was the last successful test, and it's likely failing with test_migrate_auto_converge ? > Or any hint on how I could kick the same CI as you do would help too. I > remembered I used to kick the test after any push with .gitlab-ci.yml but > it seems it's not triggering for some reason here. As Daniel already said, you need to set up Cirrus-CI according to .gitlab-ci.d/cirrus/README.rst to get the macOS jobs in your CI. Thomas
On Tue, Jul 19, 2022 at 12:28:24PM +0200, Thomas Huth wrote: > On 18/07/2022 21.14, Peter Xu wrote: > > Hi, Thomas, > > > > On Mon, Jul 18, 2022 at 08:23:26PM +0200, Thomas Huth wrote: > > > On 07/07/2022 20.46, Peter Xu wrote: > > > > We used to stop running all tests if uffd is not detected. However > > > > logically that's only needed for postcopy not the rest of tests. > > > > > > > > Keep running the rest when still possible. > > > > > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > > > --- > > > > tests/qtest/migration-test.c | 11 +++++------ > > > > 1 file changed, 5 insertions(+), 6 deletions(-) > > > > > > Did you test your patch in the gitlab-CI? I just added it to my testing-next > > > branch and the the test is failing reproducibly on macOS here: > > > > > > https://gitlab.com/thuth/qemu/-/jobs/2736260861#L6275 > > > https://gitlab.com/thuth/qemu/-/jobs/2736623914#L6275 > > > > > > (without your patch the whole test is skipped instead) > > > > Thanks for reporting this. > > > > Is it easy to figure out which test was failing on your side? I cannot > > easily reproduce this here on a MacOS with M1. > > I've modified the yml file to only run the migration test in verbose mode > and got this: > > ... > ok 5 /x86_64/migration/validate_uuid_src_not_set > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > source,debug-threads=on -m 150M -serial > file:/tmp/migration-test-ef2fMr/src_serial -drive > file=/tmp/migration-test-ef2fMr/bootsect,format=raw -uuid > 11111111-1111-1111-1111-111111111111 2>/dev/null -accel qtest > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > target,debug-threads=on -m 150M -serial > file:/tmp/migration-test-ef2fMr/dest_serial -incoming > unix:/tmp/migration-test-ef2fMr/migsocket -drive > file=/tmp/migration-test-ef2fMr/bootsect,format=raw 2>/dev/null -accel > qtest > ok 6 /x86_64/migration/validate_uuid_dst_not_set > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > source,debug-threads=on -m 150M -serial > file:/tmp/migration-test-ef2fMr/src_serial -drive > file=/tmp/migration-test-ef2fMr/bootsect,format=raw -accel qtest > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > target,debug-threads=on -m 150M -serial > file:/tmp/migration-test-ef2fMr/dest_serial -incoming > unix:/tmp/migration-test-ef2fMr/migsocket -drive > file=/tmp/migration-test-ef2fMr/bootsect,format=raw -accel qtest > ** > ERROR:../tests/qtest/migration-helpers.c:181:wait_for_migration_status: > assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) > Bail out! > ERROR:../tests/qtest/migration-helpers.c:181:wait_for_migration_status: > assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) This is the safety net we put it to catch case where the test has got stuck. It is set at 2 minutes. There's a chance that is too short, so one first step might be to increase to 10 minutes and see if the tests pass. If it still fails, then its likely a genuine bug > qemu-system-x86_64: failed to save SaveStateEntry with id(name): 2(ram): -5 > qemu-system-x86_64: Unable to write to socket: Broken pipe > /var/folders/tn/f_9sf1xx5t14qm_6f83q3b840000gn/T/scripts81855ad8681d0d86d1e91e00167939cb.sh: > line 9: 58011 Abort trap: 6 QTEST_QEMU_BINARY=./qemu-system-x86_64 > tests/qtest/migration-test > > (see: https://cirrus-ci.com/task/5719789887815680?logs=build#L7205 ) > > So it seems like validate_uuid_dst_not_set was the last successful test, and > it's likely failing with test_migrate_auto_converge ? Agreed, looks like auto_converge test, which is the first test that actually tries to run a migration to completion. With regards, Daniel
On Tue, Jul 19, 2022 at 11:37:55AM +0100, Daniel P. Berrangé wrote: > On Tue, Jul 19, 2022 at 12:28:24PM +0200, Thomas Huth wrote: > > On 18/07/2022 21.14, Peter Xu wrote: > > > Hi, Thomas, > > > > > > On Mon, Jul 18, 2022 at 08:23:26PM +0200, Thomas Huth wrote: > > > > On 07/07/2022 20.46, Peter Xu wrote: > > > > > We used to stop running all tests if uffd is not detected. However > > > > > logically that's only needed for postcopy not the rest of tests. > > > > > > > > > > Keep running the rest when still possible. > > > > > > > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > > > > --- > > > > > tests/qtest/migration-test.c | 11 +++++------ > > > > > 1 file changed, 5 insertions(+), 6 deletions(-) > > > > > > > > Did you test your patch in the gitlab-CI? I just added it to my testing-next > > > > branch and the the test is failing reproducibly on macOS here: > > > > > > > > https://gitlab.com/thuth/qemu/-/jobs/2736260861#L6275 > > > > https://gitlab.com/thuth/qemu/-/jobs/2736623914#L6275 > > > > > > > > (without your patch the whole test is skipped instead) > > > > > > Thanks for reporting this. > > > > > > Is it easy to figure out which test was failing on your side? I cannot > > > easily reproduce this here on a MacOS with M1. > > > > I've modified the yml file to only run the migration test in verbose mode > > and got this: > > > > ... > > ok 5 /x86_64/migration/validate_uuid_src_not_set > > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > > source,debug-threads=on -m 150M -serial > > file:/tmp/migration-test-ef2fMr/src_serial -drive > > file=/tmp/migration-test-ef2fMr/bootsect,format=raw -uuid > > 11111111-1111-1111-1111-111111111111 2>/dev/null -accel qtest > > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > > target,debug-threads=on -m 150M -serial > > file:/tmp/migration-test-ef2fMr/dest_serial -incoming > > unix:/tmp/migration-test-ef2fMr/migsocket -drive > > file=/tmp/migration-test-ef2fMr/bootsect,format=raw 2>/dev/null -accel > > qtest > > ok 6 /x86_64/migration/validate_uuid_dst_not_set > > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > > source,debug-threads=on -m 150M -serial > > file:/tmp/migration-test-ef2fMr/src_serial -drive > > file=/tmp/migration-test-ef2fMr/bootsect,format=raw -accel qtest > > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > > target,debug-threads=on -m 150M -serial > > file:/tmp/migration-test-ef2fMr/dest_serial -incoming > > unix:/tmp/migration-test-ef2fMr/migsocket -drive > > file=/tmp/migration-test-ef2fMr/bootsect,format=raw -accel qtest > > ** > > ERROR:../tests/qtest/migration-helpers.c:181:wait_for_migration_status: > > assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) > > Bail out! > > ERROR:../tests/qtest/migration-helpers.c:181:wait_for_migration_status: > > assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) > > This is the safety net we put it to catch case where the test has > got stuck. It is set at 2 minutes. > > There's a chance that is too short, so one first step might be to > increase to 10 minutes and see if the tests pass. If it still fails, > then its likely a genuine bug Agreed, it worths another try. Thanks both for your answers on CI. I wanted to go through the setup of Cirrus CI and kick it myself, but I got stuck at the step on generating the API token for Cirrus. It seems the button to generate API token just didn't have a respond for me until I refresh the page (then I can see some token generated), however I still haven't figured out a way to see the initial 6 letters since they'll be always masked out.. Changing browser didn't work for me either. :(
On 19/07/2022 21.53, Peter Xu wrote: ... > It seems the button to generate API token just didn't have a respond for me > until I refresh the page (then I can see some token generated), however I > still haven't figured out a way to see the initial 6 letters since they'll > be always masked out.. Changing browser didn't work for me either. :( I haven't tried in a while, but IIRC the token is indeed only shown at the first access - and if that's not happening for you, then there is likely something broken. Are you using some plug in like uMatrix or the like? Maybe it helps to switch that off? Thomas
On Wed, Jul 20, 2022 at 12:52:02PM +0200, Thomas Huth wrote: > On 19/07/2022 21.53, Peter Xu wrote: > ... > > It seems the button to generate API token just didn't have a respond for me > > until I refresh the page (then I can see some token generated), however I > > still haven't figured out a way to see the initial 6 letters since they'll > > be always masked out.. Changing browser didn't work for me either. :( > > I haven't tried in a while, but IIRC the token is indeed only shown at the > first access - and if that's not happening for you, then there is likely > something broken. Are you using some plug in like uMatrix or the like? Maybe > it helps to switch that off? Sadly no, besides the Chrome I commonly use I also tried a fresh new Firefox and Safari on different hosts. None worked for me..
On 19/07/2022 12.37, Daniel P. Berrangé wrote: > On Tue, Jul 19, 2022 at 12:28:24PM +0200, Thomas Huth wrote: >> On 18/07/2022 21.14, Peter Xu wrote: >>> Hi, Thomas, >>> >>> On Mon, Jul 18, 2022 at 08:23:26PM +0200, Thomas Huth wrote: >>>> On 07/07/2022 20.46, Peter Xu wrote: >>>>> We used to stop running all tests if uffd is not detected. However >>>>> logically that's only needed for postcopy not the rest of tests. >>>>> >>>>> Keep running the rest when still possible. >>>>> >>>>> Signed-off-by: Peter Xu <peterx@redhat.com> >>>>> --- >>>>> tests/qtest/migration-test.c | 11 +++++------ >>>>> 1 file changed, 5 insertions(+), 6 deletions(-) >>>> >>>> Did you test your patch in the gitlab-CI? I just added it to my testing-next >>>> branch and the the test is failing reproducibly on macOS here: >>>> >>>> https://gitlab.com/thuth/qemu/-/jobs/2736260861#L6275 >>>> https://gitlab.com/thuth/qemu/-/jobs/2736623914#L6275 >>>> >>>> (without your patch the whole test is skipped instead) >>> >>> Thanks for reporting this. >>> >>> Is it easy to figure out which test was failing on your side? I cannot >>> easily reproduce this here on a MacOS with M1. >> >> I've modified the yml file to only run the migration test in verbose mode >> and got this: >> >> ... >> ok 5 /x86_64/migration/validate_uuid_src_not_set >> # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock >> -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon >> chardev=char0,mode=control -display none -accel kvm -accel tcg -name >> source,debug-threads=on -m 150M -serial >> file:/tmp/migration-test-ef2fMr/src_serial -drive >> file=/tmp/migration-test-ef2fMr/bootsect,format=raw -uuid >> 11111111-1111-1111-1111-111111111111 2>/dev/null -accel qtest >> # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock >> -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon >> chardev=char0,mode=control -display none -accel kvm -accel tcg -name >> target,debug-threads=on -m 150M -serial >> file:/tmp/migration-test-ef2fMr/dest_serial -incoming >> unix:/tmp/migration-test-ef2fMr/migsocket -drive >> file=/tmp/migration-test-ef2fMr/bootsect,format=raw 2>/dev/null -accel >> qtest >> ok 6 /x86_64/migration/validate_uuid_dst_not_set >> # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock >> -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon >> chardev=char0,mode=control -display none -accel kvm -accel tcg -name >> source,debug-threads=on -m 150M -serial >> file:/tmp/migration-test-ef2fMr/src_serial -drive >> file=/tmp/migration-test-ef2fMr/bootsect,format=raw -accel qtest >> # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock >> -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon >> chardev=char0,mode=control -display none -accel kvm -accel tcg -name >> target,debug-threads=on -m 150M -serial >> file:/tmp/migration-test-ef2fMr/dest_serial -incoming >> unix:/tmp/migration-test-ef2fMr/migsocket -drive >> file=/tmp/migration-test-ef2fMr/bootsect,format=raw -accel qtest >> ** >> ERROR:../tests/qtest/migration-helpers.c:181:wait_for_migration_status: >> assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) >> Bail out! >> ERROR:../tests/qtest/migration-helpers.c:181:wait_for_migration_status: >> assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) > > This is the safety net we put it to catch case where the test has > got stuck. It is set at 2 minutes. > > There's a chance that is too short, so one first step might be to > increase to 10 minutes and see if the tests pass. If it still fails, > then its likely a genuine bug I tried to increase it to 5 minutes first, but that did not help. In a second try, I increased it to 10 minutes, and then the test was passing, indeed: https://cirrus-ci.com/task/5819072351830016?logs=build#L7208 Could it maybe be accelerated, e.g. by tweaking the downtime limit again? Thomas
On Wed, Jul 20, 2022 at 04:11:43PM +0200, Thomas Huth wrote: > On 19/07/2022 12.37, Daniel P. Berrangé wrote: > > On Tue, Jul 19, 2022 at 12:28:24PM +0200, Thomas Huth wrote: > > > On 18/07/2022 21.14, Peter Xu wrote: > > > > Hi, Thomas, > > > > > > > > On Mon, Jul 18, 2022 at 08:23:26PM +0200, Thomas Huth wrote: > > > > > On 07/07/2022 20.46, Peter Xu wrote: > > > > > > We used to stop running all tests if uffd is not detected. However > > > > > > logically that's only needed for postcopy not the rest of tests. > > > > > > > > > > > > Keep running the rest when still possible. > > > > > > > > > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > > > > > --- > > > > > > tests/qtest/migration-test.c | 11 +++++------ > > > > > > 1 file changed, 5 insertions(+), 6 deletions(-) > > > > > > > > > > Did you test your patch in the gitlab-CI? I just added it to my testing-next > > > > > branch and the the test is failing reproducibly on macOS here: > > > > > > > > > > https://gitlab.com/thuth/qemu/-/jobs/2736260861#L6275 > > > > > https://gitlab.com/thuth/qemu/-/jobs/2736623914#L6275 > > > > > > > > > > (without your patch the whole test is skipped instead) > > > > > > > > Thanks for reporting this. > > > > > > > > Is it easy to figure out which test was failing on your side? I cannot > > > > easily reproduce this here on a MacOS with M1. > > > > > > I've modified the yml file to only run the migration test in verbose mode > > > and got this: > > > > > > ... > > > ok 5 /x86_64/migration/validate_uuid_src_not_set > > > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > > > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > > > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > > > source,debug-threads=on -m 150M -serial > > > file:/tmp/migration-test-ef2fMr/src_serial -drive > > > file=/tmp/migration-test-ef2fMr/bootsect,format=raw -uuid > > > 11111111-1111-1111-1111-111111111111 2>/dev/null -accel qtest > > > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > > > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > > > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > > > target,debug-threads=on -m 150M -serial > > > file:/tmp/migration-test-ef2fMr/dest_serial -incoming > > > unix:/tmp/migration-test-ef2fMr/migsocket -drive > > > file=/tmp/migration-test-ef2fMr/bootsect,format=raw 2>/dev/null -accel > > > qtest > > > ok 6 /x86_64/migration/validate_uuid_dst_not_set > > > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > > > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > > > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > > > source,debug-threads=on -m 150M -serial > > > file:/tmp/migration-test-ef2fMr/src_serial -drive > > > file=/tmp/migration-test-ef2fMr/bootsect,format=raw -accel qtest > > > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > > > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > > > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > > > target,debug-threads=on -m 150M -serial > > > file:/tmp/migration-test-ef2fMr/dest_serial -incoming > > > unix:/tmp/migration-test-ef2fMr/migsocket -drive > > > file=/tmp/migration-test-ef2fMr/bootsect,format=raw -accel qtest > > > ** > > > ERROR:../tests/qtest/migration-helpers.c:181:wait_for_migration_status: > > > assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) > > > Bail out! > > > ERROR:../tests/qtest/migration-helpers.c:181:wait_for_migration_status: > > > assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) > > > > This is the safety net we put it to catch case where the test has > > got stuck. It is set at 2 minutes. > > > > There's a chance that is too short, so one first step might be to > > increase to 10 minutes and see if the tests pass. If it still fails, > > then its likely a genuine bug > > I tried to increase it to 5 minutes first, but that did not help. In a > second try, I increased it to 10 minutes, and then the test was passing, > indeed: > > https://cirrus-ci.com/task/5819072351830016?logs=build#L7208 > > Could it maybe be accelerated, e.g. by tweaking the downtime limit again? Oh when I tweaked convergance tunables i missed the auto-converge case as its code looks a bit different. Possibly change test_migrate_auto_converge /* Now, when we tested that throttling works, let it converge */ migrate_set_parameter_int(from, "downtime-limit", downtime_limit); migrate_set_parameter_int(from, "max-bandwidth", max_bandwidth); to migrate_ensure_converge(from); With regards, Daniel
On Wed, Jul 20, 2022 at 03:32:20PM +0100, Daniel P. Berrangé wrote: > On Wed, Jul 20, 2022 at 04:11:43PM +0200, Thomas Huth wrote: > > On 19/07/2022 12.37, Daniel P. Berrangé wrote: > > > On Tue, Jul 19, 2022 at 12:28:24PM +0200, Thomas Huth wrote: > > > > On 18/07/2022 21.14, Peter Xu wrote: > > > > > Hi, Thomas, > > > > > > > > > > On Mon, Jul 18, 2022 at 08:23:26PM +0200, Thomas Huth wrote: > > > > > > On 07/07/2022 20.46, Peter Xu wrote: > > > > > > > We used to stop running all tests if uffd is not detected. However > > > > > > > logically that's only needed for postcopy not the rest of tests. > > > > > > > > > > > > > > Keep running the rest when still possible. > > > > > > > > > > > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > > > > > > --- > > > > > > > tests/qtest/migration-test.c | 11 +++++------ > > > > > > > 1 file changed, 5 insertions(+), 6 deletions(-) > > > > > > > > > > > > Did you test your patch in the gitlab-CI? I just added it to my testing-next > > > > > > branch and the the test is failing reproducibly on macOS here: > > > > > > > > > > > > https://gitlab.com/thuth/qemu/-/jobs/2736260861#L6275 > > > > > > https://gitlab.com/thuth/qemu/-/jobs/2736623914#L6275 > > > > > > > > > > > > (without your patch the whole test is skipped instead) > > > > > > > > > > Thanks for reporting this. > > > > > > > > > > Is it easy to figure out which test was failing on your side? I cannot > > > > > easily reproduce this here on a MacOS with M1. > > > > > > > > I've modified the yml file to only run the migration test in verbose mode > > > > and got this: > > > > > > > > ... > > > > ok 5 /x86_64/migration/validate_uuid_src_not_set > > > > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > > > > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > > > > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > > > > source,debug-threads=on -m 150M -serial > > > > file:/tmp/migration-test-ef2fMr/src_serial -drive > > > > file=/tmp/migration-test-ef2fMr/bootsect,format=raw -uuid > > > > 11111111-1111-1111-1111-111111111111 2>/dev/null -accel qtest > > > > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > > > > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > > > > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > > > > target,debug-threads=on -m 150M -serial > > > > file:/tmp/migration-test-ef2fMr/dest_serial -incoming > > > > unix:/tmp/migration-test-ef2fMr/migsocket -drive > > > > file=/tmp/migration-test-ef2fMr/bootsect,format=raw 2>/dev/null -accel > > > > qtest > > > > ok 6 /x86_64/migration/validate_uuid_dst_not_set > > > > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > > > > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > > > > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > > > > source,debug-threads=on -m 150M -serial > > > > file:/tmp/migration-test-ef2fMr/src_serial -drive > > > > file=/tmp/migration-test-ef2fMr/bootsect,format=raw -accel qtest > > > > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock > > > > -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon > > > > chardev=char0,mode=control -display none -accel kvm -accel tcg -name > > > > target,debug-threads=on -m 150M -serial > > > > file:/tmp/migration-test-ef2fMr/dest_serial -incoming > > > > unix:/tmp/migration-test-ef2fMr/migsocket -drive > > > > file=/tmp/migration-test-ef2fMr/bootsect,format=raw -accel qtest > > > > ** > > > > ERROR:../tests/qtest/migration-helpers.c:181:wait_for_migration_status: > > > > assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) > > > > Bail out! > > > > ERROR:../tests/qtest/migration-helpers.c:181:wait_for_migration_status: > > > > assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) > > > > > > This is the safety net we put it to catch case where the test has > > > got stuck. It is set at 2 minutes. > > > > > > There's a chance that is too short, so one first step might be to > > > increase to 10 minutes and see if the tests pass. If it still fails, > > > then its likely a genuine bug > > > > I tried to increase it to 5 minutes first, but that did not help. In a > > second try, I increased it to 10 minutes, and then the test was passing, > > indeed: > > > > https://cirrus-ci.com/task/5819072351830016?logs=build#L7208 > > > > Could it maybe be accelerated, e.g. by tweaking the downtime limit again? > > Oh when I tweaked convergance tunables i missed the auto-converge > case as its code looks a bit different. > > Possibly change test_migrate_auto_converge > > /* Now, when we tested that throttling works, let it converge */ > migrate_set_parameter_int(from, "downtime-limit", downtime_limit); > migrate_set_parameter_int(from, "max-bandwidth", max_bandwidth); > > to > > migrate_ensure_converge(from); Sounds good to me. Thomas, would that work for you too? I'm wondering whether you'd like to post a patch for that. I could have reposted both patches (including what Dan suggested) but I still have no good way to kick that macos test so I cannot verify it. Let me know if you want me to post those, I can do it (and test as much as I could) but I may need some help on kicking a test to verify it. Thanks!
On 21/07/2022 20.24, Peter Xu wrote: > On Wed, Jul 20, 2022 at 03:32:20PM +0100, Daniel P. Berrangé wrote: >> On Wed, Jul 20, 2022 at 04:11:43PM +0200, Thomas Huth wrote: >>> On 19/07/2022 12.37, Daniel P. Berrangé wrote: >>>> On Tue, Jul 19, 2022 at 12:28:24PM +0200, Thomas Huth wrote: >>>>> On 18/07/2022 21.14, Peter Xu wrote: >>>>>> Hi, Thomas, >>>>>> >>>>>> On Mon, Jul 18, 2022 at 08:23:26PM +0200, Thomas Huth wrote: >>>>>>> On 07/07/2022 20.46, Peter Xu wrote: >>>>>>>> We used to stop running all tests if uffd is not detected. However >>>>>>>> logically that's only needed for postcopy not the rest of tests. >>>>>>>> >>>>>>>> Keep running the rest when still possible. >>>>>>>> >>>>>>>> Signed-off-by: Peter Xu <peterx@redhat.com> >>>>>>>> --- >>>>>>>> tests/qtest/migration-test.c | 11 +++++------ >>>>>>>> 1 file changed, 5 insertions(+), 6 deletions(-) >>>>>>> >>>>>>> Did you test your patch in the gitlab-CI? I just added it to my testing-next >>>>>>> branch and the the test is failing reproducibly on macOS here: >>>>>>> >>>>>>> https://gitlab.com/thuth/qemu/-/jobs/2736260861#L6275 >>>>>>> https://gitlab.com/thuth/qemu/-/jobs/2736623914#L6275 >>>>>>> >>>>>>> (without your patch the whole test is skipped instead) >>>>>> >>>>>> Thanks for reporting this. >>>>>> >>>>>> Is it easy to figure out which test was failing on your side? I cannot >>>>>> easily reproduce this here on a MacOS with M1. >>>>> >>>>> I've modified the yml file to only run the migration test in verbose mode >>>>> and got this: >>>>> >>>>> ... >>>>> ok 5 /x86_64/migration/validate_uuid_src_not_set >>>>> # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock >>>>> -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon >>>>> chardev=char0,mode=control -display none -accel kvm -accel tcg -name >>>>> source,debug-threads=on -m 150M -serial >>>>> file:/tmp/migration-test-ef2fMr/src_serial -drive >>>>> file=/tmp/migration-test-ef2fMr/bootsect,format=raw -uuid >>>>> 11111111-1111-1111-1111-111111111111 2>/dev/null -accel qtest >>>>> # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock >>>>> -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon >>>>> chardev=char0,mode=control -display none -accel kvm -accel tcg -name >>>>> target,debug-threads=on -m 150M -serial >>>>> file:/tmp/migration-test-ef2fMr/dest_serial -incoming >>>>> unix:/tmp/migration-test-ef2fMr/migsocket -drive >>>>> file=/tmp/migration-test-ef2fMr/bootsect,format=raw 2>/dev/null -accel >>>>> qtest >>>>> ok 6 /x86_64/migration/validate_uuid_dst_not_set >>>>> # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock >>>>> -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon >>>>> chardev=char0,mode=control -display none -accel kvm -accel tcg -name >>>>> source,debug-threads=on -m 150M -serial >>>>> file:/tmp/migration-test-ef2fMr/src_serial -drive >>>>> file=/tmp/migration-test-ef2fMr/bootsect,format=raw -accel qtest >>>>> # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-58011.sock >>>>> -qtest-log /dev/null -chardev socket,path=/tmp/qtest-58011.qmp,id=char0 -mon >>>>> chardev=char0,mode=control -display none -accel kvm -accel tcg -name >>>>> target,debug-threads=on -m 150M -serial >>>>> file:/tmp/migration-test-ef2fMr/dest_serial -incoming >>>>> unix:/tmp/migration-test-ef2fMr/migsocket -drive >>>>> file=/tmp/migration-test-ef2fMr/bootsect,format=raw -accel qtest >>>>> ** >>>>> ERROR:../tests/qtest/migration-helpers.c:181:wait_for_migration_status: >>>>> assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) >>>>> Bail out! >>>>> ERROR:../tests/qtest/migration-helpers.c:181:wait_for_migration_status: >>>>> assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) >>>> >>>> This is the safety net we put it to catch case where the test has >>>> got stuck. It is set at 2 minutes. >>>> >>>> There's a chance that is too short, so one first step might be to >>>> increase to 10 minutes and see if the tests pass. If it still fails, >>>> then its likely a genuine bug >>> >>> I tried to increase it to 5 minutes first, but that did not help. In a >>> second try, I increased it to 10 minutes, and then the test was passing, >>> indeed: >>> >>> https://cirrus-ci.com/task/5819072351830016?logs=build#L7208 >>> >>> Could it maybe be accelerated, e.g. by tweaking the downtime limit again? >> >> Oh when I tweaked convergance tunables i missed the auto-converge >> case as its code looks a bit different. >> >> Possibly change test_migrate_auto_converge >> >> /* Now, when we tested that throttling works, let it converge */ >> migrate_set_parameter_int(from, "downtime-limit", downtime_limit); >> migrate_set_parameter_int(from, "max-bandwidth", max_bandwidth); >> >> to >> >> migrate_ensure_converge(from); > > Sounds good to me. > > Thomas, would that work for you too? I'm wondering whether you'd like to > post a patch for that. > > I could have reposted both patches (including what Dan suggested) but I > still have no good way to kick that macos test so I cannot verify it. Let > me know if you want me to post those, I can do it (and test as much as I > could) but I may need some help on kicking a test to verify it. Please go ahead and post the patches - I'll then try to provide a Tested-by as soon as possible. Thomas
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index 9e64125f02..55acf9612c 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -2086,14 +2086,11 @@ int main(int argc, char **argv) { char template[] = "/tmp/migration-test-XXXXXX"; const bool has_kvm = qtest_has_accel("kvm"); + const bool has_uffd = ufd_version_check(); int ret; g_test_init(&argc, &argv, NULL); - if (!ufd_version_check()) { - return g_test_run(); - } - /* * On ppc64, the test only works with kvm-hv, but not with kvm-pr and TCG * is touchy due to race conditions on dirty bits (especially on PPC for @@ -2122,8 +2119,10 @@ int main(int argc, char **argv) module_call_init(MODULE_INIT_QOM); - qtest_add_func("/migration/postcopy/unix", test_postcopy); - qtest_add_func("/migration/postcopy/recovery", test_postcopy_recovery); + if (has_uffd) { + qtest_add_func("/migration/postcopy/unix", test_postcopy); + qtest_add_func("/migration/postcopy/recovery", test_postcopy_recovery); + } qtest_add_func("/migration/bad_dest", test_baddest); qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain); qtest_add_func("/migration/precopy/unix/xbzrle", test_precopy_unix_xbzrle);
We used to stop running all tests if uffd is not detected. However logically that's only needed for postcopy not the rest of tests. Keep running the rest when still possible. Signed-off-by: Peter Xu <peterx@redhat.com> --- tests/qtest/migration-test.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)