diff mbox series

[RFC,3/3] tests/qtest/migration-test: Enable test_ignore_shared

Message ID 20240525131241.378473-4-npiggin@gmail.com
State New
Headers show
Series Fix s390x flic migration and add some more qtests | expand

Commit Message

Nicholas Piggin May 25, 2024, 1:12 p.m. UTC
This was said to be broken on aarch64, but if it works on others,
let's try enable it. It's already starting to bitrot...

Cc: Yury Kotov <yury-kotov@yandex-team.ru>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 tests/qtest/migration-test.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

Comments

Fabiano Rosas May 27, 2024, 12:42 p.m. UTC | #1
Nicholas Piggin <npiggin@gmail.com> writes:

> This was said to be broken on aarch64, but if it works on others,
> let's try enable it. It's already starting to bitrot...

Yeah, look at the state of this...

I don't know what the issue was on aarch64, but I'm all for enabling
this test globally and then we deal with the breakage if it ever
comes. I don't think it will.

However, there is an issue here still on all archs - which might very
well have been the original issue - which is the fact that the
containers on the Gitlab CI have limits on shared memory usage.
Unfortunately we cannot enable this test for the CI, so it needs a check
on the GITLAB_CI environment variable.

There's also the cpr-reboot test which got put under "flaky", that has
the same issue. That one should also have been under GITLAB_CI. From
that discussion:

  "We have an issue with this test on CI:
  
  $ df -h /dev/shm
  Filesystem      Size  Used Avail Use% Mounted on
  shm              64M     0   64M   0% /dev/shm
  
  These are shared CI runners, so AFAICT there's no way to increase the
  shared memory size.
  
  Reducing the memory for this single test also wouldn't work because we
  can run migration-test for different archs in parallel + there's the
  ivshmem_test which uses 4M.
  
  Maybe just leave it out of CI? Laptops will probably have enough shared
  memory to not hit this. If we add a warning comment to the test, might
  be enough." -- https://lore.kernel.org/all/87ttq5fvh7.fsf@suse.de

>
> Cc: Yury Kotov <yury-kotov@yandex-team.ru>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  tests/qtest/migration-test.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index 7987faaded..2bcdc33b7c 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -1862,14 +1862,15 @@ static void test_precopy_unix_tls_x509_override_host(void)
>  #endif /* CONFIG_TASN1 */
>  #endif /* CONFIG_GNUTLS */
>  
> -#if 0
> -/* Currently upset on aarch64 TCG */
>  static void test_ignore_shared(void)
>  {
>      g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
>      QTestState *from, *to;
> +    MigrateStart args = {
> +        .use_shmem = true,
> +    };
>  
> -    if (test_migrate_start(&from, &to, uri, false, true, NULL, NULL)) {
> +    if (test_migrate_start(&from, &to, uri, &args)) {
>          return;
>      }
>  
> @@ -1898,7 +1899,6 @@ static void test_ignore_shared(void)
>  
>      test_migrate_end(from, to, true);
>  }
> -#endif
>  
>  static void *
>  test_migrate_xbzrle_start(QTestState *from,
> @@ -3537,7 +3537,10 @@ int main(int argc, char **argv)
>  #endif /* CONFIG_TASN1 */
>  #endif /* CONFIG_GNUTLS */
>  
> -    /* migration_test_add("/migration/ignore_shared", test_ignore_shared); */
> +    if (strcmp(arch, "aarch64") == 0) { /* Currently upset on aarch64 TCG */
> +        migration_test_add("/migration/ignore_shared", test_ignore_shared);
> +    }
> +
>  #ifndef _WIN32
>      migration_test_add("/migration/precopy/fd/tcp",
>                         test_migrate_precopy_fd_socket);
Peter Xu May 27, 2024, 2:56 p.m. UTC | #2
On Mon, May 27, 2024 at 09:42:28AM -0300, Fabiano Rosas wrote:
> However, there is an issue here still on all archs - which might very
> well have been the original issue - which is the fact that the
> containers on the Gitlab CI have limits on shared memory usage.
> Unfortunately we cannot enable this test for the CI, so it needs a check
> on the GITLAB_CI environment variable.

Another option is we teach migration-test to detect whether memory_size of
shmem is available, skip if not.  It can be a sequence of:

  memfd_create()
  fallocate()
  ret = madvise(MADV_POPULATE_WRITE)

To be run at the entry of migration-test, and skip all use_shmem=true tests
if ret != 0, or any step failed above.

Thanks,
Fabiano Rosas May 27, 2024, 3:11 p.m. UTC | #3
Peter Xu <peterx@redhat.com> writes:

> On Mon, May 27, 2024 at 09:42:28AM -0300, Fabiano Rosas wrote:
>> However, there is an issue here still on all archs - which might very
>> well have been the original issue - which is the fact that the
>> containers on the Gitlab CI have limits on shared memory usage.
>> Unfortunately we cannot enable this test for the CI, so it needs a check
>> on the GITLAB_CI environment variable.
>
> Another option is we teach migration-test to detect whether memory_size of
> shmem is available, skip if not.  It can be a sequence of:
>
>   memfd_create()
>   fallocate()
>   ret = madvise(MADV_POPULATE_WRITE)
>
> To be run at the entry of migration-test, and skip all use_shmem=true tests
> if ret != 0, or any step failed above.

There are actually two issues:

1) Trying to run a test that needs more shmem than available in the
container. This is covered well by your suggestion.

2) Trying to use some shmem while another test has already consumed all
shmem. I'm not sure if this can be done reliably as the tests run in
parallel.
Peter Xu May 27, 2024, 3:41 p.m. UTC | #4
On Mon, May 27, 2024 at 12:11:45PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Mon, May 27, 2024 at 09:42:28AM -0300, Fabiano Rosas wrote:
> >> However, there is an issue here still on all archs - which might very
> >> well have been the original issue - which is the fact that the
> >> containers on the Gitlab CI have limits on shared memory usage.
> >> Unfortunately we cannot enable this test for the CI, so it needs a check
> >> on the GITLAB_CI environment variable.
> >
> > Another option is we teach migration-test to detect whether memory_size of
> > shmem is available, skip if not.  It can be a sequence of:
> >
> >   memfd_create()
> >   fallocate()
> >   ret = madvise(MADV_POPULATE_WRITE)
> >
> > To be run at the entry of migration-test, and skip all use_shmem=true tests
> > if ret != 0, or any step failed above.
> 
> There are actually two issues:
> 
> 1) Trying to run a test that needs more shmem than available in the
> container. This is covered well by your suggestion.
> 
> 2) Trying to use some shmem while another test has already consumed all
> shmem. I'm not sure if this can be done reliably as the tests run in
> parallel.

Maybe we can also make that check to be per-test, then when use_shmem=true
the test populates the shmem file before using, skip if population fails.
And if it succeeded, using that file in that test should be reliable.
diff mbox series

Patch

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 7987faaded..2bcdc33b7c 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1862,14 +1862,15 @@  static void test_precopy_unix_tls_x509_override_host(void)
 #endif /* CONFIG_TASN1 */
 #endif /* CONFIG_GNUTLS */
 
-#if 0
-/* Currently upset on aarch64 TCG */
 static void test_ignore_shared(void)
 {
     g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
     QTestState *from, *to;
+    MigrateStart args = {
+        .use_shmem = true,
+    };
 
-    if (test_migrate_start(&from, &to, uri, false, true, NULL, NULL)) {
+    if (test_migrate_start(&from, &to, uri, &args)) {
         return;
     }
 
@@ -1898,7 +1899,6 @@  static void test_ignore_shared(void)
 
     test_migrate_end(from, to, true);
 }
-#endif
 
 static void *
 test_migrate_xbzrle_start(QTestState *from,
@@ -3537,7 +3537,10 @@  int main(int argc, char **argv)
 #endif /* CONFIG_TASN1 */
 #endif /* CONFIG_GNUTLS */
 
-    /* migration_test_add("/migration/ignore_shared", test_ignore_shared); */
+    if (strcmp(arch, "aarch64") == 0) { /* Currently upset on aarch64 TCG */
+        migration_test_add("/migration/ignore_shared", test_ignore_shared);
+    }
+
 #ifndef _WIN32
     migration_test_add("/migration/precopy/fd/tcp",
                        test_migrate_precopy_fd_socket);