[RFC] gitlab: introduce s390x wasmtime job

Message ID	20220704224844.2903473-1-iii@linux.ibm.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Ilya Leoshkevich <iii@linux.ibm.com> To: =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org>, =?utf-8?q?Philipp?= =?utf-8?q?e_Mathieu-Daud=C3=A9?= <f4bug@amsat.org>, Thomas Huth <thuth@redhat.com>, Wainer dos Santos Moschetta <wainersm@redhat.com>, Beraldo Leal <bleal@redhat.com>, Cornelia Huck <cohuck@redhat.com> Cc: qemu-devel@nongnu.org, qemu-s390x@nongnu.org, Christian Borntraeger <borntraeger@de.ibm.com>, Ulrich Weigand <ulrich.weigand@de.ibm.com>, Ilya Leoshkevich <iii@linux.ibm.com> Subject: [RFC] gitlab: introduce s390x wasmtime job Date: Tue, 5 Jul 2022 00:48:44 +0200 Message-Id: <20220704224844.2903473-1-iii@linux.ibm.com> Content-Transfer-Encoding: 8bit MIME-Version: 1.0 Received-SPF: pass client-ip=148.163.156.1; envelope-from=iii@linux.ibm.com; helo=mx0a-001b2d01.pphosted.com X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
Series	[RFC] gitlab: introduce s390x wasmtime job \| expand [RFC] gitlab: introduce s390x wasmtime job

Ilya Leoshkevich July 4, 2022, 10:48 p.m. UTC

wasmtime is a WebAssembly runtime, which includes a large testsuite.
This testsuite uses qemu-user (aarch64 and s390x are supported) in
order to exercise foreign architectures. Over time it found several
regressions in qemu itself, and it would be beneficial to catch the
similar ones earlier.

To this end, this patch introduces a job that runs stable wasmtime
testsuite against qemu-s390x. The job is split into the following
components:

- A script for running the tests. Usable on developers' machines:

    qemu$ mkdir build
    qemu$ cd build
    qemu/build$ ../tests/wasmtime/test s390x

- A script for building the tests (build-toolchain.sh).

- A dockerfile describing an image with the prebuilt testsuite
  (debian-s390x-wasmtime-cross.docker).

- gitlab job definition for building the image.

- gitlab job definition for using the image to run the tests.

It's possible to use this with aarch64 as well, but it segfaults at
the moment, therefore this patch does not provide job definitions for
it. This needs to be investigated separately.

The example of a resulting pipeline can be seen at [1].

The test job runs for about 30 minutes mostly due to unnecessary
rebuilds. They will be gone once [2] is integrated and makes it to a
stable release.

This patch depends on madvise(MADV_DONTNEED) passthrough support [3].

[1] https://gitlab.com/iii-i/qemu/-/pipelines/579677396
[2] https://github.com/bytecodealliance/wasmtime/pull/4377
[3] https://lists.gnu.org/archive/html/qemu-devel/2022-07/msg00112.html

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 .gitlab-ci.d/container-cross.yml              | 10 +++
 .gitlab-ci.d/container-template.yml           |  2 +-
 .gitlab-ci.d/qemu-project.yml                 |  1 +
 .gitlab-ci.d/wasmtime-template.yml            |  6 ++
 .gitlab-ci.d/wasmtime.yml                     |  9 ++
 tests/docker/Makefile.include                 |  6 ++
 .../build-toolchain.sh                        | 83 +++++++++++++++++++
 .../debian-s390x-wasmtime-cross.docker        | 16 ++++
 tests/wasmtime/test                           | 39 +++++++++
 9 files changed, 171 insertions(+), 1 deletion(-)
 create mode 100644 .gitlab-ci.d/wasmtime-template.yml
 create mode 100644 .gitlab-ci.d/wasmtime.yml
 create mode 100755 tests/docker/dockerfiles/debian-s390x-wasmtime-cross.d/build-toolchain.sh
 create mode 100644 tests/docker/dockerfiles/debian-s390x-wasmtime-cross.docker
 create mode 100755 tests/wasmtime/test

Daniel P. Berrangé July 5, 2022, 12:58 p.m. UTC | #1

On Tue, Jul 05, 2022 at 12:48:44AM +0200, Ilya Leoshkevich wrote:
> wasmtime is a WebAssembly runtime, which includes a large testsuite.
> This testsuite uses qemu-user (aarch64 and s390x are supported) in
> order to exercise foreign architectures.

So you're saying that WebAssembly itself is aware of qemu-user
in its test suite, as opposed to us simply choosing to run its
test suite under qemu-user ?

Any idea why its limited to just two arches ?  Can it be made
to cover all QEMU arches, or are these the only ones that
wasmtime knows how to generate code for ?

>                                          Over time it found several
> regressions in qemu itself, and it would be beneficial to catch the
> similar ones earlier.

If we put this job in QEMU CI someone will have to be able to
interpret the results when it fails.

How practical is it going to be for QEMU maintainers to understand
a failure in wasmtime test suite, and correlate that back to a
problem in QEMU ? The risk with introducing any significant 3rd
party project to a CI system, is the lack of knowledge around
that external project creating a signifcant burden for the CI
system maintainers.

> To this end, this patch introduces a job that runs stable wasmtime
> testsuite against qemu-s390x. The job is split into the following
> components:
> 
> - A script for running the tests. Usable on developers' machines:
> 
>     qemu$ mkdir build
>     qemu$ cd build
>     qemu/build$ ../tests/wasmtime/test s390x
> 
> - A script for building the tests (build-toolchain.sh).
> 
> - A dockerfile describing an image with the prebuilt testsuite
>   (debian-s390x-wasmtime-cross.docker).
> 
> - gitlab job definition for building the image.
> 
> - gitlab job definition for using the image to run the tests.
> 
> It's possible to use this with aarch64 as well, but it segfaults at
> the moment, therefore this patch does not provide job definitions for
> it. This needs to be investigated separately.
> 
> The example of a resulting pipeline can be seen at [1].
> 
> The test job runs for about 30 minutes mostly due to unnecessary
> rebuilds. They will be gone once [2] is integrated and makes it to a
> stable release.
> 
> This patch depends on madvise(MADV_DONTNEED) passthrough support [3].
> 
> [1] https://gitlab.com/iii-i/qemu/-/pipelines/579677396
> [2] https://github.com/bytecodealliance/wasmtime/pull/4377
> [3] https://lists.gnu.org/archive/html/qemu-devel/2022-07/msg00112.html
> 
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---
>  .gitlab-ci.d/container-cross.yml              | 10 +++
>  .gitlab-ci.d/container-template.yml           |  2 +-
>  .gitlab-ci.d/qemu-project.yml                 |  1 +
>  .gitlab-ci.d/wasmtime-template.yml            |  6 ++
>  .gitlab-ci.d/wasmtime.yml                     |  9 ++
>  tests/docker/Makefile.include                 |  6 ++
>  .../build-toolchain.sh                        | 83 +++++++++++++++++++
>  .../debian-s390x-wasmtime-cross.docker        | 16 ++++
>  tests/wasmtime/test                           | 39 +++++++++
>  9 files changed, 171 insertions(+), 1 deletion(-)
>  create mode 100644 .gitlab-ci.d/wasmtime-template.yml
>  create mode 100644 .gitlab-ci.d/wasmtime.yml
>  create mode 100755 tests/docker/dockerfiles/debian-s390x-wasmtime-cross.d/build-toolchain.sh
>  create mode 100644 tests/docker/dockerfiles/debian-s390x-wasmtime-cross.docker
>  create mode 100755 tests/wasmtime/test
> 
> diff --git a/.gitlab-ci.d/container-cross.yml b/.gitlab-ci.d/container-cross.yml
> index b7963498a3..b3c4b76a16 100644
> --- a/.gitlab-ci.d/container-cross.yml
> +++ b/.gitlab-ci.d/container-cross.yml
> @@ -138,6 +138,16 @@ s390x-debian-cross-container:
>    variables:
>      NAME: debian-s390x-cross
>  
> +s390x-debian-wasmtime-cross-container:
> +  extends: .container_job_template
> +  stage: containers
> +  needs: ['s390x-debian-cross-container']

This job is in 'containers' stage, but it depends on a container
built in the same stage. Seems gitlab manages to schedule that
by effectively delaying the the build. Suggests that actually we
can entirely get rid of 'containers-layer2' and just let gitlab
figure out the container dependencies.

> +  variables:
> +    NAME: debian-s390x-wasmtime-cross
> +    DOCKER_SCRIPT_ARGS: >
> +      --extra-files
> +      tests/docker/dockerfiles/debian-s390x-wasmtime-cross.d/build-toolchain.sh


> diff --git a/tests/docker/dockerfiles/debian-s390x-wasmtime-cross.d/build-toolchain.sh b/tests/docker/dockerfiles/debian-s390x-wasmtime-cross.d/build-toolchain.sh
> new file mode 100755
> index 0000000000..a28d61a353
> --- /dev/null
> +++ b/tests/docker/dockerfiles/debian-s390x-wasmtime-cross.d/build-toolchain.sh
> @@ -0,0 +1,83 @@
> +#!/bin/sh
> +
> +# Build the stable wasmtime testsuite and run it with qemu-user from $PATH.
> +# ".rustup", ".cargo" and "wasmtime" subdirectories will be created or updated
> +# in the current directory.
> +#
> +# Based on https://github.com/bytecodealliance/wasmtime/blob/v0.37.0/.github/workflows/main.yml#L208.
> +#
> +# Usage:
> +#
> +#     ./test TARGET_ARCH [CARGO_ARGS ...]
> +#
> +# where TARGET_ARCH is the architecture to test (aarch64 or s390x) and
> +# CARGO_ARGS are the extra arguments passed to cargo test.
> +
> +set -e -u -x
> +
> +# Dependency versions.
> +export RUSTUP_TOOLCHAIN=1.62.0
> +
> +# Bump when https://github.com/bytecodealliance/wasmtime/pull/4377 is
> +# integrated. Until this moment there will be some unnecessary rebuilds.
> +wasmtime_version=0.37.0
> +
> +# Script arguments.
> +arch=$1
> +shift
> +arch_upper=$(echo "$arch" | tr '[:lower:]' '[:upper:]')
> +
> +# Install/update Rust.
> +export RUSTUP_HOME="$PWD/.rustup"
> +export CARGO_HOME="$PWD/.cargo"
> +curl \
> +    --proto '=https' \
> +    --tlsv1.2 \
> +    -sSf \
> +    https://sh.rustup.rs \
> +    | sh -s -- -y \
> +        --default-toolchain="$RUSTUP_TOOLCHAIN" \
> +        --target=wasm32-wasi \
> +        --target=wasm32-unknown-unknown \
> +        --target="$arch"-unknown-linux-gnu

Why can't we just install the distros' rust packages ?


> +cat >"$CARGO_HOME/config" <<HERE
> +[build]
> +# Save space by not generating data to speed-up delta builds.
> +incremental = false
> +
> +[profile.test]
> +# Save space by not generating debug information.
> +debug = 0
> +
> +[net]
> +# Speed up crates.io index update.
> +git-fetch-with-cli = true
> +HERE
> +. "$PWD/.cargo/env"
> +
> +# Checkout/update wasmtime.
> +if [ -d wasmtime ]; then
> +    cd wasmtime
> +    git fetch --force --tags
> +    git checkout v"$wasmtime_version"
> +    git submodule update --init --recursive
> +else
> +    git clone \
> +        --depth=1 \
> +        --recurse-submodules \
> +        --shallow-submodules \
> +        -b v"$wasmtime_version" \
> +        https://github.com/bytecodealliance/wasmtime.git
> +    cd wasmtime
> +fi
> +
> +# Run wasmtime tests.
> +export CARGO_BUILD_TARGET="$arch-unknown-linux-gnu"
> +runner_var=CARGO_TARGET_${arch_upper}_UNKNOWN_LINUX_GNU_RUNNER
> +linker_var=CARGO_TARGET_${arch_upper}_UNKNOWN_LINUX_GNU_LINKER
> +eval "export $runner_var=\"qemu-$arch -L /usr/$arch-linux-gnu\""
> +eval "export $linker_var=$arch-linux-gnu-gcc"
> +export CARGO_PROFILE_DEV_OPT_LEVEL=2
> +export WASMTIME_TEST_NO_HOG_MEMORY=1
> +export RUST_BACKTRACE=1
> +ci/run-tests.sh --locked "$@"

This build-toolchain.sh script is invoked during the dockerfile
build stage, but it appears you're running the test suite here.
Shouldn't this be left until the CI build job instead ?

> diff --git a/tests/docker/dockerfiles/debian-s390x-wasmtime-cross.docker b/tests/docker/dockerfiles/debian-s390x-wasmtime-cross.docker
> new file mode 100644
> index 0000000000..d08a66dcc2
> --- /dev/null
> +++ b/tests/docker/dockerfiles/debian-s390x-wasmtime-cross.docker
> @@ -0,0 +1,16 @@
> +# Image containing pre-built wasmtime tests for s390x.
> +
> +FROM registry.gitlab.com/qemu-project/qemu/qemu/debian-s390x-cross:latest
> +
> +RUN export DEBIAN_FRONTEND=noninteractive && \
> +    eatmydata apt-get update && \
> +    eatmydata apt-get dist-upgrade -y && \
> +    eatmydata apt-get install --no-install-recommends -y \
> +            curl \
> +            libglib2.0-dev && \
> +    eatmydata apt-get autoremove -y && \
> +    eatmydata apt-get autoclean -y
> +
> +RUN mkdir /build
> +ADD build-toolchain.sh /build
> +RUN cd /build && ./build-toolchain.sh s390x --no-run


Is this '--no-run' arg used by  ci/run-tests.sh in some way ?


> diff --git a/tests/wasmtime/test b/tests/wasmtime/test
> new file mode 100755
> index 0000000000..10e2c3f886
> --- /dev/null
> +++ b/tests/wasmtime/test
> @@ -0,0 +1,39 @@
> +#!/bin/sh
> +
> +# Build qemu-user in the current directory, build the stable wasmtime
> +# testsuite, and test them together. ".rustup", ".cargo" and "wasmtime"
> +# subdirectories, as well as qemu build files, will be created or updated in
> +# the current directory.
> +#
> +# Based on https://github.com/bytecodealliance/wasmtime/blob/v0.37.0/.github/workflows/main.yml#L208.
> +#
> +# Usage:
> +#
> +#     ./test TARGET_ARCH [CARGO_ARGS ...]
> +#
> +# where TARGET_ARCH is the architecture to test (aarch64 or s390x) and
> +# CARGO_ARGS are the extra arguments passed to cargo test.
> +
> +set -e -u -x
> +
> +# Script arguments.
> +arch=$1
> +shift
> +
> +# Build QEMU.
> +srcdir=$(cd "$(dirname "$0")" && pwd)/../..
> +docker_files_dir="$srcdir"/tests/docker/dockerfiles
> +"$srcdir"/configure \
> +    --target-list="$arch"-linux-user \
> +    --disable-tools \
> +    --disable-slirp \
> +    --disable-fdt \
> +    --disable-capstone \
> +    --disable-docs
> +make --output-sync -j"$(nproc)"
> +export PATH="$PWD:$PATH"
> +
> +# Run wasmtime tests.
> +exec \
> +    "$docker_files_dir"/debian-s390x-wasmtime-cross.d/build-toolchain.sh \
> +    "$arch" "$@"
> -- 
> 2.35.3
> 
> 

With regards,
Daniel

Peter Maydell July 5, 2022, 1:57 p.m. UTC | #2

On Tue, 5 Jul 2022 at 14:04, Daniel P. Berrangé <berrange@redhat.com> wrote:
> If we put this job in QEMU CI someone will have to be able to
> interpret the results when it fails.

In particular since this is qemu-user, the answer is probably
at least some of the time going to be "oh, well, qemu-user isn't reliable
if you do complicated things in the guest". I'd be pretty wary of our having
a "pass a big complicated guest code test suite under linux-user mode"
in the CI path.

-- PMM

Ilya Leoshkevich July 5, 2022, 2:13 p.m. UTC | #3

On Tue, 2022-07-05 at 13:58 +0100, Daniel P. Berrangé wrote:
> On Tue, Jul 05, 2022 at 12:48:44AM +0200, Ilya Leoshkevich wrote:
> > wasmtime is a WebAssembly runtime, which includes a large
> > testsuite.
> > This testsuite uses qemu-user (aarch64 and s390x are supported) in
> > order to exercise foreign architectures.
> 
> So you're saying that WebAssembly itself is aware of qemu-user
> in its test suite, as opposed to us simply choosing to run its
> test suite under qemu-user ?

WebAssembly can be configured to run tests under qemu-user using
cargo environment variables:

https://doc.rust-lang.org/nightly/cargo/reference/config.html#environment-variables

In this patch build-toolchain.sh sets
CARGO_TARGET_S390X_UNKNOWN_LINUX_GNU_RUNNER.


> Any idea why its limited to just two arches ?  Can it be made
> to cover all QEMU arches, or are these the only ones that
> wasmtime knows how to generate code for ?

I believe their code generator supports only aarch64, s390x and x64 at
the moment:

https://github.com/bytecodealliance/wasmtime/tree/v0.37.0/cranelift/codegen/src/isa

> >                                          Over time it found several
> > regressions in qemu itself, and it would be beneficial to catch the
> > similar ones earlier.
> 
> If we put this job in QEMU CI someone will have to be able to
> interpret the results when it fails.
> 
> How practical is it going to be for QEMU maintainers to understand
> a failure in wasmtime test suite, and correlate that back to a
> problem in QEMU ? The risk with introducing any significant 3rd
> party project to a CI system, is the lack of knowledge around
> that external project creating a signifcant burden for the CI
> system maintainers.

The following is my limited personal experience with the test suite.

While it is quite large, individual test cases tend to be small and
exercise only a single feature. Therefore, while looking into failures
does indeed require some Rust knowledge, normally one doesn't have to
understand the entire WebAssembly code base. Also this means that all
kinds of traces that QEMU can produce for a single test have a
reasonable size.

In addition, bisect is quite a powerful tool here, that's why I tried
to move as much logic as possible from gitlab and dockerfile
definitions to stand-alone scripts - this makes it possible to just use
`git bisect run tests/wasmtime/test s390x`.

The gitlab jobs build the testsuite without debuginfo due to time and
space restrictions on the gitlab builders. Just for some context, the
gitlab docker build runs on 1 CPU and takes ~40 minutes and 5G space.
When I do a local build with debuginfo, on a 4-core i3 it takes ~15
minutes and 20G space.

> > To this end, this patch introduces a job that runs stable wasmtime
> > testsuite against qemu-s390x. The job is split into the following
> > components:
> > 
> > - A script for running the tests. Usable on developers' machines:
> > 
> >     qemu$ mkdir build
> >     qemu$ cd build
> >     qemu/build$ ../tests/wasmtime/test s390x
> > 
> > - A script for building the tests (build-toolchain.sh).
> > 
> > - A dockerfile describing an image with the prebuilt testsuite
> >   (debian-s390x-wasmtime-cross.docker).
> > 
> > - gitlab job definition for building the image.
> > 
> > - gitlab job definition for using the image to run the tests.
> > 
> > It's possible to use this with aarch64 as well, but it segfaults at
> > the moment, therefore this patch does not provide job definitions
> > for
> > it. This needs to be investigated separately.
> > 
> > The example of a resulting pipeline can be seen at [1].
> > 
> > The test job runs for about 30 minutes mostly due to unnecessary
> > rebuilds. They will be gone once [2] is integrated and makes it to
> > a
> > stable release.
> > 
> > This patch depends on madvise(MADV_DONTNEED) passthrough support
> > [3].
> > 
> > [1] https://gitlab.com/iii-i/qemu/-/pipelines/579677396
> > [2] https://github.com/bytecodealliance/wasmtime/pull/4377
> > [3]
> > https://lists.gnu.org/archive/html/qemu-devel/2022-07/msg00112.html
> > 
> > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > ---
> >  .gitlab-ci.d/container-cross.yml              | 10 +++
> >  .gitlab-ci.d/container-template.yml           |  2 +-
> >  .gitlab-ci.d/qemu-project.yml                 |  1 +
> >  .gitlab-ci.d/wasmtime-template.yml            |  6 ++
> >  .gitlab-ci.d/wasmtime.yml                     |  9 ++
> >  tests/docker/Makefile.include                 |  6 ++
> >  .../build-toolchain.sh                        | 83
> > +++++++++++++++++++
> >  .../debian-s390x-wasmtime-cross.docker        | 16 ++++
> >  tests/wasmtime/test                           | 39 +++++++++
> >  9 files changed, 171 insertions(+), 1 deletion(-)
> >  create mode 100644 .gitlab-ci.d/wasmtime-template.yml
> >  create mode 100644 .gitlab-ci.d/wasmtime.yml
> >  create mode 100755 tests/docker/dockerfiles/debian-s390x-wasmtime-
> > cross.d/build-toolchain.sh
> >  create mode 100644 tests/docker/dockerfiles/debian-s390x-wasmtime-
> > cross.docker
> >  create mode 100755 tests/wasmtime/test
> > 
> > diff --git a/.gitlab-ci.d/container-cross.yml b/.gitlab-
> > ci.d/container-cross.yml
> > index b7963498a3..b3c4b76a16 100644
> > --- a/.gitlab-ci.d/container-cross.yml
> > +++ b/.gitlab-ci.d/container-cross.yml
> > @@ -138,6 +138,16 @@ s390x-debian-cross-container:
> >    variables:
> >      NAME: debian-s390x-cross
> >  
> > +s390x-debian-wasmtime-cross-container:
> > +  extends: .container_job_template
> > +  stage: containers
> > +  needs: ['s390x-debian-cross-container']
> 
> This job is in 'containers' stage, but it depends on a container
> built in the same stage. Seems gitlab manages to schedule that
> by effectively delaying the the build. Suggests that actually we
> can entirely get rid of 'containers-layer2' and just let gitlab
> figure out the container dependencies.

Yes, that sounds reasonable. Using dependencies instead of (or in
addition to) stages for ordering jobs seems to be officially supported
by gitlab:

https://docs.gitlab.com/ee/ci/directed_acyclic_graph/index.html

> 
> > +  variables:
> > +    NAME: debian-s390x-wasmtime-cross
> > +    DOCKER_SCRIPT_ARGS: >
> > +      --extra-files
> > +      tests/docker/dockerfiles/debian-s390x-wasmtime-
> > cross.d/build-toolchain.sh
> 
> 
> > diff --git a/tests/docker/dockerfiles/debian-s390x-wasmtime-
> > cross.d/build-toolchain.sh b/tests/docker/dockerfiles/debian-s390x-
> > wasmtime-cross.d/build-toolchain.sh
> > new file mode 100755
> > index 0000000000..a28d61a353
> > --- /dev/null
> > +++ b/tests/docker/dockerfiles/debian-s390x-wasmtime-cross.d/build-
> > toolchain.sh
> > @@ -0,0 +1,83 @@
> > +#!/bin/sh
> > +
> > +# Build the stable wasmtime testsuite and run it with qemu-user
> > from $PATH.
> > +# ".rustup", ".cargo" and "wasmtime" subdirectories will be
> > created or updated
> > +# in the current directory.
> > +#
> > +# Based on
> > https://github.com/bytecodealliance/wasmtime/blob/v0.37.0/.github/workflows/main.yml#L208
> > .
> > +#
> > +# Usage:
> > +#
> > +#     ./test TARGET_ARCH [CARGO_ARGS ...]
> > +#
> > +# where TARGET_ARCH is the architecture to test (aarch64 or s390x)
> > and
> > +# CARGO_ARGS are the extra arguments passed to cargo test.
> > +
> > +set -e -u -x
> > +
> > +# Dependency versions.
> > +export RUSTUP_TOOLCHAIN=1.62.0
> > +
> > +# Bump when
> > https://github.com/bytecodealliance/wasmtime/pull/4377 is
> > +# integrated. Until this moment there will be some unnecessary
> > rebuilds.
> > +wasmtime_version=0.37.0
> > +
> > +# Script arguments.
> > +arch=$1
> > +shift
> > +arch_upper=$(echo "$arch" | tr '[:lower:]' '[:upper:]')
> > +
> > +# Install/update Rust.
> > +export RUSTUP_HOME="$PWD/.rustup"
> > +export CARGO_HOME="$PWD/.cargo"
> > +curl \
> > +    --proto '=https' \
> > +    --tlsv1.2 \
> > +    -sSf \
> > +    https://sh.rustup.rs \
> > +    | sh -s -- -y \
> > +        --default-toolchain="$RUSTUP_TOOLCHAIN" \
> > +        --target=wasm32-wasi \
> > +        --target=wasm32-unknown-unknown \
> > +        --target="$arch"-unknown-linux-gnu
> 
> Why can't we just install the distros' rust packages ?

Unfortunately they are too old. Debian 11's rust toolchain does not
support 2021 edition required by webassembly.

> > +cat >"$CARGO_HOME/config" <<HERE
> > +[build]
> > +# Save space by not generating data to speed-up delta builds.
> > +incremental = false
> > +
> > +[profile.test]
> > +# Save space by not generating debug information.
> > +debug = 0
> > +
> > +[net]
> > +# Speed up crates.io index update.
> > +git-fetch-with-cli = true
> > +HERE
> > +. "$PWD/.cargo/env"
> > +
> > +# Checkout/update wasmtime.
> > +if [ -d wasmtime ]; then
> > +    cd wasmtime
> > +    git fetch --force --tags
> > +    git checkout v"$wasmtime_version"
> > +    git submodule update --init --recursive
> > +else
> > +    git clone \
> > +        --depth=1 \
> > +        --recurse-submodules \
> > +        --shallow-submodules \
> > +        -b v"$wasmtime_version" \
> > +        https://github.com/bytecodealliance/wasmtime.git
> > +    cd wasmtime
> > +fi
> > +
> > +# Run wasmtime tests.
> > +export CARGO_BUILD_TARGET="$arch-unknown-linux-gnu"
> > +runner_var=CARGO_TARGET_${arch_upper}_UNKNOWN_LINUX_GNU_RUNNER
> > +linker_var=CARGO_TARGET_${arch_upper}_UNKNOWN_LINUX_GNU_LINKER
> > +eval "export $runner_var=\"qemu-$arch -L /usr/$arch-linux-gnu\""
> > +eval "export $linker_var=$arch-linux-gnu-gcc"
> > +export CARGO_PROFILE_DEV_OPT_LEVEL=2
> > +export WASMTIME_TEST_NO_HOG_MEMORY=1
> > +export RUST_BACKTRACE=1
> > +ci/run-tests.sh --locked "$@"
> 
> This build-toolchain.sh script is invoked during the dockerfile
> build stage, but it appears you're running the test suite here.
> Shouldn't this be left until the CI build job instead ?

The name of this script appears to be misleading. I chose it, because
it fit into the existing make/docker framework, but it would be better
to adjust it to support arbitrary names. The appropriate name here
would be something like `wasmtime-cargo-test-wrapper`, since in the
end ci/run-tests.sh just calls `cargo test`.

During the dockerfile build stage it's called with `--no-run`, so it
just compiles the test binaries.

> > diff --git a/tests/docker/dockerfiles/debian-s390x-wasmtime-
> > cross.docker b/tests/docker/dockerfiles/debian-s390x-wasmtime-
> > cross.docker
> > new file mode 100644
> > index 0000000000..d08a66dcc2
> > --- /dev/null
> > +++ b/tests/docker/dockerfiles/debian-s390x-wasmtime-cross.docker
> > @@ -0,0 +1,16 @@
> > +# Image containing pre-built wasmtime tests for s390x.
> > +
> > +FROM registry.gitlab.com/qemu-project/qemu/qemu/debian-s390x-
> > cross:latest
> > +
> > +RUN export DEBIAN_FRONTEND=noninteractive && \
> > +    eatmydata apt-get update && \
> > +    eatmydata apt-get dist-upgrade -y && \
> > +    eatmydata apt-get install --no-install-recommends -y \
> > +            curl \
> > +            libglib2.0-dev && \
> > +    eatmydata apt-get autoremove -y && \
> > +    eatmydata apt-get autoclean -y
> > +
> > +RUN mkdir /build
> > +ADD build-toolchain.sh /build
> > +RUN cd /build && ./build-toolchain.sh s390x --no-run
> 
> 
> Is this '--no-run' arg used by  ci/run-tests.sh in some way ?

Yes, see my answer above.

> > diff --git a/tests/wasmtime/test b/tests/wasmtime/test
> > new file mode 100755
> > index 0000000000..10e2c3f886
> > --- /dev/null
> > +++ b/tests/wasmtime/test
> > @@ -0,0 +1,39 @@
> > +#!/bin/sh
> > +
> > +# Build qemu-user in the current directory, build the stable
> > wasmtime
> > +# testsuite, and test them together. ".rustup", ".cargo" and
> > "wasmtime"
> > +# subdirectories, as well as qemu build files, will be created or
> > updated in
> > +# the current directory.
> > +#
> > +# Based on
> > https://github.com/bytecodealliance/wasmtime/blob/v0.37.0/.github/workflows/main.yml#L208
> > .
> > +#
> > +# Usage:
> > +#
> > +#     ./test TARGET_ARCH [CARGO_ARGS ...]
> > +#
> > +# where TARGET_ARCH is the architecture to test (aarch64 or s390x)
> > and
> > +# CARGO_ARGS are the extra arguments passed to cargo test.
> > +
> > +set -e -u -x
> > +
> > +# Script arguments.
> > +arch=$1
> > +shift
> > +
> > +# Build QEMU.
> > +srcdir=$(cd "$(dirname "$0")" && pwd)/../..
> > +docker_files_dir="$srcdir"/tests/docker/dockerfiles
> > +"$srcdir"/configure \
> > +    --target-list="$arch"-linux-user \
> > +    --disable-tools \
> > +    --disable-slirp \
> > +    --disable-fdt \
> > +    --disable-capstone \
> > +    --disable-docs
> > +make --output-sync -j"$(nproc)"
> > +export PATH="$PWD:$PATH"
> > +
> > +# Run wasmtime tests.
> > +exec \
> > +    "$docker_files_dir"/debian-s390x-wasmtime-cross.d/build-
> > toolchain.sh \
> > +    "$arch" "$@"
> > -- 
> > 2.35.3
> > 
> > 
> 
> With regards,
> Daniel

Ilya Leoshkevich July 5, 2022, 2:37 p.m. UTC | #4

On Tue, 2022-07-05 at 14:57 +0100, Peter Maydell wrote:
> On Tue, 5 Jul 2022 at 14:04, Daniel P. Berrangé <berrange@redhat.com>
> wrote:
> > If we put this job in QEMU CI someone will have to be able to
> > interpret the results when it fails.
> 
> In particular since this is qemu-user, the answer is probably
> at least some of the time going to be "oh, well, qemu-user isn't
> reliable
> if you do complicated things in the guest". I'd be pretty wary of our
> having
> a "pass a big complicated guest code test suite under linux-user
> mode"
> in the CI path.
> 
> -- PMM

Actually exercising qemu-user is one of the goals here: just as an
example, one of the things that the test suite found was commit
9a12adc704f9 ("linux-user/s390x: Fix unwinding from signal handlers"),
so it's not only about the ISA.

At least for s390x, we've noticed that various projects use
qemu-user-based setups in their CI (either calling it explicitly, or
via binfmt-misc), and we would like these workflows to be reliable,
even if they try complicated (within reason) things there.

Best regards,
Ilya

Peter Maydell July 5, 2022, 2:40 p.m. UTC | #5

On Tue, 5 Jul 2022 at 15:37, Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> On Tue, 2022-07-05 at 14:57 +0100, Peter Maydell wrote:
> > On Tue, 5 Jul 2022 at 14:04, Daniel P. Berrangé <berrange@redhat.com>
> > wrote:
> > > If we put this job in QEMU CI someone will have to be able to
> > > interpret the results when it fails.
> >
> > In particular since this is qemu-user, the answer is probably
> > at least some of the time going to be "oh, well, qemu-user isn't
> > reliable
> > if you do complicated things in the guest". I'd be pretty wary of our
> > having
> > a "pass a big complicated guest code test suite under linux-user
> > mode"
> > in the CI path.

> Actually exercising qemu-user is one of the goals here: just as an
> example, one of the things that the test suite found was commit
> 9a12adc704f9 ("linux-user/s390x: Fix unwinding from signal handlers"),
> so it's not only about the ISA.
>
> At least for s390x, we've noticed that various projects use
> qemu-user-based setups in their CI (either calling it explicitly, or
> via binfmt-misc), and we would like these workflows to be reliable,
> even if they try complicated (within reason) things there.

I also would like them to be reliable. But I don't think
*testing* these things is the difficulty: it is having
people who are willing to spend time on the often quite
difficult tasks of identifying why something intermittently
fails and doing the necessary design and implementation work
to correct the problem. Sometimes this is easy (as in the
s390 regression above) but quite often it is not (eg when
multiple threads are in use, or the guest wants to do
something complicated with clone(), etc).

thanks
-- PMM

Ilya Leoshkevich July 5, 2022, 8:41 p.m. UTC | #6

On Tue, 2022-07-05 at 15:40 +0100, Peter Maydell wrote:
> On Tue, 5 Jul 2022 at 15:37, Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > 
> > On Tue, 2022-07-05 at 14:57 +0100, Peter Maydell wrote:
> > > On Tue, 5 Jul 2022 at 14:04, Daniel P. Berrangé
> > > <berrange@redhat.com>
> > > wrote:
> > > > If we put this job in QEMU CI someone will have to be able to
> > > > interpret the results when it fails.
> > > 
> > > In particular since this is qemu-user, the answer is probably
> > > at least some of the time going to be "oh, well, qemu-user isn't
> > > reliable
> > > if you do complicated things in the guest". I'd be pretty wary of
> > > our
> > > having
> > > a "pass a big complicated guest code test suite under linux-user
> > > mode"
> > > in the CI path.
> 
> > Actually exercising qemu-user is one of the goals here: just as an
> > example, one of the things that the test suite found was commit
> > 9a12adc704f9 ("linux-user/s390x: Fix unwinding from signal
> > handlers"),
> > so it's not only about the ISA.
> > 
> > At least for s390x, we've noticed that various projects use
> > qemu-user-based setups in their CI (either calling it explicitly,
> > or
> > via binfmt-misc), and we would like these workflows to be reliable,
> > even if they try complicated (within reason) things there.
> 
> I also would like them to be reliable. But I don't think
> *testing* these things is the difficulty: it is having
> people who are willing to spend time on the often quite
> difficult tasks of identifying why something intermittently
> fails and doing the necessary design and implementation work
> to correct the problem. Sometimes this is easy (as in the
> s390 regression above) but quite often it is not (eg when
> multiple threads are in use, or the guest wants to do
> something complicated with clone(), etc).
> 
> thanks
> -- PMM
> 

For what it's worth, we can help analyzing and fixing failures detected
by the s390x wasmtime job. If something breaks, we will have to look at
it anyway, and it's better to do this sooner than later.

Best regards,
Ilya

Alex Bennée Dec. 16, 2022, 3:10 p.m. UTC | #7

Ilya Leoshkevich <iii@linux.ibm.com> writes:

> On Tue, 2022-07-05 at 15:40 +0100, Peter Maydell wrote:
>> On Tue, 5 Jul 2022 at 15:37, Ilya Leoshkevich <iii@linux.ibm.com>
>> wrote:
>> > 
>> > On Tue, 2022-07-05 at 14:57 +0100, Peter Maydell wrote:
>> > > On Tue, 5 Jul 2022 at 14:04, Daniel P. Berrangé
>> > > <berrange@redhat.com>
>> > > wrote:
>> > > > If we put this job in QEMU CI someone will have to be able to
>> > > > interpret the results when it fails.
>> > > 
>> > > In particular since this is qemu-user, the answer is probably
>> > > at least some of the time going to be "oh, well, qemu-user isn't
>> > > reliable
>> > > if you do complicated things in the guest". I'd be pretty wary of
>> > > our
>> > > having
>> > > a "pass a big complicated guest code test suite under linux-user
>> > > mode"
>> > > in the CI path.
>> 
>> > Actually exercising qemu-user is one of the goals here: just as an
>> > example, one of the things that the test suite found was commit
>> > 9a12adc704f9 ("linux-user/s390x: Fix unwinding from signal
>> > handlers"),
>> > so it's not only about the ISA.
>> > 
>> > At least for s390x, we've noticed that various projects use
>> > qemu-user-based setups in their CI (either calling it explicitly,
>> > or
>> > via binfmt-misc), and we would like these workflows to be reliable,
>> > even if they try complicated (within reason) things there.
>> 
>> I also would like them to be reliable. But I don't think
>> *testing* these things is the difficulty: it is having
>> people who are willing to spend time on the often quite
>> difficult tasks of identifying why something intermittently
>> fails and doing the necessary design and implementation work
>> to correct the problem. Sometimes this is easy (as in the
>> s390 regression above) but quite often it is not (eg when
>> multiple threads are in use, or the guest wants to do
>> something complicated with clone(), etc).
>> 
>> thanks
>> -- PMM
>> 
>
> For what it's worth, we can help analyzing and fixing failures detected
> by the s390x wasmtime job. If something breaks, we will have to look at
> it anyway, and it's better to do this sooner than later.

Sorry for necroing an old thread but I just wanted to add my 2p.

I think making 3rd party test suites easily available to developers is a worthy
goal and there are a number that I would like to see including LTP and
kvm-unit-tests. As others have pointed out I'm less sure about adding it
to the gating CI.

If we want to go forward with this we should probably think about how we
would approach this generally:

  - tests/third-party-suites/FOO?
  - should we use avocado as a wrapper or something else?
    - make check-?
  - ensuring the suites output tap for meson
  - document in docs/devel/testing.rst

Also I want to avoid adding stuff to tests/docker/dockerfiles that
aren't directly related to check-tcg and the cross builds. I want to
move away from docker.py so for 3rd party suites lets just call
docker/podman directly.

>
> Best regards,
> Ilya

Ilya Leoshkevich Dec. 19, 2022, 9:42 p.m. UTC | #8

On Fri, 2022-12-16 at 15:10 +0000, Alex Bennée wrote:
> 
> Ilya Leoshkevich <iii@linux.ibm.com> writes:
> 
> > On Tue, 2022-07-05 at 15:40 +0100, Peter Maydell wrote:
> > > On Tue, 5 Jul 2022 at 15:37, Ilya Leoshkevich <iii@linux.ibm.com>
> > > wrote:
> > > > 
> > > > On Tue, 2022-07-05 at 14:57 +0100, Peter Maydell wrote:
> > > > > On Tue, 5 Jul 2022 at 14:04, Daniel P. Berrangé
> > > > > <berrange@redhat.com>
> > > > > wrote:
> > > > > > If we put this job in QEMU CI someone will have to be able
> > > > > > to
> > > > > > interpret the results when it fails.
> > > > > 
> > > > > In particular since this is qemu-user, the answer is probably
> > > > > at least some of the time going to be "oh, well, qemu-user
> > > > > isn't
> > > > > reliable
> > > > > if you do complicated things in the guest". I'd be pretty
> > > > > wary of
> > > > > our
> > > > > having
> > > > > a "pass a big complicated guest code test suite under linux-
> > > > > user
> > > > > mode"
> > > > > in the CI path.
> > > 
> > > > Actually exercising qemu-user is one of the goals here: just as
> > > > an
> > > > example, one of the things that the test suite found was commit
> > > > 9a12adc704f9 ("linux-user/s390x: Fix unwinding from signal
> > > > handlers"),
> > > > so it's not only about the ISA.
> > > > 
> > > > At least for s390x, we've noticed that various projects use
> > > > qemu-user-based setups in their CI (either calling it
> > > > explicitly,
> > > > or
> > > > via binfmt-misc), and we would like these workflows to be
> > > > reliable,
> > > > even if they try complicated (within reason) things there.
> > > 
> > > I also would like them to be reliable. But I don't think
> > > *testing* these things is the difficulty: it is having
> > > people who are willing to spend time on the often quite
> > > difficult tasks of identifying why something intermittently
> > > fails and doing the necessary design and implementation work
> > > to correct the problem. Sometimes this is easy (as in the
> > > s390 regression above) but quite often it is not (eg when
> > > multiple threads are in use, or the guest wants to do
> > > something complicated with clone(), etc).
> > > 
> > > thanks
> > > -- PMM
> > > 
> > 
> > For what it's worth, we can help analyzing and fixing failures
> > detected
> > by the s390x wasmtime job. If something breaks, we will have to
> > look at
> > it anyway, and it's better to do this sooner than later.
> 
> Sorry for necroing an old thread but I just wanted to add my 2p.

Thanks for that though; I've been cherry-picking this patch into my
private trees for some time now, and would be happy to see it go
upstream in some form.

> I think making 3rd party test suites easily available to developers
> is a worthy
> goal and there are a number that I would like to see including LTP
> and
> kvm-unit-tests. As others have pointed out I'm less sure about adding
> it
> to the gating CI.

Another third-party test suite that I found useful was the valgrind's
one. I'll post my thoughts about integrating wasmtime's and valgrind's
test suites below, unfortunately I'm not too familiar with LTP and
kvm-unit-tests.

Not touching the gating CI is fine for me.

> If we want to go forward with this we should probably think about how
> we
> would approach this generally:
> 
>   - tests/third-party-suites/FOO?

Sounds good to me.

>   - should we use avocado as a wrapper or something else?
>     - make check-?

avocado sounds good; we might have to add a second wrapper for
producing tap output (see below).

One should definitely be able to specify the testsuite and the
architecture, e.g. `make check-third-party-wasmtime-s390x`.

In addition, we need to either hardcode or let the user choose
the way the testsuite it built and executed. I see 3 possibilities:

- Fully on the host. Easiest to implement, the results are also easy
  to debug. But this requires installing cross-toolchains manually,
  which is simple on some distros and not-so-simple on the others.

- Provide the toolchain as a Docker image. For wasmtime, the toolchain
  would include the Rust compiler and Cargo. This solves the problem
  with configuring the host, but introduces the next choice one has to
  make:

  - Build qemu on the host. Then qemu binary would have to be
    compatible with the container (e.g. no references to the latest
    greatest glibc functions).

    This is because wastime testsuite needs to run inside the
    container: it's driven by Cargo, which is not available on the 
    host. It is possible to only build tests with Cargo and then run
    the resulting binaries manually, but there is more than one and I'm
    not sure how to get a list of them (if we decide to do this, in the
    worst case the list can be hardcoded).

    For valgrind it's a bit easier, since the test runner is not as
    complex as Cargo, and can therefore just follow the check-tcg
    approach.

  - Build qemu inside the container. 2x space and time required, one
    might also have to install additional -dev packages for extra qemu
    features. Also, a decision needs to be made on whether the qemu
    build directory ends up in the container (needs a rebuild on every
    run), in a volume (volume lifetime needs to be managed) or in a
    mounted host directory (this can cause selinux/ownership issues if
    not done carefully).

- Provide both toolchain and testsuite as a Docker image. Essentially
  same as above, but trades build time for download time. Also the
  results are slightly harder to debug, since the test binaries are
  now located inside the container.

Sorry for the long list, it's just that since we are discussing how to
enable this for a larger audience, I felt I needed to enumerate all the
options and pitfalls I could think of.

>   - ensuring the suites output tap for meson

At the moment Rust can output either json like this:

$ cargo test -- -Z unstable-options --format=json
{ "type": "suite", "event": "started", "test_count": 1 }
{ "type": "test", "event": "started", "name": "test::hello" }
{ "type": "test", "name": "test::hello", "event": "ok" }
{ "type": "suite", "event": "ok", "passed": 1, "failed": 0, "ignored":
0, "measured": 0, "filtered_out": 0, "exec_time": 0.001460307 }

or xUnit like this:

$ cargo test -- -Z unstable-options --format=junit

# the following is on a single line; formatted for clarity

<?xml version="1.0" encoding="UTF-8"?>
<testsuites>
  <testsuite name="test" package="test" id="0" errors="0" failures="0"
tests="1" skipped="0">
    <testcase classname="integration" name="test::hello" time="0"/>
    <system-out/>
    <system-err/>
  </testsuite>
</testsuites>

I skimmed the avocado docs and couldn't find whether it can convert
between different test output formats. Based on the source code, we can
add an XUnitRunner the same way the TAPRunner was added.

In the worst case we can pipe json to a script that would output tap.

Enhancing Rust is also an option, of course, even though this might
take some time.

>   - document in docs/devel/testing.rst

Right, we need this too; I totally ignored it in this patch.

> Also I want to avoid adding stuff to tests/docker/dockerfiles that
> aren't directly related to check-tcg and the cross builds. I want to
> move away from docker.py so for 3rd party suites lets just call
> docker/podman directly.

We could add the dockerfiles (if we decide we need them based on
the discussion above) to tests/third-party-suites/FOO. My question is,
would it be possible to build and publish the images on GitLab? Or
is it better to build them on developers' machines?

> > Best regards,
> > Ilya

Alex Bennée Dec. 19, 2022, 10:18 p.m. UTC | #9

Ilya Leoshkevich <iii@linux.ibm.com> writes:

> On Fri, 2022-12-16 at 15:10 +0000, Alex Bennée wrote:
>> 
>> Ilya Leoshkevich <iii@linux.ibm.com> writes:
>> 
>> > On Tue, 2022-07-05 at 15:40 +0100, Peter Maydell wrote:
>> > > On Tue, 5 Jul 2022 at 15:37, Ilya Leoshkevich <iii@linux.ibm.com>
>> > > wrote:
>> > > > 
>> > > > On Tue, 2022-07-05 at 14:57 +0100, Peter Maydell wrote:
>> > > > > On Tue, 5 Jul 2022 at 14:04, Daniel P. Berrangé
>> > > > > <berrange@redhat.com>
>> > > > > wrote:
>> > > > > > If we put this job in QEMU CI someone will have to be able
>> > > > > > to
>> > > > > > interpret the results when it fails.
>> > > > > 
>> > > > > In particular since this is qemu-user, the answer is probably
>> > > > > at least some of the time going to be "oh, well, qemu-user
>> > > > > isn't
>> > > > > reliable
>> > > > > if you do complicated things in the guest". I'd be pretty
>> > > > > wary of
>> > > > > our
>> > > > > having
>> > > > > a "pass a big complicated guest code test suite under linux-
>> > > > > user
>> > > > > mode"
>> > > > > in the CI path.
>> > > 
>> > > > Actually exercising qemu-user is one of the goals here: just as
>> > > > an
>> > > > example, one of the things that the test suite found was commit
>> > > > 9a12adc704f9 ("linux-user/s390x: Fix unwinding from signal
>> > > > handlers"),
>> > > > so it's not only about the ISA.
>> > > > 
>> > > > At least for s390x, we've noticed that various projects use
>> > > > qemu-user-based setups in their CI (either calling it
>> > > > explicitly,
>> > > > or
>> > > > via binfmt-misc), and we would like these workflows to be
>> > > > reliable,
>> > > > even if they try complicated (within reason) things there.
>> > > 
>> > > I also would like them to be reliable. But I don't think
>> > > *testing* these things is the difficulty: it is having
>> > > people who are willing to spend time on the often quite
>> > > difficult tasks of identifying why something intermittently
>> > > fails and doing the necessary design and implementation work
>> > > to correct the problem. Sometimes this is easy (as in the
>> > > s390 regression above) but quite often it is not (eg when
>> > > multiple threads are in use, or the guest wants to do
>> > > something complicated with clone(), etc).
>> > > 
>> > > thanks
>> > > -- PMM
>> > > 
>> > 
>> > For what it's worth, we can help analyzing and fixing failures
>> > detected
>> > by the s390x wasmtime job. If something breaks, we will have to
>> > look at
>> > it anyway, and it's better to do this sooner than later.
>> 
>> Sorry for necroing an old thread but I just wanted to add my 2p.
>
> Thanks for that though; I've been cherry-picking this patch into my
> private trees for some time now, and would be happy to see it go
> upstream in some form.
>
>> I think making 3rd party test suites easily available to developers
>> is a worthy
>> goal and there are a number that I would like to see including LTP
>> and
>> kvm-unit-tests. As others have pointed out I'm less sure about adding
>> it
>> to the gating CI.
>
> Another third-party test suite that I found useful was the valgrind's
> one. I'll post my thoughts about integrating wasmtime's and valgrind's
> test suites below, unfortunately I'm not too familiar with LTP and
> kvm-unit-tests.
>
> Not touching the gating CI is fine for me.
>
>> If we want to go forward with this we should probably think about how
>> we
>> would approach this generally:
>> 
>>   - tests/third-party-suites/FOO?
>
> Sounds good to me.
>
>>   - should we use avocado as a wrapper or something else?
>>     - make check-?
>
> avocado sounds good; we might have to add a second wrapper for
> producing tap output (see below).
>
> One should definitely be able to specify the testsuite and the
> architecture, e.g. `make check-third-party-wasmtime-s390x`.
>
> In addition, we need to either hardcode or let the user choose
> the way the testsuite it built and executed. I see 3 possibilities:
>
> - Fully on the host. Easiest to implement, the results are also easy
>   to debug. But this requires installing cross-toolchains manually,
>   which is simple on some distros and not-so-simple on the others.
>
> - Provide the toolchain as a Docker image. For wasmtime, the toolchain
>   would include the Rust compiler and Cargo. This solves the problem
>   with configuring the host, but introduces the next choice one has to
>   make:
>
>   - Build qemu on the host. Then qemu binary would have to be
>     compatible with the container (e.g. no references to the latest
>     greatest glibc functions).
>
>     This is because wastime testsuite needs to run inside the
>     container: it's driven by Cargo, which is not available on the 
>     host. It is possible to only build tests with Cargo and then run
>     the resulting binaries manually, but there is more than one and I'm
>     not sure how to get a list of them (if we decide to do this, in the
>     worst case the list can be hardcoded).
>
>     For valgrind it's a bit easier, since the test runner is not as
>     complex as Cargo, and can therefore just follow the check-tcg
>     approach.
>
>   - Build qemu inside the container. 2x space and time required, one
>     might also have to install additional -dev packages for extra qemu
>     features. Also, a decision needs to be made on whether the qemu
>     build directory ends up in the container (needs a rebuild on every
>     run), in a volume (volume lifetime needs to be managed) or in a
>     mounted host directory (this can cause selinux/ownership issues if
>     not done carefully).

I think building inside the container is the easiest to ensure you have
all the bits. We can provide a persistent ccache and follow the same
TARGET_LIST and option rules as the cross builds to allow for selecting
a minimal subset.

> - Provide both toolchain and testsuite as a Docker image. Essentially
>   same as above, but trades build time for download time. Also the
>   results are slightly harder to debug, since the test binaries are
>   now located inside the container.

There certainly seems some millage in having the test binaries in a
volume that is on the host system - especially if they are
self-contained or build statically.

> Sorry for the long list, it's just that since we are discussing how to
> enable this for a larger audience, I felt I needed to enumerate all the
> options and pitfalls I could think of.
>
>>   - ensuring the suites output tap for meson
>
> At the moment Rust can output either json like this:
>
> $ cargo test -- -Z unstable-options --format=json
> { "type": "suite", "event": "started", "test_count": 1 }
> { "type": "test", "event": "started", "name": "test::hello" }
> { "type": "test", "name": "test::hello", "event": "ok" }
> { "type": "suite", "event": "ok", "passed": 1, "failed": 0, "ignored":
> 0, "measured": 0, "filtered_out": 0, "exec_time": 0.001460307 }
>
> or xUnit like this:
>
> $ cargo test -- -Z unstable-options --format=junit
>
> # the following is on a single line; formatted for clarity
>
> <?xml version="1.0" encoding="UTF-8"?>
> <testsuites>
>   <testsuite name="test" package="test" id="0" errors="0" failures="0"
> tests="1" skipped="0">
>     <testcase classname="integration" name="test::hello" time="0"/>
>     <system-out/>
>     <system-err/>
>   </testsuite>
> </testsuites>
>
> I skimmed the avocado docs and couldn't find whether it can convert
> between different test output formats. Based on the source code, we can
> add an XUnitRunner the same way the TAPRunner was added.
>
> In the worst case we can pipe json to a script that would output tap.

That certainly works, there are plenty of interoperation solutions.

>
> Enhancing Rust is also an option, of course, even though this might
> take some time.
>
>>   - document in docs/devel/testing.rst
>
> Right, we need this too; I totally ignored it in this patch.
>
>> Also I want to avoid adding stuff to tests/docker/dockerfiles that
>> aren't directly related to check-tcg and the cross builds. I want to
>> move away from docker.py so for 3rd party suites lets just call
>> docker/podman directly.
>
> We could add the dockerfiles (if we decide we need them based on
> the discussion above) to tests/third-party-suites/FOO. My question is,
> would it be possible to build and publish the images on GitLab? Or
> is it better to build them on developers' machines?

Probably assume the developers especially as the actual CI currently
hammers our GitLab storage quotas and we are not expecting everyone to
be interested in such detail.

>
>> > Best regards,
>> > Ilya

[RFC] gitlab: introduce s390x wasmtime job

Commit Message

Comments

Patch