diff mbox series

.travis.yml: skip ppc64abi32-linux-user with plugins

Message ID 20200714175516.5475-1-alex.bennee@linaro.org
State New
Headers show
Series .travis.yml: skip ppc64abi32-linux-user with plugins | expand

Commit Message

Alex Bennée July 14, 2020, 5:55 p.m. UTC
We actually see failures on threadcount running without plugins:

  retry.py -n 1000 -c -- \
    ./ppc64abi32-linux-user/qemu-ppc64abi32 \
    ./tests/tcg/ppc64abi32-linux-user/threadcount

which reports:

  0: 978 times (97.80%), avg time 0.270 (0.01 varience/0.08 deviation)
  -6: 21 times (2.10%), avg time 0.336 (0.01 varience/0.12 deviation)
  -11: 1 times (0.10%), avg time 0.502 (0.00 varience/0.00 deviation)
  Ran command 1000 times, 978 passes

But when running with plugins we hit the failure a lot more often:

  0: 91 times (91.00%), avg time 0.302 (0.04 varience/0.19 deviation)
  -11: 9 times (9.00%), avg time 0.558 (0.01 varience/0.11 deviation)
  Ran command 100 times, 91 passes

The crash occurs in guest code which is the same in both pass and fail
cases. However we see various messages reported on the console about
corrupted memory lists which seems to imply the guest memory allocation
is corrupted. This lines up with the seg fault being in the guest
__libc_free function. So we think this is a guest bug which is
exacerbated by various modes of translation. If anyone has access to
real hardware to soak test the test case we could prove this properly.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 .travis.yml | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Philippe Mathieu-Daudé July 14, 2020, 7:29 p.m. UTC | #1
On 7/14/20 7:55 PM, Alex Bennée wrote:
> We actually see failures on threadcount running without plugins:
> 
>   retry.py -n 1000 -c -- \
>     ./ppc64abi32-linux-user/qemu-ppc64abi32 \
>     ./tests/tcg/ppc64abi32-linux-user/threadcount
> 
> which reports:
> 
>   0: 978 times (97.80%), avg time 0.270 (0.01 varience/0.08 deviation)
>   -6: 21 times (2.10%), avg time 0.336 (0.01 varience/0.12 deviation)
>   -11: 1 times (0.10%), avg time 0.502 (0.00 varience/0.00 deviation)
>   Ran command 1000 times, 978 passes
> 
> But when running with plugins we hit the failure a lot more often:
> 
>   0: 91 times (91.00%), avg time 0.302 (0.04 varience/0.19 deviation)
>   -11: 9 times (9.00%), avg time 0.558 (0.01 varience/0.11 deviation)
>   Ran command 100 times, 91 passes
> 
> The crash occurs in guest code which is the same in both pass and fail
> cases. However we see various messages reported on the console about
> corrupted memory lists which seems to imply the guest memory allocation
> is corrupted. This lines up with the seg fault being in the guest
> __libc_free function. So we think this is a guest bug which is
> exacerbated by various modes of translation. If anyone has access to
> real hardware to soak test the test case we could prove this properly.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

> ---
>  .travis.yml | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/.travis.yml b/.travis.yml
> index ab429500fc..6695c0620f 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -350,9 +350,10 @@ jobs:
>      # Run check-tcg against linux-user (with plugins)
>      # we skip sparc64-linux-user until it has been fixed somewhat
>      # we skip cris-linux-user as it doesn't use the common run loop
> +    # we skip ppc64abi32-linux-user as it seems to have a broken libc
>      - name: "GCC plugins check-tcg (user)"
>        env:
> -        - CONFIG="--disable-system --enable-plugins --enable-debug-tcg --target-list-exclude=sparc64-linux-user,cris-linux-user"
> +        - CONFIG="--disable-system --enable-plugins --enable-debug-tcg --target-list-exclude=sparc64-linux-user,cris-linux-user,ppc64abi32-linux-user"
>          - TEST_BUILD_CMD="make build-tcg"
>          - TEST_CMD="make check-tcg"
>          - CACHE_NAME="${TRAVIS_BRANCH}-linux-gcc-debug-tcg"
>
David Gibson July 15, 2020, 4:06 a.m. UTC | #2
On Tue, Jul 14, 2020 at 06:55:16PM +0100, Alex Bennée wrote:
> We actually see failures on threadcount running without plugins:
> 
>   retry.py -n 1000 -c -- \
>     ./ppc64abi32-linux-user/qemu-ppc64abi32 \
>     ./tests/tcg/ppc64abi32-linux-user/threadcount
> 
> which reports:
> 
>   0: 978 times (97.80%), avg time 0.270 (0.01 varience/0.08 deviation)
>   -6: 21 times (2.10%), avg time 0.336 (0.01 varience/0.12 deviation)
>   -11: 1 times (0.10%), avg time 0.502 (0.00 varience/0.00 deviation)
>   Ran command 1000 times, 978 passes
> 
> But when running with plugins we hit the failure a lot more often:
> 
>   0: 91 times (91.00%), avg time 0.302 (0.04 varience/0.19 deviation)
>   -11: 9 times (9.00%), avg time 0.558 (0.01 varience/0.11 deviation)
>   Ran command 100 times, 91 passes
> 
> The crash occurs in guest code which is the same in both pass and fail
> cases. However we see various messages reported on the console about
> corrupted memory lists which seems to imply the guest memory allocation
> is corrupted. This lines up with the seg fault being in the guest
> __libc_free function. So we think this is a guest bug which is
> exacerbated by various modes of translation. If anyone has access to
> real hardware to soak test the test case we could prove this properly.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>

Acked-by: David Gibson <david@gibson.dropbear.id.au>

Honestly, AFAICT the ppc64abi32-linux-user target is pretty much
entirely broken anyway.  Many things about it appear to make no
sense, it's difficult to work out what it's even supposed to be, and I
strongly suspect no-one's actually used it in like a decade.

> ---
>  .travis.yml | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/.travis.yml b/.travis.yml
> index ab429500fc..6695c0620f 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -350,9 +350,10 @@ jobs:
>      # Run check-tcg against linux-user (with plugins)
>      # we skip sparc64-linux-user until it has been fixed somewhat
>      # we skip cris-linux-user as it doesn't use the common run loop
> +    # we skip ppc64abi32-linux-user as it seems to have a broken libc
>      - name: "GCC plugins check-tcg (user)"
>        env:
> -        - CONFIG="--disable-system --enable-plugins --enable-debug-tcg --target-list-exclude=sparc64-linux-user,cris-linux-user"
> +        - CONFIG="--disable-system --enable-plugins --enable-debug-tcg --target-list-exclude=sparc64-linux-user,cris-linux-user,ppc64abi32-linux-user"
>          - TEST_BUILD_CMD="make build-tcg"
>          - TEST_CMD="make check-tcg"
>          - CACHE_NAME="${TRAVIS_BRANCH}-linux-gcc-debug-tcg"
Alex Bennée July 15, 2020, 8:02 a.m. UTC | #3
David Gibson <david@gibson.dropbear.id.au> writes:

> On Tue, Jul 14, 2020 at 06:55:16PM +0100, Alex Bennée wrote:
>> We actually see failures on threadcount running without plugins:
>> 
>>   retry.py -n 1000 -c -- \
>>     ./ppc64abi32-linux-user/qemu-ppc64abi32 \
>>     ./tests/tcg/ppc64abi32-linux-user/threadcount
>> 
>> which reports:
>> 
>>   0: 978 times (97.80%), avg time 0.270 (0.01 varience/0.08 deviation)
>>   -6: 21 times (2.10%), avg time 0.336 (0.01 varience/0.12 deviation)
>>   -11: 1 times (0.10%), avg time 0.502 (0.00 varience/0.00 deviation)
>>   Ran command 1000 times, 978 passes
>> 
>> But when running with plugins we hit the failure a lot more often:
>> 
>>   0: 91 times (91.00%), avg time 0.302 (0.04 varience/0.19 deviation)
>>   -11: 9 times (9.00%), avg time 0.558 (0.01 varience/0.11 deviation)
>>   Ran command 100 times, 91 passes
>> 
>> The crash occurs in guest code which is the same in both pass and fail
>> cases. However we see various messages reported on the console about
>> corrupted memory lists which seems to imply the guest memory allocation
>> is corrupted. This lines up with the seg fault being in the guest
>> __libc_free function. So we think this is a guest bug which is
>> exacerbated by various modes of translation. If anyone has access to
>> real hardware to soak test the test case we could prove this properly.
>> 
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> Cc: David Gibson <david@gibson.dropbear.id.au>
>> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
>
> Acked-by: David Gibson <david@gibson.dropbear.id.au>
>
> Honestly, AFAICT the ppc64abi32-linux-user target is pretty much
> entirely broken anyway.  Many things about it appear to make no
> sense, it's difficult to work out what it's even supposed to be, and I
> strongly suspect no-one's actually used it in like a decade.

Should we think about marking it deprecated for 5.2?
David Gibson July 15, 2020, 8:56 a.m. UTC | #4
On Wed, Jul 15, 2020 at 09:02:05AM +0100, Alex Bennée wrote:
> 
> David Gibson <david@gibson.dropbear.id.au> writes:
> 
> > On Tue, Jul 14, 2020 at 06:55:16PM +0100, Alex Bennée wrote:
> >> We actually see failures on threadcount running without plugins:
> >> 
> >>   retry.py -n 1000 -c -- \
> >>     ./ppc64abi32-linux-user/qemu-ppc64abi32 \
> >>     ./tests/tcg/ppc64abi32-linux-user/threadcount
> >> 
> >> which reports:
> >> 
> >>   0: 978 times (97.80%), avg time 0.270 (0.01 varience/0.08 deviation)
> >>   -6: 21 times (2.10%), avg time 0.336 (0.01 varience/0.12 deviation)
> >>   -11: 1 times (0.10%), avg time 0.502 (0.00 varience/0.00 deviation)
> >>   Ran command 1000 times, 978 passes
> >> 
> >> But when running with plugins we hit the failure a lot more often:
> >> 
> >>   0: 91 times (91.00%), avg time 0.302 (0.04 varience/0.19 deviation)
> >>   -11: 9 times (9.00%), avg time 0.558 (0.01 varience/0.11 deviation)
> >>   Ran command 100 times, 91 passes
> >> 
> >> The crash occurs in guest code which is the same in both pass and fail
> >> cases. However we see various messages reported on the console about
> >> corrupted memory lists which seems to imply the guest memory allocation
> >> is corrupted. This lines up with the seg fault being in the guest
> >> __libc_free function. So we think this is a guest bug which is
> >> exacerbated by various modes of translation. If anyone has access to
> >> real hardware to soak test the test case we could prove this properly.
> >> 
> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> >> Cc: David Gibson <david@gibson.dropbear.id.au>
> >> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> >
> > Acked-by: David Gibson <david@gibson.dropbear.id.au>
> >
> > Honestly, AFAICT the ppc64abi32-linux-user target is pretty much
> > entirely broken anyway.  Many things about it appear to make no
> > sense, it's difficult to work out what it's even supposed to be, and I
> > strongly suspect no-one's actually used it in like a decade.
> 
> Should we think about marking it deprecated for 5.2?

Yes, probably.  I just haven't gotten around to it.
diff mbox series

Patch

diff --git a/.travis.yml b/.travis.yml
index ab429500fc..6695c0620f 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -350,9 +350,10 @@  jobs:
     # Run check-tcg against linux-user (with plugins)
     # we skip sparc64-linux-user until it has been fixed somewhat
     # we skip cris-linux-user as it doesn't use the common run loop
+    # we skip ppc64abi32-linux-user as it seems to have a broken libc
     - name: "GCC plugins check-tcg (user)"
       env:
-        - CONFIG="--disable-system --enable-plugins --enable-debug-tcg --target-list-exclude=sparc64-linux-user,cris-linux-user"
+        - CONFIG="--disable-system --enable-plugins --enable-debug-tcg --target-list-exclude=sparc64-linux-user,cris-linux-user,ppc64abi32-linux-user"
         - TEST_BUILD_CMD="make build-tcg"
         - TEST_CMD="make check-tcg"
         - CACHE_NAME="${TRAVIS_BRANCH}-linux-gcc-debug-tcg"