diff mbox series

[for-3.0,?] tests/libqtest: Improve kill_qemu() assert

Message ID 20180720153932.8507-1-peter.maydell@linaro.org
State New
Headers show
Series [for-3.0,?] tests/libqtest: Improve kill_qemu() assert | expand

Commit Message

Peter Maydell July 20, 2018, 3:39 p.m. UTC
In kill_qemu() we have an assert that checks that the QEMU process
didn't dump core:
            assert(!WCOREDUMP(wstatus));

Unfortunately the WCOREDUMP macro here means the resulting message
is not very easy to comprehend on at least some systems:

ahci-test: tests/libqtest.c:113: kill_qemu: Assertion `!(((__extension__ (((union { __typeof(wstatus) __in; int __i; }) { .__in = (wstatus) }).__i))) & 0x80)' failed.

and it doesn't identify what signal the process took.

Instead of using a raw assert, print the information in an
easier to understand way:

/i386/ahci/sanity: libqtest.c: kill_qemu() tried to terminate QEMU process but it dumped core with signal 11
ahci-test: tests/libqtest.c:118: kill_qemu: Assertion `0' failed.
Aborted (core dumped)

(Of course, the really useful information would be why the QEMU
process dumped core in the first place, but we don't have that
by the time the test program has picked up the exit status.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
In particular, the travis test config that enables gprof
seems to (a) run into this every so often and (b) have the
really unhelpful assertion text quoted above:
 https://travis-ci.org/qemu/qemu/jobs/406192798

Maybe for 3.0 since it's only test code.

 tests/libqtest.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Philippe Mathieu-Daudé July 20, 2018, 3:48 p.m. UTC | #1
On 07/20/2018 12:39 PM, Peter Maydell wrote:
> In kill_qemu() we have an assert that checks that the QEMU process
> didn't dump core:
>             assert(!WCOREDUMP(wstatus));
> 
> Unfortunately the WCOREDUMP macro here means the resulting message
> is not very easy to comprehend on at least some systems:
> 
> ahci-test: tests/libqtest.c:113: kill_qemu: Assertion `!(((__extension__ (((union { __typeof(wstatus) __in; int __i; }) { .__in = (wstatus) }).__i))) & 0x80)' failed.
> 
> and it doesn't identify what signal the process took.
> 
> Instead of using a raw assert, print the information in an
> easier to understand way:
> 
> /i386/ahci/sanity: libqtest.c: kill_qemu() tried to terminate QEMU process but it dumped core with signal 11

Less cryptic, indeed.

> ahci-test: tests/libqtest.c:118: kill_qemu: Assertion `0' failed.

This file:line is not very relevant, ...

> Aborted (core dumped)
> 
> (Of course, the really useful information would be why the QEMU
> process dumped core in the first place, but we don't have that
> by the time the test program has picked up the exit status.)
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> In particular, the travis test config that enables gprof
> seems to (a) run into this every so often and (b) have the
> really unhelpful assertion text quoted above:
>  https://travis-ci.org/qemu/qemu/jobs/406192798
> 
> Maybe for 3.0 since it's only test code.
> 
>  tests/libqtest.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/libqtest.c b/tests/libqtest.c
> index 098af6aec44..99341e1b47d 100644
> --- a/tests/libqtest.c
> +++ b/tests/libqtest.c
> @@ -110,7 +110,13 @@ static void kill_qemu(QTestState *s)
>          pid = waitpid(s->qemu_pid, &wstatus, 0);
>  
>          if (pid == s->qemu_pid && WIFSIGNALED(wstatus)) {
> -            assert(!WCOREDUMP(wstatus));
> +            if (WCOREDUMP(wstatus)) {
> +                fprintf(stderr,
> +                        "libqtest.c: kill_qemu() tried to terminate QEMU "
> +                        "process but it dumped core with signal %d\n",
> +                        WTERMSIG(wstatus));
> +                assert(0);

... what about directly using abort() here?

> +            }
>          }
>      }
>  }
> 

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Peter Maydell July 20, 2018, 3:49 p.m. UTC | #2
On 20 July 2018 at 16:48, Philippe Mathieu-Daudé <f4bug@amsat.org> wrote:
> On 07/20/2018 12:39 PM, Peter Maydell wrote:
>> In kill_qemu() we have an assert that checks that the QEMU process
>> didn't dump core:
>>             assert(!WCOREDUMP(wstatus));
>>
>> Unfortunately the WCOREDUMP macro here means the resulting message
>> is not very easy to comprehend on at least some systems:
>>
>> ahci-test: tests/libqtest.c:113: kill_qemu: Assertion `!(((__extension__ (((union { __typeof(wstatus) __in; int __i; }) { .__in = (wstatus) }).__i))) & 0x80)' failed.
>>
>> and it doesn't identify what signal the process took.
>>
>> Instead of using a raw assert, print the information in an
>> easier to understand way:
>>
>> /i386/ahci/sanity: libqtest.c: kill_qemu() tried to terminate QEMU process but it dumped core with signal 11
>
> Less cryptic, indeed.
>
>> ahci-test: tests/libqtest.c:118: kill_qemu: Assertion `0' failed.
>
> This file:line is not very relevant, ...

> ... what about directly using abort() here?

I did actually start with that, but decided I'd rather have
the file-and-line in there to direct people more quickly
to the immediate point where we asserted.

thanks
-- PMM
Richard Henderson July 20, 2018, 4:14 p.m. UTC | #3
On 07/20/2018 08:49 AM, Peter Maydell wrote:
> On 20 July 2018 at 16:48, Philippe Mathieu-Daudé <f4bug@amsat.org> wrote:
>> On 07/20/2018 12:39 PM, Peter Maydell wrote:
>>> In kill_qemu() we have an assert that checks that the QEMU process
>>> didn't dump core:
>>>             assert(!WCOREDUMP(wstatus));
>>>
>>> Unfortunately the WCOREDUMP macro here means the resulting message
>>> is not very easy to comprehend on at least some systems:
>>>
>>> ahci-test: tests/libqtest.c:113: kill_qemu: Assertion `!(((__extension__ (((union { __typeof(wstatus) __in; int __i; }) { .__in = (wstatus) }).__i))) & 0x80)' failed.
>>>
>>> and it doesn't identify what signal the process took.
>>>
>>> Instead of using a raw assert, print the information in an
>>> easier to understand way:
>>>
>>> /i386/ahci/sanity: libqtest.c: kill_qemu() tried to terminate QEMU process but it dumped core with signal 11
>>
>> Less cryptic, indeed.
>>
>>> ahci-test: tests/libqtest.c:118: kill_qemu: Assertion `0' failed.
>>
>> This file:line is not very relevant, ...
> 
>> ... what about directly using abort() here?
> 
> I did actually start with that, but decided I'd rather have
> the file-and-line in there to direct people more quickly
> to the immediate point where we asserted.

You already print the file, just include the line.  Perhaps

  fprintf(stderr,
          "%s:%d: kill_qemu tried to terminate QEMU "
          "process but it dumped core with signal %s\n",
          __FILE__, __LINE__, strsignal(WTERMSIG(wstatus)));
  abort();

Not that I expect the signal to ever be anything other than 11,
and that being one of the handful that are consistent across
pretty much all unix systems.  But still.


r~


PS: The bike shed should be blue.
Peter Maydell July 20, 2018, 4:25 p.m. UTC | #4
On 20 July 2018 at 17:14, Richard Henderson
<richard.henderson@linaro.org> wrote:
> You already print the file, just include the line.  Perhaps
>
>   fprintf(stderr,
>           "%s:%d: kill_qemu tried to terminate QEMU "
>           "process but it dumped core with signal %s\n",
>           __FILE__, __LINE__, strsignal(WTERMSIG(wstatus)));
>   abort();

I wasn't convinced that strsignal() would be available
on all the host OSes we build on (we don't currently use
it outside linux-user/), and I definitely didn't think that
it merited a configure test for its presence just for a
test error message :-)

thanks
-- PMM
Richard Henderson July 20, 2018, 4:36 p.m. UTC | #5
On 07/20/2018 09:25 AM, Peter Maydell wrote:
> On 20 July 2018 at 17:14, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>> You already print the file, just include the line.  Perhaps
>>
>>   fprintf(stderr,
>>           "%s:%d: kill_qemu tried to terminate QEMU "
>>           "process but it dumped core with signal %s\n",
>>           __FILE__, __LINE__, strsignal(WTERMSIG(wstatus)));
>>   abort();
> 
> I wasn't convinced that strsignal() would be available
> on all the host OSes we build on (we don't currently use
> it outside linux-user/), and I definitely didn't think that
> it merited a configure test for its presence just for a
> test error message :-)

Hmm.  It has been in _GNU_SOURCE since the dawn of time
and in POSIX since 2008.

For non-linux, I peeked at the OpenBSD man page, which says

  The strsignal() function first appeared in AT&T System V
  Release 4 UNIX and was reimplemented for NetBSD 1.0.

That suggests all of the extant BSDs should have it.

MinGW has had the function since 2008.

What other hosts do we support?


r~
Peter Maydell July 20, 2018, 4:45 p.m. UTC | #6
On 20 July 2018 at 17:36, Richard Henderson
<richard.henderson@linaro.org> wrote:
> On 07/20/2018 09:25 AM, Peter Maydell wrote:
>> On 20 July 2018 at 17:14, Richard Henderson
>> <richard.henderson@linaro.org> wrote:
>>> You already print the file, just include the line.  Perhaps
>>>
>>>   fprintf(stderr,
>>>           "%s:%d: kill_qemu tried to terminate QEMU "
>>>           "process but it dumped core with signal %s\n",
>>>           __FILE__, __LINE__, strsignal(WTERMSIG(wstatus)));
>>>   abort();
>>
>> I wasn't convinced that strsignal() would be available
>> on all the host OSes we build on (we don't currently use
>> it outside linux-user/), and I definitely didn't think that
>> it merited a configure test for its presence just for a
>> test error message :-)
>
> Hmm.  It has been in _GNU_SOURCE since the dawn of time
> and in POSIX since 2008.
>
> For non-linux, I peeked at the OpenBSD man page, which says
>
>   The strsignal() function first appeared in AT&T System V
>   Release 4 UNIX and was reimplemented for NetBSD 1.0.
>
> That suggests all of the extant BSDs should have it.
>
> MinGW has had the function since 2008.
>
> What other hosts do we support?

OSX, but that's I think OK as it inherits it from BSD.
The configure script also has support for Solaris-variants
and Haiku...

thanks
-- PMM
Richard Henderson July 20, 2018, 5:28 p.m. UTC | #7
On 07/20/2018 09:45 AM, Peter Maydell wrote:
>> For non-linux, I peeked at the OpenBSD man page, which says
>>
>>   The strsignal() function first appeared in AT&T System V
>>   Release 4 UNIX and was reimplemented for NetBSD 1.0.
>>
>> That suggests all of the extant BSDs should have it.
>>
>> MinGW has had the function since 2008.
>>
>> What other hosts do we support?
> 
> OSX, but that's I think OK as it inherits it from BSD.
> The configure script also has support for Solaris-variants
> and Haiku...

Solaris derives from SVR4, and so should have had it
since the beginning of time; certainly OpenSolaris does:
http://repo.or.cz/opensolaris.git/blob/HEAD:/usr/src/head/string.h#l94

Haiku has it as well:
https://git.haiku-os.org/haiku/tree/headers/posix/string.h#n75


r~
Michael S. Tsirkin July 22, 2018, 3:27 p.m. UTC | #8
On Fri, Jul 20, 2018 at 04:39:32PM +0100, Peter Maydell wrote:
> In kill_qemu() we have an assert that checks that the QEMU process
> didn't dump core:
>             assert(!WCOREDUMP(wstatus));
> 
> Unfortunately the WCOREDUMP macro here means the resulting message
> is not very easy to comprehend on at least some systems:
> 
> ahci-test: tests/libqtest.c:113: kill_qemu: Assertion `!(((__extension__ (((union { __typeof(wstatus) __in; int __i; }) { .__in = (wstatus) }).__i))) & 0x80)' failed.
> 
> and it doesn't identify what signal the process took.
> 
> Instead of using a raw assert, print the information in an
> easier to understand way:
> 
> /i386/ahci/sanity: libqtest.c: kill_qemu() tried to terminate QEMU process but it dumped core with signal 11
> ahci-test: tests/libqtest.c:118: kill_qemu: Assertion `0' failed.
> Aborted (core dumped)
> 
> (Of course, the really useful information would be why the QEMU
> process dumped core in the first place, but we don't have that
> by the time the test program has picked up the exit status.)
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>

Up to you whether to apply in 3.0.

> ---
> In particular, the travis test config that enables gprof
> seems to (a) run into this every so often and (b) have the
> really unhelpful assertion text quoted above:
>  https://travis-ci.org/qemu/qemu/jobs/406192798
> 
> Maybe for 3.0 since it's only test code.
> 
>  tests/libqtest.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/libqtest.c b/tests/libqtest.c
> index 098af6aec44..99341e1b47d 100644
> --- a/tests/libqtest.c
> +++ b/tests/libqtest.c
> @@ -110,7 +110,13 @@ static void kill_qemu(QTestState *s)
>          pid = waitpid(s->qemu_pid, &wstatus, 0);
>  
>          if (pid == s->qemu_pid && WIFSIGNALED(wstatus)) {
> -            assert(!WCOREDUMP(wstatus));
> +            if (WCOREDUMP(wstatus)) {
> +                fprintf(stderr,
> +                        "libqtest.c: kill_qemu() tried to terminate QEMU "
> +                        "process but it dumped core with signal %d\n",
> +                        WTERMSIG(wstatus));
> +                assert(0);
> +            }
>          }
>      }
>  }
> -- 
> 2.17.1
Alex Bennée July 23, 2018, 10:59 a.m. UTC | #9
Peter Maydell <peter.maydell@linaro.org> writes:

> In kill_qemu() we have an assert that checks that the QEMU process
> didn't dump core:
>             assert(!WCOREDUMP(wstatus));
>
> Unfortunately the WCOREDUMP macro here means the resulting message
> is not very easy to comprehend on at least some systems:
>
> ahci-test: tests/libqtest.c:113: kill_qemu: Assertion `!(((__extension__ (((union { __typeof(wstatus) __in; int __i; }) { .__in = (wstatus) }).__i))) & 0x80)' failed.
>
> and it doesn't identify what signal the process took.
>
> Instead of using a raw assert, print the information in an
> easier to understand way:
>
> /i386/ahci/sanity: libqtest.c: kill_qemu() tried to terminate QEMU process but it dumped core with signal 11
> ahci-test: tests/libqtest.c:118: kill_qemu: Assertion `0' failed.
> Aborted (core dumped)
>
> (Of course, the really useful information would be why the QEMU
> process dumped core in the first place, but we don't have that
> by the time the test program has picked up the exit status.)
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> In particular, the travis test config that enables gprof
> seems to (a) run into this every so often and (b) have the
> really unhelpful assertion text quoted above:
>  https://travis-ci.org/qemu/qemu/jobs/406192798
>
> Maybe for 3.0 since it's only test code.
>
>  tests/libqtest.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/tests/libqtest.c b/tests/libqtest.c
> index 098af6aec44..99341e1b47d 100644
> --- a/tests/libqtest.c
> +++ b/tests/libqtest.c
> @@ -110,7 +110,13 @@ static void kill_qemu(QTestState *s)
>          pid = waitpid(s->qemu_pid, &wstatus, 0);
>
>          if (pid == s->qemu_pid && WIFSIGNALED(wstatus)) {
> -            assert(!WCOREDUMP(wstatus));
> +            if (WCOREDUMP(wstatus)) {
> +                fprintf(stderr,
> +                        "libqtest.c: kill_qemu() tried to terminate QEMU "
> +                        "process but it dumped core with signal %d\n",
> +                        WTERMSIG(wstatus));
> +                assert(0);
> +            }

I had something similar but I added a:

+            if (WCOREDUMP(wstatus)) {
+                   fprintf(stderr, "Child QEMU (%s) dumped core\n", getenv("QTEST_QEMU_BINARY"));
+                   abort();
+            }

So I could tell which QEMU was the one that was crashing. As you are
asserting you can drop the filename/function stuff or use __file__ and
__func__ instead. Anyway it is an improvement on what we have now so:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>


>          }
>      }
>  }


--
Alex Bennée
diff mbox series

Patch

diff --git a/tests/libqtest.c b/tests/libqtest.c
index 098af6aec44..99341e1b47d 100644
--- a/tests/libqtest.c
+++ b/tests/libqtest.c
@@ -110,7 +110,13 @@  static void kill_qemu(QTestState *s)
         pid = waitpid(s->qemu_pid, &wstatus, 0);
 
         if (pid == s->qemu_pid && WIFSIGNALED(wstatus)) {
-            assert(!WCOREDUMP(wstatus));
+            if (WCOREDUMP(wstatus)) {
+                fprintf(stderr,
+                        "libqtest.c: kill_qemu() tried to terminate QEMU "
+                        "process but it dumped core with signal %d\n",
+                        WTERMSIG(wstatus));
+                assert(0);
+            }
         }
     }
 }