[3/3] qtest: kill QEMU process on g_assert() failure

Message ID	1392651898-16749-4-git-send-email-stefanha@redhat.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Stefan Hajnoczi <stefanha@redhat.com> To: <qemu-devel@nongnu.org> Date: Mon, 17 Feb 2014 16:44:58 +0100 Message-Id: <1392651898-16749-4-git-send-email-stefanha@redhat.com> In-Reply-To: <1392651898-16749-1-git-send-email-stefanha@redhat.com> References: <1392651898-16749-1-git-send-email-stefanha@redhat.com> Cc: Peter Maydell <peter.maydell@linaro.org>, Andreas Faerber <afaerber@suse.de>, Anthony Liguori <aliguori@amazon.com> Subject: [Qemu-devel] [PATCH 3/3] qtest: kill QEMU process on g_assert() failure Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Stefan Hajnoczi Feb. 17, 2014, 3:44 p.m. UTC

The QEMU process stays running if the test case fails.  This patch fixes
the leak by installing a SIGABRT signal handler which invokes
qtest_end().

In order to make that work for assertion failures during qtest_init(),
we need to initialize QTestState fields including file descriptors and
pids carefully.  qtest_quit() is then safe to call even during
qtest_init().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 tests/libqtest.c | 29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

Paolo Bonzini Feb. 17, 2014, 4:16 p.m. UTC | #1

Il 17/02/2014 16:44, Stefan Hajnoczi ha scritto:
>  }
>
> +static void sigabrt_handler(int signo)
> +{
> +    qtest_end();
> +}
> +

void qtest_quit(QTestState *s)
{
     int status;

     if (s->qemu_pid != -1) {
         kill(s->qemu_pid, SIGTERM);
         waitpid(s->qemu_pid, &status, 0);
     }

     close(s->fd);
     close(s->qmp_fd);
     g_string_free(s->rx, true);
     g_free(s);
}

Not async-signal safe.  You need to ignore the g_string_free and g_free 
(perhaps even the closes) if calling from the sigabrt_handler.

Paolo

Markus Armbruster Feb. 17, 2014, 4:49 p.m. UTC | #2

Stefan Hajnoczi <stefanha@redhat.com> writes:

> The QEMU process stays running if the test case fails.  This patch fixes
> the leak by installing a SIGABRT signal handler which invokes
> qtest_end().
>
> In order to make that work for assertion failures during qtest_init(),
> we need to initialize QTestState fields including file descriptors and
> pids carefully.  qtest_quit() is then safe to call even during
> qtest_init().
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  tests/libqtest.c | 29 ++++++++++++++++++++++++++---
>  1 file changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/tests/libqtest.c b/tests/libqtest.c
> index 8b2b2d7..09a0481 100644
> --- a/tests/libqtest.c
> +++ b/tests/libqtest.c
> @@ -44,6 +44,7 @@ struct QTestState
>      bool irq_level[MAX_IRQ];
>      GString *rx;
>      pid_t qemu_pid;  /* our child QEMU process */
> +    struct sigaction sigact_old; /* restored on exit */
>  };
>  
>  #define g_assert_no_errno(ret) do { \
> @@ -88,6 +89,11 @@ static int socket_accept(int sock)
>      return ret;
>  }
>  
> +static void sigabrt_handler(int signo)
> +{
> +    qtest_end();
> +}
> +
>  QTestState *qtest_init(const char *extra_args)
>  {
>      QTestState *s;
> @@ -96,11 +102,22 @@ QTestState *qtest_init(const char *extra_args)
>      gchar *qmp_socket_path;
>      gchar *command;
>      const char *qemu_binary;
> +    struct sigaction sigact;
>  
>      qemu_binary = getenv("QTEST_QEMU_BINARY");
>      g_assert(qemu_binary != NULL);
>  
> -    s = g_malloc(sizeof(*s));
> +    s = g_malloc0(sizeof(*s));
> +    s->fd = -1;
> +    s->qmp_fd = -1;
> +    s->qemu_pid = -1;
> +
> +    /* Catch SIGABRT to clean up on g_assert() failure */
> +    sigact = (struct sigaction){
> +        .sa_handler = sigabrt_handler,
> +        .sa_flags = SA_RESETHAND,
> +    };

Assumes zero-initialization has the same effect as
sigemptyset(&sigact.sa_mask).  Quoting POSIX:

    The implementation of the sigemptyset() (or sigfillset()) function
    could quite trivially clear (or set) all the bits in the signal set.
    Alternatively, it would be reasonable to initialize part of the
    structure, such as a version field, to permit binary-compatibility
    between releases where the size of the set varies.  For such
    reasons, either sigemptyset() or sigfillset() must be called prior
    to any other use of the signal set, even if such use is read-only
    (for example, as an argument to sigpending()).

Looks like you better sigemptyset() here, for maximum portability.

> +    sigaction(SIGABRT, &sigact, &s->sigact_old);
>  
>      socket_path = g_strdup_printf("/tmp/qtest-%d.sock", getpid());
>      qmp_socket_path = g_strdup_printf("/tmp/qtest-%d.qmp", getpid());
> @@ -150,13 +167,19 @@ void qtest_quit(QTestState *s)
>  {
>      int status;
>  
> +    sigaction(SIGABRT, &s->sigact_old, NULL);
> +

Can this ever restore the action to anything but the default action?

>      if (s->qemu_pid != -1) {
>          kill(s->qemu_pid, SIGTERM);
>          waitpid(s->qemu_pid, &status, 0);
>      }
>  
> -    close(s->fd);
> -    close(s->qmp_fd);
> +    if (s->fd != -1) {
> +        close(s->fd);
> +    }
> +    if (s->qmp_fd != -1) {
> +        close(s->qmp_fd);
> +    }

I generally don't bother to avoid close(-1).

>      g_string_free(s->rx, true);
>      g_free(s);
>  }

Paolo Bonzini Feb. 17, 2014, 4:56 p.m. UTC | #3

Il 17/02/2014 17:49, Markus Armbruster ha scritto:
> Assumes zero-initialization has the same effect as
> sigemptyset(&sigact.sa_mask).  Quoting POSIX:
>
>     The implementation of the sigemptyset() (or sigfillset()) function
>     could quite trivially clear (or set) all the bits in the signal set.
>     Alternatively, it would be reasonable to initialize part of the
>     structure, such as a version field, to permit binary-compatibility
>     between releases where the size of the set varies.  For such
>     reasons, either sigemptyset() or sigfillset() must be called prior
>     to any other use of the signal set, even if such use is read-only
>     (for example, as an argument to sigpending()).
>
> Looks like you better sigemptyset() here, for maximum portability.
>

Certainly memset of struct sigaction or sigset_t * is common enough that 
no one in their right minds would do this.  Is there really an OS that 
does it?  Also, the above justification is quite feeble; it would work 
for binary compatibility of sigset_t* arguments, but not for embedded 
sigset_t structs.  I'm CCing our resident POSIX experts in hope that 
this paragraph can be eliminated from the standard. :)

Related to this, there are a bunch of Coverity reports where we use 
uninitialized fields of a struct sigaction.

Paolo

Markus Armbruster Feb. 17, 2014, 5 p.m. UTC | #4

Paolo Bonzini <pbonzini@redhat.com> writes:

> Il 17/02/2014 16:44, Stefan Hajnoczi ha scritto:
>>  }
>>
>> +static void sigabrt_handler(int signo)
>> +{
>> +    qtest_end();
>> +}
>> +
>
> void qtest_quit(QTestState *s)
> {
>     int status;
>
>     if (s->qemu_pid != -1) {
>         kill(s->qemu_pid, SIGTERM);
>         waitpid(s->qemu_pid, &status, 0);
>     }
>
>     close(s->fd);
>     close(s->qmp_fd);
>     g_string_free(s->rx, true);
>     g_free(s);
> }
>
> Not async-signal safe.  You need to ignore the g_string_free and
> g_free (perhaps even the closes) if calling from the sigabrt_handler.

kill(), waitpid() and close() are all async-signal-safe.

SIGABRT is normally synchronous enough: it's sent by abort().  But of
course, nothing stops the user from kill -ABRT.  Or GLib from calling
abort() in some place where an attempt to reenter it crashes & burns.
Not sure I'd care, but I'm pretty sure I don't care for freeing stuff on
exit :)

Stefan Hajnoczi Feb. 18, 2014, 9:05 a.m. UTC | #5

On Mon, Feb 17, 2014 at 06:00:03PM +0100, Markus Armbruster wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
> > Il 17/02/2014 16:44, Stefan Hajnoczi ha scritto:
> >>  }
> >>
> >> +static void sigabrt_handler(int signo)
> >> +{
> >> +    qtest_end();
> >> +}
> >> +
> >
> > void qtest_quit(QTestState *s)
> > {
> >     int status;
> >
> >     if (s->qemu_pid != -1) {
> >         kill(s->qemu_pid, SIGTERM);
> >         waitpid(s->qemu_pid, &status, 0);
> >     }
> >
> >     close(s->fd);
> >     close(s->qmp_fd);
> >     g_string_free(s->rx, true);
> >     g_free(s);
> > }
> >
> > Not async-signal safe.  You need to ignore the g_string_free and
> > g_free (perhaps even the closes) if calling from the sigabrt_handler.
> 
> kill(), waitpid() and close() are all async-signal-safe.
> 
> SIGABRT is normally synchronous enough: it's sent by abort().  But of
> course, nothing stops the user from kill -ABRT.  Or GLib from calling
> abort() in some place where an attempt to reenter it crashes & burns.
> Not sure I'd care, but I'm pretty sure I don't care for freeing stuff on
> exit :)

Yes, SIGABRT is synchronous for all purposes.  So the only danger is
that g_string_free() or g_free() could fail while we're in
g_assert(false).  But they don't, which makes sense because they are
totally unrelated to g_assert() and therefore can handle re-entrancy.

In practice there is no issue and I've tested assertion failure with
glib 1.2.10.

Stefan

Stefan Hajnoczi Feb. 18, 2014, 9:17 a.m. UTC | #6

On Mon, Feb 17, 2014 at 05:49:31PM +0100, Markus Armbruster wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> > +    /* Catch SIGABRT to clean up on g_assert() failure */
> > +    sigact = (struct sigaction){
> > +        .sa_handler = sigabrt_handler,
> > +        .sa_flags = SA_RESETHAND,
> > +    };
> 
> Assumes zero-initialization has the same effect as
> sigemptyset(&sigact.sa_mask).  Quoting POSIX:
> 
>     The implementation of the sigemptyset() (or sigfillset()) function
>     could quite trivially clear (or set) all the bits in the signal set.
>     Alternatively, it would be reasonable to initialize part of the
>     structure, such as a version field, to permit binary-compatibility
>     between releases where the size of the set varies.  For such
>     reasons, either sigemptyset() or sigfillset() must be called prior
>     to any other use of the signal set, even if such use is read-only
>     (for example, as an argument to sigpending()).
> 
> Looks like you better sigemptyset() here, for maximum portability.

Okay, will do that.

> > +    sigaction(SIGABRT, &sigact, &s->sigact_old);
> >  
> >      socket_path = g_strdup_printf("/tmp/qtest-%d.sock", getpid());
> >      qmp_socket_path = g_strdup_printf("/tmp/qtest-%d.qmp", getpid());
> > @@ -150,13 +167,19 @@ void qtest_quit(QTestState *s)
> >  {
> >      int status;
> >  
> > +    sigaction(SIGABRT, &s->sigact_old, NULL);
> > +
> 
> Can this ever restore the action to anything but the default action?

This code is in a "library" so I don't want to make assumptions about
the callers.  Keeping sigact_old around is literally 1 line of code so I
think it's worth it.

> >      if (s->qemu_pid != -1) {
> >          kill(s->qemu_pid, SIGTERM);
> >          waitpid(s->qemu_pid, &status, 0);
> >      }
> >  
> > -    close(s->fd);
> > -    close(s->qmp_fd);
> > +    if (s->fd != -1) {
> > +        close(s->fd);
> > +    }
> > +    if (s->qmp_fd != -1) {
> > +        close(s->qmp_fd);
> > +    }
> 
> I generally don't bother to avoid close(-1).

When I drive on the highway I stay on the lanes but I guess I could just
slide along the side barriers :).  It's a style issue but close(-1)
annoys me in strace so I try to avoid doing it.

Markus Armbruster Feb. 18, 2014, 9:55 a.m. UTC | #7

Stefan Hajnoczi <stefanha@gmail.com> writes:

> On Mon, Feb 17, 2014 at 05:49:31PM +0100, Markus Armbruster wrote:
>> Stefan Hajnoczi <stefanha@redhat.com> writes:
>> > +    /* Catch SIGABRT to clean up on g_assert() failure */
>> > +    sigact = (struct sigaction){
>> > +        .sa_handler = sigabrt_handler,
>> > +        .sa_flags = SA_RESETHAND,
>> > +    };
>> 
>> Assumes zero-initialization has the same effect as
>> sigemptyset(&sigact.sa_mask).  Quoting POSIX:
>> 
>>     The implementation of the sigemptyset() (or sigfillset()) function
>>     could quite trivially clear (or set) all the bits in the signal set.
>>     Alternatively, it would be reasonable to initialize part of the
>>     structure, such as a version field, to permit binary-compatibility
>>     between releases where the size of the set varies.  For such
>>     reasons, either sigemptyset() or sigfillset() must be called prior
>>     to any other use of the signal set, even if such use is read-only
>>     (for example, as an argument to sigpending()).
>> 
>> Looks like you better sigemptyset() here, for maximum portability.
>
> Okay, will do that.
>
>> > +    sigaction(SIGABRT, &sigact, &s->sigact_old);
>> >  
>> >      socket_path = g_strdup_printf("/tmp/qtest-%d.sock", getpid());
>> >      qmp_socket_path = g_strdup_printf("/tmp/qtest-%d.qmp", getpid());
>> > @@ -150,13 +167,19 @@ void qtest_quit(QTestState *s)
>> >  {
>> >      int status;
>> >  
>> > +    sigaction(SIGABRT, &s->sigact_old, NULL);
>> > +
>> 
>> Can this ever restore the action to anything but the default action?
>
> This code is in a "library" so I don't want to make assumptions about
> the callers.  Keeping sigact_old around is literally 1 line of code so I
> think it's worth it.
>
>> >      if (s->qemu_pid != -1) {
>> >          kill(s->qemu_pid, SIGTERM);
>> >          waitpid(s->qemu_pid, &status, 0);
>> >      }
>> >  
>> > -    close(s->fd);
>> > -    close(s->qmp_fd);
>> > +    if (s->fd != -1) {
>> > +        close(s->fd);
>> > +    }
>> > +    if (s->qmp_fd != -1) {
>> > +        close(s->qmp_fd);
>> > +    }
>> 
>> I generally don't bother to avoid close(-1).
>
> When I drive on the highway I stay on the lanes but I guess I could just
> slide along the side barriers :).  It's a style issue but close(-1)
> annoys me in strace so I try to avoid doing it.

For me, it's in the same category as free(NULL).

Markus Armbruster Feb. 18, 2014, 10:02 a.m. UTC | #8

Markus Armbruster <armbru@redhat.com> writes:

> Stefan Hajnoczi <stefanha@redhat.com> writes:
>
>> The QEMU process stays running if the test case fails.  This patch fixes
>> the leak by installing a SIGABRT signal handler which invokes
>> qtest_end().
>>
>> In order to make that work for assertion failures during qtest_init(),
>> we need to initialize QTestState fields including file descriptors and
>> pids carefully.  qtest_quit() is then safe to call even during
>> qtest_init().
>>
>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>> ---
>>  tests/libqtest.c | 29 ++++++++++++++++++++++++++---
>>  1 file changed, 26 insertions(+), 3 deletions(-)
>>
>> diff --git a/tests/libqtest.c b/tests/libqtest.c
>> index 8b2b2d7..09a0481 100644
>> --- a/tests/libqtest.c
>> +++ b/tests/libqtest.c
>> @@ -44,6 +44,7 @@ struct QTestState
>>      bool irq_level[MAX_IRQ];
>>      GString *rx;
>>      pid_t qemu_pid;  /* our child QEMU process */
>> +    struct sigaction sigact_old; /* restored on exit */
>>  };
>>  
>>  #define g_assert_no_errno(ret) do { \
>> @@ -88,6 +89,11 @@ static int socket_accept(int sock)
>>      return ret;
>>  }
>>  
>> +static void sigabrt_handler(int signo)
>> +{
>> +    qtest_end();

Don't you have to re-raise SIGABRT here, to actually terminate the
process?

>> +}
>> +
>>  QTestState *qtest_init(const char *extra_args)
>>  {
>>      QTestState *s;
[...]

Markus Armbruster Feb. 18, 2014, 10:05 a.m. UTC | #9

Stefan Hajnoczi <stefanha@gmail.com> writes:

> On Mon, Feb 17, 2014 at 06:00:03PM +0100, Markus Armbruster wrote:
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>> 
>> > Il 17/02/2014 16:44, Stefan Hajnoczi ha scritto:
>> >>  }
>> >>
>> >> +static void sigabrt_handler(int signo)
>> >> +{
>> >> +    qtest_end();
>> >> +}
>> >> +
>> >
>> > void qtest_quit(QTestState *s)
>> > {
>> >     int status;
>> >
>> >     if (s->qemu_pid != -1) {
>> >         kill(s->qemu_pid, SIGTERM);
>> >         waitpid(s->qemu_pid, &status, 0);
>> >     }
>> >
>> >     close(s->fd);
>> >     close(s->qmp_fd);
>> >     g_string_free(s->rx, true);
>> >     g_free(s);
>> > }
>> >
>> > Not async-signal safe.  You need to ignore the g_string_free and
>> > g_free (perhaps even the closes) if calling from the sigabrt_handler.
>> 
>> kill(), waitpid() and close() are all async-signal-safe.
>> 
>> SIGABRT is normally synchronous enough: it's sent by abort().  But of
>> course, nothing stops the user from kill -ABRT.  Or GLib from calling
>> abort() in some place where an attempt to reenter it crashes & burns.
>> Not sure I'd care, but I'm pretty sure I don't care for freeing stuff on
>> exit :)
>
> Yes, SIGABRT is synchronous for all purposes.  So the only danger is
> that g_string_free() or g_free() could fail while we're in
> g_assert(false).  But they don't, which makes sense because they are
> totally unrelated to g_assert() and therefore can handle re-entrancy.

The (theoretical!) problem is when it aborts in the bowels of g_*free(),
and your SIGABRT handler reenters g_*free().

> In practice there is no issue and I've tested assertion failure with
> glib 1.2.10.

Worst that can happen is we crash on the way from abort() to process
termination.  Tolerable.

Still, avoiding unnecessary cleanup on that path seems prudent to me.
If you agree, factor out the kill()/waitpid(), and call only that from
the signal handler.

Paolo Bonzini Feb. 18, 2014, 10:07 a.m. UTC | #10

Il 18/02/2014 10:05, Stefan Hajnoczi ha scritto:
>> > SIGABRT is normally synchronous enough: it's sent by abort().  But of
>> > course, nothing stops the user from kill -ABRT.  Or GLib from calling
>> > abort() in some place where an attempt to reenter it crashes & burns.
>> > Not sure I'd care, but I'm pretty sure I don't care for freeing stuff on
>> > exit :)
> Yes, SIGABRT is synchronous for all purposes.  So the only danger is
> that g_string_free() or g_free() could fail while we're in
> g_assert(false).  But they don't, which makes sense because they are
> totally unrelated to g_assert() and therefore can handle re-entrancy.

If malloc aborts due to a double free or other similar problem, you may 
risk reentering it.

Paolo

Daniel P. Berrangé Feb. 18, 2014, 10:17 a.m. UTC | #11

On Tue, Feb 18, 2014 at 11:07:53AM +0100, Paolo Bonzini wrote:
> Il 18/02/2014 10:05, Stefan Hajnoczi ha scritto:
> >>> SIGABRT is normally synchronous enough: it's sent by abort().  But of
> >>> course, nothing stops the user from kill -ABRT.  Or GLib from calling
> >>> abort() in some place where an attempt to reenter it crashes & burns.
> >>> Not sure I'd care, but I'm pretty sure I don't care for freeing stuff on
> >>> exit :)
> >Yes, SIGABRT is synchronous for all purposes.  So the only danger is
> >that g_string_free() or g_free() could fail while we're in
> >g_assert(false).  But they don't, which makes sense because they are
> >totally unrelated to g_assert() and therefore can handle re-entrancy.
> 
> If malloc aborts due to a double free or other similar problem, you
> may risk reentering it.

If you register the custom SIGABRT handler with sigaction + SA_RESETHAND
then you'd avoid the re-entrancy risk, since a cascading SIGABRT would
get handled by the system default handler, which would immediately
terminate the process.

Regards,
Daniel

Paolo Bonzini Feb. 18, 2014, 10:23 a.m. UTC | #12

Il 18/02/2014 11:17, Daniel P. Berrange ha scritto:
>>> > >Yes, SIGABRT is synchronous for all purposes.  So the only danger is
>>> > >that g_string_free() or g_free() could fail while we're in
>>> > >g_assert(false).  But they don't, which makes sense because they are
>>> > >totally unrelated to g_assert() and therefore can handle re-entrancy.
>> >
>> > If malloc aborts due to a double free or other similar problem, you
>> > may risk reentering it.
> If you register the custom SIGABRT handler with sigaction + SA_RESETHAND
> then you'd avoid the re-entrancy risk, since a cascading SIGABRT would
> get handled by the system default handler, which would immediately
> terminate the process.

I meant reentering malloc.

Paolo

Paolo Bonzini Feb. 18, 2014, 10:23 a.m. UTC | #13

Il 18/02/2014 11:05, Markus Armbruster ha scritto:
>> > Yes, SIGABRT is synchronous for all purposes.  So the only danger is
>> > that g_string_free() or g_free() could fail while we're in
>> > g_assert(false).  But they don't, which makes sense because they are
>> > totally unrelated to g_assert() and therefore can handle re-entrancy.
> The (theoretical!) problem is when it aborts in the bowels of g_*free(),
> and your SIGABRT handler reenters g_*free().
>
>> > In practice there is no issue and I've tested assertion failure with
>> > glib 1.2.10.
> Worst that can happen is we crash on the way from abort() to process
> termination.  Tolerable.

What about recursive locking of a non-recursive mutex?

Paolo

Markus Armbruster Feb. 18, 2014, 10:43 a.m. UTC | #14

Paolo Bonzini <pbonzini@redhat.com> writes:

> Il 18/02/2014 11:05, Markus Armbruster ha scritto:
>>> > Yes, SIGABRT is synchronous for all purposes.  So the only danger is
>>> > that g_string_free() or g_free() could fail while we're in
>>> > g_assert(false).  But they don't, which makes sense because they are
>>> > totally unrelated to g_assert() and therefore can handle re-entrancy.
>> The (theoretical!) problem is when it aborts in the bowels of g_*free(),
>> and your SIGABRT handler reenters g_*free().
>>
>>> > In practice there is no issue and I've tested assertion failure with
>>> > glib 1.2.10.
>> Worst that can happen is we crash on the way from abort() to process
>> termination.  Tolerable.
>
> What about recursive locking of a non-recursive mutex?

You're right, deadlock is possible.  Strengthens the argument for:

>> Still, avoiding unnecessary cleanup on that path seems prudent to me.
>> If you agree, factor out the kill()/waitpid(), and call only that from
>> the signal handler.

Stefan Hajnoczi Feb. 18, 2014, 2:38 p.m. UTC | #15

On Tue, Feb 18, 2014 at 11:02:52AM +0100, Markus Armbruster wrote:
> Markus Armbruster <armbru@redhat.com> writes:
> 
> > Stefan Hajnoczi <stefanha@redhat.com> writes:
> >
> >> The QEMU process stays running if the test case fails.  This patch fixes
> >> the leak by installing a SIGABRT signal handler which invokes
> >> qtest_end().
> >>
> >> In order to make that work for assertion failures during qtest_init(),
> >> we need to initialize QTestState fields including file descriptors and
> >> pids carefully.  qtest_quit() is then safe to call even during
> >> qtest_init().
> >>
> >> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> >> ---
> >>  tests/libqtest.c | 29 ++++++++++++++++++++++++++---
> >>  1 file changed, 26 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/tests/libqtest.c b/tests/libqtest.c
> >> index 8b2b2d7..09a0481 100644
> >> --- a/tests/libqtest.c
> >> +++ b/tests/libqtest.c
> >> @@ -44,6 +44,7 @@ struct QTestState
> >>      bool irq_level[MAX_IRQ];
> >>      GString *rx;
> >>      pid_t qemu_pid;  /* our child QEMU process */
> >> +    struct sigaction sigact_old; /* restored on exit */
> >>  };
> >>  
> >>  #define g_assert_no_errno(ret) do { \
> >> @@ -88,6 +89,11 @@ static int socket_accept(int sock)
> >>      return ret;
> >>  }
> >>  
> >> +static void sigabrt_handler(int signo)
> >> +{
> >> +    qtest_end();
> 
> Don't you have to re-raise SIGABRT here, to actually terminate the
> process?

No.  POSIX says:

  RETURN VALUE
       The abort() function shall not return.

(BTW the way to avoid that is using longjmp.)

The Linux man page is more explicit:

  If  the  SIGABRT  signal  is ignored, or caught by a handler that returns,
  the abort() function will still terminate the process.  It does this by
  restoring the default disposition for  SIGABRT and then raising the signal
  for a second time.

Stefan Hajnoczi Feb. 18, 2014, 2:38 p.m. UTC | #16

On Tue, Feb 18, 2014 at 11:43:10AM +0100, Markus Armbruster wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
> > Il 18/02/2014 11:05, Markus Armbruster ha scritto:
> >>> > Yes, SIGABRT is synchronous for all purposes.  So the only danger is
> >>> > that g_string_free() or g_free() could fail while we're in
> >>> > g_assert(false).  But they don't, which makes sense because they are
> >>> > totally unrelated to g_assert() and therefore can handle re-entrancy.
> >> The (theoretical!) problem is when it aborts in the bowels of g_*free(),
> >> and your SIGABRT handler reenters g_*free().
> >>
> >>> > In practice there is no issue and I've tested assertion failure with
> >>> > glib 1.2.10.
> >> Worst that can happen is we crash on the way from abort() to process
> >> termination.  Tolerable.
> >
> > What about recursive locking of a non-recursive mutex?
> 
> You're right, deadlock is possible.  Strengthens the argument for:
> 
> >> Still, avoiding unnecessary cleanup on that path seems prudent to me.
> >> If you agree, factor out the kill()/waitpid(), and call only that from
> >> the signal handler.

Okay, I'm convinced.

Stefan Hajnoczi Feb. 18, 2014, 2:44 p.m. UTC | #17

On Tue, Feb 18, 2014 at 10:55:56AM +0100, Markus Armbruster wrote:
> Stefan Hajnoczi <stefanha@gmail.com> writes:
> 
> > On Mon, Feb 17, 2014 at 05:49:31PM +0100, Markus Armbruster wrote:
> >> Stefan Hajnoczi <stefanha@redhat.com> writes:
> >> >      if (s->qemu_pid != -1) {
> >> >          kill(s->qemu_pid, SIGTERM);
> >> >          waitpid(s->qemu_pid, &status, 0);
> >> >      }
> >> >  
> >> > -    close(s->fd);
> >> > -    close(s->qmp_fd);
> >> > +    if (s->fd != -1) {
> >> > +        close(s->fd);
> >> > +    }
> >> > +    if (s->qmp_fd != -1) {
> >> > +        close(s->qmp_fd);
> >> > +    }
> >> 
> >> I generally don't bother to avoid close(-1).
> >
> > When I drive on the highway I stay on the lanes but I guess I could just
> > slide along the side barriers :).  It's a style issue but close(-1)
> > annoys me in strace so I try to avoid doing it.
> 
> For me, it's in the same category as free(NULL).

Understood.  I'll drop it from the next patch.

Markus Armbruster Feb. 18, 2014, 2:52 p.m. UTC | #18

Stefan Hajnoczi <stefanha@gmail.com> writes:

> On Tue, Feb 18, 2014 at 11:02:52AM +0100, Markus Armbruster wrote:
>> Markus Armbruster <armbru@redhat.com> writes:
>> 
>> > Stefan Hajnoczi <stefanha@redhat.com> writes:
>> >
>> >> The QEMU process stays running if the test case fails.  This patch fixes
>> >> the leak by installing a SIGABRT signal handler which invokes
>> >> qtest_end().
>> >>
>> >> In order to make that work for assertion failures during qtest_init(),
>> >> we need to initialize QTestState fields including file descriptors and
>> >> pids carefully.  qtest_quit() is then safe to call even during
>> >> qtest_init().
>> >>
>> >> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>> >> ---
>> >>  tests/libqtest.c | 29 ++++++++++++++++++++++++++---
>> >>  1 file changed, 26 insertions(+), 3 deletions(-)
>> >>
>> >> diff --git a/tests/libqtest.c b/tests/libqtest.c
>> >> index 8b2b2d7..09a0481 100644
>> >> --- a/tests/libqtest.c
>> >> +++ b/tests/libqtest.c
>> >> @@ -44,6 +44,7 @@ struct QTestState
>> >>      bool irq_level[MAX_IRQ];
>> >>      GString *rx;
>> >>      pid_t qemu_pid;  /* our child QEMU process */
>> >> +    struct sigaction sigact_old; /* restored on exit */
>> >>  };
>> >>  
>> >>  #define g_assert_no_errno(ret) do { \
>> >> @@ -88,6 +89,11 @@ static int socket_accept(int sock)
>> >>      return ret;
>> >>  }
>> >>  
>> >> +static void sigabrt_handler(int signo)
>> >> +{
>> >> +    qtest_end();
>> 
>> Don't you have to re-raise SIGABRT here, to actually terminate the
>> process?
>
> No.  POSIX says:
>
>   RETURN VALUE
>        The abort() function shall not return.
>
> (BTW the way to avoid that is using longjmp.)
>
> The Linux man page is more explicit:
>
>   If  the  SIGABRT  signal  is ignored, or caught by a handler that returns,
>   the abort() function will still terminate the process.  It does this by
>   restoring the default disposition for  SIGABRT and then raising the signal
>   for a second time.

Learn something new :)  Thanks!

Peter Maydell Feb. 18, 2014, 2:56 p.m. UTC | #19

On 18 February 2014 09:17, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> When I drive on the highway I stay on the lanes but I guess I could just
> slide along the side barriers :).  It's a style issue but close(-1)
> annoys me in strace so I try to avoid doing it.

As an aside, ever try stracing gtester? It implements its "-o file"
option by always doing the write of the XML, so if you didn't
specify -o then the write()s and close() are all done on fd -1...

thanks
-- PMM

[3/3] qtest: kill QEMU process on g_assert() failure

Commit Message

Comments

Patch